Bootstrap Confidence Intervals¶

Every public metric in Shesha supports an optional outer bootstrap for computing confidence intervals on the point estimate. Instead of returning a single float, the function returns a dictionary with the mean, lower/upper CI bounds, standard deviation, and metadata.

How it works¶

The input data is resampled with replacement (rows/samples).
The metric is recomputed on each resampled dataset.
Percentile-based confidence intervals are computed from the distribution of bootstrap estimates.

This “outer bootstrap” is independent of any internal iterations the metric already performs (e.g. n_splits in feature_split). It quantifies uncertainty due to the finite sample of observations.

Usage¶

Pass n_bootstrap_ci (number of resamples) and optionally ci (confidence level, default 0.95) to any metric function:

import shesha

X = np.random.randn(500, 768)

# Point estimate (default behaviour, returns float)
stability = shesha.feature_split(X, n_splits=30, seed=320)

# With 95% bootstrap CI (returns dict)
result = shesha.feature_split(X, n_splits=30, seed=320, n_bootstrap_ci=1000)
print(result)
# {'mean': 0.42, 'ci_low': 0.38, 'ci_high': 0.46,
#  'std': 0.021, 'n_bootstraps': 1000, 'ci_level': 0.95}

# 99% CI
result_99 = shesha.feature_split(X, n_splits=30, seed=320,
                                  n_bootstrap_ci=1000, ci=0.99)

Return format¶

When n_bootstrap_ci is set, the function returns a dictionary:

Key	Description
`mean`	Mean of the bootstrap distribution
`ci_low`	Lower bound of the confidence interval
`ci_high`	Upper bound of the confidence interval
`std`	Standard deviation of bootstrap estimates
`n_bootstraps`	Number of successful resamples (may be < `n_bootstrap_ci` if some resamples yield NaN)
`ci_level`	The confidence level used (e.g. 0.95)

Examples across modules¶

Core (unsupervised)

result = shesha.feature_split(X, n_splits=30, n_bootstrap_ci=1000, seed=320)
result = shesha.sample_split(X, n_splits=30, n_bootstrap_ci=1000, seed=320)
result = shesha.anchor_stability(X, n_bootstrap_ci=1000, seed=320)

Core (supervised)

result = shesha.variance_ratio(X, y, n_bootstrap_ci=1000, seed=320)
result = shesha.supervised_alignment(X, y, n_bootstrap_ci=1000, seed=320)
result = shesha.class_separation_ratio(X, y, n_bootstrap_ci=1000, seed=320)
result = shesha.lda_stability(X, y, n_bootstrap_ci=1000, seed=320)

Core (drift)

result = shesha.rdm_similarity(X, Y, n_bootstrap_ci=1000, seed=320)
result = shesha.rdm_drift(X, Y, n_bootstrap_ci=1000, seed=320)

Bio (perturbation analysis)

from shesha.bio import perturbation_stability, perturbation_effect_size

result = perturbation_stability(X_ctrl, X_pert, n_bootstrap_ci=1000, seed=320)
result = perturbation_effect_size(X_ctrl, X_pert, n_bootstrap_ci=1000, seed=320)

Sim (similarity metrics)

from shesha.sim import cka, cka_linear, cka_debiased
from shesha.sim import procrustes_similarity, rdm_similarity

result = cka(X, Y, n_bootstrap_ci=1000, seed=320)
result = cka_linear(X, Y, n_bootstrap_ci=1000, seed=320)
result = cka_debiased(X, Y, n_bootstrap_ci=1000, seed=320)
result = procrustes_similarity(X, Y, n_bootstrap_ci=1000, seed=320)
result = rdm_similarity(X, Y, n_bootstrap_ci=1000, seed=320)

Choosing `n_bootstrap_ci`¶

Quick exploration: 200–500 resamples
Publication-quality: 1000–10000 resamples
Computational cost: scales linearly with n_bootstrap_ci. Each resample runs the full metric computation, so expensive metrics (e.g. anchor_stability on large data) will take proportionally longer.

Resampling strategy¶

Single-matrix metrics (feature_split, sample_split, etc.): rows of X (and y if supervised) are resampled together with the same indices.
Two-matrix metrics (rdm_similarity, cka, etc.): both X and Y are resampled with the same indices (paired bootstrap).
Bio metrics (perturbation_stability, perturbation_effect_size): control and perturbed populations are resampled independently.

Reproducibility¶

Pass seed for deterministic results:

result1 = shesha.feature_split(X, n_bootstrap_ci=1000, seed=320)
result2 = shesha.feature_split(X, n_bootstrap_ci=1000, seed=320)
assert result1 == result2  # identical