Bootstrap Confidence Intervals¶
Every public metric in Shesha supports an optional outer bootstrap for computing confidence intervals on the point estimate. Instead of returning a single float, the function returns a dictionary with the mean, lower/upper CI bounds, standard deviation, and metadata.
How it works¶
The input data is resampled with replacement (rows/samples).
The metric is recomputed on each resampled dataset.
Percentile-based confidence intervals are computed from the distribution of bootstrap estimates.
This “outer bootstrap” is independent of any internal iterations the metric
already performs (e.g. n_splits in feature_split). It quantifies
uncertainty due to the finite sample of observations.
Usage¶
Pass n_bootstrap_ci (number of resamples) and optionally ci (confidence
level, default 0.95) to any metric function:
import shesha
X = np.random.randn(500, 768)
# Point estimate (default behaviour, returns float)
stability = shesha.feature_split(X, n_splits=30, seed=320)
# With 95% bootstrap CI (returns dict)
result = shesha.feature_split(X, n_splits=30, seed=320, n_bootstrap_ci=1000)
print(result)
# {'mean': 0.42, 'ci_low': 0.38, 'ci_high': 0.46,
# 'std': 0.021, 'n_bootstraps': 1000, 'ci_level': 0.95}
# 99% CI
result_99 = shesha.feature_split(X, n_splits=30, seed=320,
n_bootstrap_ci=1000, ci=0.99)
Return format¶
When n_bootstrap_ci is set, the function returns a dictionary:
Key |
Description |
|---|---|
|
Mean of the bootstrap distribution |
|
Lower bound of the confidence interval |
|
Upper bound of the confidence interval |
|
Standard deviation of bootstrap estimates |
|
Number of successful resamples (may be < |
|
The confidence level used (e.g. 0.95) |
Examples across modules¶
Core (unsupervised)
result = shesha.feature_split(X, n_splits=30, n_bootstrap_ci=1000, seed=320)
result = shesha.sample_split(X, n_splits=30, n_bootstrap_ci=1000, seed=320)
result = shesha.anchor_stability(X, n_bootstrap_ci=1000, seed=320)
Core (supervised)
result = shesha.variance_ratio(X, y, n_bootstrap_ci=1000, seed=320)
result = shesha.supervised_alignment(X, y, n_bootstrap_ci=1000, seed=320)
result = shesha.class_separation_ratio(X, y, n_bootstrap_ci=1000, seed=320)
result = shesha.lda_stability(X, y, n_bootstrap_ci=1000, seed=320)
Core (drift)
result = shesha.rdm_similarity(X, Y, n_bootstrap_ci=1000, seed=320)
result = shesha.rdm_drift(X, Y, n_bootstrap_ci=1000, seed=320)
Bio (perturbation analysis)
from shesha.bio import perturbation_stability, perturbation_effect_size
result = perturbation_stability(X_ctrl, X_pert, n_bootstrap_ci=1000, seed=320)
result = perturbation_effect_size(X_ctrl, X_pert, n_bootstrap_ci=1000, seed=320)
Sim (similarity metrics)
from shesha.sim import cka, cka_linear, cka_debiased
from shesha.sim import procrustes_similarity, rdm_similarity
result = cka(X, Y, n_bootstrap_ci=1000, seed=320)
result = cka_linear(X, Y, n_bootstrap_ci=1000, seed=320)
result = cka_debiased(X, Y, n_bootstrap_ci=1000, seed=320)
result = procrustes_similarity(X, Y, n_bootstrap_ci=1000, seed=320)
result = rdm_similarity(X, Y, n_bootstrap_ci=1000, seed=320)
Choosing n_bootstrap_ci¶
Quick exploration: 200–500 resamples
Publication-quality: 1000–10000 resamples
Computational cost: scales linearly with
n_bootstrap_ci. Each resample runs the full metric computation, so expensive metrics (e.g.anchor_stabilityon large data) will take proportionally longer.
Resampling strategy¶
Single-matrix metrics (
feature_split,sample_split, etc.): rows ofX(andyif supervised) are resampled together with the same indices.Two-matrix metrics (
rdm_similarity,cka, etc.): bothXandYare resampled with the same indices (paired bootstrap).Bio metrics (
perturbation_stability,perturbation_effect_size): control and perturbed populations are resampled independently.
Reproducibility¶
Pass seed for deterministic results:
result1 = shesha.feature_split(X, n_bootstrap_ci=1000, seed=320)
result2 = shesha.feature_split(X, n_bootstrap_ci=1000, seed=320)
assert result1 == result2 # identical