Biological Perturbation Analysis¶

shesha.bio provides metrics for single-cell and CRISPR perturbation experiments. It works natively with AnnData objects.

Compute stability¶

shesha.bio.compute_stability() measures per-perturbation geometric consistency relative to a control population.

from shesha.bio import compute_stability

stability = compute_stability(
    adata_pca,
    perturbation_key='guide_id',
    control_label='NT',
    metric='cosine',
)
print(stability['KLF1'])   # e.g. 0.85

Compute magnitude¶

shesha.bio.compute_magnitude() measures the average distance of perturbed cells from the centroid of the control population.

from shesha.bio import compute_magnitude

magnitude = compute_magnitude(
    adata_pca,
    perturbation_key='guide_id',
    control_label='NT',
    metric='euclidean',
)
print(magnitude['KLF1'])   # e.g. 2.40

Bootstrap confidence intervals¶

The low-level functions perturbation_stability() and perturbation_effect_size() support bootstrap CIs via n_bootstrap_ci. Control and perturbed populations are resampled independently. See Bootstrap Confidence Intervals for full details.

from shesha.bio import perturbation_stability

result = perturbation_stability(X_ctrl, X_pert, n_bootstrap_ci=1000, seed=320)
print(f"{result['mean']:.3f} [{result['ci_low']:.3f}, {result['ci_high']:.3f}]")

Split-half reproducibility¶

shesha.bio.split_half_reproducibility() measures effect-direction reproducibility for each perturbation by repeated 50/50 random cell splits. High values indicate that the perturbation’s directional shift is consistent across independent subsets of cells — a direct assay of biological reproducibility that is distinct from effect magnitude.

from shesha.bio import split_half_reproducibility

repro = split_half_reproducibility(
    adata,
    perturbation_key="perturbation",
    control_label="control",
    n_splits=50,
    random_state=320,
)
# Returns DataFrame: index=perturbation, columns=[split_half_cosine, n_cells]
print(repro.sort_values("split_half_cosine", ascending=False).head())

Magnitude-matched comparison¶

shesha.bio.magnitude_matched_comparison() tests whether stability predicts reproducibility within magnitude bins, controlling for the SNR confound. Perturbations are binned by effect size and, within each bin, the mean split-half cosine is compared between the high-stability and low-stability halves.

from shesha.bio import compute_stability, compute_magnitude, magnitude_matched_comparison
import pandas as pd

sp = compute_stability(adata, perturbation_key="perturbation", control_label="control")
mp = compute_magnitude(adata, perturbation_key="perturbation", control_label="control")

df = repro.copy()
df["Sp"] = pd.Series(sp)
df["Mp"] = pd.Series(mp)

bins = magnitude_matched_comparison(
    df,
    stability_col="Sp",
    repro_col="split_half_cosine",
    magnitude_col="Mp",
    n_bins=4,
)
print(bins[["mag_bin", "n", "high_stability_mean", "low_stability_mean", "difference"]])