Bio API¶
Shesha Bio: Stability metrics for biological perturbation experiments.
This module provides Shesha variants for single-cell and perturbation biology, measuring the consistency of perturbation effects across individual cells.
- shesha.bio.compute_magnitude(adata, perturbation_key, control_label='control', metric='euclidean', layer=None)[source]¶
Scanpy-compatible wrapper for perturbation magnitude.
- shesha.bio.compute_stability(adata, perturbation_key, control_label='control', layer=None, method='standard', **kwargs)[source]¶
Scanpy-compatible wrapper for perturbation stability.
Computes stability for all perturbations in an AnnData object.
- Parameters:
adata (AnnData) – Annotated data matrix.
perturbation_key (str) – Column in adata.obs containing perturbation labels (e.g. ‘guide_id’).
control_label (str) – The label in perturbation_key representing control cells (e.g. ‘NT’).
layer (str, optional) – Layer to use for computation. If None, uses .X.
method ({'standard', 'whitened', 'knn'}, default='standard') – Method for computing stability: - ‘standard’: Global control centroid - ‘whitened’: Mahalanobis-scaled using control covariance - ‘knn’: Local k-NN matched control centroids
**kwargs – Additional arguments passed to perturbation_stability() (e.g., k=50 for knn, regularization=1e-6 for whitened).
- Returns:
Dictionary mapping perturbation names to stability scores.
- Return type:
Examples
>>> import shesha.bio as bio >>> # Standard stability >>> stability = bio.compute_stability(adata, "perturbation") >>> # Whitened stability >>> stability_w = bio.compute_stability(adata, "perturbation", method="whitened") >>> # k-NN stability >>> stability_knn = bio.compute_stability(adata, "perturbation", method="knn", k=50)
- shesha.bio.compute_stability_knn(adata, perturbation_key, control_label='control', layer=None, k=50, metric='euclidean', seed=None, max_samples=1000)[source]¶
Scanpy-compatible wrapper for k-NN matched control stability.
Convenience wrapper for compute_stability(…, method=’knn’). Consider using the unified interface instead.
- Parameters:
adata (AnnData) – Annotated data object containing single-cell data.
perturbation_key (str) – Column in adata.obs containing perturbation labels.
control_label (str, default="control") – Label identifying control/unperturbed cells.
layer (str, optional) – Layer in adata.layers to use. If None, uses adata.X.
k (int, default=50) – Number of nearest control neighbors to use for local centroid.
metric (str, default="euclidean") – Distance metric for k-NN matching: ‘cosine’ or ‘euclidean’.
seed (int, optional) – Random seed for subsampling reproducibility.
max_samples (int, optional) – Subsample perturbed population if exceeded.
- Returns:
Dictionary mapping perturbation names to k-NN matched stability scores.
- Return type:
See also
compute_stabilityUnified interface with method=’knn’
- shesha.bio.compute_stability_whitened(adata, perturbation_key, control_label='control', layer=None, regularization=1e-06, seed=None, max_samples=1000)[source]¶
Scanpy-compatible wrapper for whitened perturbation stability.
Convenience wrapper for compute_stability(…, method=’whitened’). Consider using the unified interface instead.
- Parameters:
adata (AnnData) – Annotated data object containing single-cell data.
perturbation_key (str) – Column in adata.obs containing perturbation labels.
control_label (str, default="control") – Label identifying control/unperturbed cells.
layer (str, optional) – Layer in adata.layers to use. If None, uses adata.X.
regularization (float, default=1e-6) – Regularization added to covariance diagonal for numerical stability.
seed (int, optional) – Random seed for subsampling reproducibility.
max_samples (int, optional) – Subsample perturbed population if exceeded.
- Returns:
Dictionary mapping perturbation names to whitened stability scores.
- Return type:
See also
compute_stabilityUnified interface with method=’whitened’
- shesha.bio.perturbation_effect_size(X_control, X_perturbed, metric='euclidean', n_bootstrap_ci=None, ci=0.95, seed=None)[source]¶
Compute the magnitude of the perturbation effect.
- Parameters:
X_control (np.ndarray) – Control population embeddings.
X_perturbed (np.ndarray) – Perturbed population embeddings.
metric (str, default="euclidean") –
- ‘euclidean’: Raw L2 distance between centroids (Magnitude).
Use this for geometric plots (Stability vs Magnitude).
- ’cohen’: Standardized effect size (Magnitude / Pooled SD).
Use this for statistical power analysis.
n_bootstrap_ci (int, optional) – If provided, compute bootstrap confidence interval by resampling control and perturbed populations this many times.
ci (float, default=0.95) – Confidence level for the interval.
seed (int, optional) – Random seed for bootstrap reproducibility.
- Returns:
If n_bootstrap_ci is None: the calculated magnitude/effect size. If n_bootstrap_ci is set: dict with keys ‘mean’, ‘ci_low’, ‘ci_high’, ‘std’, ‘n_bootstraps’, ‘ci_level’.
- Return type:
- shesha.bio.perturbation_stability(X_control, X_perturbed, method='standard', metric='cosine', k=50, regularization=1e-06, seed=None, max_samples=1000, n_bootstrap_ci=None, ci=0.95)[source]¶
Perturbation stability: consistency of perturbation effects across samples.
Measures whether individual perturbed samples shift in a consistent direction relative to the control population. High values indicate that the perturbation has a coherent, reproducible effect; low values suggest heterogeneous or noisy responses.
- Parameters:
X_control (np.ndarray) – Control population embeddings, shape (n_control, n_features).
X_perturbed (np.ndarray) – Perturbed population embeddings, shape (n_perturbed, n_features).
method ({'standard', 'whitened', 'knn'}, default='standard') – Method for computing stability: - ‘standard’: Global control centroid (default) - ‘whitened’: Mahalanobis-scaled using control covariance - ‘knn’: Local k-NN matched control centroids
metric ({'cosine', 'euclidean'}, default='cosine') – How to measure directional consistency (used for ‘standard’ and ‘knn’ methods).
k (int, default=50) – Number of nearest neighbors (only used when method=’knn’).
regularization (float, default=1e-6) – Regularization for covariance (only used when method=’whitened’).
seed (int, optional) – Random seed for subsampling reproducibility.
max_samples (int, optional) – Subsample perturbed population if exceeded.
n_bootstrap_ci (int, optional) – If provided, compute bootstrap confidence interval by resampling control and perturbed populations this many times.
ci (float, default=0.95) – Confidence level for the interval.
- Returns:
If n_bootstrap_ci is None: stability score in [-1, 1]. Higher = more consistent perturbation effect. If n_bootstrap_ci is set: dict with keys ‘mean’, ‘ci_low’, ‘ci_high’, ‘std’, ‘n_bootstraps’, ‘ci_level’.
- Return type:
Examples
>>> # Control and perturbed cell populations >>> X_ctrl = np.random.randn(500, 50) # 500 control cells, 50 genes >>> shift = np.random.randn(50) # consistent direction >>> X_pert = X_ctrl + shift + np.random.randn(500, 50) * 0.1 >>> >>> # Standard stability >>> stability = perturbation_stability(X_ctrl, X_pert, method='standard') >>> >>> # With bootstrap CI >>> result = perturbation_stability(X_ctrl, X_pert, n_bootstrap_ci=1000) >>> print(f"{result['mean']:.3f} [{result['ci_low']:.3f}, {result['ci_high']:.3f}]")
Notes
Method selection: - ‘standard’: Best for homogeneous controls, computationally fastest - ‘whitened’: Better when features have different scales or are correlated - ‘knn’: Best for heterogeneous controls with multiple cell types/states
The control reference is computed differently for each method: - Standard: Global centroid of all control cells - Whitened: Mahalanobis-scaled space accounting for control covariance - k-NN: Local centroid of k nearest control cells for each perturbed cell
- shesha.bio.perturbation_stability_knn(X_control, X_perturbed, k=50, metric='euclidean', seed=None, max_samples=1000)[source]¶
k-NN matched control perturbation stability.
Convenience wrapper for perturbation_stability(…, method=’knn’). Consider using the unified interface instead.
- Parameters:
X_control (np.ndarray) – Control population embeddings, shape (n_control, n_features).
X_perturbed (np.ndarray) – Perturbed population embeddings, shape (n_perturbed, n_features).
k (int) – Number of nearest control neighbors to use for local centroid.
metric (str) – Distance metric for k-NN matching: ‘cosine’ or ‘euclidean’.
seed (int, optional) – Random seed for subsampling reproducibility.
max_samples (int, optional) – Subsample perturbed population if exceeded.
- Returns:
k-NN matched stability score in [-1, 1].
- Return type:
See also
perturbation_stabilityUnified interface with method=’knn’
- shesha.bio.perturbation_stability_whitened(X_control, X_perturbed, regularization=1e-06, seed=None, max_samples=1000)[source]¶
Whitened (Mahalanobis) perturbation stability.
Convenience wrapper for perturbation_stability(…, method=’whitened’). Consider using the unified interface instead.
- Parameters:
X_control (np.ndarray) – Control population embeddings, shape (n_control, n_features).
X_perturbed (np.ndarray) – Perturbed population embeddings, shape (n_perturbed, n_features).
regularization (float) – Regularization added to covariance diagonal for numerical stability.
seed (int, optional) – Random seed for subsampling reproducibility.
max_samples (int, optional) – Subsample perturbed population if exceeded.
- Returns:
Whitened stability score in [-1, 1].
- Return type:
See also
perturbation_stabilityUnified interface with method=’whitened’