Core API¶
Shesha: Self-consistency Metrics for Representational Stability
Core implementations of Shesha variants for measuring geometric stability of high-dimensional representations.
- shesha.core.anchor_stability(X, n_splits=30, n_anchors=100, n_per_split=200, metric='cosine', rank_normalize=True, seed=None, max_samples=1500)[source]¶
Anchor-based Shesha: measures stability of distance profiles from fixed anchors.
Selects fixed anchor points, then measures consistency of distance profiles from anchors to random data splits. More robust to sampling variation than pure bootstrap approaches.
- Parameters:
X (np.ndarray) – Data matrix of shape (n_samples, n_features).
n_splits (int) – Number of random splits.
n_anchors (int) – Number of fixed anchor points.
n_per_split (int) – Number of samples per split.
metric (str) – Distance metric.
rank_normalize (bool) – If True, rank-normalize distances within each anchor before correlating.
seed (int, optional) – Random seed.
max_samples (int, optional) – Subsample to this many samples if exceeded.
- Returns:
Mean correlation of anchor distance profiles across splits.
- Return type:
- shesha.core.class_separation_ratio(X, y, n_bootstrap=50, subsample_frac=0.5, metric='euclidean', seed=None)[source]¶
Class Separation Ratio: ratio of between-class to within-class distances.
Measures how well-separated classes are in the representation space. Uses bootstrap subsampling for computational efficiency and stability. Related to Fisher’s discriminant ratio but operates in distance space.
- Parameters:
X (np.ndarray) – Data matrix of shape (n_samples, n_features).
y (np.ndarray) – Class labels of shape (n_samples,).
n_bootstrap (int) – Number of bootstrap iterations for stability.
subsample_frac (float) – Fraction of samples to use per bootstrap (0.0-1.0).
metric (str) – Distance metric: ‘cosine’ or ‘euclidean’.
seed (int, optional) – Random seed for reproducibility.
- Returns:
Mean separation ratio across bootstrap samples. Higher values indicate better class separation. Range: [0, inf), typically [0.5, 5.0].
- Return type:
Examples
>>> # Well-separated classes >>> X = np.vstack([np.random.randn(100, 10), ... np.random.randn(100, 10) + 5]) >>> y = np.array([0]*100 + [1]*100) >>> ratio = class_separation_ratio(X, y) >>> print(f"Separation: {ratio:.2f}") # High value
Notes
Higher values indicate representations where same-class samples are closer together than different-class samples, suggesting good discriminability.
- shesha.core.compute_rdm(X, metric='cosine', normalize=True)[source]¶
Compute Representational Dissimilarity Matrix (RDM).
- Parameters:
- Returns:
Condensed distance vector (upper triangle of RDM).
- Return type:
np.ndarray
- shesha.core.feature_split(X, n_splits=30, metric='cosine', seed=None, max_samples=1600)[source]¶
Feature-Split Shesha: measures internal geometric consistency.
Partitions feature dimensions into random disjoint halves, computes RDMs on each half, and measures their rank correlation. High values indicate that geometric structure is distributed across features (redundant encoding).
- Parameters:
X (np.ndarray) – Data matrix of shape (n_samples, n_features).
n_splits (int) – Number of random feature partitions to average over.
metric (str) – Distance metric for RDM computation.
seed (int, optional) – Random seed for reproducibility.
max_samples (int, optional) – Subsample to this many samples if exceeded (for efficiency).
- Returns:
Mean Spearman correlation between split-half RDMs. Range: [-1, 1].
- Return type:
Examples
>>> X = np.random.randn(500, 768) # 500 samples, 768-dim embeddings >>> stability = feature_split(X, n_splits=30, seed=320) >>> print(f"Feature-split stability: {stability:.3f}")
- shesha.core.lda_stability(X, y, n_bootstrap=50, subsample_frac=0.5, seed=None)[source]¶
LDA Subspace Stability: consistency of linear discriminant direction.
Measures whether the optimal linear decision boundary is robust to sampling variation. Computes LDA on full dataset and bootstrapped subsamples, then measures alignment of discriminant vectors.
- Parameters:
X (np.ndarray) – Data matrix of shape (n_samples, n_features).
y (np.ndarray) – Binary class labels of shape (n_samples,). Must have exactly 2 classes.
n_bootstrap (int) – Number of bootstrap iterations.
subsample_frac (float) – Fraction of samples to use per bootstrap (0.0-1.0).
seed (int, optional) – Random seed for reproducibility.
- Returns:
Mean absolute cosine similarity between full and bootstrap discriminant vectors. Range: [0, 1]. Values near 1 indicate stable discriminant subspace.
- Return type:
Examples
>>> # Create well-separated binary classification data >>> X = np.vstack([np.random.randn(100, 10), ... np.random.randn(100, 10) + 3]) >>> y = np.array([0]*100 + [1]*100) >>> stability = lda_stability(X, y) >>> print(f"LDA Stability: {stability:.3f}") # Should be high
Notes
Low values suggest the discriminant subspace is unstable, potentially indicating overfitting to source domain structure. This metric is particularly useful for predicting transfer learning performance.
Only works for binary classification. For multi-class, consider using class_separation_ratio instead.
- shesha.core.rdm_drift(X, Y, method='spearman', metric='cosine')[source]¶
Compute representational drift between two representations.
Drift is defined as 1 - rdm_similarity, so higher values indicate more change in geometric structure. This is useful for tracking how much a representation has changed over time or due to some intervention (fine-tuning, perturbation, etc.).
- Parameters:
X (np.ndarray) – First (baseline/before) representation of shape (n_samples, n_features_x).
Y (np.ndarray) – Second (comparison/after) representation of shape (n_samples, n_features_y). Must have the same number of samples as X.
method (str) – Correlation method: ‘spearman’ (rank-based, default) or ‘pearson’.
metric (str) – Distance metric for RDM computation.
- Returns:
Drift score: 1 - RDM_correlation. Range: [0, 2]. - 0: Identical geometric structure - 1: Uncorrelated (random relationship) - 2: Perfectly anti-correlated (inverted structure)
- Return type:
Examples
>>> # Track drift during training >>> X_epoch0 = model.encode(data) >>> for epoch in range(10): ... train_one_epoch(model) ... X_current = model.encode(data) ... drift = rdm_drift(X_epoch0, X_current) ... print(f"Epoch {epoch+1}: drift = {drift:.3f}")
>>> # Measure drift due to noise perturbation >>> X_clean = model.encode(clean_data) >>> X_noisy = model.encode(noisy_data) >>> drift = rdm_drift(X_clean, X_noisy) >>> print(f"Noise-induced drift: {drift:.3f}")
See also
rdm_similarityThe inverse metric (similarity instead of drift)
- shesha.core.rdm_similarity(X, Y, method='spearman', metric='cosine')[source]¶
Compute RDM similarity between two representations.
Measures how similar the pairwise distance structures are between two representations. Useful for measuring representational drift, comparing models, or tracking changes during training.
- Parameters:
X (np.ndarray) – First representation matrix of shape (n_samples, n_features_x).
Y (np.ndarray) – Second representation matrix of shape (n_samples, n_features_y). Must have the same number of samples as X.
method (str) – Correlation method: ‘spearman’ (rank-based, default) or ‘pearson’.
metric (str) – Distance metric for RDM computation: ‘cosine’, ‘correlation’, or ‘euclidean’.
- Returns:
Correlation between RDMs. Range: [-1, 1]. Higher values indicate more similar geometric structure.
- Return type:
Examples
>>> # Compare representations before and after training >>> X_before = model_before.encode(data) >>> X_after = model_after.encode(data) >>> similarity = rdm_similarity(X_before, X_after) >>> print(f"RDM similarity: {similarity:.3f}")
>>> # Compare two different models >>> X_model1 = model1.encode(data) >>> X_model2 = model2.encode(data) >>> similarity = rdm_similarity(X_model1, X_model2, method='pearson')
Notes
Spearman (default) is more robust to outliers and non-linear relationships
Pearson captures linear relationships in distance magnitudes
The representations can have different feature dimensions (only sample count must match)
- shesha.core.sample_split(X, n_splits=30, subsample_fraction=0.4, metric='cosine', seed=None, max_samples=1500)[source]¶
Sample-Split Shesha (Bootstrap RDM): measures robustness to input variation.
Creates random subsamples of data points, computes RDMs on each, and measures their correlation. Assesses whether distance structure generalizes across different subsets of the data.
- Parameters:
X (np.ndarray) – Data matrix of shape (n_samples, n_features).
n_splits (int) – Number of bootstrap iterations.
subsample_fraction (float) – Fraction of samples to use in each subsample.
metric (str) – Distance metric for RDM computation.
seed (int, optional) – Random seed for reproducibility.
max_samples (int, optional) – Subsample to this many samples if exceeded.
- Returns:
Mean Spearman correlation between bootstrap RDMs. Range: [-1, 1].
- Return type:
Examples
>>> X = np.random.randn(1000, 384) >>> stability = sample_split(X, n_splits=50, seed=320)
- shesha.core.supervised_alignment(X, y, metric='correlation', seed=None, max_samples=300)[source]¶
Supervised RDM Alignment: correlation between model RDM and ideal label RDM.
Measures how well the representation’s distance structure aligns with task-defined similarity (same class = similar, different class = dissimilar).
- Parameters:
- Returns:
Spearman correlation between model and ideal RDMs. Range: [-1, 1].
- Return type:
- shesha.core.variance_ratio(X, y)[source]¶
Variance Ratio Shesha: ratio of between-class to total variance.
A simple, efficient measure of how much geometric structure is explained by class labels. Equivalent to the R-squared of predicting coordinates from class membership.
- Parameters:
X (np.ndarray) – Data matrix of shape (n_samples, n_features).
y (np.ndarray) – Class labels of shape (n_samples,).
- Returns:
Between-class variance / total variance. Range: [0, 1].
- Return type:
Examples
>>> X = np.random.randn(500, 768) >>> y = np.random.randint(0, 10, 500) >>> vr = variance_ratio(X, y)
Unified interface¶
- shesha.shesha(X, y=None, variant='feature_split', **kwargs)[source]¶
Unified interface for computing Shesha stability metrics.
- Parameters:
X (np.ndarray) – Data matrix of shape (n_samples, n_features).
y (np.ndarray, optional) – Class labels (required for supervised variants).
variant (str) – Which Shesha variant to compute: - ‘feature_split’: Unsupervised, partitions features - ‘sample_split’: Unsupervised, bootstrap resampling - ‘anchor’: Unsupervised, anchor-based stability - ‘variance’: Supervised, variance ratio - ‘supervised’: Supervised, RDM alignment
**kwargs – Additional arguments passed to the specific variant function.
- Returns:
Shesha stability score.
- Return type:
Examples
>>> # Unsupervised >>> stability = shesha(X, variant='feature_split', n_splits=30, seed=320)
>>> # Supervised >>> alignment = shesha(X, y, variant='supervised')
Unsupervised metrics¶
- shesha.feature_split(X, n_splits=30, metric='cosine', seed=None, max_samples=1600)[source]¶
Feature-Split Shesha: measures internal geometric consistency.
Partitions feature dimensions into random disjoint halves, computes RDMs on each half, and measures their rank correlation. High values indicate that geometric structure is distributed across features (redundant encoding).
- Parameters:
X (np.ndarray) – Data matrix of shape (n_samples, n_features).
n_splits (int) – Number of random feature partitions to average over.
metric (str) – Distance metric for RDM computation.
seed (int, optional) – Random seed for reproducibility.
max_samples (int, optional) – Subsample to this many samples if exceeded (for efficiency).
- Returns:
Mean Spearman correlation between split-half RDMs. Range: [-1, 1].
- Return type:
Examples
>>> X = np.random.randn(500, 768) # 500 samples, 768-dim embeddings >>> stability = feature_split(X, n_splits=30, seed=320) >>> print(f"Feature-split stability: {stability:.3f}")
- shesha.sample_split(X, n_splits=30, subsample_fraction=0.4, metric='cosine', seed=None, max_samples=1500)[source]¶
Sample-Split Shesha (Bootstrap RDM): measures robustness to input variation.
Creates random subsamples of data points, computes RDMs on each, and measures their correlation. Assesses whether distance structure generalizes across different subsets of the data.
- Parameters:
X (np.ndarray) – Data matrix of shape (n_samples, n_features).
n_splits (int) – Number of bootstrap iterations.
subsample_fraction (float) – Fraction of samples to use in each subsample.
metric (str) – Distance metric for RDM computation.
seed (int, optional) – Random seed for reproducibility.
max_samples (int, optional) – Subsample to this many samples if exceeded.
- Returns:
Mean Spearman correlation between bootstrap RDMs. Range: [-1, 1].
- Return type:
Examples
>>> X = np.random.randn(1000, 384) >>> stability = sample_split(X, n_splits=50, seed=320)
- shesha.anchor_stability(X, n_splits=30, n_anchors=100, n_per_split=200, metric='cosine', rank_normalize=True, seed=None, max_samples=1500)[source]¶
Anchor-based Shesha: measures stability of distance profiles from fixed anchors.
Selects fixed anchor points, then measures consistency of distance profiles from anchors to random data splits. More robust to sampling variation than pure bootstrap approaches.
- Parameters:
X (np.ndarray) – Data matrix of shape (n_samples, n_features).
n_splits (int) – Number of random splits.
n_anchors (int) – Number of fixed anchor points.
n_per_split (int) – Number of samples per split.
metric (str) – Distance metric.
rank_normalize (bool) – If True, rank-normalize distances within each anchor before correlating.
seed (int, optional) – Random seed.
max_samples (int, optional) – Subsample to this many samples if exceeded.
- Returns:
Mean correlation of anchor distance profiles across splits.
- Return type:
Supervised metrics¶
- shesha.variance_ratio(X, y)[source]¶
Variance Ratio Shesha: ratio of between-class to total variance.
A simple, efficient measure of how much geometric structure is explained by class labels. Equivalent to the R-squared of predicting coordinates from class membership.
- Parameters:
X (np.ndarray) – Data matrix of shape (n_samples, n_features).
y (np.ndarray) – Class labels of shape (n_samples,).
- Returns:
Between-class variance / total variance. Range: [0, 1].
- Return type:
Examples
>>> X = np.random.randn(500, 768) >>> y = np.random.randint(0, 10, 500) >>> vr = variance_ratio(X, y)
- shesha.supervised_alignment(X, y, metric='correlation', seed=None, max_samples=300)[source]¶
Supervised RDM Alignment: correlation between model RDM and ideal label RDM.
Measures how well the representation’s distance structure aligns with task-defined similarity (same class = similar, different class = dissimilar).
- Parameters:
- Returns:
Spearman correlation between model and ideal RDMs. Range: [-1, 1].
- Return type:
- shesha.class_separation_ratio(X, y, n_bootstrap=50, subsample_frac=0.5, metric='euclidean', seed=None)[source]¶
Class Separation Ratio: ratio of between-class to within-class distances.
Measures how well-separated classes are in the representation space. Uses bootstrap subsampling for computational efficiency and stability. Related to Fisher’s discriminant ratio but operates in distance space.
- Parameters:
X (np.ndarray) – Data matrix of shape (n_samples, n_features).
y (np.ndarray) – Class labels of shape (n_samples,).
n_bootstrap (int) – Number of bootstrap iterations for stability.
subsample_frac (float) – Fraction of samples to use per bootstrap (0.0-1.0).
metric (str) – Distance metric: ‘cosine’ or ‘euclidean’.
seed (int, optional) – Random seed for reproducibility.
- Returns:
Mean separation ratio across bootstrap samples. Higher values indicate better class separation. Range: [0, inf), typically [0.5, 5.0].
- Return type:
Examples
>>> # Well-separated classes >>> X = np.vstack([np.random.randn(100, 10), ... np.random.randn(100, 10) + 5]) >>> y = np.array([0]*100 + [1]*100) >>> ratio = class_separation_ratio(X, y) >>> print(f"Separation: {ratio:.2f}") # High value
Notes
Higher values indicate representations where same-class samples are closer together than different-class samples, suggesting good discriminability.
- shesha.lda_stability(X, y, n_bootstrap=50, subsample_frac=0.5, seed=None)[source]¶
LDA Subspace Stability: consistency of linear discriminant direction.
Measures whether the optimal linear decision boundary is robust to sampling variation. Computes LDA on full dataset and bootstrapped subsamples, then measures alignment of discriminant vectors.
- Parameters:
X (np.ndarray) – Data matrix of shape (n_samples, n_features).
y (np.ndarray) – Binary class labels of shape (n_samples,). Must have exactly 2 classes.
n_bootstrap (int) – Number of bootstrap iterations.
subsample_frac (float) – Fraction of samples to use per bootstrap (0.0-1.0).
seed (int, optional) – Random seed for reproducibility.
- Returns:
Mean absolute cosine similarity between full and bootstrap discriminant vectors. Range: [0, 1]. Values near 1 indicate stable discriminant subspace.
- Return type:
Examples
>>> # Create well-separated binary classification data >>> X = np.vstack([np.random.randn(100, 10), ... np.random.randn(100, 10) + 3]) >>> y = np.array([0]*100 + [1]*100) >>> stability = lda_stability(X, y) >>> print(f"LDA Stability: {stability:.3f}") # Should be high
Notes
Low values suggest the discriminant subspace is unstable, potentially indicating overfitting to source domain structure. This metric is particularly useful for predicting transfer learning performance.
Only works for binary classification. For multi-class, consider using class_separation_ratio instead.
Drift metrics¶
- shesha.rdm_similarity(X, Y, method='spearman', metric='cosine')[source]¶
Compute RDM similarity between two representations.
Measures how similar the pairwise distance structures are between two representations. Useful for measuring representational drift, comparing models, or tracking changes during training.
- Parameters:
X (np.ndarray) – First representation matrix of shape (n_samples, n_features_x).
Y (np.ndarray) – Second representation matrix of shape (n_samples, n_features_y). Must have the same number of samples as X.
method (str) – Correlation method: ‘spearman’ (rank-based, default) or ‘pearson’.
metric (str) – Distance metric for RDM computation: ‘cosine’, ‘correlation’, or ‘euclidean’.
- Returns:
Correlation between RDMs. Range: [-1, 1]. Higher values indicate more similar geometric structure.
- Return type:
Examples
>>> # Compare representations before and after training >>> X_before = model_before.encode(data) >>> X_after = model_after.encode(data) >>> similarity = rdm_similarity(X_before, X_after) >>> print(f"RDM similarity: {similarity:.3f}")
>>> # Compare two different models >>> X_model1 = model1.encode(data) >>> X_model2 = model2.encode(data) >>> similarity = rdm_similarity(X_model1, X_model2, method='pearson')
Notes
Spearman (default) is more robust to outliers and non-linear relationships
Pearson captures linear relationships in distance magnitudes
The representations can have different feature dimensions (only sample count must match)
- shesha.rdm_drift(X, Y, method='spearman', metric='cosine')[source]¶
Compute representational drift between two representations.
Drift is defined as 1 - rdm_similarity, so higher values indicate more change in geometric structure. This is useful for tracking how much a representation has changed over time or due to some intervention (fine-tuning, perturbation, etc.).
- Parameters:
X (np.ndarray) – First (baseline/before) representation of shape (n_samples, n_features_x).
Y (np.ndarray) – Second (comparison/after) representation of shape (n_samples, n_features_y). Must have the same number of samples as X.
method (str) – Correlation method: ‘spearman’ (rank-based, default) or ‘pearson’.
metric (str) – Distance metric for RDM computation.
- Returns:
Drift score: 1 - RDM_correlation. Range: [0, 2]. - 0: Identical geometric structure - 1: Uncorrelated (random relationship) - 2: Perfectly anti-correlated (inverted structure)
- Return type:
Examples
>>> # Track drift during training >>> X_epoch0 = model.encode(data) >>> for epoch in range(10): ... train_one_epoch(model) ... X_current = model.encode(data) ... drift = rdm_drift(X_epoch0, X_current) ... print(f"Epoch {epoch+1}: drift = {drift:.3f}")
>>> # Measure drift due to noise perturbation >>> X_clean = model.encode(clean_data) >>> X_noisy = model.encode(noisy_data) >>> drift = rdm_drift(X_clean, X_noisy) >>> print(f"Noise-induced drift: {drift:.3f}")
See also
rdm_similarityThe inverse metric (similarity instead of drift)