Skip to content

raphael-group/MGW

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

70 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Riemannian Metric Learning for Alignment of Spatial Multiomics (Manifold Gromov-Wasserstein or MGW)

This the repository for "Riemannian Metric Learning for Alignment of Spatial Multiomics." a technique which:

  1. Performs Riemannian metric learning across spatial modalities (multimoics, transcriptomics, and so on) using the Riemannian pull-back metric.
  2. Infers Riemannian (geodesic) distances.
  3. Aligns Riemannian distances with Gromov-Wasserstein Optimal Transport.

In the section below, we detail the usage of MGW which complements the simple demo notebooks:

- [demo_mgw_y7.ipynb](demo_mgw_y7.ipynb)
- [mouse_align.ipynb](mouse_align.ipynb)

Contents

mgw/

  • mgw.py — main solver/class for MGW
  • geometry.py — metric-tensor, geodesic distance, k-NN graph, APSP utilities
  • models.py — neural field models (MLP)
  • metric.py — evaluation metrics (e.g. migration, AMI, cosine similarity(
  • plotting.py — visualization utilities
  • utils.py — miscellaneous helpers, barycentric projection

validation/

  • dopamine.py — validation utilities for dopamine experiments (AUROC, AUPRC)
  • run_methods.py — code for running other methods (moscot Translation, SCOT, SCOTv2, PASTE2 FGW spatial, POT FGW spatial only)

demos/

  • demo_mgw_y7.ipynb — demo notebook for running MGW on the Y_7 ccRCC slice (Hu '24)
  • mouse_align.ipynb - demo notebook for aligning Spatiotemporal Transcriptomics with MGW on E9.5-10.5 mouse embryo timepoint pair (Chen '22)
  • riemannian_mouse_geodesics.ipynb — demo visualization of the geodesics in the Riemannian pull-back metric of E9.5-10.5 mouse embryo (Chen '22)

experiments/

  • Reproducible experimental notebooks on Stereo-Seq Mouse Embryo, Visium-Xenium alignment of colorectal cancer, MALDI-MSI metabolomics and Visium transcriptomics alignment of human striatum, AFADESI-MSI and Visium alignment of renal cancer.

Getting Started

**1. Load the two multiomic datasets **

Load two AnnData objects such as spatial transcriptomics (st) and spatial metabolomics (msi) after appropriate filtering.

import anndata as ad
st = ad.read_h5ad(ST_PATH)
msi = ad.read_h5ad(MSI_PATH)

**1. Running MGW's pre-processing (optional) the two multiomic datasets **

Call mgw.mgw_preprocess on two AnnDatas.

You can run PCA (will default to pre-computed PCA if already done) with PCA_comp components, and an additional CCA step for multimodal data. Set use_cca_feeler=True for this CCA step, which involves basic/coarse feeler alignment (spatial_only: bool = True to do a spatial-only feeler, feature_only = True to do a feature-only feeler, or if both False a basic spatial-feature feeler). This subsets feature dimensions which are correlated across modalities, and you can specify the number of final CCA dimensions with CCA_comp.

To run on the raw st.X and msi.X as-is without processing, set use_cca_feeler=False, use_pca_X/Z=False, and log1p_X/Z=False. We do not assume common/joint features in multimodal data generally and do independent internal PCA steps. For unimodal (e.g. transcriptomics-transcriptomics) we recommend an external joint PCA: see, e.g. experiments/mgw_mouse_embryo.ipynb for an example of this pre-processing.

import mgw.mgw as mgw
pre = mgw.mgw_preprocess(
    st, msi,
    PCA_comp=PCA_componet,
    CCA_comp=CCA_componet,
    use_cca_feeler=True, 
    use_pca_X=True,
    use_pca_Z=False, #False if the features from second modality are intensities which doesn't make sense to run pca on
    log1p_X=True,
    log1p_Z=False, #False if the features from second modality are not counts which doesn't make sense to run log1p on
    verbose=True
)

**2. Run MGW **

Next, we run mgw.mgw_align_core on the data pre to both infer the neural fields, learn metric tensors, and align the result with Gromov-Wasserstein.

PHI_ARC = (128,256,256,128)
KNN_K= 12
DEFAULT_GW_PARAMS = dict(verbose=True, inner_maxit=3000, outer_maxit=3000, inner_tol=1e-7,   outer_tol=1e-7,   epsilon=1e-4)
DEFAULT_LR = 1e-3
DEFAULT_EPS = 1e-2
DEFAULT_ITER = 20_000
EXP_PATH = "your_path"
EXP_TAG = "your_tag"

out = mgw.mgw_align_core(
        pre,
        widths=PHI_ARC,
        lr=DEFAULT_LR,
        niter=DEFAULT_ITER,
        knn_k=KNN_K,
        geodesic_eps=DEFAULT_EPS,
        save_dir=EXP_PATH, 
        tag=EXP_TAG, 
        verbose=True,
        plot_net=True, # zoom in to visually check if the two modalities shown similar pattern
        use_cca = True, #for multi-modal, we recommend setting to TRUE
        gw_params = DEFAULT_GW_PARAMS
    )

Here, the key parameters are

  • PHI_ARC: Layers of the MLP
  • KNN_K: Resolution of the K nearest neighbor graph used for Riemannian geodesics
  • DEFAULT_EPS: Epsilon for stability of Jacobian. Generally not an issue, and smaller yields more faithful Riemannian geodesics (e.g. 1e-5 for mouse embryo).
  • DEFAULT_GW_PARAMS: Default parameters for the optimal transport solver of ott jax
  • DEFAULT_LR: Learning-rate for the network.
  • DEFAULT_ITER: Number of training iterations for the network.
  • save_dir: Where to save outputs
  • tag: Tag for generated files.

**3. Return alignment and project across modalities **

We have a number of variables which can be accessed from out.

  • P: MGW coupling/alignment
  • xs: Spatial coordinates 1 (normalized)
  • xs2: Spatial coordinates 2 (normalized)
  • phi: Neural field mapping into modality 1
  • psi: Neural field mapping into modality 2
  • G_M/G_N: Pull-back metric tensor field evaluated at the coordinates
  • C_M/C_N: MGW Riemannian distance matrices

As an example, let us return the alignment and barycentrically project across modalities.

P = out["P"]
from mgw.evaulation import bary_proj
adata_sm2st = bary_proj(st, msi, P)
adata_st2sm = bary_proj(msi, st, P.T)

P represents the MGW alignment, adata_sm2st is the metabolomics to transcriptomics projection (added to st as metabolite annotation), and adata_st2sm is the transcriptomics to metabolomics projection (added to msi as transcriptomics annotation).

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Languages