MacSGP: Mapping Cell-Type-Specific Spatial Gene Programs Uncovers Tissue Architecture and Microenvironment Organization
MacSGP is a scalable statistical and computational approach for MApping Cell-type-specific Spatial Gene Programs (SGPs) in spatial transcriptomic (ST) data.
MacSGP's effectiveness relies on our innovations in the seamless integration of deep graph neural networks (GNNs) and probabilistic models:
- MacSGP maps gene expressions and spatial information of spots into a shared latent space by leveraging deep GNNs, yielding low-dimensional representations of each spot that capture both gene expression similarity and spatial coherence.
- MacSGP utilizes the latent representation to generate cell-type-specific SGPs through a probabilistic model, which accounts for cell type mixtures and characterizes cell-type-specific SGPs using the low-rank structure.
- For large-scale high-resolution ST datasets, MacSGP adopts a batch-learning scheme that learns SGPs over small gene patches, enabling scalable training without sacrificing accuracy.
It's recommended to create a virtual environment first.
$ conda create -n MacSGP python=3.11
$ conda activate MacSGPMacSGP requires pytorch and PyG
For PyG, MacSGP also requires its additional libraries, their installation requires specifications for torch version and CUDA version. Users could use nvcc --version to check the CUDA version for installation.
Here we provide with an example of the CUDA 12.8 installation code.
$ pip install torch_geometric
# Additional dependencies:
$ pip install pyg_lib torch_scatter torch_sparse torch_cluster torch_spline_conv -f https://data.pyg.org/whl/torch-2.8.0+cu128.htmlMacSGP can be installed from PyPI:
$ pip install MacSGPAlternatively, MacSGP can be downloaded from GitHub:
# Clone the repository
$ git clone https://github.com/YangLabHKUST/MacSGP.git
$ cd MacSGP
# Install the required packages
$ pip install -r requirements.txt
# Install MacSGP
$ python setup.py build
$ python setup.py installThe tutorials for using MacSGP and codes for reproducing the simulation and real data analysis results presented in our paper are available on the tutorial website (https://macsgp-tutorial.readthedocs.io/).
- Simulation study replicates
- Visium adult mouse brain datasets containing two biological replicates
- Visium kidney cancer dataset at the tumour-normal interface
- Multiple human colorectal cancer datasets generated by different technologies
If you find MacSGP or any of the source code in this repository useful for your work, please cite:
Mapping Cell-Type-Specific Spatial Gene Programs Uncovers Tissue Architecture and Microenvironment Organization.
Yeqin Zeng, Zhiwei Wang, Yuyao Liu, Yuheng Chen, Jiguang Wang, Hao Chen, and Can Yang.
Submitted, 2025.
The software is developed and maintained by Yeqin Zeng.
Please feel free to contact Yeqin Zeng, Zhiwei Wang, or Prof. Can Yang if any inquiries.
