A graph self-supervised residual learning framework for domain identification and data integration of spatial transcriptomics

Jinjin Huang; Xiaoqian Fu; Zhuangli Zhang; Yinfeng Xie; Shangkun Liu; Yarong Wang; Zhihong Zhao; Youmei Peng

doi:10.1038/s42003-024-06814-1

Communications Biology (Sep 2024)

A graph self-supervised residual learning framework for domain identification and data integration of spatial transcriptomics

Jinjin Huang,
Xiaoqian Fu,
Zhuangli Zhang,
Yinfeng Xie,
Shangkun Liu,
Yarong Wang,
Zhihong Zhao,
Youmei Peng

Affiliations

Jinjin Huang: Henan Key Laboratory for Pharmacology of Liver Diseases, BGI College & Henan Institute of Medical and Pharmaceutical Sciences, Zhengzhou University
Xiaoqian Fu: Henan Key Laboratory for Pharmacology of Liver Diseases, BGI College & Henan Institute of Medical and Pharmaceutical Sciences, Zhengzhou University
Zhuangli Zhang: Henan Key Laboratory for Pharmacology of Liver Diseases, BGI College & Henan Institute of Medical and Pharmaceutical Sciences, Zhengzhou University
Yinfeng Xie: Henan Key Laboratory for Pharmacology of Liver Diseases, BGI College & Henan Institute of Medical and Pharmaceutical Sciences, Zhengzhou University
Shangkun Liu: Henan Key Laboratory for Pharmacology of Liver Diseases, BGI College & Henan Institute of Medical and Pharmaceutical Sciences, Zhengzhou University
Yarong Wang: Henan Key Laboratory for Pharmacology of Liver Diseases, BGI College & Henan Institute of Medical and Pharmaceutical Sciences, Zhengzhou University
Zhihong Zhao: Henan Key Laboratory for Pharmacology of Liver Diseases, BGI College & Henan Institute of Medical and Pharmaceutical Sciences, Zhengzhou University
Youmei Peng: Henan Key Laboratory for Pharmacology of Liver Diseases, BGI College & Henan Institute of Medical and Pharmaceutical Sciences, Zhengzhou University

DOI: https://doi.org/10.1038/s42003-024-06814-1
Journal volume & issue: Vol. 7, no. 1
pp. 1 – 15

Abstract

Read online

Abstract Spatial transcriptomics (ST) technologies allow for comprehensive characterization of gene expression patterns in the context of tissue microenvironment. However, accurately identifying domains with spatial coherence in both gene expression and histology in situ and effectively integrating data from multi-sample remains challenging. Here, we propose ResST, a graph self-supervised residual learning model based on graph neural network and Margin Disparity Discrepancy (MDD) theory. ResST aggregates gene expression, biological effects, spatial location, and morphological information to capture nonlinear relationships between a cell and surrounding cells for spatial domain identification. Also, ResST integrates multiple ST datasets and aligns latent embeddings based on MDD theory for correcting batch effects. Results show that ResST identifies continuous spatial domains at a finer scale in ten ST datasets acquired with different technologies. Moreover, ResST efficiently integrated data from multiple tissue sections vertically or horizontally while correcting batch effects. Overall, ResST demonstrates exceptional performance in analyzing ST datasets.

Published in Communications Biology

ISSN: 2399-3642 (Online)
Publisher: Nature Portfolio
Country of publisher: United Kingdom
LCC subjects: Science: Biology (General)
Website: https://www.nature.com/commsbio/

About the journal