Advanced Science (Jul 2025)
scCompass: An Integrated Multi‐Species scRNA‐seq Database for AI‐Ready
- Pengfei Wang,
- Wenhao Liu,
- Jiajia Wang,
- Yana Liu,
- Pengjiang Li,
- Ping Xu,
- Wentao Cui,
- Ran Zhang,
- Qingqing Long,
- Zhilong Hu,
- Chen Fang,
- Jingxi Dong,
- Chunyang Zhang,
- Yan Chen,
- Chengrui Wang,
- Guole Liu,
- Hanyu Xie,
- Yiyang Zhang,
- Meng Xiao,
- Shubai Chen,
- Haiping Jiang,
- The X‐Compass Consortium,
- Yiqiang Chen,
- Ge Yang,
- Shihua Zhang,
- Zhen Meng,
- Xuezhi Wang,
- Guihai Feng,
- Xin Li,
- Yuanchun Zhou
Affiliations
- Pengfei Wang
- Computer Network Information Center Chinese Academy of Sciences Beijing 100083 China
- Wenhao Liu
- State Key Laboratory of Stem Cell and Reproductive Biology Institute of Zoology Chinese Academy of Sciences Beijing 100101 China
- Jiajia Wang
- Computer Network Information Center Chinese Academy of Sciences Beijing 100083 China
- Yana Liu
- State Key Laboratory of Stem Cell and Reproductive Biology Institute of Zoology Chinese Academy of Sciences Beijing 100101 China
- Pengjiang Li
- Computer Network Information Center Chinese Academy of Sciences Beijing 100083 China
- Ping Xu
- Computer Network Information Center Chinese Academy of Sciences Beijing 100083 China
- Wentao Cui
- Computer Network Information Center Chinese Academy of Sciences Beijing 100083 China
- Ran Zhang
- Computer Network Information Center Chinese Academy of Sciences Beijing 100083 China
- Qingqing Long
- Computer Network Information Center Chinese Academy of Sciences Beijing 100083 China
- Zhilong Hu
- Computer Network Information Center Chinese Academy of Sciences Beijing 100083 China
- Chen Fang
- State Key Laboratory of Stem Cell and Reproductive Biology Institute of Zoology Chinese Academy of Sciences Beijing 100101 China
- Jingxi Dong
- State Key Laboratory of Stem Cell and Reproductive Biology Institute of Zoology Chinese Academy of Sciences Beijing 100101 China
- Chunyang Zhang
- Computer Network Information Center Chinese Academy of Sciences Beijing 100083 China
- Yan Chen
- Computer Network Information Center Chinese Academy of Sciences Beijing 100083 China
- Chengrui Wang
- Computer Network Information Center Chinese Academy of Sciences Beijing 100083 China
- Guole Liu
- State Key Laboratory of Multimodal Artificial Intelligence Systems Institute of Automation Chinese Academy of Sciences Beijing 100190 China
- Hanyu Xie
- Computer Network Information Center Chinese Academy of Sciences Beijing 100083 China
- Yiyang Zhang
- CEMS NCMIS HCMS MDIS RCSDS Academy of Mathematics and Systems Science Chinese Academy of Sciences Beijing 100190 China
- Meng Xiao
- Computer Network Information Center Chinese Academy of Sciences Beijing 100083 China
- Shubai Chen
- Beijing Key Laboratory of Mobile Computing and Pervasive Device Institute of Computing Technology Chinese Academy of Sciences Beijing 100190 China
- Haiping Jiang
- State Key Laboratory of Stem Cell and Reproductive Biology Institute of Zoology Chinese Academy of Sciences Beijing 100101 China
- The X‐Compass Consortium
- Yiqiang Chen
- Beijing Key Laboratory of Mobile Computing and Pervasive Device Institute of Computing Technology Chinese Academy of Sciences Beijing 100190 China
- Ge Yang
- State Key Laboratory of Multimodal Artificial Intelligence Systems Institute of Automation Chinese Academy of Sciences Beijing 100190 China
- Shihua Zhang
- CEMS NCMIS HCMS MDIS RCSDS Academy of Mathematics and Systems Science Chinese Academy of Sciences Beijing 100190 China
- Zhen Meng
- Computer Network Information Center Chinese Academy of Sciences Beijing 100083 China
- Xuezhi Wang
- Computer Network Information Center Chinese Academy of Sciences Beijing 100083 China
- Guihai Feng
- State Key Laboratory of Stem Cell and Reproductive Biology Institute of Zoology Chinese Academy of Sciences Beijing 100101 China
- Xin Li
- State Key Laboratory of Stem Cell and Reproductive Biology Institute of Zoology Chinese Academy of Sciences Beijing 100101 China
- Yuanchun Zhou
- Computer Network Information Center Chinese Academy of Sciences Beijing 100083 China
- DOI
- https://doi.org/10.1002/advs.202500870
- Journal volume & issue
-
Vol. 12,
no. 25
pp. n/a – n/a
Abstract
Abstract Emerging single‐cell sequencing technology has generated large amounts of data, allowing analysis of cellular dynamics and gene regulation at the single‐cell resolution. Advances in artificial intelligence enhance life sciences research by delivering critical insights and optimizing data analysis processes. However, inconsistent data processing quality and standards remain to be a major challenge. Here scCompass is proposed, which provides a comprehensive resource designed to build large‐scale, multi‐species, and model‐friendly single‐cell data collection. By applying standardized data pre‐processing, scCompass integrates and curates transcriptomic data from nearly 105 million single cells across 13 species. Using this extensive dataset, it is able to identify stable expression genes (SEGs) and organ‐specific expression genes (OSGs) in humans and mice. Different scalable datasets are provided that can be easily adapted for AI model training and the pretrained checkpoints with state‐of‐the‐art single‐cell foundation models. In summary, scCompass is highly efficient and scalable database for AI‐ready, which combined with user‐friendly data sharing, visualization, and online analysis, greatly simplifies data access and exploitation for researchers in single‐cell biology (http://www.bdbe.cn/kun).
Keywords