miMatch: a microbial metabolic background matching tool for mitigating host confounding in metagenomics research
Lei Liu,
Suqi Cao,
Weili Lin,
Zhigang Gao,
Liu Yang,
Lixin Zhu,
Bin Yang,
Guoqing Zhang,
Ruixin Zhu,
Dingfeng Wu
Affiliations
Lei Liu
Department of Gastroenterology, The Shanghai Tenth People’s Hospital, School of Life Sciences and Technology, Tongji University, Shanghai, P. R. China
Suqi Cao
National Center, Children’s Hospital, Zhejiang University School of Medicine, National Clinical Research Center for Child Health, Hangzhou, P. R. China
Weili Lin
Department of Gastroenterology, The Shanghai Tenth People’s Hospital, School of Life Sciences and Technology, Tongji University, Shanghai, P. R. China
Zhigang Gao
Department of General Surgery, Children’s Hospital, Zhejiang University School of Medicine, National Clinical Research Center for Child Health, Hangzhou, P. R. China
Liu Yang
National Center, Children’s Hospital, Zhejiang University School of Medicine, National Clinical Research Center for Child Health, Hangzhou, P. R. China
Lixin Zhu
Guangdong Institute of Gastroenterology; Guangdong Provincial Key Laboratory of Colorectal and Pelvic Floor Diseases; Biomedical Innovation Center, Sun Yat-Sen University, Guangzhou, P. R. China
Bin Yang
Shanghai Southgene Technology Co., Ltd., Shanghai, China
Guoqing Zhang
National Genomics Data Center & Bio-Med Big Data Center, Chinese Academy of Sciences Key Laboratory of Computational Biology, Shanghai Institute of Nutrition and Health, University of the Chinese Academy of Sciences, Shanghai, P. R. China
Ruixin Zhu
Department of Gastroenterology, The Shanghai Tenth People’s Hospital, School of Life Sciences and Technology, Tongji University, Shanghai, P. R. China
Dingfeng Wu
National Center, Children’s Hospital, Zhejiang University School of Medicine, National Clinical Research Center for Child Health, Hangzhou, P. R. China
Metagenomic research faces a persistent challenge due to the low concordance across studies. While matching host confounders can mitigate the impact of individual differences, the influence of factors such as genetics, environment, and lifestyle habits on microbial profiles makes it exceptionally challenging to create fully matched cohorts. The microbial metabolic background, which modulates microbial composition, reflects a cumulative impact of host confounders, serving as an ideal baseline for microbial sample matching. In this study, we introduced miMatch, an innovative metagenomic sample-matching tool that uses microbial metabolic background as a comprehensive reference for host-related variables and employs propensity score matching to build case-control pairs, even in the absence of host confounders. In the simulated datasets, miMatch effectively eliminated individual metabolic background differences, thereby enhancing the accuracy of identifying differential microbial patterns and reducing false positives. Moreover, in real metagenomic data, miMatch improved result consistency and model generalizability across cohorts of the same disease. A user-friendly web server (https://www.biosino.org/iMAC/mimatch) has been established to promote the integration of multiple metagenomic cohorts, strengthening causal relationships in metagenomic research.