StrainPanDA: Linked reconstruction of strain composition and gene content profiles via pangenome‐based decomposition of metagenomic data

Han Hu; Yuxiang Tan; Chenhao Li; Junyu Chen; Yan Kou; Zhenjiang Zech Xu; Yang‐Yu Liu; Yan Tan; Lei Dai

doi:10.1002/imt2.41

iMeta (Sep 2022)

StrainPanDA: Linked reconstruction of strain composition and gene content profiles via pangenome‐based decomposition of metagenomic data

Han Hu,
Yuxiang Tan,
Chenhao Li,
Junyu Chen,
Yan Kou,
Zhenjiang Zech Xu,
Yang‐Yu Liu,
Yan Tan,
Lei Dai

Affiliations

Han Hu: CAS Key Laboratory of Quantitative Engineering Biology, Shenzhen Institute of Synthetic Biology Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences Shenzhen China
Yuxiang Tan: CAS Key Laboratory of Quantitative Engineering Biology, Shenzhen Institute of Synthetic Biology Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences Shenzhen China
Chenhao Li: Center for Computational and Integrative Biology Massachusetts General Hospital and Harvard Medical School, Richard B. Simches Research Center Boston Massachusetts USA
Junyu Chen: CAS Key Laboratory of Quantitative Engineering Biology, Shenzhen Institute of Synthetic Biology Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences Shenzhen China
Yan Kou: Bioinformatics Department Xbiome, Scientific Research Building, Tsinghua High‐Tech Park Shenzhen China
Zhenjiang Zech Xu: Department of Food Science and Technology, State Key Laboratory of Food Science and Technology Nanchang University Nanchang China
Yang‐Yu Liu: Channing Division of Network Medicine, Department of Medicine Brigham and Women's Hospital and Harvard Medical School Boston Massachusetts USA
Yan Tan: Bioinformatics Department Xbiome, Scientific Research Building, Tsinghua High‐Tech Park Shenzhen China
Lei Dai: CAS Key Laboratory of Quantitative Engineering Biology, Shenzhen Institute of Synthetic Biology Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences Shenzhen China

DOI: https://doi.org/10.1002/imt2.41
Journal volume & issue: Vol. 1, no. 3
pp. n/a – n/a

Abstract

Read online

Abstract Microbial strains of variable functional capacities coexist in microbiomes. Current bioinformatics methods of strain analysis cannot provide the direct linkage between strain composition and their gene contents from metagenomic data. Here we present Strain‐level Pangenome Decomposition Analysis (StrainPanDA), a novel method that uses the pangenome coverage profile of multiple metagenomic samples to simultaneously reconstruct the composition and gene content variation of coexisting strains in microbial communities. We systematically validate the accuracy and robustness of StrainPanDA using synthetic data sets. To demonstrate the power of gene‐centric strain profiling, we then apply StrainPanDA to analyze the gut microbiome samples of infants, as well as patients treated with fecal microbiota transplantation. We show that the linked reconstruction of strain composition and gene content profiles is critical for understanding the relationship between microbial adaptation and strain‐specific functions (e.g., nutrient utilization and pathogenicity). Finally, StrainPanDA has minimal requirements for computing resources and can be scaled to process multiple species in a community in parallel. In short, StrainPanDA can be applied to metagenomic data sets to detect the association between molecular functions and microbial/host phenotypes to formulate testable hypotheses and gain novel biological insights at the strain or subspecies level.

Published in iMeta

ISSN: 2770-5986 (Print); 2770-596X (Online)
Publisher: Wiley
Country of publisher: Australia
LCC subjects: Medicine: Medicine (General): Computer applications to medicine. Medical informatics
Website: https://onlinelibrary.wiley.com/journal/2770596x

About the journal

Abstract

Keywords