Advanced Science (Dec 2024)

DeepPhylo: Phylogeny‐Aware Microbial Embeddings Enhanced Predictive Accuracy in Human Microbiome Data Analysis

  • Bin Wang,
  • Yulong Shen,
  • Jingyan Fang,
  • Xiaoquan Su,
  • Zhenjiang Zech Xu

DOI
https://doi.org/10.1002/advs.202404277
Journal volume & issue
Vol. 11, no. 45
pp. n/a – n/a

Abstract

Read online

Abstract Microbial data analysis poses significant challenges due to its high dimensionality, sparsity, and compositionality. Recent advances have shown that integrating abundance and phylogenetic information is an effective strategy for uncovering robust patterns and enhancing the predictive performance in microbiome studies. However, existing methods primarily focus on the hierarchical structure of phylogenetic trees, overlooking the evolutionary distances embedded within them. This study introduces DeepPhylo, a novel method that employs phylogeny‐aware amplicon embeddings to effectively integrate abundance and phylogenetic information. DeepPhylo improves both the unsupervised discriminatory power and supervised predictive accuracy of microbiome data analysis. Compared to the existing methods, DeepPhylo demonstrates superiority in informing biologically relevant insights across five real‐world microbiome use cases, including clustering of skin microbiomes, prediction of host chronological age and gender, diagnosis of inflammatory bowel disease (IBD) across 15 studies, and multilabel disease classification.

Keywords