BMC Genomics (Sep 2022)

A robust and transformation-free joint model with matching and regularization for metagenomic trajectory and disease onset

  • Qian Li,
  • Kendra Vehik,
  • Cai Li,
  • Eric Triplett,
  • Luiz Roesch,
  • Yi-Juan Hu,
  • Jeffrey Krischer

DOI
https://doi.org/10.1186/s12864-022-08890-1
Journal volume & issue
Vol. 23, no. 1
pp. 1 – 15

Abstract

Read online

Abstract Background To identify operational taxonomy units (OTUs) signaling disease onset in an observational study, a powerful strategy was selecting participants by matched sets and profiling temporal metagenomes, followed by trajectory analysis. Existing trajectory analyses modeled individual OTU or microbial community without adjusting for the within-community correlation and matched-set-specific latent factors. Results We proposed a joint model with matching and regularization (JMR) to detect OTU-specific trajectory predictive of host disease status. The between- and within-matched-sets heterogeneity in OTU relative abundance and disease risk were modeled by nested random effects. The inherent negative correlation in microbiota composition was adjusted by incorporating and regularizing the top-correlated taxa as longitudinal covariate, pre-selected by Bray-Curtis distance and elastic net regression. We designed a simulation pipeline to generate true biomarkers for disease onset and the pseudo biomarkers caused by compositionality. We demonstrated that JMR effectively controlled the false discovery and pseudo biomarkers in a simulation study generating temporal high-dimensional metagenomic counts with random intercept or slope. Application of the competing methods in the simulated data and the TEDDY cohort showed that JMR outperformed the other methods and identified important taxa in infants’ fecal samples with dynamics preceding host disease status. Conclusion Our method JMR is a robust framework that models taxon-specific trajectory and host disease status for matched participants without transformation of relative abundance, improving the power of detecting disease-associated microbial features in certain scenarios. JMR is available in R package mtradeR at https://github.com/qianli10000/mtradeR.

Keywords