Genome Biology (Jun 2020)

Approaches for integrating heterogeneous RNA-seq data reveal cross-talk between microbes and genes in asthmatic patients

  • Daniel Spakowicz,
  • Shaoke Lou,
  • Brian Barron,
  • Jose L. Gomez,
  • Tianxiao Li,
  • Qing Liu,
  • Nicole Grant,
  • Xiting Yan,
  • Rebecca Hoyd,
  • George Weinstock,
  • Geoffrey L. Chupp,
  • Mark Gerstein

DOI
https://doi.org/10.1186/s13059-020-02033-z
Journal volume & issue
Vol. 21, no. 1
pp. 1 – 22

Abstract

Read online

Abstract Sputum induction is a non-invasive method to evaluate the airway environment, particularly for asthma. RNA sequencing (RNA-seq) of sputum samples can be challenging to interpret due to the complex and heterogeneous mixtures of human cells and exogenous (microbial) material. In this study, we develop a pipeline that integrates dimensionality reduction and statistical modeling to grapple with the heterogeneity. LDA(Latent Dirichlet allocation)-link connects microbes to genes using reduced-dimensionality LDA topics. We validate our method with single-cell RNA-seq and microscopy and then apply it to the sputum of asthmatic patients to find known and novel relationships between microbes and genes.