Fundamental Research (Jul 2024)

Integrating multi-omics data of childhood asthma using a deep association model

  • Kai Wei,
  • Fang Qian,
  • Yixue Li,
  • Tao Zeng,
  • Tao Huang

Journal volume & issue
Vol. 4, no. 4
pp. 738 – 751

Abstract

Read online

Childhood asthma is one of the most common respiratory diseases with rising mortality and morbidity. The multi-omics data is providing a new chance to explore collaborative biomarkers and corresponding diagnostic models of childhood asthma. To capture the nonlinear association of multi-omics data and improve interpretability of diagnostic model, we proposed a novel deep association model (DAM) and corresponding efficient analysis framework. First, the Deep Subspace Reconstruction was used to fuse the omics data and diagnostic information, thereby correcting the distribution of the original omics data and reducing the influence of unnecessary data noises. Second, the Joint Deep Semi-Negative Matrix Factorization was applied to identify different latent sample patterns and extract biomarkers from different omics data levels. Third, our newly proposed Deep Orthogonal Canonical Correlation Analysis can rank features in the collaborative module, which are able to construct the diagnostic model considering nonlinear correlation between different omics data levels. Using DAM, we deeply analyzed the transcriptome and methylation data of childhood asthma. The effectiveness of DAM is verified from the perspectives of algorithm performance and biological significance on the independent test dataset, by ablation experiment and comparison with many baseline methods from clinical and biological studies. The DAM-induced diagnostic model can achieve a prediction AUC of 0.912, which is higher than that of many other alternative methods. Meanwhile, relevant pathways and biomarkers of childhood asthma are also recognized to be collectively altered on the gene expression and methylation levels. As an interpretable machine learning approach, DAM simultaneously considers the non-linear associations among samples and those among biological features, which should help explore interpretative biomarker candidates and efficient diagnostic models from multi-omics data analysis for human complex diseases.

Keywords