International Journal of Molecular Sciences (Mar 2021)

Machine Learning Reduced Gene/Non-Coding RNA Features That Classify Schizophrenia Patients Accurately and Highlight Insightful Gene Clusters

  • Yichuan Liu,
  • Hui-Qi Qu,
  • Xiao Chang,
  • Lifeng Tian,
  • Jingchun Qu,
  • Joseph Glessner,
  • Patrick M. A. Sleiman,
  • Hakon Hakonarson

DOI
https://doi.org/10.3390/ijms22073364
Journal volume & issue
Vol. 22, no. 7
p. 3364

Abstract

Read online

RNA-seq has been a powerful method to detect the differentially expressed genes/long non-coding RNAs (lncRNAs) in schizophrenia (SCZ) patients; however, due to overfitting problems differentially expressed targets (DETs) cannot be used properly as biomarkers. This study used machine learning to reduce gene/non-coding RNA features. Dorsolateral prefrontal cortex (dlpfc) RNA-seq data from 254 individuals was obtained from the CommonMind consortium. The average predictive accuracy for SCZ patients was 67% based on coding genes, and 96% based on long non-coding RNAs (lncRNAs). Machine learning is a powerful algorithm to reduce functional biomarkers in SCZ patients. The lncRNAs capture the characteristics of SCZ tissue more accurately than mRNA as the former regulate every level of gene expression, not limited to mRNA levels.

Keywords