BMC Bioinformatics (Apr 2021)

SMALF: miRNA-disease associations prediction based on stacked autoencoder and XGBoost

  • Dayun Liu,
  • Yibiao Huang,
  • Wenjuan Nie,
  • Jiaxuan Zhang,
  • Lei Deng

DOI
https://doi.org/10.1186/s12859-021-04135-2
Journal volume & issue
Vol. 22, no. 1
pp. 1 – 18

Abstract

Read online

Abstract Background Identifying miRNA and disease associations helps us understand disease mechanisms of action from the molecular level. However, it is usually blind, time-consuming, and small-scale based on biological experiments. Hence, developing computational methods to predict unknown miRNA and disease associations is becoming increasingly important. Results In this work, we develop a computational framework called SMALF to predict unknown miRNA-disease associations. SMALF first utilizes a stacked autoencoder to learn miRNA latent feature and disease latent feature from the original miRNA-disease association matrix. Then, SMALF obtains the feature vector of representing miRNA-disease by integrating miRNA functional similarity, miRNA latent feature, disease semantic similarity, and disease latent feature. Finally, XGBoost is utilized to predict unknown miRNA-disease associations. We implement cross-validation experiments. Compared with other state-of-the-art methods, SAMLF achieved the best AUC value. We also construct three case studies, including hepatocellular carcinoma, colon cancer, and breast cancer. The results show that 10, 10, and 9 out of the top ten predicted miRNAs are verified in MNDR v3.0 or miRCancer, respectively. Conclusion The comprehensive experimental results demonstrate that SMALF is effective in identifying unknown miRNA-disease associations.

Keywords