PLoS ONE (Jan 2020)
Principal component analysis of blood microRNA datasets facilitates diagnosis of diverse diseases.
Abstract
Early, ideally pre-symptomatic, recognition of common diseases (e.g., heart disease, cancer, diabetes, Alzheimer's disease) facilitates early treatment or lifestyle modifications, such as diet and exercise. Sensitive, specific identification of diseases using blood samples would facilitate early recognition. We explored the potential of disease identification in high dimensional blood microRNA (miRNA) datasets using a powerful data reduction method: principal component analysis (PCA). Using Qlucore Omics Explorer (QOE), a dynamic, interactive visualization-guided bioinformatics program with a built-in statistical platform, we analyzed publicly available blood miRNA datasets from the Gene Expression Omnibus (GEO) maintained at the National Center for Biotechnology Information at the National Institutes of Health (NIH). The miRNA expression profiles were generated from real time PCR arrays, microarrays or next generation sequencing of biologic materials (e.g., blood, serum or blood components such as platelets). PCA identified the top three principal components that distinguished cohorts of patients with specific diseases (e.g., heart disease, stroke, hypertension, sepsis, diabetes, specific types of cancer, HIV, hemophilia, subtypes of meningitis, multiple sclerosis, amyotrophic lateral sclerosis, Alzheimer's disease, mild cognitive impairment, aging, and autism), from healthy subjects. Literature searches verified the functional relevance of the discriminating miRNAs. Our goal is to assemble PCA and heatmap analyses of existing and future blood miRNA datasets into a clinical reference database to facilitate the diagnosis of diseases using routine blood draws.