Classification models using circulating neutrophil transcripts can detect unruptured intracranial aneurysm

Kerry E. Poppenberg; Vincent M. Tutino; Lu Li; Muhammad Waqas; Armond June; Lee Chaves; Kaiyu Jiang; James N. Jarvis; Yijun Sun; Kenneth V. Snyder; Elad I. Levy; Adnan H. Siddiqui; John Kolega; Hui Meng

doi:10.1186/s12967-020-02550-2

Journal of Translational Medicine (Oct 2020)

Classification models using circulating neutrophil transcripts can detect unruptured intracranial aneurysm

Kerry E. Poppenberg,
Vincent M. Tutino,
Lu Li,
Muhammad Waqas,
Armond June,
Lee Chaves,
Kaiyu Jiang,
James N. Jarvis,
Yijun Sun,
Kenneth V. Snyder,
Elad I. Levy,
Adnan H. Siddiqui,
John Kolega,
Hui Meng

Affiliations

Kerry E. Poppenberg: Canon Stroke and Vascular Research Center, Clinical and Translational Research Center
Vincent M. Tutino: Canon Stroke and Vascular Research Center, Clinical and Translational Research Center
Lu Li: Department of Computer Science and Engineering, University of Buffalo
Muhammad Waqas: Department of Neurosurgery, Jacobs School of Medicine and Biomedical Sciences
Armond June: Department of Pathology and Anatomical Sciences, Jacobs School of Medicine and Biomedical Sciences
Lee Chaves: Department of Internal Medicine, Jacobs School of Medicine and Biomedical Sciences
Kaiyu Jiang: Genetics, Genomics, and Bioinformatics Program, Jacobs School of Medicine and Biomedical Sciences
James N. Jarvis: Genetics, Genomics, and Bioinformatics Program, Jacobs School of Medicine and Biomedical Sciences
Yijun Sun: Genetics, Genomics, and Bioinformatics Program, Jacobs School of Medicine and Biomedical Sciences
Kenneth V. Snyder: Canon Stroke and Vascular Research Center, Clinical and Translational Research Center
Elad I. Levy: Canon Stroke and Vascular Research Center, Clinical and Translational Research Center
Adnan H. Siddiqui: Canon Stroke and Vascular Research Center, Clinical and Translational Research Center
John Kolega: Canon Stroke and Vascular Research Center, Clinical and Translational Research Center
Hui Meng: Canon Stroke and Vascular Research Center, Clinical and Translational Research Center

DOI: https://doi.org/10.1186/s12967-020-02550-2
Journal volume & issue: Vol. 18, no. 1
pp. 1 – 19

Abstract

Read online

Abstract Background Intracranial aneurysms (IAs) are dangerous because of their potential to rupture. We previously found significant RNA expression differences in circulating neutrophils between patients with and without unruptured IAs and trained machine learning models to predict presence of IA using 40 neutrophil transcriptomes. Here, we aim to develop a predictive model for unruptured IA using neutrophil transcriptomes from a larger population and more robust machine learning methods. Methods Neutrophil RNA extracted from the blood of 134 patients (55 with IA, 79 IA-free controls) was subjected to next-generation RNA sequencing. In a randomly-selected training cohort (n = 94), the Least Absolute Shrinkage and Selection Operator (LASSO) selected transcripts, from which we constructed prediction models via 4 well-established supervised machine-learning algorithms (K-Nearest Neighbors, Random Forest, and Support Vector Machines with Gaussian and cubic kernels). We tested the models in the remaining samples (n = 40) and assessed model performance by receiver-operating-characteristic (ROC) curves. Real-time quantitative polymerase chain reaction (RT-qPCR) of 9 IA-associated genes was used to verify gene expression in a subset of 49 neutrophil RNA samples. We also examined the potential influence of demographics and comorbidities on model prediction. Results Feature selection using LASSO in the training cohort identified 37 IA-associated transcripts. Models trained using these transcripts had a maximum accuracy of 90% in the testing cohort. The testing performance across all methods had an average area under ROC curve (AUC) = 0.97, an improvement over our previous models. The Random Forest model performed best across both training and testing cohorts. RT-qPCR confirmed expression differences in 7 of 9 genes tested. Gene ontology and IPA network analyses performed on the 37 model genes reflected dysregulated inflammation, cell signaling, and apoptosis processes. In our data, demographics and comorbidities did not affect model performance. Conclusions We improved upon our previous IA prediction models based on circulating neutrophil transcriptomes by increasing sample size and by implementing LASSO and more robust machine learning methods. Future studies are needed to validate these models in larger cohorts and further investigate effect of covariates.

Published in Journal of Translational Medicine

ISSN: 1479-5876 (Online)
Publisher: BMC
Country of publisher: United Kingdom
LCC subjects: Medicine
Website: https://translational-medicine.biomedcentral.com/

About the journal

Abstract

Keywords