Predicting Astrocytic Nuclear Morphology with Machine Learning: A Tree Ensemble Classifier Study

Piercesare Grimaldi; Martina Lorenzati; Marta Ribodino; Elena Signorino; Annalisa Buffo; Paola Berchialla

doi:10.3390/app13074289

Applied Sciences (Mar 2023)

Predicting Astrocytic Nuclear Morphology with Machine Learning: A Tree Ensemble Classifier Study

Piercesare Grimaldi,
Martina Lorenzati,
Marta Ribodino,
Elena Signorino,
Annalisa Buffo,
Paola Berchialla

Affiliations

Piercesare Grimaldi: Department of Public Health and Pediatrics, University of Torino, Via Santena 5 bis, 10126 Torino, Italy
Martina Lorenzati: Department of Neuroscience Rita Levi-Montalcini, University of Torino, Via Cherasco 15, 10126 Torino, Italy
Marta Ribodino: Department of Neuroscience Rita Levi-Montalcini, University of Torino, Via Cherasco 15, 10126 Torino, Italy
Elena Signorino: Department of Neuroscience Rita Levi-Montalcini, University of Torino, Via Cherasco 15, 10126 Torino, Italy
Annalisa Buffo: Department of Neuroscience Rita Levi-Montalcini, University of Torino, Via Cherasco 15, 10126 Torino, Italy
Paola Berchialla: Center for Biostatistics, Epidemiology and Public Health, Department of Clinical and Biological Sciences, University of Torino, Regione Gonzole 43, 10043 Orbassano, Italy

DOI: https://doi.org/10.3390/app13074289
Journal volume & issue: Vol. 13, no. 7
p. 4289

Abstract

Read online

Machine learning is usually associated with big data; however, experimental or clinical data are usually limited in size. The aim of this study was to describe how supervised machine learning can be used to classify astrocytes from a small sample into different morphological classes. Our dataset was composed of only 193 cells, with unbalanced morphological classes and missing observations. We combined classification trees and ensemble algorithms (boosting and bagging) with under sampling to classify the nuclear morphology (homogeneous, dotted, wrinkled, forming crumples, and forming micronuclei) of astrocytes stained with anti-LMNB1 antibody. Accuracy, sensitivity (recall), specificity, and F1 score were assessed with bootstrapping, leave one-out (LOOCV) and stratified cross-validation. We found that our algorithm performed at rates above chance in predicting the morphological classes of astrocytes based on the nuclear expression of LMNB1. Boosting algorithms (tree ensemble) yielded better classifications over bagging ones (tree bagger). Moreover leave-one-out and bootstrapping yielded better predictions than the more commonly used k-fold cross-validation. Finally, we could identify four important predictors: the intensity of LMNB1 expression, nuclear area, cellular area, and soma area. Our results show that a tree ensemble can be optimized, in order to classify morphological data from a small sample, even in the presence of highly unbalanced classes and numerous missing data.

Published in Applied Sciences

ISSN: 2076-3417 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Engineering (General). Civil engineering (General); Science: Biology (General); Science: Physics; Science: Chemistry
Website: http://www.mdpi.com/journal/applsci

About the journal

Abstract

Keywords