CellBiAge: Improved single-cell age classification using data binarization
Doudou Yu,
Manlin Li,
Guanjie Linghu,
Yihuan Hu,
Kaitlyn H. Hajdarovic,
An Wang,
Ritambhara Singh,
Ashley E. Webb
Affiliations
Doudou Yu
Molecular Biology, Cell Biology, and Biochemistry Graduate Program, Brown University, Providence, RI 02912, USA; Data Science Institute, Brown University, Providence, RI 02912, USA
Manlin Li
Data Science Institute, Brown University, Providence, RI 02912, USA
Guanjie Linghu
Data Science Institute, Brown University, Providence, RI 02912, USA
Yihuan Hu
Data Science Institute, Brown University, Providence, RI 02912, USA
Kaitlyn H. Hajdarovic
Neuroscience Graduate Program, Brown University, Providence, RI 02912, USA
An Wang
Department of Applied Mathematics & Statistics, Johns Hopkins University, Baltimore, MD 21218, USA
Ritambhara Singh
Department of Computer Science, Brown University, Providence, RI 02912, USA; Center for Computational Molecular Biology, Brown University, Providence, RI 02912, USA; Corresponding author
Ashley E. Webb
Department of Molecular Biology, Cell Biology, and Biochemistry, Brown University, Providence, RI 02912, USA; Center on the Biology of Aging, Brown University, Providence, RI 02912, USA; Carney Institute for Brain Science, Brown University, Providence, RI 02912, USA; Center for Translational Neuroscience, Brown University, Providence, RI 02912, USA; Corresponding author
Summary: Aging is a major risk factor for many diseases. Accurate methods for predicting age in specific cell types are essential to understand the heterogeneity of aging and to assess rejuvenation strategies. However, classifying organismal age at single-cell resolution using transcriptomics is challenging due to sparsity and noise. Here, we developed CellBiAge, a robust and easy-to-implement machine learning pipeline, to classify the age of single cells in the mouse brain using single-cell transcriptomics. We show that binarization of gene expression values for the top highly variable genes significantly improved test performance across different models, techniques, sexes, and brain regions, with potential age-related genes identified for model prediction. Additionally, we demonstrate CellBiAge’s ability to capture exercise-induced rejuvenation in neural stem cells. This study provides a broadly applicable approach for robust classification of organismal age of single cells in the mouse brain, which may aid in understanding the aging process and evaluating rejuvenation methods.