Scientific Reports (Oct 2024)
Development of a quantitative prediction algorithm for human cord blood-derived CD34+ hematopoietic stem-progenitor cells using parametric and non-parametric machine learning models
Abstract
Abstract The transplantation of CD34+ hematopoietic stem-progenitor cells (HSPCs) derived from cord blood serves as the standard treatment for selected hematological, oncological, metabolic, and immunodeficiency disorders, of which the dose is pivotal to the clinical outcome. Based on numerous maternal and neonatal parameters, we evaluated the predictive power of mathematical pipelines to the proportion of CD34+ cells in the final cryopreserved cord blood product adopting both parametric and non-parametric algorithms. Twenty-four predictor variables associated with the cord blood processing of 802 processed cord blood units randomly sampled in 2020–2022 were retrieved and analyzed. Prediction models were developed by adopting the parametric (multivariate linear regression) and non-parametric (random forest and back propagation neural network) statistical models to investigate the data patterns for determining the single outcome (i.e., the proportion of CD34+ cells). The multivariate linear regression model produced the lowest root-mean-square deviation (0.0982). However, the model created by the back propagation neural network produced the highest median absolute deviation (0.0689) and predictive power (56.99%) in comparison to the random forest and multivariate linear regression. The predictive model depending on a combination of continuous and discrete maternal with neonatal parameters associated with cord blood processing can predict the CD34+ dose in the final product for clinical utilization. The back propagation neural network algorithm produces a model with the highest predictive power which can be widely applied to assisting cell banks for optimal cord blood unit selection to ensure the highest chance of transplantation success.
Keywords