Frontiers in Oncology (Dec 2022)
Construction of machine learning-based models for cancer outcomes in low and lower-middle income countries: A scoping review
Abstract
BackgroundThe impact and utility of machine learning (ML)-based prediction tools for cancer outcomes including assistive diagnosis, risk stratification, and adjunctive decision-making have been largely described and realized in the high income and upper-middle-income countries. However, statistical projections have estimated higher cancer incidence and mortality risks in low and lower-middle-income countries (LLMICs). Therefore, this review aimed to evaluate the utilization, model construction methods, and degree of implementation of ML-based models for cancer outcomes in LLMICs.MethodsPubMed/Medline, Scopus, and Web of Science databases were searched and articles describing the use of ML-based models for cancer among local populations in LLMICs between 2002 and 2022 were included. A total of 140 articles from 22,516 citations that met the eligibility criteria were included in this study.ResultsML-based models from LLMICs were often based on traditional ML algorithms than deep or deep hybrid learning. We found that the construction of ML-based models was skewed to particular LLMICs such as India, Iran, Pakistan, and Egypt with a paucity of applications in sub-Saharan Africa. Moreover, models for breast, head and neck, and brain cancer outcomes were frequently explored. Many models were deemed suboptimal according to the Prediction model Risk of Bias Assessment tool (PROBAST) due to sample size constraints and technical flaws in ML modeling even though their performance accuracy ranged from 0.65 to 1.00. While the development and internal validation were described for all models included (n=137), only 4.4% (6/137) have been validated in independent cohorts and 0.7% (1/137) have been assessed for clinical impact and efficacy.ConclusionOverall, the application of ML for modeling cancer outcomes in LLMICs is increasing. However, model development is largely unsatisfactory. We recommend model retraining using larger sample sizes, intensified external validation practices, and increased impact assessment studies using randomized controlled trial designsSystematic review registrationhttps://www.crd.york.ac.uk/prospero/display_record.php?RecordID=308345, identifier CRD42022308345.
Keywords