Predicting the risk of cancer in adults using supervised machine learning: a scoping review

Asma Abdullah Alfayez; Holger Kunz; Alvina Grace Lai

doi:10.1136/bmjopen-2020-047755

BMJ Open (Sep 2021)

Predicting the risk of cancer in adults using supervised machine learning: a scoping review

Asma Abdullah Alfayez,
Holger Kunz,
Alvina Grace Lai

Affiliations

Asma Abdullah Alfayez: Institute of Health Informatics, University College London, London, UK
Holger Kunz: Institute of Health Informatics, University College of London, London, UK
Alvina Grace Lai: Institute of Health Informatics, University College London, London, UK

DOI: https://doi.org/10.1136/bmjopen-2020-047755
Journal volume & issue: Vol. 11, no. 9

Abstract

Read online

Objectives The purpose of this scoping review is to: (1) identify existing supervised machine learning (ML) approaches on the prediction of cancer in asymptomatic adults; (2) to compare the performance of ML models with each other and (3) to identify potential gaps in research.Design Scoping review using the population, concept and context approach.Search strategy PubMed search engine was used from inception to 10 November 2020 to identify literature meeting following inclusion criteria: (1) a general adult (≥18 years) population, either sex, asymptomatic (population); (2) any study using ML techniques to derive predictive models for future cancer risk using clinical and/or demographic and/or basic laboratory data (concept) and (3) original research articles conducted in all settings in any region of the world (context).Results The search returned 627 unique articles, of which 580 articles were excluded because they did not meet the inclusion criteria, were duplicates or were related to benign neoplasm. Full-text reviews were conducted for 47 articles and a final set of 10 articles were included in this scoping review. These 10 very heterogeneous studies used ML to predict future cancer risk in asymptomatic individuals. All studies reported area under the receiver operating characteristics curve (AUC) values as metrics of model performance, but no study reported measures of model calibration.Conclusions Research gaps that must be addressed in order to deliver validated ML-based models to assist clinical decision-making include: (1) establishing model generalisability through validation in independent cohorts, including those from low-income and middle-income countries; (2) establishing models for all cancer types; (3) thorough comparisons of ML models with best available clinical tools to ensure transparency of their potential clinical utility; (4) reporting of model calibration performance and (5) comparisons of different methods on the same cohort to reveal important information about model generalisability and performance.

Published in BMJ Open

ISSN: 2044-6055 (Online)
Publisher: BMJ Publishing Group
Country of publisher: United Kingdom
LCC subjects: Medicine
Website: https://bmjopen.bmj.com

About the journal