Cancer Management and Research (Apr 2021)

Machine Learning Models to Improve the Differentiation Between Benign and Malignant Breast Lesions on Ultrasound: A Multicenter External Validation Study

  • Huo L,
  • Tan Y,
  • Wang S,
  • Geng C,
  • Li Y,
  • Ma X,
  • Wang B,
  • He Y,
  • Yao C,
  • Ouyang T

Journal volume & issue
Vol. Volume 13
pp. 3367 – 3379

Abstract

Read online

Ling Huo,1,* Yao Tan,2,* Shu Wang,3 Cuizhi Geng,4 Yi Li,5 XiangJun Ma,6 Bin Wang,2 YingJian He,1 Chen Yao,2,7 Tao Ouyang1 1Key Laboratory of Carcinogenesis and Translational Research (Ministry of Education/Beijing), Breast Center, Peking University Cancer Hospital & Institute, Beijing, People’s Republic of China; 2Department of Biostatistics, Peking University First Hospital, Beijing, People’s Republic of China; 3Department of Breast Center, Peking University People’s Hospital, Beijing, People’s Republic of China; 4The Fourth Hospital of Hebei Medical University, Shijiazhuang, People’s Republic of China; 5Shunyi District Health Care Hospital for Women and Children of Beijing, Beijing, People’s Republic of China; 6Haidian Maternal and Child Health Hospital, Beijing, People’s Republic of China; 7Peking University Clinical Research Institute, Peking University Health Science Center, Beijing, People’s Republic of China*These authors contributed equally to this workCorrespondence: Chen YaoPeking University First Hospital, Xicheng District, Beijing, 100034, People’s Republic of ChinaTel +86 18610640562Email [email protected] OuyangPeking University Cancer Hospital & Institute, Haidian District, Beijing, 100142, People’s Republic of ChinaTel +86 010 88121122Email [email protected]: This study aimed to establish and evaluate the usefulness of a simple, practical, and easy-to-promote machine learning model based on ultrasound imaging features for diagnosing breast cancer (BC).Materials and Methods: Logistic regression, random forest, extra trees, support vector, multilayer perceptron, and XG Boost models were developed. The modeling data set of 1345 cases was from a tertiary class A hospital in China. The external validation data set of 1965 cases were from 3 tertiary class A hospitals and 2 primary hospitals. The area under the receiver operating characteristic curve (AUC) was used as the main evaluation index, and pathological biopsy was used as the gold standard for evaluating each model. Diagnostic capability was also compared with that of clinicians.Results: Among the six models, the logistic model showed superior diagnostic efficiency, with an AUC of 0.771 and 0.906 and Brier scores of 0.181 and 0.165 in the test and validation sets, respectively. The AUCs of the clinician diagnosis and the logistic model were 0.913 and 0.906. Their AUCs in the tertiary class A hospitals were 0.915 and 0.915, respectively, and were 0.894 and 0.873 in primary hospitals, respectively.Conclusion: The externally validated logical model can be used to distinguish between malignant and benign breast lesions in ultrasound images. Compared with clinician diagnosis, the logistic model has better diagnostic efficiency, making it potentially useful to assist in screening, particularly in lower level medical institutions.Trial Registration: http://www.clinicaltrials.gov. ClinicalTrials.gov ID: NCT03080623.Keywords: breast cancer, machine learning, diagnostic accuracy, patient stratification, screening modalities, ultrasound imaging

Keywords