An advanced machine learning method for simultaneous breast cancer risk prediction and risk ranking in Chinese population: A prospective cohort and modeling study

Liyuan Liu; Yong He; Chunyu Kao; Yeye Fan; Fu Yang; Fei Wang; Lixiang Yu; Fei Zhou; Yujuan Xiang; Shuya Huang; Chao Zheng; Han Cai; Heling Bao; Liwen Fang; Linhong Wang; Zengjing Chen; Zhigang Yu; Yuanyuan Ji

doi:10.1097/CM9.0000000000002891

Chinese Medical Journal (Sep 2024)

An advanced machine learning method for simultaneous breast cancer risk prediction and risk ranking in Chinese population: A prospective cohort and modeling study

Liyuan Liu,
Yong He,
Chunyu Kao,
Yeye Fan,
Fu Yang,
Fei Wang,
Lixiang Yu,
Fei Zhou,
Yujuan Xiang,
Shuya Huang,
Chao Zheng,
Han Cai,
Heling Bao,
Liwen Fang,
Linhong Wang,
Zengjing Chen,
Zhigang Yu,
Yuanyuan Ji

Affiliations

Liyuan Liu: 1 Department of Breast Surgery, The Second Hospital, Cheeloo College of Medicine, Shandong University, Jinan, Shandong 250033, China
Yong He: 2 School of Mathematics, Shandong University, Jinan, Shandong 250100, China
Chunyu Kao: 3 Zhongtai Securities Institute for Financial Studies, Shandong University, Jinan, Shandong 250100, China
Yeye Fan: 2 School of Mathematics, Shandong University, Jinan, Shandong 250100, China
Fu Yang: 3 Zhongtai Securities Institute for Financial Studies, Shandong University, Jinan, Shandong 250100, China
Fei Wang: 1 Department of Breast Surgery, The Second Hospital, Cheeloo College of Medicine, Shandong University, Jinan, Shandong 250033, China
Lixiang Yu: 1 Department of Breast Surgery, The Second Hospital, Cheeloo College of Medicine, Shandong University, Jinan, Shandong 250033, China
Fei Zhou: 1 Department of Breast Surgery, The Second Hospital, Cheeloo College of Medicine, Shandong University, Jinan, Shandong 250033, China
Yujuan Xiang: 1 Department of Breast Surgery, The Second Hospital, Cheeloo College of Medicine, Shandong University, Jinan, Shandong 250033, China
Shuya Huang: 1 Department of Breast Surgery, The Second Hospital, Cheeloo College of Medicine, Shandong University, Jinan, Shandong 250033, China
Chao Zheng: 1 Department of Breast Surgery, The Second Hospital, Cheeloo College of Medicine, Shandong University, Jinan, Shandong 250033, China
Han Cai: 1 Department of Breast Surgery, The Second Hospital, Cheeloo College of Medicine, Shandong University, Jinan, Shandong 250033, China
Heling Bao: 5 Department of Maternal and Child Health, School of Public Health, Peking University, Haidian District, Beijing 100191, China
Liwen Fang: 6 National Center for Chronic and Non-communicable Disease Control and Prevention, Chinese Center for Disease Control and Prevention, Beijing 100050, China.
Linhong Wang: 6 National Center for Chronic and Non-communicable Disease Control and Prevention, Chinese Center for Disease Control and Prevention, Beijing 100050, China.
Zengjing Chen: 2 School of Mathematics, Shandong University, Jinan, Shandong 250100, China
Zhigang Yu: 1 Department of Breast Surgery, The Second Hospital, Cheeloo College of Medicine, Shandong University, Jinan, Shandong 250033, China
Yuanyuan Ji

DOI: https://doi.org/10.1097/CM9.0000000000002891
Journal volume & issue: Vol. 137, no. 17
pp. 2084 – 2091

Abstract

Read online

Abstract. Background:. Breast cancer (BC) risk-stratification tools for Asian women that are highly accurate and can provide improved interpretation ability are lacking. We aimed to develop risk-stratification models to predict long- and short-term BC risk among Chinese women and to simultaneously rank potential non-experimental risk factors. Methods:. The Breast Cancer Cohort Study in Chinese Women, a large ongoing prospective dynamic cohort study, includes 122,058 women aged 25–70 years old from the eastern part of China. We developed multiple machine-learning risk prediction models using parametric models (penalized logistic regression, bootstrap, and ensemble learning), which were the short-term ensemble penalized logistic regression (EPLR) risk prediction model and the ensemble penalized long-term (EPLT) risk prediction model to estimate BC risk. The models were assessed based on calibration and discrimination, and following this assessment, they were externally validated in new study participants from 2017 to 2020. Results:. The AUC values of the short-term EPLR risk prediction model were 0.800 for the internal validation and 0.751 for the external validation set. For the long-term EPLT risk prediction model, the area under the receiver operating characteristic curve was 0.692 and 0.760 in internal and external validations, respectively. The net reclassification improvement index of the EPLT relative to the Gail and the Han Chinese Breast Cancer Prediction Model (HCBCP) models for external validation was 0.193 and 0.233, respectively, indicating that the EPLT model has higher classification accuracy. Conclusions:. We developed the EPLR and EPLT models to screen populations with a high risk of developing BC. These can serve as useful tools to aid in risk-stratified screening and BC prevention.

Published in Chinese Medical Journal

ISSN: 0366-6999 (Print); 2542-5641 (Online)
Publisher: Wolters Kluwer
Country of publisher: United Kingdom
LCC subjects: Medicine
Website: https://journals.lww.com/cmj/pages/default.aspx

About the journal