Machine learning-based genome-wide interrogation of somatic copy number aberrations in circulating tumor DNA for early detection of hepatocellular carcinoma
Kaishan Tao,
Zhenyuan Bian,
Qiong Zhang,
Xu Guo,
Chun Yin,
Yang Wang,
Kaixiang Zhou,
Shaogui Wan,
Meifang Shi,
Dengke Bao,
Chuhu Yang,
Jinliang Xing
Affiliations
Kaishan Tao
Department of Hepatobiliary Surgery, Xijing Hospital, Fourth Military Medical University, Xi'an, Shaanxi 710032, China
Zhenyuan Bian
Department of Hepatobiliary Surgery, Xijing Hospital, Fourth Military Medical University, Xi'an, Shaanxi 710032, China; Department of General Surgery, General Hospital of Shenyang Military Area Command, Shenyang, Liaoning 110016, China
Qiong Zhang
Research and Development Division, Oriomics Biotech, Hangzhou, Zhejiang 310018, China
Xu Guo
State Key Laboratory of Cancer Biology and Department of Physiology and Pathophysiology, Fourth Military Medical University, Xi'an, Shaanxi 710032, China
Chun Yin
State Key Laboratory of Cancer Biology and Department of Physiology and Pathophysiology, Fourth Military Medical University, Xi'an, Shaanxi 710032, China
Yang Wang
Department of Hepatobiliary Surgery, Xijing Hospital, Fourth Military Medical University, Xi'an, Shaanxi 710032, China
Kaixiang Zhou
State Key Laboratory of Cancer Biology and Department of Physiology and Pathophysiology, Fourth Military Medical University, Xi'an, Shaanxi 710032, China
Shaogui Wan
Center for Molecular Pathology, First Affiliated Hospital, Gannan Medical University, Ganzhou, Jiangxi 341000, China
Meifang Shi
Department of Liver Surgery and Transplantation, Liver Cancer Institute, Zhongshan Hospital of Fudan University, Shanghai 200032, China; Key Laboratory of Carcinogenesis and Cancer Invasion, Ministry of Education, Shanghai 200032, China
Dengke Bao
Laboratory of Cancer Biomarkers and Liquid Biopsy, School of Pharmacy, Henan University, Kaifeng 475001, China
Chuhu Yang
Research and Development Division, Oriomics Biotech, Hangzhou, Zhejiang 310018, China; Co-Corresponding author. Chunhu Yang, PhD, Research and Development Division, Oriomics Biotech, Hangzhou, Zhejiang 310018, China. Tel: (86)-18616824021.
Jinliang Xing
State Key Laboratory of Cancer Biology and Department of Physiology and Pathophysiology, Fourth Military Medical University, Xi'an, Shaanxi 710032, China; Corresponding author: Jinliang Xing, MD, PhD, State Key Laboratory of Cancer Biology and Experimental Teaching Center of Basic Medicine, Fourth Military Medical University, Xi'an, Shaanxi 710032, China. Tel: (86)-29-84774764; Fax: (86)-29-84774764.
ABSTRACT: Background: DNAs released from tumor cells into blood (circulating tumor DNAs, ctDNAs) carry tumor-specific genomic aberrations, providing a non-invasive means for cancer detection. In this study, we aimed to leverage somatic copy number aberration (SCNA) in ctDNA to develop assays to detect early-stage HCCs. Methods: We conducted low-depth whole-genome sequencing (WGS) to profile SCNAs in 384 plasma samples of hepatitis B virus (HBV)-related HCC and cancer-free HBV patients, using one discovery and two validation cohorts. To fully capture the robust signals of WGS data from the complete genome, we developed a machine learning-based statistical model that is focused on detection accuracy in early-stage HCC. Findings: We built the model using a discovery cohort of 209 patients, achieving an overall area under curve (AUC) of 0.893, with 0.874 for early-stage (Barcelona clinical liver cancer [BCLC] stage 0-A) and 0.933 for advanced-stage (BCLC stage B-D). The performance of the model was then assessed in two validation cohorts (76 and 99 patients) that only consisted of patients with stage 0-A HCC. Our model exhibited a robust predictive performance, with an AUC of 0.920 and 0.812 for the two validation cohorts. Further analyses showed the impact of tumor sample heterogeneity in model training on detecting early-stage tumors, and a refined model addressing the heterogeneity in the discovery cohort significantly increased model performance in validation. Interpretation: We developed an SCNA-based, machine learning-driven model in the non-invasive detection of early-stage HCC in HBV patients and demonstrated its performance through strict independent validations.