Inference of chromosome selection parameters and missegregation rate in cancer from DNA-sequencing data

Zijin Xiang; Zhihan Liu; Khanh N. Dinh

doi:10.1038/s41598-024-67842-9

Scientific Reports (Jul 2024)

Inference of chromosome selection parameters and missegregation rate in cancer from DNA-sequencing data

Zijin Xiang,
Zhihan Liu,
Khanh N. Dinh

Affiliations

Zijin Xiang: Irving Institute for Cancer Dynamics and Department of Statistics, Columbia University
Zhihan Liu: Irving Institute for Cancer Dynamics and Department of Statistics, Columbia University
Khanh N. Dinh: Irving Institute for Cancer Dynamics and Department of Statistics, Columbia University

DOI: https://doi.org/10.1038/s41598-024-67842-9
Journal volume & issue: Vol. 14, no. 1
pp. 1 – 12

Abstract

Read online

Abstract Aneuploidy is frequently observed in cancers and has been linked to poor patient outcome. Analysis of aneuploidy in DNA-sequencing (DNA-seq) data necessitates untangling the effects of the Copy Number Aberration (CNA) occurrence rates and the selection coefficients that act upon the resulting karyotypes. We introduce a parameter inference algorithm that takes advantage of both bulk and single-cell DNA-seq cohorts. The method is based on Approximate Bayesian Computation (ABC) and utilizes CINner, our recently introduced simulation algorithm of chromosomal instability in cancer. We examine three groups of statistics to summarize the data in the ABC routine: (A) Copy Number-based measures, (B) phylogeny tip statistics, and (C) phylogeny balance indices. Using these statistics, our method can recover both the CNA probabilities and selection parameters from ground truth data, and performs well even for data cohorts of relatively small sizes. We find that only statistics in groups A and C are well-suited for identifying CNA probabilities, and only group A carries the signals for estimating selection parameters. Moreover, the low number of CNA events at large scale compared to cell counts in single-cell samples means that statistics in group B cannot be estimated accurately using phylogeny reconstruction algorithms at the chromosome level. As data from both bulk and single-cell DNA-sequencing techniques becomes increasingly available, our inference framework promises to facilitate the analysis of distinct cancer types, differentiation between selection and neutral drift, and prediction of cancer clonal dynamics.

Published in Scientific Reports

ISSN: 2045-2322 (Online)
Publisher: Nature Portfolio
Country of publisher: United Kingdom
LCC subjects: Medicine; Science
Website: https://www.nature.com/srep/

About the journal