PLoS Computational Biology (Jul 2014)

Inferring clonal composition from multiple sections of a breast cancer.

  • Habil Zare,
  • Junfeng Wang,
  • Alex Hu,
  • Kris Weber,
  • Josh Smith,
  • Debbie Nickerson,
  • ChaoZhong Song,
  • Daniela Witten,
  • C Anthony Blau,
  • William Stafford Noble

DOI
https://doi.org/10.1371/journal.pcbi.1003703
Journal volume & issue
Vol. 10, no. 7
p. e1003703

Abstract

Read online

Cancers arise from successive rounds of mutation and selection, generating clonal populations that vary in size, mutational content and drug responsiveness. Ascertaining the clonal composition of a tumor is therefore important both for prognosis and therapy. Mutation counts and frequencies resulting from next-generation sequencing (NGS) potentially reflect a tumor's clonal composition; however, deconvolving NGS data to infer a tumor's clonal structure presents a major challenge. We propose a generative model for NGS data derived from multiple subsections of a single tumor, and we describe an expectation-maximization procedure for estimating the clonal genotypes and relative frequencies using this model. We demonstrate, via simulation, the validity of the approach, and then use our algorithm to assess the clonal composition of a primary breast cancer and associated metastatic lymph node. After dividing the tumor into subsections, we perform exome sequencing for each subsection to assess mutational content, followed by deep sequencing to precisely count normal and variant alleles within each subsection. By quantifying the frequencies of 17 somatic variants, we demonstrate that our algorithm predicts clonal relationships that are both phylogenetically and spatially plausible. Applying this method to larger numbers of tumors should cast light on the clonal evolution of cancers in space and time.