Identification of Medium-Sized Copy Number Alterations in Whole-Genome Sequencing

Hatice Gulcin Ozer; Aisulu Usubalieva; Adrienne Dorrance; Ayse Selen Yilmaz; Michael Caligiuri; Guido Marcucci; Kun Huang

doi:10.4137/CIN.S14023

Cancer Informatics (Jan 2014)

Identification of Medium-Sized Copy Number Alterations in Whole-Genome Sequencing

Hatice Gulcin Ozer,
Aisulu Usubalieva,
Adrienne Dorrance,
Ayse Selen Yilmaz,
Michael Caligiuri,
Guido Marcucci,
Kun Huang

Affiliations

Hatice Gulcin Ozer: Department of Biomedical Informatics
Aisulu Usubalieva: Department of Biomedical Informatics
Adrienne Dorrance: Division of Hematology, Department of Medicine, The Ohio State University, Columbus, OH, USA.
Ayse Selen Yilmaz: Department of Biomedical Informatics
Michael Caligiuri: Division of Hematology, Department of Medicine, The Ohio State University, Columbus, OH, USA.
Guido Marcucci: Division of Hematology, Department of Medicine, The Ohio State University, Columbus, OH, USA.
Kun Huang: Department of Biomedical Informatics

DOI: https://doi.org/10.4137/CIN.S14023
Journal volume & issue: Vol. 13s3

Abstract

Read online

The genome-wide discoveries such as detection of copy number alterations (CNA) from high-throughput whole-genome sequencing data enabled new developments in personalized medicine. The CNAs have been reported to be associated with various diseases and cancers including acute myeloid leukemia. However, there are multiple challenges to the use of current CNA detection tools that lead to high false-positive rates and thus impede widespread use of such tools in cancer research. In this paper, we discuss these issues and propose possible solutions. First, since the entire genome cannot be mapped due to some regions lacking sequence uniqueness, current methods cannot be appropriately adjusted to handle these regions in the analyses. Thus, detection of medium-sized CNAs is also being directly affected by these mappability problems. The requirement for matching control samples is also an important limitation because acquiring matching controls might not be possible or might not be cost efficient. Here we present an approach that addresses these issues and detects medium-sized CNAs in cancer genomes by (1) masking unmappable regions during the initial CNA detection phase, (2) using pool of a few normal samples as control, and (3) employing median filtering to adjust CNA ratios to its surrounding coverage and eliminate false positives.

Published in Cancer Informatics

ISSN: 1176-9351 (Online)
Publisher: SAGE Publishing
Country of publisher: United Kingdom
LCC subjects: Medicine: Internal medicine: Neoplasms. Tumors. Oncology. Including cancer and carcinogens
Website: https://journals.sagepub.com/home/cix

About the journal