Accurate computation of survival statistics in genome-wide studies.

Fabio Vandin; Alexandra Papoutsaki; Benjamin J Raphael; Eli Upfal

doi:10.1371/journal.pcbi.1004071

PLoS Computational Biology (May 2015)

Accurate computation of survival statistics in genome-wide studies.

Fabio Vandin,
Alexandra Papoutsaki,
Benjamin J Raphael,
Eli Upfal

Affiliations

Fabio Vandin
Alexandra Papoutsaki
Benjamin J Raphael
Eli Upfal

DOI: https://doi.org/10.1371/journal.pcbi.1004071
Journal volume & issue: Vol. 11, no. 5
p. e1004071

Abstract

Read online

A key challenge in genomics is to identify genetic variants that distinguish patients with different survival time following diagnosis or treatment. While the log-rank test is widely used for this purpose, nearly all implementations of the log-rank test rely on an asymptotic approximation that is not appropriate in many genomics applications. This is because: the two populations determined by a genetic variant may have very different sizes; and the evaluation of many possible variants demands highly accurate computation of very small p-values. We demonstrate this problem for cancer genomics data where the standard log-rank test leads to many false positive associations between somatic mutations and survival time. We develop and analyze a novel algorithm, Exact Log-rank Test (ExaLT), that accurately computes the p-value of the log-rank statistic under an exact distribution that is appropriate for any size populations. We demonstrate the advantages of ExaLT on data from published cancer genomics studies, finding significant differences from the reported p-values. We analyze somatic mutations in six cancer types from The Cancer Genome Atlas (TCGA), finding mutations with known association to survival as well as several novel associations. In contrast, standard implementations of the log-rank test report dozens-hundreds of likely false positive associations as more significant than these known associations.

Published in PLoS Computational Biology

ISSN: 1553-734X (Print); 1553-7358 (Online)
Publisher: Public Library of Science (PLoS)
Country of publisher: United States
LCC subjects: Science: Biology (General)
Website: https://journals.plos.org/ploscompbiol/

About the journal