Current composite-feature classification methods do not outperform simple single-genes classifiers in breast cancer prognosis

Christine eStaiger; Christine eStaiger; Sidney eCadot; Balázs eGyörffy; Lodewyk FA Wessels; Lodewyk FA Wessels; Gunnar W Klau; Gunnar W Klau

doi:10.3389/fgene.2013.00289

Frontiers in Genetics (Dec 2013)

Current composite-feature classification methods do not outperform simple single-genes classifiers in breast cancer prognosis

Christine eStaiger,
Christine eStaiger,
Sidney eCadot,
Balázs eGyörffy,
Lodewyk FA Wessels,
Lodewyk FA Wessels,
Gunnar W Klau,
Gunnar W Klau

Affiliations

Christine eStaiger: Centrum Wiskunde & Informatica
Christine eStaiger: The Netherlands Cancer Institute
Sidney eCadot: The Netherlands Cancer Institute
Balázs eGyörffy: Hungarian Academy of Sciences
Lodewyk FA Wessels: The Netherlands Cancer Institute
Lodewyk FA Wessels: TU Delft
Gunnar W Klau: Centrum Wiskunde & Informatica
Gunnar W Klau: VU University Amsterdam

DOI: https://doi.org/10.3389/fgene.2013.00289
Journal volume & issue: Vol. 4

Abstract

Read online

Integrating gene expression data with secondary data such as pathway or protein-protein interaction data has been proposed as a promising approach for improved outcome prediction of cancer patients. Methods employing this approach usually aggregate the expression of genes into new composite features, while the secondary data guide this aggregation. Previous studies were limited to few data sets with a small number of patients. Moreover, each study used different data and evaluation procedures. This makes it difficult to objectively assess the gain in classification performance. Here we introduce the Amsterdam Classification Evaluation Suite (ACES). ACES is a Python package to objectively evaluate classification and feature-selection methods and contains methods for pooling and normalizing Affymetrix microarrays from different studies. It is simple to use and therefore facilitates the comparison of new approaches to best-in-class approaches. In addition to the methods described in our earlier study (Staiger et al. (2012), PLoS One, 7, 4: e34796), we have included two prominent prognostic gene signatures specific for breast cancer outcome, one more composite feature selection method and two network-based gene ranking methods. Employing the evaluation pipeline we show that current composite-feature classification methods do not outperform simple single-genes classifiers in predicting outcome in breast cancer. Furthermore, we find that also the stability of features across different data sets is not higher for composite features. Most stunningly, we observe that prediction performances are not affected when extracting features from randomized PPI networks.

Published in Frontiers in Genetics

ISSN: 1664-8021 (Online)
Publisher: Frontiers Media S.A.
Country of publisher: Switzerland
LCC subjects: Science: Biology (General): Genetics
Website: http://journal.frontiersin.org/journal/genetics

About the journal

Abstract

Keywords