BMC Bioinformatics (Mar 2018)

Ensemble of rankers for efficient gene signature extraction in smoke exposure classification

  • Maurizio Giordano,
  • Kumar Parijat Tripathi,
  • Mario Rosario Guarracino

DOI
https://doi.org/10.1186/s12859-018-2035-3
Journal volume & issue
Vol. 19, no. S2
pp. 41 – 54

Abstract

Read online

Abstract Background System toxicology aims at understanding the mechanisms used by biological systems to respond to toxicants. Such understanding can be leveraged to assess the risk of chemicals, drugs, and consumer products in living organisms. In system toxicology, machine learning techniques and methodologies are applied to develop prediction models for classification of toxicant exposure of biological systems. Gene expression data (RNA/DNA microarray) are often used to develop such prediction models. Results The outcome of the present work is an experimental methodology to develop prediction models, based on robust gene signatures, for the classification of cigarette smoke exposure and cessation in humans. It is a result of the participation in the recent sbv IMPROVER SysTox Computational Challenge. By merging different gene selection techniques, we obtain robust gene signatures and we investigate prediction capabilities of different off-the-shelf machine learning techniques, such as artificial neural networks, linear models and support vector machines. We also predict six novel genes in our signature, and firmly believe these genes have to be further investigated as biomarkers for tobacco smoking exposure. Conclusions The proposed methodology provides gene signatures with top-ranked performances in the prediction of the investigated classification methods, as well as new discoveries in genetic signatures for bio-markers of the smoke exposure of humans.

Keywords