T1000: a reduced gene set prioritized for toxicogenomic studies

Othman Soufan; Jessica Ewald; Charles Viau; Doug Crump; Markus Hecker; Niladri Basu; Jianguo Xia

doi:10.7717/peerj.7975

PeerJ (Oct 2019)

T1000: a reduced gene set prioritized for toxicogenomic studies

Othman Soufan,
Jessica Ewald,
Charles Viau,
Doug Crump,
Markus Hecker,
Niladri Basu,
Jianguo Xia

Affiliations

Othman Soufan: Institute of Parasitology, McGill University, Montreal, Canada
Jessica Ewald: Faculty of Agricultural and Environmental Sciences, McGill University, Montreal, Canada
Charles Viau: Institute of Parasitology, McGill University, Montreal, Canada
Doug Crump: Ecotoxicology and Wildlife Health Division, Environment and Climate Change Canada, National Wildlife Research Centre, Carleton University, Ottawa, Canada
Markus Hecker: School of the Environment & Sustainability and Toxicology Centre, University of Saskatchewan, Saskatoon, Canada
Niladri Basu: Faculty of Agricultural and Environmental Sciences, McGill University, Montreal, Canada
Jianguo Xia: Institute of Parasitology, McGill University, Montreal, Canada

DOI: https://doi.org/10.7717/peerj.7975
Journal volume & issue: Vol. 7
p. e7975

Abstract

Read online Read online

There is growing interest within regulatory agencies and toxicological research communities to develop, test, and apply new approaches, such as toxicogenomics, to more efficiently evaluate chemical hazards. Given the complexity of analyzing thousands of genes simultaneously, there is a need to identify reduced gene sets. Though several gene sets have been defined for toxicological applications, few of these were purposefully derived using toxicogenomics data. Here, we developed and applied a systematic approach to identify 1,000 genes (called Toxicogenomics-1000 or T1000) highly responsive to chemical exposures. First, a co-expression network of 11,210 genes was built by leveraging microarray data from the Open TG-GATEs program. This network was then re-weighted based on prior knowledge of their biological (KEGG, MSigDB) and toxicological (CTD) relevance. Finally, weighted correlation network analysis was applied to identify 258 gene clusters. T1000 was defined by selecting genes from each cluster that were most associated with outcome measures. For model evaluation, we compared the performance of T1000 to that of other gene sets (L1000, S1500, Genes selected by Limma, and random set) using two external datasets based on the rat model. Additionally, a smaller (T384) and a larger version (T1500) of T1000 were used for dose-response modeling to test the effect of gene set size. Our findings demonstrated that the T1000 gene set is predictive of apical outcomes across a range of conditions (e.g., in vitro and in vivo, dose-response, multiple species, tissues, and chemicals), and generally performs as well, or better than other gene sets available.

Published in PeerJ

ISSN: 2167-8359 (Online)
Publisher: PeerJ Inc.
Country of publisher: United States
LCC subjects: Medicine; Science: Biology (General)
Website: https://peerj.com/

About the journal

Abstract

Keywords