Dataset on aquatic ecotoxicity predictions of 2697 chemicals, using three quantitative structure-activity relationship platforms
Patrik Svedberg,
Pedro A. Inostroza,
Mikael Gustavsson,
Erik Kristiansson,
Francis Spilsbury,
Thomas Backhaus
Affiliations
Patrik Svedberg
Department of Biological and Environmental Sciences, University of Gothenburg, PO Box 463, SE-405 30 Gothenburg, Sweden; Corresponding author.
Pedro A. Inostroza
Department of Biological and Environmental Sciences, University of Gothenburg, PO Box 463, SE-405 30 Gothenburg, Sweden; Institute for Environmental Research, RWTH Aachen University, D-52072 Aachen, Germany
Mikael Gustavsson
Department of Biological and Environmental Sciences, University of Gothenburg, PO Box 463, SE-405 30 Gothenburg, Sweden; Department of Economics, University of Gothenburg, PO Box 640, SE-405 30 Gothenburg, Sweden
Erik Kristiansson
Department of Mathematical Sciences, Chalmers University of Technology and University of Gothenburg, SE-412 96 Gothenburg, Sweden
Francis Spilsbury
Department of Biological and Environmental Sciences, University of Gothenburg, PO Box 463, SE-405 30 Gothenburg, Sweden
Thomas Backhaus
Department of Biological and Environmental Sciences, University of Gothenburg, PO Box 463, SE-405 30 Gothenburg, Sweden; Institute for Environmental Research, RWTH Aachen University, D-52072 Aachen, Germany
Empirical and in silico data on the aquatic ecotoxicology of 2697 organic chemicals were collected in order to compile a dataset for assessing the predictive power of current Quantitative Structure Activity Relationship (QSAR) models and software platforms. This document presents the dataset and the data pipeline for its creation. Empirical data were collected from the US EPA ECOTOX Knowledgebase (ECOTOX) and the EFSA (European Food Safety Authority) report “Completion of data entry of pesticide ecotoxicology Tier 1 study endpoints in a XML schema – database”. Only data for OECD recommended algae, daphnia and fish species were retained. QSAR toxicity predictions were calculated for each chemical and each of six endpoints using ECOSAR, VEGA and the Toxicity Estimation Software Tool (T.E.S.T.) platforms. Finally, the dataset was amended with SMILES, InChIKey, pKa and logP collected from webchem and PubChem.