Journal of Cheminformatics (Jan 2019)

Large scale comparison of QSAR and conformal prediction methods and their applications in drug discovery

  • Nicolas Bosc,
  • Francis Atkinson,
  • Eloy Felix,
  • Anna Gaulton,
  • Anne Hersey,
  • Andrew R. Leach

DOI
https://doi.org/10.1186/s13321-018-0325-4
Journal volume & issue
Vol. 11, no. 1
pp. 1 – 16

Abstract

Read online

Abstract Structure–activity relationship modelling is frequently used in the early stage of drug discovery to assess the activity of a compound on one or several targets, and can also be used to assess the interaction of compounds with liability targets. QSAR models have been used for these and related applications over many years, with good success. Conformal prediction is a relatively new QSAR approach that provides information on the certainty of a prediction, and so helps in decision-making. However, it is not always clear how best to make use of this additional information. In this article, we describe a case study that directly compares conformal prediction with traditional QSAR methods for large-scale predictions of target-ligand binding. The ChEMBL database was used to extract a data set comprising data from 550 human protein targets with different bioactivity profiles. For each target, a QSAR model and a conformal predictor were trained and their results compared. The models were then evaluated on new data published since the original models were built to simulate a “real world” application. The comparative study highlights the similarities between the two techniques but also some differences that it is important to bear in mind when the methods are used in practical drug discovery applications.

Keywords