An Analysis of Proteochemometric and Conformal Prediction Machine Learning Protein-Ligand Binding Affinity Models

Conor Parks; Zied Gaieb; Rommie E. Amaro

doi:10.3389/fmolb.2020.00093

Frontiers in Molecular Biosciences (Jun 2020)

An Analysis of Proteochemometric and Conformal Prediction Machine Learning Protein-Ligand Binding Affinity Models

Conor Parks,
Zied Gaieb,
Rommie E. Amaro

Affiliations

Conor Parks
Zied Gaieb
Rommie E. Amaro

DOI: https://doi.org/10.3389/fmolb.2020.00093
Journal volume & issue: Vol. 7

Abstract

Read online

Protein-ligand binding affinity is a key pharmacodynamic endpoint in drug discovery. Sole reliance on experimental design, make, and test cycles is costly and time consuming, providing an opportunity for computational methods to assist. Herein, we present results comparing random forest and feed-forward neural network proteochemometric models for their ability to predict pIC50 measurements for held out generic Bemis-Murcko scaffolds. In addition, we assess the ability of conformal prediction to provide calibrated prediction intervals in both a retrospective and semi-prospective test using the recently released Grand Challenge 4 data set as an external test set. In total, random forest and deep neural network proteochemometric models show quality retrospective performance but suffer in the semi-prospective setting. However, the conformal predictor prediction intervals prove to be well-calibrated both retrospectively and semi-prospectively showing that they can be used to guide hit discovery and lead optimization campaigns.

Published in Frontiers in Molecular Biosciences

ISSN: 2296-889X (Online)
Publisher: Frontiers Media S.A.
Country of publisher: Switzerland
LCC subjects: Science: Biology (General)
Website: https://www.frontiersin.org/journals/molecular-biosciences

About the journal

Abstract

Keywords