MolScore: a scoring, evaluation and benchmarking framework for generative models in de novo drug design

Morgan Thomas; Noel M. O’Boyle; Andreas Bender; Chris De Graaf

doi:10.1186/s13321-024-00861-w

Journal of Cheminformatics (May 2024)

MolScore: a scoring, evaluation and benchmarking framework for generative models in de novo drug design

Morgan Thomas,
Noel M. O’Boyle,
Andreas Bender,
Chris De Graaf

Affiliations

Morgan Thomas: Centre for Molecular Informatics, Department of Chemistry, University of Cambridge
Noel M. O’Boyle: Computational Chemistry, Nxera Pharma
Andreas Bender: Centre for Molecular Informatics, Department of Chemistry, University of Cambridge
Chris De Graaf: Computational Chemistry, Nxera Pharma

DOI: https://doi.org/10.1186/s13321-024-00861-w
Journal volume & issue: Vol. 16, no. 1
pp. 1 – 20

Abstract

Read online

Abstract Generative models are undergoing rapid research and application to de novo drug design. To facilitate their application and evaluation, we present MolScore. MolScore already contains many drug-design-relevant scoring functions commonly used in benchmarks such as, molecular similarity, molecular docking, predictive models, synthesizability, and more. In addition, providing performance metrics to evaluate generative model performance based on the chemistry generated. With this unification of functionality, MolScore re-implements commonly used benchmarks in the field (such as GuacaMol, MOSES, and MolOpt). Moreover, new benchmarks can be created trivially. We demonstrate this by testing a chemical language model with reinforcement learning on three new tasks of increasing complexity related to the design of 5-HT2a ligands that utilise either molecular descriptors, 266 pre-trained QSAR models, or dual molecular docking. Lastly, MolScore can be integrated into an existing Python script with just three lines of code. This framework is a step towards unifying generative model application and evaluation as applied to drug design for both practitioners and researchers. The framework can be found on GitHub and downloaded directly from the Python Package Index. Scientific Contribution MolScore is an open-source platform to facilitate generative molecular design and evaluation thereof for application in drug design. This platform takes important steps towards unifying existing benchmarks, providing a platform to share new benchmarks, and improves customisation, flexibility and usability for practitioners over existing solutions. Graphical Abstract

Published in Journal of Cheminformatics

ISSN: 1758-2946 (Online)
Publisher: BMC
Country of publisher: United Kingdom
LCC subjects: Technology: Technology (General): Industrial engineering. Management engineering: Information technology; Science: Chemistry
Website: https://jcheminf.biomedcentral.com/

About the journal

Abstract

Keywords