Molecular Sets (MOSES): A Benchmarking Platform for Molecular Generation Models

Daniil Polykovskiy; Alexander Zhebrak; Benjamin Sanchez-Lengeling; Sergey Golovanov; Oktai Tatanov; Stanislav Belyaev; Rauf Kurbanov; Aleksey Artamonov; Vladimir Aladinskiy; Mark Veselov; Artur Kadurin; Simon Johansson; Hongming Chen; Sergey Nikolenko; Sergey Nikolenko; Sergey Nikolenko; Alán Aspuru-Guzik; Alán Aspuru-Guzik; Alán Aspuru-Guzik; Alán Aspuru-Guzik; Alex Zhavoronkov

doi:10.3389/fphar.2020.565644

Frontiers in Pharmacology (Dec 2020)

Molecular Sets (MOSES): A Benchmarking Platform for Molecular Generation Models

Daniil Polykovskiy,
Alexander Zhebrak,
Benjamin Sanchez-Lengeling,
Sergey Golovanov,
Oktai Tatanov,
Stanislav Belyaev,
Rauf Kurbanov,
Aleksey Artamonov,
Vladimir Aladinskiy,
Mark Veselov,
Artur Kadurin,
Simon Johansson,
Hongming Chen,
Sergey Nikolenko,
Sergey Nikolenko,
Sergey Nikolenko,
Alán Aspuru-Guzik,
Alán Aspuru-Guzik,
Alán Aspuru-Guzik,
Alán Aspuru-Guzik,
Alex Zhavoronkov

Affiliations

Daniil Polykovskiy: Insilico Medicine Hong Kong Ltd., Pak Shek Kok, Hong Kong
Alexander Zhebrak: Insilico Medicine Hong Kong Ltd., Pak Shek Kok, Hong Kong
Benjamin Sanchez-Lengeling: Chemistry and Chemical Biology Department, Harvard University, Cambridge, MA, United States
Sergey Golovanov: Neuromation OU, Tallinn, Estonia
Oktai Tatanov: Neuromation OU, Tallinn, Estonia
Stanislav Belyaev: Neuromation OU, Tallinn, Estonia
Rauf Kurbanov: Neuromation OU, Tallinn, Estonia
Aleksey Artamonov: Neuromation OU, Tallinn, Estonia
Vladimir Aladinskiy: Insilico Medicine Hong Kong Ltd., Pak Shek Kok, Hong Kong
Mark Veselov: Insilico Medicine Hong Kong Ltd., Pak Shek Kok, Hong Kong
Artur Kadurin: Insilico Medicine Hong Kong Ltd., Pak Shek Kok, Hong Kong
Simon Johansson: Molecular AI, DiscoverySciences, R&D, AstraZeneca, Gothenburg, Sweden
Hongming Chen: Molecular AI, DiscoverySciences, R&D, AstraZeneca, Gothenburg, Sweden
Sergey Nikolenko: Insilico Medicine Hong Kong Ltd., Pak Shek Kok, Hong Kong
Sergey Nikolenko: Neuromation OU, Tallinn, Estonia
Sergey Nikolenko: Computer Science Department, National Research University Higher School of Economics, St. Petersburg, Russia
Alán Aspuru-Guzik: Chemical Physics Theory Group, Department of Chemistry, University of Toronto, Toronto, ON, Canada
Alán Aspuru-Guzik: Department of Computer Science, University of Toronto, Toronto, ON, Canada
Alán Aspuru-Guzik: CIFAR AI Chair, Vector Institute for Artificial Intelligence, Toronto, ON, Canada
Alán Aspuru-Guzik: Lebovic Fellow, Canadian Institute for Advanced Research (CIFAR), Toronto, ON, Canada
Alex Zhavoronkov: Insilico Medicine Hong Kong Ltd., Pak Shek Kok, Hong Kong

DOI: https://doi.org/10.3389/fphar.2020.565644
Journal volume & issue: Vol. 11

Abstract

Read online

Generative models are becoming a tool of choice for exploring the molecular space. These models learn on a large training dataset and produce novel molecular structures with similar properties. Generated structures can be utilized for virtual screening or training semi-supervized predictive models in the downstream tasks. While there are plenty of generative models, it is unclear how to compare and rank them. In this work, we introduce a benchmarking platform called Molecular Sets (MOSES) to standardize training and comparison of molecular generative models. MOSES provides training and testing datasets, and a set of metrics to evaluate the quality and diversity of generated structures. We have implemented and compared several molecular generation models and suggest to use our results as reference points for further advancements in generative chemistry research. The platform and source code are available at https://github.com/molecularsets/moses.

Published in Frontiers in Pharmacology

ISSN: 1663-9812 (Online)
Publisher: Frontiers Media S.A.
Country of publisher: Switzerland
LCC subjects: Medicine: Therapeutics. Pharmacology
Website: http://journal.frontiersin.org/journals/pharmacology

About the journal

Abstract

Keywords