Journal of Cheminformatics (Aug 2022)

Using Jupyter Notebooks for re-training machine learning models

  • Aljoša Smajić,
  • Melanie Grandits,
  • Gerhard F. Ecker

DOI
https://doi.org/10.1186/s13321-022-00635-2
Journal volume & issue
Vol. 14, no. 1
pp. 1 – 9

Abstract

Read online

Abstract Machine learning (ML) models require an extensive, user-driven selection of molecular descriptors in order to learn from chemical structures to predict actives and inactives with a high reliability. In addition, privacy concerns often restrict the access to sufficient data, leading to models with a narrow chemical space. Therefore, we propose a framework of re-trainable models that can be transferred from one local instance to another, and further allow a less extensive descriptor selection. The models are shared via a Jupyter Notebook, allowing the evaluation and implementation of a broader chemical space by keeping most of the tunable parameters pre-defined. This enables the models to be updated in a decentralized, facile, and fast manner. Herein, the method was evaluated with six transporter datasets (BCRP, BSEP, OATP1B1, OATP1B3, MRP3, P-gp), which revealed the general applicability of this approach.

Keywords