Communications Chemistry (Apr 2024)

Fast and effective molecular property prediction with transferability map

  • Shaolun Yao,
  • Jie Song,
  • Lingxiang Jia,
  • Lechao Cheng,
  • Zipeng Zhong,
  • Mingli Song,
  • Zunlei Feng

DOI
https://doi.org/10.1038/s42004-024-01169-4
Journal volume & issue
Vol. 7, no. 1
pp. 1 – 10

Abstract

Read online

Abstract Effective transfer learning for molecular property prediction has shown considerable strength in addressing insufficient labeled molecules. Many existing methods either disregard the quantitative relationship between source and target properties, risking negative transfer, or require intensive training on target tasks. To quantify transferability concerning task-relatedness, we propose Principal Gradient-based Measurement (PGM) for transferring molecular property prediction ability. First, we design an optimization-free scheme to calculate a principal gradient for approximating the direction of model optimization on a molecular property prediction dataset. We have analyzed the close connection between the principal gradient and model optimization through mathematical proof. PGM measures the transferability as the distance between the principal gradient obtained from the source dataset and that derived from the target dataset. Then, we perform PGM on various molecular property prediction datasets to build a quantitative transferability map for source dataset selection. Finally, we evaluate PGM on multiple combinations of transfer learning tasks across 12 benchmark molecular property prediction datasets and demonstrate that it can serve as fast and effective guidance to improve the performance of a target task. This work contributes to more efficient discovery of drugs, materials, and catalysts by offering a task-relatedness quantification prior to transfer learning and understanding the relationship between chemical properties.