Scientific Reports (Nov 2023)

Streamlining pipeline efficiency: a novel model-agnostic technique for accelerating conditional generative and virtual screening pipelines

  • Karthik Viswanathan,
  • Manan Goel,
  • Siddhartha Laghuvarapu,
  • Girish Varma,
  • U. Deva Priyakumar

DOI
https://doi.org/10.1038/s41598-023-42952-y
Journal volume & issue
Vol. 13, no. 1
pp. 1 – 11

Abstract

Read online

Abstract The discovery of potential therapeutic agents for life-threatening diseases has become a significant problem. There is a requirement for fast and accurate methods to identify drug-like molecules that can be used as potential candidates for novel targets. Existing techniques like high-throughput screening and virtual screening are time-consuming and inefficient. Traditional molecule generation pipelines are more efficient than virtual screening but use time-consuming docking software. Such docking functions can be emulated using Machine Learning models with comparable accuracy and faster execution times. However, we find that when pre-trained machine learning models are employed in generative pipelines as oracles, they suffer from model degradation in areas where data is scarce. In this study, we propose an active learning-based model that can be added as a supplement to enhanced molecule generation architectures. The proposed method uses uncertainty sampling on the molecules created by the generator model and dynamically learns as the generator samples molecules from different regions of the chemical space. The proposed framework can generate molecules with high binding affinity with $$\sim$$ ∼ a 70% improvement in runtime compared to the baseline model by labeling only $$\sim$$ ∼ 30% of molecules compared to the baseline oracle.