Cell Reports Physical Science (Sep 2021)

Combating small-molecule aggregation with machine learning

  • Kuan Lee,
  • Ann Yang,
  • Yen-Chu Lin,
  • Daniel Reker,
  • Gonçalo J.L. Bernardes,
  • Tiago Rodrigues

Journal volume & issue
Vol. 2, no. 9
p. 100573

Abstract

Read online

Summary: Biological screens are plagued by false-positive hits resulting from aggregation. Methods to triage small colloidally aggregating molecules (SCAMs) are in high demand. Herein, we disclose a neural network to flag such entities. Our data demonstrate the utility of machine learning for predicting SCAMs, achieving 80% of correct predictions in an out-of-sample evaluation. The tool is competitive with a panel of expert chemists, who correctly predict 61% ± 7% of the same molecules in a Turing-like test. Our computational routine provides insight into features governing aggregation that had remained hidden to expert intuition. Further, we quantify that up to 15%–20% of ligands in publicly available chemogenomic databases have high potential to aggregate at a typical screening concentration (30 μM), imposing caution in systems biology and drug design programs. Our approach provides a means to augment human intuition and mitigate attrition and a pathway to accelerate future molecular medicine.

Keywords