Scientific Reports (Feb 2021)

Molecular function recognition by supervised projection pursuit machine learning

  • Tyler Grear,
  • Chris Avery,
  • John Patterson,
  • Donald J. Jacobs

DOI
https://doi.org/10.1038/s41598-021-83269-y
Journal volume & issue
Vol. 11, no. 1
pp. 1 – 15

Abstract

Read online

Abstract Identifying mechanisms that control molecular function is a significant challenge in pharmaceutical science and molecular engineering. Here, we present a novel projection pursuit recurrent neural network to identify functional mechanisms in the context of iterative supervised machine learning for discovery-based design optimization. Molecular function recognition is achieved by pairing experiments that categorize systems with digital twin molecular dynamics simulations to generate working hypotheses. Feature extraction decomposes emergent properties of a system into a complete set of basis vectors. Feature selection requires signal-to-noise, statistical significance, and clustering quality to concurrently surpass acceptance levels. Formulated as a multivariate description of differences and similarities between systems, the data-driven working hypothesis is refined by analyzing new systems prioritized by a discovery-likelihood. Utility and generality are demonstrated on several benchmarks, including the elucidation of antibiotic resistance in TEM-52 beta-lactamase. The software is freely available, enabling turnkey analysis of massive data streams found in computational biology and material science.