Journal of Cheminformatics (Aug 2024)

Automatic molecular fragmentation by evolutionary optimisation

  • Fiona C. Y. Yu,
  • Jorge L. Gálvez Vallejo,
  • Giuseppe M. J. Barca

DOI
https://doi.org/10.1186/s13321-024-00896-z
Journal volume & issue
Vol. 16, no. 1
pp. 1 – 25

Abstract

Read online

Abstract Molecular fragmentation is an effective suite of approaches to reduce the formal computational complexity of quantum chemistry calculations while enhancing their algorithmic parallelisability. However, the practical applicability of fragmentation techniques remains hindered by a dearth of automation and effective metrics to assess the quality of a fragmentation scheme. In this article, we present the Quick Fragmentation via Automated Genetic Search (QFRAGS), a novel automated fragmentation algorithm that uses a genetic optimisation procedure to generate molecular fragments that yield low energy errors when adopted in Many Body Expansions (MBEs). Benchmark testing of QFRAGS on protein systems with less than 500 atoms, using two-body (MBE2) and three-body (MBE3) MBE calculations at the HF/6-31G* level, reveals mean absolute energy errors (MAEE) of 20.6 and 2.2 kJ $$\hbox {mol}^{-1}$$ mol - 1 , respectively. For larger protein systems exceeding 500 atoms, MAEEs are 181.5 kJ $$\hbox {mol}^{-1}$$ mol - 1 for MBE2 and 24.3 kJ $$\hbox {mol}^{-1}$$ mol - 1 for MBE3. Furthermore, when compared to three manual fragmentation schemes on a 40-protein dataset, using both MBE and Fragment Molecular Orbital techniques, QFRAGS achieves comparable or often lower MAEEs. When applied to a 10-lipoglycan/glycolipid dataset, MAEs of 7.9 and 0.3 kJ $$\hbox {mol}^{-1}$$ mol - 1 were observed at the MBE2 and MBE3 levels, respectively. Scientific Contribution This Article presents the Quick Fragmentation via Automated Genetic Search (QFRAGS), an innovative molecular fragmentation algorithm that significantly improves upon existing molecular fragmentation approaches by specifically addressing their lack of automation and effective fragmentation quality metrics. With an evolutionary optimisation strategy, QFRAGS actively pursues high quality fragments, generating fragmentation schemes that exhibit minimal energy errors on systems with hundreds to thousands of atoms. The advent of QFRAGS represents a significant advancement in molecular fragmentation, greatly improving the accessibility and computational feasibility of accurate quantum chemistry calculations.

Keywords