Molecules (Feb 2019)
The Good, the Bad, and the Ugly: “HiPen”, a New Dataset for Validating (S)QM/MM Free Energy Simulations
Abstract
Indirect (S)QM/MM free energy simulations (FES) are vital to efficiently incorporating sufficient sampling and accurate (QM) energetic evaluations when estimating free energies of practical/experimental interest. Connecting between levels of theory, i.e., calculating Δ A l o w → h i g h , remains to be the most challenging step within an indirect FES protocol. To improve calculations of Δ A l o w → h i g h , we must: (1) compare the performance of all FES methods currently available; and (2) compile and maintain datasets of Δ A l o w → h i g h calculated for a wide-variety of molecules so that future practitioners may replicate or improve upon the current state-of-the-art. Towards these two aims, we introduce a new dataset, “HiPen”, which tabulates Δ A g a s M M → 3 o b (the free energy associated with switching from an M M to an S C C − D F T B molecular description using the 3ob parameter set in gas phase), calculated for 22 drug-like small molecules. We compare the calculation of this value using free energy perturbation, Bennett’s acceptance ratio, Jarzynski’s equation, and Crooks’ equation. We also predict the reliability of each calculated Δ A g a s M M → 3 o b by evaluating several convergence criteria including sample size hysteresis, overlap statistics, and bias metric ( Π ). Within the total dataset, three distinct categories of molecules emerge: the “good” molecules, for which we can obtain converged Δ A g a s M M → 3 o b using Jarzynski’s equation; “bad” molecules which require Crooks’ equation to obtain a converged Δ A g a s M M → 3 o b ; and “ugly” molecules for which we cannot obtain reliably converged Δ A g a s M M → 3 o b with either Jarzynski’s or Crooks’ equations. We discuss, in depth, results from several example molecules in each of these categories and describe how dihedral discrepancies between levels of theory cause convergence failures even for these gas phase free energy simulations.
Keywords