Scientific Data (Oct 2024)

Quantum chemical calculation dataset for representative protein folds by the fragment molecular orbital method

  • Daisuke Takaya,
  • Shu Ohno,
  • Toma Miyagishi,
  • Sota Tanaka,
  • Koji Okuwaki,
  • Chiduru Watanabe,
  • Koichiro Kato,
  • Yu-Shi Tian,
  • Kaori Fukuzawa

DOI
https://doi.org/10.1038/s41597-024-03999-2
Journal volume & issue
Vol. 11, no. 1
pp. 1 – 10

Abstract

Read online

Abstract The function of a biomacromolecule is not only determined by its three-dimensional structure but also by its electronic state. Quantum chemical calculations are promising non-empirical methods available for determining the electronic state of a given structure. In this study, we used the fragment molecular orbital (FMO) method, which applies to biopolymers such as proteins, to provide physicochemical property values on representative structures in the SCOP2 database of protein families, a subset of the Protein Data Bank. Our dataset was constructed by over 5,000 protein structures, including over 200 million inter-fragment interaction energies (IFIEs) and their energy components obtained by pair interaction energy decomposition analysis (PIEDA) using FMO-MP2/6-31 G*. Moreover, three basis sets, 6-31 G*, 6-31 G**, and cc-pVDZ, were used for the FMO calculations of each structure, making it possible to compare the energies obtained with different basis functions for the same fragment pair. The total data size is approximately 6.7 GB. Our dataset will be useful for functional analyses and machine learning based on the physicochemical property values of proteins.