Computational and Structural Biotechnology Journal (Jan 2021)

Conservation of binding properties in protein models

  • Megan Egbert,
  • Kathryn A. Porter,
  • Usman Ghani,
  • Sergei Kotelnikov,
  • Thu Nguyen,
  • Ryota Ashizawa,
  • Dima Kozakov,
  • Sandor Vajda

Journal volume & issue
Vol. 19
pp. 2549 – 2566

Abstract

Read online

We study the models submitted to round 12 of the Critical Assessment of protein Structure Prediction (CASP) experiment to assess how well the binding properties are conserved when the X-ray structures of the target proteins are replaced by their models. To explore small molecule binding we generate distributions of molecular probes – which are fragment-sized organic molecules of varying size, shape, and polarity – around the protein, and count the number of interactions between each residue and the probes, resulting in a vector of interactions we call a binding fingerprint. The similarity between two fingerprints, one for the X-ray structure and the other for a model of the protein, is determined by calculating the correlation coefficient between the two vectors. The resulting correlation coefficients are shown to correlate with global measures of accuracy established in CASP, and the relationship yields an accuracy threshold that has to be reached for meaningful binding surface conservation. The clusters formed by the probe molecules reliably predict binding hot spots and ligand binding sites in both X-ray structures and reasonably accurate models of the target, but ensembles of models may be needed for assessing the availability of proper binding pockets. We explored ligand docking to the few targets that had bound ligands in the X-ray structure. More targets were available to assess the ability of the models to reproduce protein–protein interactions by docking both the X-ray structures and models to their interaction partners in complexes. It was shown that this application is more difficult than finding small ligand binding sites, and the success rates heavily depend on the local structure in the potential interface. In particular, predicted conformations of flexible loops are frequently incorrect in otherwise highly accurate models, and may prevent predicting correct protein–protein interactions.

Keywords