PLoS Computational Biology (Dec 2015)

A Large-Scale Assessment of Nucleic Acids Binding Site Prediction Programs.

  • Zhichao Miao,
  • Eric Westhof

DOI
https://doi.org/10.1371/journal.pcbi.1004639
Journal volume & issue
Vol. 11, no. 12
p. e1004639

Abstract

Read online

Computational prediction of nucleic acid binding sites in proteins are necessary to disentangle functional mechanisms in most biological processes and to explore the binding mechanisms. Several strategies have been proposed, but the state-of-the-art approaches display a great diversity in i) the definition of nucleic acid binding sites; ii) the training and test datasets; iii) the algorithmic methods for the prediction strategies; iv) the performance measures and v) the distribution and availability of the prediction programs. Here we report a large-scale assessment of 19 web servers and 3 stand-alone programs on 41 datasets including more than 5000 proteins derived from 3D structures of protein-nucleic acid complexes. Well-defined binary assessment criteria (specificity, sensitivity, precision, accuracy…) are applied. We found that i) the tools have been greatly improved over the years; ii) some of the approaches suffer from theoretical defects and there is still room for sorting out the essential mechanisms of binding; iii) RNA binding and DNA binding appear to follow similar driving forces and iv) dataset bias may exist in some methods.