Data Science Journal (Jun 2020)

Impact of the Protein Data Bank Across Scientific Disciplines

  • Zukang Feng,
  • Natalie Verdiguel,
  • Luigi Di Costanzo,
  • David S. Goodsell,
  • John D. Westbrook,
  • Stephen K. Burley,
  • Christine Zardecki

DOI
https://doi.org/10.5334/dsj-2020-025
Journal volume & issue
Vol. 19, no. 1

Abstract

Read online

The Protein Data Bank archive (PDB) was established in 1971 as the 1st open access digital data resource for biology and medicine. Today, the PDB contains >160,000 atomic-level, experimentally-determined 3D biomolecular structures. PDB data are freely and publicly available for download, without restrictions. Each entry contains summary information about the structure and experiment, atomic coordinates, and in most cases, a citation to a corresponding scientific publication. Individually and in bulk, PDB structures can be downloaded and/or analyzed and visualized online using tools at RCSB.org. As such, it is challenging to understand and monitor reuse of data. Citations of the scientific publications describing PDB structures provide one way of understanding which structures are being used, and in which research areas. Our analysis highlights frequently-cited structures and identifies milestone structures that have demonstrated impact across scientific fields.

Keywords