Frontiers in Bioinformatics (Dec 2023)

RCSB Protein Data Bank: visualizing groups of experimentally determined PDB structures alongside computed structure models of proteins

  • Joan Segura,
  • Yana Rose,
  • Chunxiao Bi,
  • Jose Duarte,
  • Stephen K. Burley,
  • Stephen K. Burley,
  • Stephen K. Burley,
  • Stephen K. Burley,
  • Stephen K. Burley,
  • Sebastian Bittrich

DOI
https://doi.org/10.3389/fbinf.2023.1311287
Journal volume & issue
Vol. 3

Abstract

Read online

Recent advances in Artificial Intelligence and Machine Learning (e.g., AlphaFold, RosettaFold, and ESMFold) enable prediction of three-dimensional (3D) protein structures from amino acid sequences alone at accuracies comparable to lower-resolution experimental methods. These tools have been employed to predict structures across entire proteomes and the results of large-scale metagenomic sequence studies, yielding an exponential increase in available biomolecular 3D structural information. Given the enormous volume of this newly computed biostructure data, there is an urgent need for robust tools to manage, search, cluster, and visualize large collections of structures. Equally important is the capability to efficiently summarize and visualize metadata, biological/biochemical annotations, and structural features, particularly when working with vast numbers of protein structures of both experimental origin from the Protein Data Bank (PDB) and computationally-predicted models. Moreover, researchers require advanced visualization techniques that support interactive exploration of multiple sequences and structural alignments. This paper introduces a suite of tools provided on the RCSB PDB research-focused web portal RCSB. org, tailor-made for efficient management, search, organization, and visualization of this burgeoning corpus of 3D macromolecular structure data.

Keywords