BioData Mining (Jul 2017)

The Dark Proteome Database

  • Nelson Perdigão,
  • Agostinho C. Rosa,
  • Seán I. O’Donoghue

DOI
https://doi.org/10.1186/s13040-017-0144-6
Journal volume & issue
Vol. 10, no. 1
pp. 1 – 11

Abstract

Read online

Abstract Background Recently we surveyed the dark-proteome, i.e., regions of proteins never observed by experimental structure determination and inaccessible to homology modelling. Surprisingly, we found that most of the dark proteome could not be accounted for by conventional explanations (e.g., intrinsic disorder, transmembrane domains, and compositional bias), and that nearly half of the dark proteome comprised dark proteins, in which the entire sequence lacked similarity to any known structure. In this paper we will present the Dark Proteome Database (DPD) and associated web services that provide access to updated information about the dark proteome. Results We assembled DPD from several external web resources (primarily Aquaria and Swiss-Prot) and stored it in a relational database currently containing ~10 million entries and occupying ~2 GBytes of disk space. This database comprises two key tables: one giving information on the ‘darkness’ of each protein, and a second table that breaks each protein into dark and non-dark regions. In addition, a second version of the database is created using also information from the Protein Model Portal (PMP) to determine darkness. To provide access to DPD, a web server has been implemented giving access to all underlying data, as well as providing access to functional analyses derived from these data. Conclusions Availability of this database and its web service will help focus future structural and computational biology efforts to study the dark proteome, thus providing a basis for understanding a wide variety of biological functions that currently remain unknown. Availability and implementation DPD is available at http://darkproteome.ws . The complete database is also available upon request. Data use is permitted via the Creative Commons Attribution-NonCommercial International license ( http://creativecommons.org/licenses/by-nc/4.0/ ).

Keywords