3DCONS-DB: A Database of Position-Specific Scoring Matrices in Protein Structures
Ruben Sanchez-Garcia,
Carlos Oscar Sanchez Sorzano,
Jose Maria Carazo,
Joan Segura
Affiliations
Ruben Sanchez-Garcia
GN7 of the Spanish National Institute for Bioinformatics (INB) and Biocomputing Unit, National Center of Biotechnology (CSIC)/Instruct Image Processing Center, 28049 Madrid, Spain
Carlos Oscar Sanchez Sorzano
GN7 of the Spanish National Institute for Bioinformatics (INB) and Biocomputing Unit, National Center of Biotechnology (CSIC)/Instruct Image Processing Center, 28049 Madrid, Spain
Jose Maria Carazo
GN7 of the Spanish National Institute for Bioinformatics (INB) and Biocomputing Unit, National Center of Biotechnology (CSIC)/Instruct Image Processing Center, 28049 Madrid, Spain
Joan Segura
GN7 of the Spanish National Institute for Bioinformatics (INB) and Biocomputing Unit, National Center of Biotechnology (CSIC)/Instruct Image Processing Center, 28049 Madrid, Spain
Many studies have used position-specific scoring matrices (PSSM) profiles to characterize residues in protein structures and to predict a broad range of protein features. Moreover, PSSM profiles of Protein Data Bank (PDB) entries have been recalculated in many works for different purposes. Although the computational cost of calculating a single PSSM profile is affordable, many statistical studies or machine learning-based methods used thousands of profiles to achieve their goals, thereby leading to a substantial increase of the computational cost. In this work we present a new database compiling PSSM profiles for the proteins of the PDB. Currently, the database contains 333,532 protein chain profiles involving 123,135 different PDB entries.