IEEE Access (Jan 2021)

Towards a Shared, Conceptual Model-Based Understanding of Proteins and Their Interactions

  • Ana Leon,
  • Oscar Pastor

DOI
https://doi.org/10.1109/ACCESS.2021.3080040
Journal volume & issue
Vol. 9
pp. 73608 – 73623

Abstract

Read online

Understanding the human genome is a big research challenge. The huge complexity and amount of genome data require extremely effective and efficient data management policies. A first crucial point is to obtain a shared understanding of the domain, which becomes a very hard task considering the number of different genome data sources. To make things more complicated, those data sources deal with different parts of genome-based information: we not only need to understand them well, but also to integrate and intercommunicate all the relevant information. The protein perspective is a good example: rich, well-known repositories such as UniProt provide a lot of valuable information that it is not easy to interpret and manage when we want to generate useful results. Proteomes and basic information, protein-protein interaction, protein structure, protein processing events, protein function, etc. provide a lot of information is that needs to be conceptually characterized and delimited. To facilitate the essential common understanding of the domain, this paper uses the case of proteins to analyze the data provided by Uniprot in order to make a sound conceptualization work for identifying the relevant domain concepts. A conceptual model of proteins is the result of this conceptualization process, explained in detail in this work. This holistic conceptual model of proteins presented in this paper is the result of achieving a precise ontological commitment. It establishes concepts and their relationships that are significant in order to have a solid basis to efficiently manage relevant genome data related to proteins.

Keywords