Frontiers in Genetics (Apr 2022)

BioTAGME: A Comprehensive Platform for Biological Knowledge Network Analysis

  • Antonio Di Maria,
  • Salvatore Alaimo,
  • Lorenzo Bellomo,
  • Fabrizio Billeci,
  • Paolo Ferragina,
  • Alfredo Ferro,
  • Alfredo Pulvirenti

DOI
https://doi.org/10.3389/fgene.2022.855739
Journal volume & issue
Vol. 13

Abstract

Read online

The inference of novel knowledge and new hypotheses from the current literature analysis is crucial in making new scientific discoveries. In bio-medicine, given the enormous amount of literature and knowledge bases available, the automatic gain of knowledge concerning relationships among biological elements, in the form of semantically related terms (or entities), is rising novel research challenges and corresponding applications. In this regard, we propose BioTAGME, a system that combines an entity-annotation framework based on Wikipedia corpus (i.e., TAGME tool) with a network-based inference methodology (i.e., DT-Hybrid). This integration aims to create an extensive Knowledge Graph modeling relations among biological terms and phrases extracted from titles and abstracts of papers available in PubMed. The framework consists of a back-end and a front-end. The back-end is entirely implemented in Scala and runs on top of a Spark cluster that distributes the computing effort among several machines. The front-end is released through the Laravel framework, connected with the Neo4j graph database to store the knowledge graph.

Keywords