Mathematical Biosciences and Engineering (Jun 2022)

Matching biomedical ontologies with GCN-based feature propagation

  • Peng Wang,
  • Shiyi Zou ,
  • Jiajun Liu,
  • Wenjun Ke

DOI
https://doi.org/10.3934/mbe.2022394
Journal volume & issue
Vol. 19, no. 8
pp. 8479 – 8504

Abstract

Read online

With an increasing number of biomedical ontologies being evolved independently, matching these ontologies to solve the interoperability problem has become a critical issue in biomedical applications. Traditional biomedical ontology matching methods are mostly based on rules or similarities for concepts and properties. These approaches require manually designed rules that not only fail to address the heterogeneity of domain ontology terminology and the ambiguity of multiple meanings of words, but also make it difficult to capture structural information in ontologies that contain a large amount of semantics during matching. Recently, various knowledge graph (KG) embedding techniques utilizing deep learning methods to deal with the heterogeneity in knowledge graphs (KGs), have quickly gained massive attention. However, KG embedding focuses mainly on entity alignment (EA). EA tasks and ontology matching (OM) tasks differ dramatically in terms of matching elements, semantic information and application scenarios, etc., hence these methods cannot be applied directly to biomedical ontologies that contain abstract concepts but almost no entities. To tackle these issues, this paper proposes a novel approach called BioOntGCN that directly learns embeddings of ontology-pairs for biomedical ontology matching. Specifically, we first generate a pair-wise connectivity graph (PCG) of two ontologies, whose nodes are concept-pairs and edges correspond to property-pairs. Subsequently, we learn node embeddings of the PCG to predicate the matching results through following phases: 1) A convolutional neural network (CNN) to extract the similarity feature vectors of nodes; 2) A graph convolutional network (GCN) to propagate the similarity features and obtain the final embeddings of concept-pairs. Consequently, the biomedical ontology matching problem is transformed into a binary classification problem. We conduct systematic experiments on real-world biomedical ontologies in Ontology Alignment Evaluation Initiative (OAEI), and the results show that our approach significantly outperforms other entity alignment methods and achieves state-of-the-art performance. This indicates that BioOntGCN is more applicable to ontology matching than the EA method. At the same time, BioOntGCN substantially achieves superior performance compared with previous ontology matching (OM) systems, which suggests that BioOntGCN based on the representation learning is more effective than the traditional approaches.

Keywords