Unsupervised Event Graph Representation and Similarity Learning on Biomedical Literature

Giacomo Frisoni; Gianluca Moro; Giulio Carlassare; Antonella Carbonaro

doi:10.3390/s22010003

Sensors (Dec 2021)

Unsupervised Event Graph Representation and Similarity Learning on Biomedical Literature

Giacomo Frisoni,
Gianluca Moro,
Giulio Carlassare,
Antonella Carbonaro

Affiliations

Giacomo Frisoni: Department of Computer Science and Engineering (DISI), University of Bologna, 40126 Bologna, Italy
Gianluca Moro: Department of Computer Science and Engineering (DISI), University of Bologna, 40126 Bologna, Italy
Giulio Carlassare: Independent Researcher, 48018 Faenza, Italy
Antonella Carbonaro: Department of Computer Science and Engineering (DISI), University of Bologna, 40126 Bologna, Italy

DOI: https://doi.org/10.3390/s22010003
Journal volume & issue: Vol. 22, no. 1
p. 3

Abstract

Read online

The automatic extraction of biomedical events from the scientific literature has drawn keen interest in the last several years, recognizing complex and semantically rich graphical interactions otherwise buried in texts. However, very few works revolve around learning embeddings or similarity metrics for event graphs. This gap leaves biological relations unlinked and prevents the application of machine learning techniques to promote discoveries. Taking advantage of recent deep graph kernel solutions and pre-trained language models, we propose Deep Divergence Event Graph Kernels (DDEGK), an unsupervised inductive method to map events into low-dimensional vectors, preserving their structural and semantic similarities. Unlike most other systems, DDEGK operates at a graph level and does not require task-specific labels, feature engineering, or known correspondences between nodes. To this end, our solution compares events against a small set of anchor ones, trains cross-graph attention networks for drawing pairwise alignments (bolstering interpretability), and employs transformer-based models to encode continuous attributes. Extensive experiments have been done on nine biomedical datasets. We show that our learned event representations can be effectively employed in tasks such as graph classification, clustering, and visualization, also facilitating downstream semantic textual similarity. Empirical results demonstrate that DDEGK significantly outperforms other state-of-the-art methods.

Published in Sensors

ISSN: 1424-8220 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Chemical technology
Website: http://www.mdpi.com/journal/sensors

About the journal

Abstract

Keywords