Unsupervised Heterogeneous Graph Neural Networks for One-Class Tasks: Exploring Early Fusion Operators

Marcos Paulo Silva Gôlo; Marcelo Isaias de Moraes Junior; Rudinei Goularte; Ricardo Marcondes Marcacini

doi:10.5753/jis.2024.4109

Journal on Interactive Systems (May 2024)

Unsupervised Heterogeneous Graph Neural Networks for One-Class Tasks: Exploring Early Fusion Operators

Marcos Paulo Silva Gôlo,
Marcelo Isaias de Moraes Junior,
Rudinei Goularte,
Ricardo Marcondes Marcacini

Affiliations

Marcos Paulo Silva Gôlo: University of São Paulo
Marcelo Isaias de Moraes Junior: University of São Paulo
Rudinei Goularte: University of São Paulo
Ricardo Marcondes Marcacini: University of São Paulo

DOI: https://doi.org/10.5753/jis.2024.4109
Journal volume & issue: Vol. 15, no. 1

Abstract

Read online

Heterogeneous graphs are an essential structure that models real-world data through different types of nodes and relationships between them, including multimodality, which comprises different types of data such as text, image, and audio. Graph Neural Networks (GNNs) are a prominent graph representation learning method that takes advantage of the graph structure and its attributes that, when applied to the multimodal heterogeneous graph, learn a unique semantic space for the different modalities. Consequently, it allows multimodal fusion through simple operators such as sum, average, or multiplication, generating unified representations considering the supplementary and complementarity relationships between the modalities. In multimodal heterogeneous graphs, the labeling process tends to be even more costly due to the multiple modalities analyzed, in addition to the imbalance of classes inherent to some applications. In order to overcome these problems in applications that comprise a class of interest, One-Class Learning (OCL) is used. Given the lack of studies on multimodal early fusion in heterogeneous graphs for OCL tasks, we proposed a method based on unsupervised GNN for heterogeneous graphs and evaluated different early fusion operators. In this paper, we extend another work by evaluating the behavior of the main GNN convolutions in the method. We highlight that using operators such as average, addition, and subtraction were the best early fusion operators. In addition, GNN layers that do not use an attention mechanism performed better. In this way, we argue for heterogeneous graph neural networks in multimodal using early fusion simple operators instead of well-often-used concatenation and less complex convolutions.

Published in Journal on Interactive Systems

ISSN: 2763-7719 (Online)
Publisher: Brazilian Computer Society
Country of publisher: Brazil
LCC subjects: Science: Mathematics: Instruments and machines: Electronic computers. Computer science: Computer software; Technology: Electrical engineering. Electronics. Nuclear engineering: Electronics: Computer engineering. Computer hardware
Website: https://sol.sbc.org.br/journals/index.php/jis/

About the journal

Abstract

Keywords