Emergent communication of multimodal deep generative models based on Metropolis-Hastings naming game

Nguyen Le Hoang; Tadahiro Taniguchi; Yoshinobu Hagiwara; Akira Taniguchi

doi:10.3389/frobt.2023.1290604

Frontiers in Robotics and AI (Jan 2024)

Emergent communication of multimodal deep generative models based on Metropolis-Hastings naming game

Nguyen Le Hoang,
Tadahiro Taniguchi,
Yoshinobu Hagiwara,
Akira Taniguchi

Affiliations

Nguyen Le Hoang: Graduate School of Information Science and Engineering, Ritsumeikan University, Kusatsu, Shiga, Japan
Tadahiro Taniguchi: College of Information Science and Engineering, Ritsumeikan University, Kusatsu, Shiga, Japan
Yoshinobu Hagiwara: Research Organization of Science and Technology, Ritsumeikan University, Kusatsu, Shiga, Japan
Akira Taniguchi: College of Information Science and Engineering, Ritsumeikan University, Kusatsu, Shiga, Japan

DOI: https://doi.org/10.3389/frobt.2023.1290604
Journal volume & issue: Vol. 10

Abstract

Read online

Deep generative models (DGM) are increasingly employed in emergent communication systems. However, their application in multimodal data contexts is limited. This study proposes a novel model that combines multimodal DGM with the Metropolis-Hastings (MH) naming game, enabling two agents to focus jointly on a shared subject and develop common vocabularies. The model proves that it can handle multimodal data, even in cases of missing modalities. Integrating the MH naming game with multimodal variational autoencoders (VAE) allows agents to form perceptual categories and exchange signs within multimodal contexts. Moreover, fine-tuning the weight ratio to favor a modality that the model could learn and categorize more readily improved communication. Our evaluation of three multimodal approaches - mixture-of-experts (MoE), product-of-experts (PoE), and mixture-of-product-of-experts (MoPoE)–suggests an impact on the creation of latent spaces, the internal representations of agents. Our results from experiments with the MNIST + SVHN and Multimodal165 datasets indicate that combining the Gaussian mixture model (GMM), PoE multimodal VAE, and MH naming game substantially improved information sharing, knowledge formation, and data reconstruction.

Published in Frontiers in Robotics and AI

ISSN: 2296-9144 (Online)
Publisher: Frontiers Media S.A.
Country of publisher: Switzerland
LCC subjects: Technology: Mechanical engineering and machinery; Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: https://www.frontiersin.org/journals/robotics-and-ai

About the journal

Abstract

Keywords