Machine Learning and Knowledge Extraction (Dec 2022)

Multimodal AutoML via Representation Evolution

  • Blaž Škrlj,
  • Matej Bevec,
  • Nada Lavrač

DOI
https://doi.org/10.3390/make5010001
Journal volume & issue
Vol. 5, no. 1
pp. 1 – 13

Abstract

Read online

With the increasing amounts of available data, learning simultaneously from different types of inputs is becoming necessary to obtain robust and well-performing models. With the advent of representation learning in recent years, lower-dimensional vector-based representations have become available for both images and texts, while automating simultaneous learning from multiple modalities remains a challenging problem. This paper presents an AutoML (automated machine learning) approach to automated machine learning model configuration identification for data composed of two modalities: texts and images. The approach is based on the idea of representation evolution, the process of automatically amplifying heterogeneous representations across several modalities, optimized jointly with a collection of fast, well-regularized linear models. The proposed approach is benchmarked against 11 unimodal and multimodal (texts and images) approaches on four real-life benchmark datasets from different domains. It achieves competitive performance with minimal human effort and low computing requirements, enabling learning from multiple modalities in automated manner for a wider community of researchers.

Keywords