Automated Audio Captioning With Topic Modeling

Aysegul Ozkaya Eren; Mustafa Sert

doi:10.1109/ACCESS.2023.3235733

IEEE Access (Jan 2023)

Automated Audio Captioning With Topic Modeling

Aysegul Ozkaya Eren,
Mustafa Sert

Affiliations

Aysegul Ozkaya Eren: ORCiD; Department of Computer Engineering, Başkent University, Ankara, Turkey
Mustafa Sert: ORCiD; Department of Computer Engineering, Başkent University, Ankara, Turkey

DOI: https://doi.org/10.1109/ACCESS.2023.3235733
Journal volume & issue: Vol. 11
pp. 4983 – 4991

Abstract

Read online

Automatic audio captioning (AAC) is an important area of research aimed at generating meaningful descriptions for audio clips. Most existing methods use relevant semantic information to improve AAC performance and have demonstrated the feasibility of semantic information extraction. Audio events and keywords are commonly used for this purpose. Unlike previous studies, this study proposes a framework that uses topic modeling to obtain relevant semantic content since topic models explore the main themes of the documents. To this end, we present a framework that integrates audio embeddings with audio topics in a transformer-based encoder-decoder architecture. First, we represent each audio clip with a set of topics using a pre-trained topic model, BERTopic. Then, we design a multilayer perceptron (MLP)-based multi-label classifier to predict the topics of audio clips in the testing phase. Finally, in the proposed framework, we input audio embedding and extracted topics into the transformer model to generate captions. The results show that the proposed model improves performance and competes with the most advanced methods that utilize additional external data for training. We believe that the topic modeling can be used to extract semantic content in the AAC task.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords