LMGAN: Linguistically Informed Semi-Supervised GAN with Multiple Generators

Whanhee Cho; Yongsuk Choi

doi:10.3390/s22228761

Sensors (Nov 2022)

LMGAN: Linguistically Informed Semi-Supervised GAN with Multiple Generators

Whanhee Cho,
Yongsuk Choi

Affiliations

Whanhee Cho: Department of Computer Science, Hanyang University, Seoul 04763, Korea
Yongsuk Choi: Department of Computer Science, Hanyang University, Seoul 04763, Korea

DOI: https://doi.org/10.3390/s22228761
Journal volume & issue: Vol. 22, no. 22
p. 8761

Abstract

Read online

Semi-supervised learning is one of the active research topics these days. There is a trial that solves semi-supervised text classification with a generative adversarial network (GAN). However, its generator has a limitation in producing fake data distributions that are similar to real data distributions. Since the real data distribution is frequently changing, the generator could not create adequate fake data. To overcome this problem, we present a novel approach for semi-supervised learning for text classification based on generative adversarial networks, Linguistically Informed SeMi-Supervised GAN with Multiple Generators, LMGAN. LMGAN uses trained bidirectional encoder representations from transformers (BERT) and the discriminator from GAN-BERT. In addition, LMGAN has multiple generators and utilizes the hidden layers of BERT. To reduce the discrepancy between the distribution of fake data and real data distribution, LMGAN uses fine-tuned BERT and the discriminator from GAN-BERT. However, since injecting fine-tuned BERT could induce incorrect fake data distribution, we utilize linguistically meaningful intermediate hidden layer outputs of BERT to enrich fake data distribution. Our model shows well-distributed fake data compared to the earlier GAN-based approach that failed to generate adequate high-quality fake data. Moreover, we can get better performances with extremely limited amounts of labeled data, up to 20.0%, compared to the baseline GAN-based model.

Published in Sensors

ISSN: 1424-8220 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Chemical technology
Website: http://www.mdpi.com/journal/sensors

About the journal

Abstract

Keywords