IEEE Access (Jan 2021)

LETS: A Label-Efficient Training Scheme for Aspect-Based Sentiment Analysis by Using a Pre-Trained Language Model

  • Heereen Shim,
  • Dietwig Lowet,
  • Stijn Luca,
  • Bart Vanrumste

DOI
https://doi.org/10.1109/ACCESS.2021.3101867
Journal volume & issue
Vol. 9
pp. 115563 – 115578

Abstract

Read online

Recently proposed pre-trained language models can be easily fine-tuned to a wide range of downstream tasks. However, a large-scale labelled task-specific dataset is required for fine-tuning creating a bottleneck in the development process of machine learning applications. To foster a fast development by reducing manual labelling efforts, we propose a Label-Efficient Training Scheme (LETS). The proposed LETS consists of three elements: (i) task-specific pre-training to exploit unlabelled task-specific corpus data, (ii) label augmentation to maximise the utility of labelled data, and (iii) active learning to label data strategically. In this paper, we apply LETS to a novel aspect-based sentiment analysis (ABSA) use-case for analysing the reviews of the health-related program supporting people to improve their sleep quality. We validate the proposed LETS on a custom health-related program-reviews dataset and another ABSA benchmark dataset. Experimental results show that the LETS can reduce manual labelling efforts 2-3 times compared to labelling with random sampling on both datasets. The LETS also outperforms other state-of-the-art active learning methods. Furthermore, the experimental results show that LETS can contribute to better generalisability with both datasets compared to other methods thanks to the task-specific pre-training and the proposed label augmentation. We expect this work could contribute to the natural language processing (NLP) domain by addressing the issue of the high cost of manually labelling data. Also, our work could contribute to the healthcare domain by introducing a new potential application of NLP techniques.

Keywords