Multi-Domain Aspect Extraction Using Bidirectional Encoder Representations From Transformers

Brucce Neves Dos Santos; Ricardo Marcondes Marcacini; Solange Oliveira Rezende

doi:10.1109/ACCESS.2021.3089099

IEEE Access (Jan 2021)

Multi-Domain Aspect Extraction Using Bidirectional Encoder Representations From Transformers

Brucce Neves Dos Santos,
Ricardo Marcondes Marcacini,
Solange Oliveira Rezende

Affiliations

Brucce Neves Dos Santos: ORCiD; Institute of Mathematics and Computer Sciences, University of São Paulo, São Carlos, Brazil
Ricardo Marcondes Marcacini: ORCiD; Institute of Mathematics and Computer Sciences, University of São Paulo, São Carlos, Brazil
Solange Oliveira Rezende: Institute of Mathematics and Computer Sciences, University of São Paulo, São Carlos, Brazil

DOI: https://doi.org/10.1109/ACCESS.2021.3089099
Journal volume & issue: Vol. 9
pp. 91604 – 91613

Abstract

Read online

Deep learning and neural language models have obtained state-of-the-art results in aspects extraction tasks, in which the objective is to automatically extract characteristics of products and services that are the target of consumer opinion. However, these methods require a large amount of labeled data to achieve such results. Since data labeling is a costly task, there are no labeled data available for all domains. In this paper, we propose an approach for aspect extraction in a multi-domain transfer learning scenario, thereby leveraging labeled data from different source domains to extract aspects of a new unlabeled target domain. Our approach, called MDAE-BERT (Multi-Domain Aspect Extraction using Bidirectional Encoder Representations from Transformers), explores neural language models to deal with two major challenges in multi-domain learning: (1) inconsistency of aspects from target and source domains and (2) context-based semantic distance between ambiguous aspects. We evaluated our MDAE-BERT considering two perspectives (1) the aspect extraction performance using F1-Macro and Accuracy measures; and (2) by comparing the multi-domain aspect extraction models and single-domain models for aspect extraction. In the first perspective, our method outperforms the LSTM-based approach. In the second perspective, our approach proved to be a competitive alternative compared to the single-domain model trained in a specific domain, even in the absence of labeled data from the target domain.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords