Augmenting Semantic Lexicons Using Word Embeddings and Transfer Learning

Thayer Alshaabi; Thayer Alshaabi; Colin M. Van Oort; Colin M. Van Oort; Mikaela Irene Fudolig; Michael V. Arnold; Christopher M. Danforth; Christopher M. Danforth; Peter Sheridan Dodds; Peter Sheridan Dodds

doi:10.3389/frai.2021.783778

Frontiers in Artificial Intelligence (Jan 2022)

Augmenting Semantic Lexicons Using Word Embeddings and Transfer Learning

Thayer Alshaabi,
Thayer Alshaabi,
Colin M. Van Oort,
Colin M. Van Oort,
Mikaela Irene Fudolig,
Michael V. Arnold,
Christopher M. Danforth,
Christopher M. Danforth,
Peter Sheridan Dodds,
Peter Sheridan Dodds

Affiliations

Thayer Alshaabi: Advanced Bioimaging Center, University of California, Berkeley, Berkeley, CA, United States
Thayer Alshaabi: Vermont Complex Systems Center, University of Vermont, Burlington, VT, United States
Colin M. Van Oort: Vermont Complex Systems Center, University of Vermont, Burlington, VT, United States
Colin M. Van Oort: The MITRE Corporation, McLean, VA, United States
Mikaela Irene Fudolig: Vermont Complex Systems Center, University of Vermont, Burlington, VT, United States
Michael V. Arnold: Vermont Complex Systems Center, University of Vermont, Burlington, VT, United States
Christopher M. Danforth: Vermont Complex Systems Center, University of Vermont, Burlington, VT, United States
Christopher M. Danforth: Department of Mathematics & Statistics, University of Vermont, Burlington, VT, United States
Peter Sheridan Dodds: Vermont Complex Systems Center, University of Vermont, Burlington, VT, United States
Peter Sheridan Dodds: Department of Computer Science, University of Vermont, Burlington, VT, United States

DOI: https://doi.org/10.3389/frai.2021.783778
Journal volume & issue: Vol. 4

Abstract

Read online

Sentiment-aware intelligent systems are essential to a wide array of applications. These systems are driven by language models which broadly fall into two paradigms: Lexicon-based and contextual. Although recent contextual models are increasingly dominant, we still see demand for lexicon-based models because of their interpretability and ease of use. For example, lexicon-based models allow researchers to readily determine which words and phrases contribute most to a change in measured sentiment. A challenge for any lexicon-based approach is that the lexicon needs to be routinely expanded with new words and expressions. Here, we propose two models for automatic lexicon expansion. Our first model establishes a baseline employing a simple and shallow neural network initialized with pre-trained word embeddings using a non-contextual approach. Our second model improves upon our baseline, featuring a deep Transformer-based network that brings to bear word definitions to estimate their lexical polarity. Our evaluation shows that both models are able to score new words with a similar accuracy to reviewers from Amazon Mechanical Turk, but at a fraction of the cost.

Published in Frontiers in Artificial Intelligence

ISSN: 2624-8212 (Online)
Publisher: Frontiers Media S.A.
Country of publisher: Switzerland
LCC subjects: Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: https://www.frontiersin.org/journals/artificial-intelligence#

About the journal

Abstract

Keywords