Explaining pretrained language models' understanding of linguistic structures using construction grammar

Leonie Weissweiler; Leonie Weissweiler; Valentin Hofmann; Valentin Hofmann; Abdullatif Köksal; Abdullatif Köksal; Hinrich Schütze; Hinrich Schütze

doi:10.3389/frai.2023.1225791

Frontiers in Artificial Intelligence (Oct 2023)

Explaining pretrained language models' understanding of linguistic structures using construction grammar

Leonie Weissweiler,
Leonie Weissweiler,
Valentin Hofmann,
Valentin Hofmann,
Abdullatif Köksal,
Abdullatif Köksal,
Hinrich Schütze,
Hinrich Schütze

Affiliations

Leonie Weissweiler: Center for Information and Language Processing, LMU Munich, Munich, Germany
Leonie Weissweiler: Munich Center for Machine Learning, Munich, Germany
Valentin Hofmann: Center for Information and Language Processing, LMU Munich, Munich, Germany
Valentin Hofmann: Faculty of Linguistics, University of Oxford, Oxford, United Kingdom
Abdullatif Köksal: Center for Information and Language Processing, LMU Munich, Munich, Germany
Abdullatif Köksal: Munich Center for Machine Learning, Munich, Germany
Hinrich Schütze: Center for Information and Language Processing, LMU Munich, Munich, Germany
Hinrich Schütze: Munich Center for Machine Learning, Munich, Germany

DOI: https://doi.org/10.3389/frai.2023.1225791
Journal volume & issue: Vol. 6

Abstract

Read online

Construction Grammar (CxG) is a paradigm from cognitive linguistics emphasizing the connection between syntax and semantics. Rather than rules that operate on lexical items, it posits constructions as the central building blocks of language, i.e., linguistic units of different granularity that combine syntax and semantics. As a first step toward assessing the compatibility of CxG with the syntactic and semantic knowledge demonstrated by state-of-the-art pretrained language models (PLMs), we present an investigation of their capability to classify and understand one of the most commonly studied constructions, the English comparative correlative (CC). We conduct experiments examining the classification accuracy of a syntactic probe on the one hand and the models' behavior in a semantic application task on the other, with BERT, RoBERTa, and DeBERTa as the example PLMs. Our results show that all three investigated PLMs, as well as OPT, are able to recognize the structure of the CC but fail to use its meaning. While human-like performance of PLMs on many NLP tasks has been alleged, this indicates that PLMs still suffer from substantial shortcomings in central domains of linguistic knowledge.

Published in Frontiers in Artificial Intelligence

ISSN: 2624-8212 (Online)
Publisher: Frontiers Media S.A.
Country of publisher: Switzerland
LCC subjects: Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: https://www.frontiersin.org/journals/artificial-intelligence#

About the journal

Abstract

Keywords