Measuring and Improving Consistency in Pretrained Language Models

Yanai Elazar; Nora Kassner; Shauli Ravfogel; Abhilasha Ravichander; Eduard Hovy; Hinrich Schütze; Yoav Goldberg

doi:10.1162/tacl_a_00410

Transactions of the Association for Computational Linguistics (Jan 2021)

Measuring and Improving Consistency in Pretrained Language Models

Yanai Elazar,
Nora Kassner,
Shauli Ravfogel,
Abhilasha Ravichander,
Eduard Hovy,
Hinrich Schütze,
Yoav Goldberg

Affiliations

Yanai Elazar: Computer Science Department, Bar Ilan University, Israel
Nora Kassner: Center for Information and Language Processing (CIS), LMU Munich, Germany. [email protected]
Shauli Ravfogel: Computer Science Department, Bar Ilan University, Israel
Abhilasha Ravichander: Language Technologies Institute, Carnegie Mellon University, United States. [email protected]
Eduard Hovy: Language Technologies Institute, Carnegie Mellon University, United States. [email protected]
Hinrich Schütze: Center for Information and Language Processing (CIS), LMU Munich, Germany
Yoav Goldberg: Computer Science Department, Bar Ilan University, Israel

DOI: https://doi.org/10.1162/tacl_a_00410
Journal volume & issue: Vol. 9
pp. 1012 – 1031

Abstract

Read online

AbstractConsistency of a model—that is, the invariance of its behavior under meaning-preserving alternations in its input—is a highly desirable property in natural language processing. In this paper we study the question: Are Pretrained Language Models (PLMs) consistent with respect to factual knowledge? To this end, we create ParaRel🤘, a high-quality resource of cloze-style query English paraphrases. It contains a total of 328 paraphrases for 38 relations. Using ParaRel🤘, we show that the consistency of all PLMs we experiment with is poor— though with high variance between relations. Our analysis of the representational spaces of PLMs suggests that they have a poor structure and are currently not suitable for representing knowledge robustly. Finally, we propose a method for improving model consistency and experimentally demonstrate its effectiveness.1

Published in Transactions of the Association for Computational Linguistics

ISSN: 2307-387X (Online)
Publisher: The MIT Press
Country of publisher: United States
LCC subjects: Language and Literature: Philology. Linguistics: Computational linguistics. Natural language processing
Website: https://direct.mit.edu/tacl

About the journal