MeaningBERT: assessing meaning preservation between sentences

David Beauchemin; Horacio Saggion; Richard Khoury

doi:10.3389/frai.2023.1223924

Frontiers in Artificial Intelligence (Sep 2023)

MeaningBERT: assessing meaning preservation between sentences

David Beauchemin,
Horacio Saggion,
Richard Khoury

Affiliations

David Beauchemin: Group for Research in Artificial Intelligence of Laval University, Department of Computer Science and Software Engineering, Université Laval, Québec, QC, Canada
Horacio Saggion: Large Scale Text Understanding System Lab, Natural Language Processing Group, Department of Information and Communication Technologies, Universitat Pompeu Fabra, Barcelona, Spain
Richard Khoury: Group for Research in Artificial Intelligence of Laval University, Department of Computer Science and Software Engineering, Université Laval, Québec, QC, Canada

DOI: https://doi.org/10.3389/frai.2023.1223924
Journal volume & issue: Vol. 6

Abstract

Read online

In the field of automatic text simplification, assessing whether or not the meaning of the original text has been preserved during simplification is of paramount importance. Metrics relying on n-gram overlap assessment may struggle to deal with simplifications which replace complex phrases with their simpler paraphrases. Current evaluation metrics for meaning preservation based on large language models (LLMs), such as BertScore in machine translation or QuestEval in summarization, have been proposed. However, none has a strong correlation with human judgment of meaning preservation. Moreover, such metrics have not been assessed in the context of text simplification research. In this study, we present a meta-evaluation of several metrics we apply to measure content similarity in text simplification. We also show that the metrics are unable to pass two trivial, inexpensive content preservation tests. Another contribution of this study is MeaningBERT (https://github.com/GRAAL-Research/MeaningBERT), a new trainable metric designed to assess meaning preservation between two sentences in text simplification, showing how it correlates with human judgment. To demonstrate its quality and versatility, we will also present a compilation of datasets used to assess meaning preservation and benchmark our study against a large selection of popular metrics.

Published in Frontiers in Artificial Intelligence

ISSN: 2624-8212 (Online)
Publisher: Frontiers Media S.A.
Country of publisher: Switzerland
LCC subjects: Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: https://www.frontiersin.org/journals/artificial-intelligence#

About the journal

Abstract

Keywords