Evaluating Text Generation Model Performance by Combining Semantic Meaning and Word Order

Erik Novak; Luka Bizjak; Dunja Mladenic; Marko Grobelnik

doi:10.1109/ACCESS.2024.3426082

IEEE Access (Jan 2024)

Evaluating Text Generation Model Performance by Combining Semantic Meaning and Word Order

Erik Novak,
Luka Bizjak,
Dunja Mladenic,
Marko Grobelnik

Affiliations

Erik Novak: ORCiD; Department for Artificial Intelligence, Jožf Stefan Institute, Ljubljana, Slovenia
Luka Bizjak: ORCiD; Departement de Mathematique, Université Libre de Bruxelles, Brussels, Belgium
Dunja Mladenic: ORCiD; Department for Artificial Intelligence, Jožf Stefan Institute, Ljubljana, Slovenia
Marko Grobelnik: ORCiD; Department for Artificial Intelligence, Jožf Stefan Institute, Ljubljana, Slovenia

DOI: https://doi.org/10.1109/ACCESS.2024.3426082
Journal volume & issue: Vol. 12
pp. 95265 – 95277

Abstract

Read online

Modern text generation metrics use semantic representations of words to assess the quality of a text generation model without considering the fluency of the generated text. This paper proposes a novel text generation metric that combines adequacy and fluency to measure the quality of the generated text. When computing the final score using optimal transport, the metric considers semantic meaning and word order. We evaluate the metric on text translation data sets consisting of 20 language pairs from various language families and scripts. Using a novel statistic for measuring word order sensitivity, we analyze its adequacy-based performance using Pearson’s r and Kendall’s $\tau $ correlation coefficients and their sensitivity to fluency-related modifications. Results show that the proposed metric is the most sensitive to fluency-related changes among all top-performing embedding-based metrics, which were found to be relatively invariant to variations in word order. The proposed metric’s overall adequacy-based performance is lower than the best embedding-based metric but higher than the n-gram matching metrics. Our code is publicly available on GitHub (https://github.com/eriknovak/metric-OPWScore) under the BSD-2-Clause license.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords