IEEE Access (Jan 2018)

No Landslide for the Human Journalist - An Empirical Study of Computer-Generated Election News in Finland

  • Magnus Melin,
  • Asta Back,
  • Caj Sodergard,
  • Myriam D. Munezero,
  • Leo J. Leppanen,
  • Hannu Toivonen

DOI
https://doi.org/10.1109/ACCESS.2018.2861987
Journal volume & issue
Vol. 6
pp. 43356 – 43367

Abstract

Read online

In an age of struggling news media, automated generation of news via natural language generation (NLG) methods could be of great help, especially in areas where the amount of raw input data is big, and the structure of the data is known in advance. One such news automation system is the Valtteri NLG system, which generates news articles about the Finnish municipal elections of 2017. To evaluate the quality of Valtteri-produced articles and to identify aspects to improve, n = 152 users were asked to evaluate the output of Valtteri. Each evaluator rated six preselected computer-generated articles, four control articles written by journalists, and four computer-generated articles of their own choice. All the articles were evaluated along four dimensions: credibility, liking, quality, and representativeness. As expected, the texts written by Valtteri received lower ratings than those written by journalists, but overall the ratings were satisfactory (average 2.9 versus 4.0 for journalists on a five-point scale). Valtteri's best rating (3.6) was for credibility. The computer-written articles that the evaluators could freely select got slightly better ratings than the preselected computer-written articles. When looking at the results by demographic groups, males aged 55 or more liked the automatic articles best and females aged 34 or less liked them the least. Evaluators mistook 21% of the computer-written articles as written by humans and 10% of the human-written articles as computer-written. The share of users making these mistakes grew with the age. Overall, the male evaluators made less writer-identification mistakes than female evaluators did.

Keywords