Applied AI Letters (Sep 2024)

Fine‐Tuned Pretrained Transformer for Amharic News Headline Generation

  • Mizanu Zelalem Degu,
  • Million Meshesha

DOI
https://doi.org/10.1002/ail2.98
Journal volume & issue
Vol. 5, no. 3
pp. n/a – n/a

Abstract

Read online

ABSTRACT Amharic is one of the under‐resourced languages, making news headline generation particularly challenging due to the scarcity of high‐quality linguistic datasets necessary for training effective natural language processing models. In this study, we fine‐tuned the small check point of the T5v1.1 model (t5‐small) to perform Amharic news headline generation with an Amharic dataset that is comprised of over 70k news articles along with their headline. Fine‐tuning the model involves dataset collection from Amharic news websites, text cleaning, news article size optimization using the TF‐IDF algorithm, and tokenization. In addition, a tokenizer model is developed using the byte pair encoding (BPE) algorithm prior to feeding the dataset for feature extraction and summarization. Metrics including Rouge‐L, BLEU, and Meteor were used to evaluate the performance of the model and a score of 0.5, 0.24, and 0.71, respectively, was achieved on the test partition of the dataset that contains 7230 instances. The results were good relative to result of the t5 model without fine‐tuning, which are 0.1, 0.03, and 0.14, respectively. A postprocessing technique using a rule‐based approach was used for further improving summaries generated by the model. The addition of the postprocessing helped the system to achieve Rouge‐L, BLEU, and Meteor scores of 0.72, 0.52, and 0.81, respectively. The result value is relatively better than the result achieved by the nonfine‐tuned T5v1.1 model and the result of previous studies report on abstractive‐based text summarization for Amharic language, which had a 0.27 Rouge‐L score. This contributes a valuable insight for practical application and further improvement of the model in the future by increasing the article length, using more training data, using machine learning–based adaptive postprocessing techniques, and fine‐tuning other available pretrained models for text summarization.

Keywords