Systems and Soft Computing (Dec 2024)

Harmony search for hyperparameters optimization of a low resource language transformer model trained with a novel parallel corpus Ocelotl Nahuatl – Spanish

  • Máximo Enrique Pacheco Martínez,
  • Maya Carrillo Ruiz,
  • María de Lourdes Sandoval Solís

Journal volume & issue
Vol. 6
p. 200152

Abstract

Read online

Nahuatl, a low-resource language, does not have an online translator application. Instead, resources are limited to dictionaries, web pages, or digital books. Given this condition, it is vital to provide as much support to the language as possible. This research aims to enhance the BLEU score in machine translation by applying the harmony search heuristic method to state-of-the-art transformers models. This is conducted by finding the optimal hyperparameter settings for the models. Models are trained and tested using a fresh moderate-size parallel corpus of 1.5k phrases. By utilizing harmony search, the study shows an improvement in the BLEU score, enhancing it by 2.569%. In order to accomplish this, various factors related to the hyperparameters need to be considered. The application of harmony search with transformers can be extended to various parallel corpora or models, taking these considerations into account.

Keywords