International Journal of Computational Intelligence Systems (May 2024)

PERCORE: A Deep Learning-Based Framework for Persian Spelling Correction with Phonetic Analysis

  • Seyed Mohammad Sadegh Dashti,
  • Amid Khatibi Bardsiri,
  • Mehdi Jafari Shahbazzadeh

DOI
https://doi.org/10.1007/s44196-024-00459-y
Journal volume & issue
Vol. 17, no. 1
pp. 1 – 23

Abstract

Read online

Abstract This research introduces a state-of-the-art Persian spelling correction system that seamlessly integrates deep learning techniques with phonetic analysis, significantly enhancing the accuracy and efficiency of natural language processing (NLP) for Persian. Utilizing a fine-tuned language representation model, our methodology effectively combines deep contextual analysis with phonetic insights, adeptly correcting both non-word and real-word spelling errors. This strategy proves particularly effective in tackling the unique complexities of Persian spelling, including its elaborate morphology and the challenge of homophony. A thorough evaluation on a wide-ranging dataset confirms our system’s superior performance compared to existing methods, with impressive F1-Scores of 0.890 for detecting real-word errors and 0.905 for correcting them. Additionally, the system demonstrates a strong capability in non-word error correction, achieving an F1-Score of 0.891. These results illustrate the significant benefits of incorporating phonetic insights into deep learning models for spelling correction. Our contributions not only advance Persian language processing by providing a versatile solution for a variety of NLP applications but also pave the way for future research in the field, emphasizing the critical role of phonetic analysis in developing effective spelling correction system.

Keywords