Journal of Cloud Computing: Advances, Systems and Applications (Sep 2022)
Automated all in one misspelling detection and correction system for Ethiopian languages
Abstract
Abstract In this paper, a misspelling detection and correction system was developed for Ethiopian languages (Amharic, Afan Oromo, Tigrinya, Hadiyyisa, Kambatissa, and Awngi). For some of these languages, there have been few works on typo detection and correction systems. However, an effective and all-in-one typo detector and corrector system for Ethiopian languages have yet to be developed. A dictionary-based methodology is used to detect and rectify various forms of misspelling-related issues. The major characteristics of the proposed model can be outlined by presenting suggestions for detected flaws and automatically correcting them utilizing the first suggestion. In addition, the proposed model is evaluated using dictionary-based data sets for all languages. The corpora used were gathered from a variety of sources, including economic, political, social, and related publications, newspapers, and magazines. In this model, the users can perform all spelling-related issues within a single system (all-in-one). That means if the user(s) is (are) working on the Amharic language and then he/she/they can change the language she/he/they prefer(s) without shifting to another graphical user interface (GUI). Here, the users can save time and perform their tasks easily. Similarly, the user(s) can improve their skills in the selected languages accordingly. Finally, precision, recall, and f-measures for each language have been computed following a successful evaluation of the model. The system outperforms an f-measure of 89.57%, 87.57%, 88.31%, 86.83%, 81.83%, and 87.59% for Amharic, Afan Oromo, Tigrinya, Hadiyyisa, Kambatissa, and Awngi languages respectively. Furthermore, recommendations have been provided for future researchers.
Keywords