RULE-BASED SYLLABIFICATION OF KOREAN WORDS WRITTEN IN LATIN USING DETERMINISTIC FINITE AUTOMATA MODELS

Rouly Doharma Sihite; Aditya Wikan Mahastama

doi:10.21460/jutei.2018.21.77

JUTEI (Jurnal Terapan Teknologi Informasi) (Aug 2018)

RULE-BASED SYLLABIFICATION OF KOREAN WORDS WRITTEN IN LATIN USING DETERMINISTIC FINITE AUTOMATA MODELS

Rouly Doharma Sihite,
Aditya Wikan Mahastama

Affiliations

Rouly Doharma Sihite: Program Studi Manajemen Informatika, STMIK Widuri
Aditya Wikan Mahastama: Prodi Informatika, Universitas Kristen Duta Wacana

DOI: https://doi.org/10.21460/jutei.2018.21.77
Journal volume & issue: Vol. 2, no. 1
pp. 75 – 85

Abstract

Read online

Transliteration is still a challenge in helping people to read or write from one to another writing systems. Korean transliteration has been a topic of research to automate the conversion between Hangul (Korean writing system) and Latin characters. Previous works have been done in transliterating Hangul to Latin, using statistical approach (72.2% accuracy) and Extended Markov Models (54.9% accuracy). This research focus on transliterating Latin (romanised) Korean words into Hangul, as many learners of Korean began using Latin first. Selected method is modeling the probable vowel and consonant forms and problable vowel and consonant sequences using Finite State Automata to avoid training. These models are then coded into rules which applied and tested to 100 random Korean words. Initial test results only 40% success rate in transliterating due to the nature that consonants have to be labeled as initial or final of a syllable, and some consonants missed the modeled rules. Additional rules are then added to catch-up and merge these consonants into existing proper syllables, which increased the success rate to 92%. This result is analysed further and it is found that certain consonants sequence caused syllabification problem if exist in a certain position. Other additional rules was inserted and yields 99% final success rate which also is the accuracy of transliterating Korean words written in Latin into Hangul characters in compund syllables.

Published in JUTEI (Jurnal Terapan Teknologi Informasi)

ISSN: 2579-3675 (Print); 2579-5538 (Online)
Publisher: Universitas Kristen Duta Wacana
Country of publisher: Indonesia
LCC subjects: Technology: Technology (General): Industrial engineering. Management engineering: Information technology
Website: https://jutei.ukdw.ac.id

About the journal

Abstract

Keywords