SoftwareX (May 2024)
Morpheme-based Korean text cohesion analyzer
Abstract
The fundamental difference between Korean and English text analysis lies in morpheme analysis. While existing Korean text analysis relies on English analysis tools, it often yields inaccurate results due to the difficulty of morpheme analysis. The primary reason is the existing morpheme analyzer depends on eojeol tokens, making it challenging to grasp Korean characteristics. Therefore, we introduce a Transformer-based morpheme analyzer that uses morpheme tokens to capture the inherent feature in Korean sentences. Then, we successfully integrate this morpheme analyzer into our Korean text analysis tool, offering it as a web service for efficient usage.