Journal of Intelligent Systems (Sep 2020)
Towards Developing a Comprehensive Tag Set for the Arabic Language
Abstract
This paper presents a comprehensive Tag set as a fundamental component for developing an automated Word Class/Part-of-Speech (PoS) tagging system for the Arabic language. The aim is to develop a standard and comprehensive PoS tag set that based upon PoS classes and Arabic inflectional morphology useful for Linguistics and Natural Language Processing (NLP) developers to extract more linguistic information from it. The tag names in the developed tag set uses terminology from Arabic tradition grammar rather than English grammar. The usability of the presented Tag set has been tested in manual tagging and built up a set of tagged text to serve as a goal corpus used to compare it with the results obtained from the tagger. The tagger has achieved an average accuracy of 90% using the developed detailed tag set.
Keywords