Journal of ICT (Aug 2020)

CLASSIFICATION OF SHORT POSSESSIVE CLITIC PRONOUN NYA IN MALAY TEXT TO SUPPORT ANAPHOR CANDIDATE DETERMINATION

  • Noor Huzaimi@Karimah Mohd Noor,
  • Shahrul Azman Mohd Noah,
  • Mohd Juzaiddin Ab Aziz

DOI
https://doi.org/10.32890/jict2020.19.4.3
Journal volume & issue
Vol. 19, no. 4
pp. 513 – 532

Abstract

Read online

Anaphor candidate determination is an important process in anaphora resolution (AR) systems. There are several types of anaphor, one of which is pronominal anaphor. Pronominal anaphor is an anaphor that involves pronouns. In some of the cases, certain pronouns can be used without referring to any situation or entity in a text, and this phenomenon is known as pleonastic. In the case of the Malay language, it usually occurs for the pronoun nya. The pleonastic that exists in every text causes a severe problem to the anaphora resolution systems. The process to determine the pleonastic nya is not the same as identifying the pleonastic ‘it’ in the English language, where the syntactic pattern could not be used because the structure of nya comes at the end of a word. As an alternative, semantic classes are used to identify the pleonastic itself and the anaphoric nya. In this paper, the automatic semantic tag was used to determine the type of nya, which at the same time could determine nya as an anaphor candidate. The new algorithms and MalayAR architecture were proposed. The results of the F-measure showed the detection of clitic nya as a separate word achieved a perfect 100% result. In comparison, the clitic nya as a pleonastic achieved 88%, clitic nya referring to humans achieved 94%, and clitic nya referring to non-humans achieved 63%. The results showed that the proposed algorithms were acceptable to solve the issue of the clitic nya as pleonastic, human referral as well as non-human referral.

Keywords