KKU Engineering Journal (Mar 2014)

Semantic-based technique for thai documents plagiarism detection

  • Sorawat Prapanitisatian,
  • Kraisak Kesorn

Journal volume & issue
Vol. 41, no. 1
pp. 109 – 117

Abstract

Read online

Plagiarism is the act of taking another person's writing or idea without referring to the source of information. This is one of major problems in educational institutes. There is a number of plagiarism detection software available on the Internet. However, a few numbers of them works. Typically, they use a simple method for plagiarism detection e.g. string matching. The main weakness of this method is it cannot detect the plagiarism when the author replaces some words using synonyms. As such, this paper presents a new technique for a semantic-based plagiarism detection using Semantic Role Labeling (SRL) and term weighting. SRL is deployed in order to calculate the semantic-based similarity. The main different from the existing framework is terms in a sentence are weighted dynamically depending on their roles in the sentence e.g. subject, verb or object. This technique enhances the plagiarism detection mechanism more efficiently than existing system although positions of terms in a sentence are reordered. The experimental results show that the proposed method can detect the plagiarism document more effective than the existing methods, Anti-kobpae, Turnit-in and Traditional Semantic Role Labeling.

Keywords