Automatika (Jul 2024)

Ensemble machine learning technique-based plagiarism detection over opinions in social media

  • Sethu Vinayaga Vadivu,
  • Palanigurupackiam Nagaraj,
  • Bagavathi Ammai Shanmugam Murugan

DOI
https://doi.org/10.1080/00051144.2024.2326383
Journal volume & issue
Vol. 65, no. 3
pp. 983 – 991

Abstract

Read online

With the progressive enhancement of social media, several people prefer posting their opinions on various social media instead of posting on radios, television or newspapers. The postings differ in dimensions and include various titles and comments. Nowadays, the formation of plagiarism is increasing tremendously which occurs by rewriting or repeating one’s work. There are many ways to detect plagiarism by browsing through the internet. The significant intention of this paper involves the detection of plagiarism in social media using four different phases, namely the data pre-processing phase, n-gram evaluation, similarity/distance computation analysis and the plagiarism detection phase. The pre-processing includes data cleaning processes, such as the removal of redundant data, upper case letters, noise, irrelevant punctuations and characterizing into a vector form. After pre-processing the data are fed for n-gram evaluation to develop a posting attribution system. Then finally, an ensemble support vector machine-based African vulture optimization (ESVM-AVO) approach is employed to detect plagiarism which signifies that the performance based on detection is enhanced and the execution time in obtaining a high rate of detection accuracy is very low. Finally, the performance evaluation and the comparative analysis are carried out to determine the performance of the proposed system.

Keywords