Heliyon (Sep 2023)
Are paid tools worth the cost? A prospective cross-over study to find the right tool for plagiarism detection
Abstract
Background: The increasing pressure to publish research has led to a rise in plagiarism incidents, creating a need for effective plagiarism detection software. The importance of this study lies in the high cost variation amongst the available options for plagiarism detection. By uncovering the advantages of these low-cost or free alternatives, researchers could access the appropriate tools for plagiarism detection. This is the first study to compare four plagiarism detection tools and assess factors impacting their effectiveness in identifying plagiarism in AI-generated articles. Methodology: A prospective cross-over study was conducted with the primary objective to compare Overall Similarity Index(OSI) of four plagiarism detection software(iThenticate, Grammarly, Small SEO Tools, and DupliChecker) on AI-generated articles. ChatGPT was used to generate 100 articles, ten from each of ten general domains affecting various aspects of life. These were run through four software, recording the OSI. Flesch Reading Ease Score(FRES), Gunning Fog Index(GFI), and Flesch-Kincaid Grade Level(FKGL) were used to assess how factors, such as article length and language complexity, impact plagiarism detection. Results: The study found significant variation in OSI(p < 0.001) among the four software, with Grammarly having the highest mean rank(3.56) and Small SEO Tools having the lowest(1.67). Pairwise analyses revealed significant differences(p < 0.001) between all pairs except for Small SEO Tools-DupliChecker. Number of words showed a significant correlation with OSI for iThenticate(p < 0.05) but not for the other three. FRES had a positive correlation, and GFI had a negative correlation with OSI by DupliChecker. FKGL negatively correlated with OSI by Small SEO Tools and DupliChecker. Conclusion: Grammarly is unexpectedly most effective in detecting plagiarism in AI-generated articles compared to the other tools. This could be due to different softwares using diverse data sources. This highlights the potential for lower-cost plagiarism detection tools to be utilized by researchers.