Journal of Nature and Science of Medicine (Apr 2025)

Assessing the Validity and Accuracy of Artificial Intelligence Technologies for Identifying Relevant Literature in Dentistry

  • Ameena M Siyad,
  • A. S. Akhila,
  • Subramaniam Ramanarayanan,
  • Shabil Mohamed Mustafa,
  • Jesline Merly James,
  • Priya Babu

DOI
https://doi.org/10.4103/jnsm.jnsm_170_24
Journal volume & issue
Vol. 8, no. 2
pp. 135 – 138

Abstract

Read online

Introduction: Health care sector, across its numerous domains, has been employing artificial intelligence (AI), encompassing a wide range of tasks involving the various stages of complexity. In fields of research, AI has been widely employed in the scientific writing. The usage is a matter of much debate in the recent years. Two names in the AI world stand out: OpenAI’s ChatGPT and Microsoft Copilot. The study was conducted with the objective of assessing the validity of generative AI technologies for identifying relevant literature in dentistry. Methods: The study was conducted as a cross-sectional and observational study, during the month of March 2024. The study used both ChatGPT 3.5 and Microsoft Copilot to search dental scientific literature pertaining to topics related to cone-beam computed tomography and cone-beam volumetric tomography and its subdomains including the accuracy, advantages, limitations, and validity. The six components included the (1) authors, (2) reference titles, (3) journal names, (4) publication years, (5) digital object identifiers, and (6) reference links. The accuracy of the reference citations was verified through searching then Medline, Web of Science, Scopus, and Google databases. The data obtained were coded, tabulated, and analyzed using the Statistical Package for Social Sciences for Windows. The accuracy of references to the six components was summarized and expressed as frequency and percentages. The comparison of accuracy between Chat GPT and Copilot was done using the Chi-square test. Results: Fifty-five unique references from ChatGPT and 47 from the Copilot were analyzed for validity and accuracy. Concerning the valid titles, 30.90% of the titles provided by ChatGPT and 89.40% provided by Copilot were valid. Author details were correct in 29.10% of the ChatGPT and 74.50% of the Copilot references. Concerning the name of the journal, the corresponding figures were 30.90% and 87.20% respectively. Conclusions: Based on our findings, using ChatGPT and Copilot as the sole resource for identifying references to literature reviews in dentistry is not currently recommended. Among the two AI platforms, the Copilot was more accurate and valid, but further research is warranted.

Keywords