Journal of Multidisciplinary Healthcare (Aug 2024)
Comparison of the Performance of ChatGPT, Claude and Bard in Support of Myopia Prevention and Control
Abstract
Yan Wang,1 Lihua Liang,2,* Ran Li,2,* Yihua Wang,3 Changfu Hao1 1Department of Child and Adolescent Health, School of Public Health, Zhengzhou University, Zhengzhou, Henan, People’s Republic of China; 2Primary and Secondary School Health Center, Zhengzhou Education Science Planning and Evaluation Center, Zhengzhou Municipal Education Bureau, Zhengzhou, Henan, People’s Republic of China; 3Institute of Science and Technology Information, Zhengzhou University, Zhengzhou, Henan, People’s Republic of China*These authors contributed equally to this workCorrespondence: Yihua Wang, Institute of Science and Technology Information, Zhengzhou University, Science Avenue, Zhengzhou, Henan Province, 450001, People’s Republic of China, Email [email protected] Changfu Hao, Department of Child and Adolescent Health, School of Public Health, Zhengzhou University, Science Avenue, Zhengzhou, Henan Province, 450001, People’s Republic of China, Email [email protected]: Chatbots, which are based on large language models, are increasingly being used in public health. However, the effectiveness of chatbot responses has been debated, and their performance in myopia prevention and control has not been fully explored. This study aimed to evaluate the effectiveness of three well-known chatbots—ChatGPT, Claude, and Bard—in responding to public health questions about myopia.Methods: Nineteen public health questions about myopia (including three topics of policy, basics and measures) were responded individually by three chatbots. After shuffling the order, each chatbot response was independently rated by 4 raters for comprehensiveness, accuracy and relevance.Results: The study’s questions have undergone reliable testing. There was a significant difference among the word count responses of all 3 chatbots. From most to least, the order was ChatGPT, Bard, and Claude. All 3 chatbots had a composite score above 4 out of 5. ChatGPT scored the highest in all aspects of the assessment. However, all chatbots exhibit shortcomings, such as giving fabricated responses.Conclusion: Chatbots have shown great potential in public health, with ChatGPT being the best. The future use of chatbots as a public health tool will require rapid development of standards for their use and monitoring, as well as continued research, evaluation and improvement of chatbots.Keywords: chatbot, large language model, public health, myopia