Evaluating evidence-based health information from generative AI using a cross-sectional study with laypeople seeking screening information

Felix G. Rebitschek; Alessandra Carella; Silja Kohlrausch-Pazin; Michael Zitzmann; Anke Steckelberg; Christoph Wilhelm

doi:10.1038/s41746-025-01752-6

npj Digital Medicine (Jun 2025)

Evaluating evidence-based health information from generative AI using a cross-sectional study with laypeople seeking screening information

Felix G. Rebitschek,
Alessandra Carella,
Silja Kohlrausch-Pazin,
Michael Zitzmann,
Anke Steckelberg,
Christoph Wilhelm

Affiliations

Felix G. Rebitschek: Harding Center for Risk Literacy, Faculty of Health Sciences Brandenburg, University of Potsdam
Alessandra Carella: Department of Developmental and Social Psychology, University of Padova
Silja Kohlrausch-Pazin: Harding Center for Risk Literacy, Faculty of Health Sciences Brandenburg, University of Potsdam
Michael Zitzmann: Harding Center for Risk Literacy, Faculty of Health Sciences Brandenburg, University of Potsdam
Anke Steckelberg: Institute of Health, Midwifery and Nursing Science, Medical Faculty, Martin Luther University Halle-Wittenberg
Christoph Wilhelm: Harding Center for Risk Literacy, Faculty of Health Sciences Brandenburg, University of Potsdam

DOI: https://doi.org/10.1038/s41746-025-01752-6
Journal volume & issue: Vol. 8, no. 1
pp. 1 – 8

Abstract

Read online

Abstract Large language models (LLMs) are used to seek health information. Guidelines for evidence-based health communication require the presentation of the best available evidence to support informed decision-making. We investigate the prompt-dependent guideline compliance of LLMs and evaluate a minimal behavioural intervention for boosting laypeople’s prompting. Study 1 systematically varied prompt informedness, topic, and LLMs to evaluate compliance. Study 2 randomized 300 participants to three LLMs under standard or boosted prompting conditions. Blinded raters assessed LLM response with two instruments. Study 1 found that LLMs failed evidence-based health communication standards. The quality of responses was found to be contingent upon prompt informedness. Study 2 revealed that laypeople frequently generated poor-quality responses. The simple boost improved response quality, though it remained below required standards. These findings underscore the inadequacy of LLMs as a standalone health communication tool. Integrating LLMs with evidence-based frameworks, enhancing their reasoning and interfaces, and teaching prompting are essential. Study Registration: German Clinical Trials Register (DRKS) (Reg. No.: DRKS00035228, registered on 15 October 2024).

Published in npj Digital Medicine

ISSN: 2398-6352 (Online)
Publisher: Nature Portfolio
Country of publisher: United Kingdom
LCC subjects: Medicine: Medicine (General): Computer applications to medicine. Medical informatics
Website: https://www.nature.com/npjdigitalmed/

About the journal