Exploring the Potential of Claude 3 Opus in Renal Pathological Diagnosis: Performance Evaluation

Xingyuan Li; Ke Liu; Yanlin Lang; Zhonglin Chai; Fang Liu

doi:10.2196/65033

JMIR Medical Informatics (Nov 2024)

Exploring the Potential of Claude 3 Opus in Renal Pathological Diagnosis: Performance Evaluation

Xingyuan Li,
Ke Liu,
Yanlin Lang,
Zhonglin Chai,
Fang Liu

Affiliations

Xingyuan Li: ORCiD
Ke Liu: ORCiD
Yanlin Lang: ORCiD
Zhonglin Chai: ORCiD
Fang Liu: ORCiD

DOI: https://doi.org/10.2196/65033
Journal volume & issue: Vol. 12
p. e65033

Abstract

Read online

BackgroundArtificial intelligence (AI) has shown great promise in assisting medical diagnosis, but its application in renal pathology remains limited. ObjectiveWe evaluated the performance of an advanced AI language model, Claude 3 Opus (Anthropic), in generating diagnostic descriptions for renal pathological images. MethodsWe carefully curated a dataset of 100 representative renal pathological images from the Diagnostic Atlas of Renal Pathology (3rd edition). The image selection aimed to cover a wide spectrum of common renal diseases, ensuring a balanced and comprehensive dataset. Claude 3 Opus generated diagnostic descriptions for each image, which were scored by 2 pathologists on clinical relevance, accuracy, fluency, completeness, and overall value. ResultsClaude 3 Opus achieved a high mean score in language fluency (3.86) but lower scores in clinical relevance (1.75), accuracy (1.55), completeness (2.01), and overall value (1.75). Performance varied across disease types. Interrater agreement was substantial for relevance (κ=0.627) and overall value (κ=0.589) and moderate for accuracy (κ=0.485) and completeness (κ=0.458). ConclusionsClaude 3 Opus shows potential in generating fluent renal pathology descriptions but needs improvement in accuracy and clinical value. The AI’s performance varied across disease types. Addressing the limitations of single-source data and incorporating comparative analyses with other AI approaches are essential steps for future research. Further optimization and validation are needed for clinical applications.

Published in JMIR Medical Informatics

ISSN: 2291-9694 (Online)
Publisher: JMIR Publications
Country of publisher: Canada
LCC subjects: Medicine: Medicine (General): Computer applications to medicine. Medical informatics
Website: https://medinform.jmir.org

About the journal