Artificial intelligence in the GPs office: a retrospective study on diagnostic accuracy

Steindor Ellertsson; Hrafn Loftsson; Emil L. Sigurdsson

doi:10.1080/02813432.2021.1973255

Scandinavian Journal of Primary Health Care (Oct 2021)

Artificial intelligence in the GPs office: a retrospective study on diagnostic accuracy

Steindor Ellertsson,
Hrafn Loftsson,
Emil L. Sigurdsson

Affiliations

Steindor Ellertsson: Primary Health Care Service of the Capital Area
Hrafn Loftsson: Department of Computer Science, Reykjavik University
Emil L. Sigurdsson: Primary Health Care Service of the Capital Area

DOI: https://doi.org/10.1080/02813432.2021.1973255
Journal volume & issue: Vol. 39, no. 4
pp. 448 – 458

Abstract

Read online

Objective Machine learning (ML) is expected to play an increasing role within primary health care (PHC) in coming years. No peer-reviewed studies exist that evaluate the diagnostic accuracy of ML models compared to general practitioners (GPs). The aim of this study was to evaluate the diagnostic accuracy of an ML classifier on primary headache diagnoses in PHC, compare its performance to GPs, and examine the most impactful signs and symptoms when making a prediction. Design A retrospective study on diagnostic accuracy, using electronic health records from the database of the Primary Health Care Service of the Capital Area (PHCCA) in Iceland. Setting Fifteen primary health care centers of the PHCCA. Subjects All patients that consulted a physician, from 1 January 2006 to 30 April 2020, and received one of the selected diagnoses. Main outcome measures Sensitivity, Specificity, Positive Predictive Value, Matthews Correlation Coefficient, Receiver Operating Characteristic (ROC) curve, and Area under the ROC curve (AUROC) score for primary headache diagnoses, as well as Shapley Additive Explanations (SHAP) values of the ML classifier. Results The classifier outperformed the GPs on all metrics except specificity. The SHAP values indicate that the classifier uses the same signs and symptoms (features) as a physician would, when distinguishing between headache diagnoses. Conclusion In a retrospective comparison, the diagnostic accuracy of the ML classifier for primary headache diagnoses is superior to GPs. According to SHAP values, the ML classifier relies on the same signs and symptoms as a physician when making a diagnostic prediction.Keypoints Little is known about the diagnostic accuracy of machine learning (ML) in the context of primary health care, despite its considerable potential to aid in clinical work. This novel research sheds light on the diagnostic accuracy of ML in a clinical context, as well as the interpretation of its predictions. If the vast potential of ML is to be utilized in primary health care, its performance, safety, and inner workings need to be understood by clinicians.

Published in Scandinavian Journal of Primary Health Care

ISSN: 0281-3432 (Print); 1502-7724 (Online)
Publisher: Taylor & Francis Group
Country of publisher: United Kingdom
LCC subjects: Medicine: Public aspects of medicine
Website: https://www.tandfonline.com/journals/ipri

About the journal

Abstract

Keywords