Use of machine learning techniques for identifying ischemic stroke instead of the rule-based methods: a nationwide population-based study

Hyunsun Lim; Youngmin Park; Jung Hwa Hong; Ki-Bong Yoo; Kwon-Duk Seo

doi:10.1186/s40001-023-01594-6

European Journal of Medical Research (Jan 2024)

Use of machine learning techniques for identifying ischemic stroke instead of the rule-based methods: a nationwide population-based study

Hyunsun Lim,
Youngmin Park,
Jung Hwa Hong,
Ki-Bong Yoo,
Kwon-Duk Seo

Affiliations

Hyunsun Lim: Department of Research and Analysis, National Health Insurance Service Ilsan Hospital
Youngmin Park: Department of Family Medicine, National Health Insurance Service Ilsan Hospital
Jung Hwa Hong: Department of Research and Analysis, National Health Insurance Service Ilsan Hospital
Ki-Bong Yoo: Division of Health Administration, Yonsei University
Kwon-Duk Seo: Department of Neurology, National Health Insurance Service Ilsan Hospital

DOI: https://doi.org/10.1186/s40001-023-01594-6
Journal volume & issue: Vol. 29, no. 1
pp. 1 – 9

Abstract

Read online

Abstract Background Many studies have evaluated stroke using claims data; most of these studies have defined ischemic stroke using an operational definition following the rule-based method. Rule-based methods tend to overestimate the number of patients with ischemic stroke. Objectives We aimed to identify an appropriate algorithm for identifying stroke by applying machine learning (ML) techniques to analyze the claims data. Methods We obtained the data from the Korean National Health Insurance Service database, which is linked to the Ilsan Hospital database (n = 30,897). The performance of prediction models (extreme gradient boosting [XGBoost] or gated recurrent unit [GRU]) was evaluated using the area under the receiver operating characteristic curve (AUROC), the area under precision–recall curve (AUPRC), and calibration curve. Results In total, 30,897 patients were enrolled in this study, 3145 of whom (10.18%) had ischemic stroke. XGBoost, a tree-based ML technique, had the AUROC was 94.46% and AUPRC was 92.80%. GRU showed the highest accuracy (99.81%), precision (99.92%) and recall (99.69%). Conclusions We proposed recurrent neural network-based deep learning techniques to improve stroke phenotyping. This can be expected to produce rapid and more accurate results than the rule-based methods.

Published in European Journal of Medical Research

ISSN: 2047-783X (Online)
Publisher: BMC
Country of publisher: United Kingdom
LCC subjects: Medicine
Website: https://eurjmedres.biomedcentral.com

About the journal

Abstract

Keywords