Deep-learning-based natural-language-processing models to identify cardiovascular disease hospitalisations of patients with diabetes from routine visits’ text

Alessandro Guazzo; Enrico Longato; Gian Paolo Fadini; Mario Luca Morieri; Giovanni Sparacino; Barbara Di Camillo

doi:10.1038/s41598-023-45115-1

Scientific Reports (Nov 2023)

Deep-learning-based natural-language-processing models to identify cardiovascular disease hospitalisations of patients with diabetes from routine visits’ text

Alessandro Guazzo,
Enrico Longato,
Gian Paolo Fadini,
Mario Luca Morieri,
Giovanni Sparacino,
Barbara Di Camillo

Affiliations

Alessandro Guazzo: Department of Information Engineering, University of Padova
Enrico Longato: Department of Information Engineering, University of Padova
Gian Paolo Fadini: Department of Medicine, University of Padova
Mario Luca Morieri: Department of Medicine, University of Padova
Giovanni Sparacino: Department of Information Engineering, University of Padova
Barbara Di Camillo: Department of Information Engineering, University of Padova

DOI: https://doi.org/10.1038/s41598-023-45115-1
Journal volume & issue: Vol. 13, no. 1
pp. 1 – 10

Abstract

Read online

Abstract Writing notes is the most widespread method to report clinical events. Therefore, most of the information about the disease history of a patient remains locked behind free-form text. Natural language processing (NLP) provides a solution to automatically transform free-form text into structured data. In the present work, electronic healthcare records data of patients with diabetes were used to develop deep-learning based NLP models to automatically identify, within free-form text describing routine visits, the occurrence of hospitalisations related to cardiovascular disease (CVDs), an outcome of diabetes. Four possible time windows of increasing level of expected difficulty were considered: infinite, 24 months, 12 months, and 6 months. Model performance was evaluated by means of the area under the precision recall curve, as well as precision, recall, and F1-score after thresholding. Results showed that the proposed NLP approach was successful for both the infinite and 24-month windows, while, as expected, performance deteriorated with shorter time windows. Possible clinical applications of tools based on the proposed NLP approach include the retrospective filling of medical records with respect to a patient’s CVD history for epidemiological and research purposes as well as for clinical decision making.

Published in Scientific Reports

ISSN: 2045-2322 (Online)
Publisher: Nature Portfolio
Country of publisher: United Kingdom
LCC subjects: Medicine; Science
Website: https://www.nature.com/srep/

About the journal