Scientific Reports (Nov 2023)

Deep-learning-based natural-language-processing models to identify cardiovascular disease hospitalisations of patients with diabetes from routine visits’ text

  • Alessandro Guazzo,
  • Enrico Longato,
  • Gian Paolo Fadini,
  • Mario Luca Morieri,
  • Giovanni Sparacino,
  • Barbara Di Camillo

DOI
https://doi.org/10.1038/s41598-023-45115-1
Journal volume & issue
Vol. 13, no. 1
pp. 1 – 10

Abstract

Read online

Abstract Writing notes is the most widespread method to report clinical events. Therefore, most of the information about the disease history of a patient remains locked behind free-form text. Natural language processing (NLP) provides a solution to automatically transform free-form text into structured data. In the present work, electronic healthcare records data of patients with diabetes were used to develop deep-learning based NLP models to automatically identify, within free-form text describing routine visits, the occurrence of hospitalisations related to cardiovascular disease (CVDs), an outcome of diabetes. Four possible time windows of increasing level of expected difficulty were considered: infinite, 24 months, 12 months, and 6 months. Model performance was evaluated by means of the area under the precision recall curve, as well as precision, recall, and F1-score after thresholding. Results showed that the proposed NLP approach was successful for both the infinite and 24-month windows, while, as expected, performance deteriorated with shorter time windows. Possible clinical applications of tools based on the proposed NLP approach include the retrospective filling of medical records with respect to a patient’s CVD history for epidemiological and research purposes as well as for clinical decision making.