Modeling of Electronic Health Records for Time-Variant Event Learning Beyond Bio-Markers&#x2014;A Case Study in Prostate Cancer

J. Herp; Jan-Matthias Braun; M. L. Cantuaria; Ashkan Tashk; T. B. Pedersen; M. H. A. Poulsen; M. Krogh; E. S. Nadimi; S. P. Sheikh

doi:10.1109/ACCESS.2023.3272745

IEEE Access (Jan 2023)

Modeling of Electronic Health Records for Time-Variant Event Learning Beyond Bio-Markers—A Case Study in Prostate Cancer

J. Herp,
Jan-Matthias Braun,
M. L. Cantuaria,
Ashkan Tashk,
T. B. Pedersen,
M. H. A. Poulsen,
M. Krogh,
E. S. Nadimi,
S. P. Sheikh

Affiliations

J. Herp: Unit of Applied Artificial Intelligence and Data Science, The Maersk Mc-Kinney Møller Institute, Faculty of Engineering, University of Southern Denmark, Odense, Denmark
Jan-Matthias Braun: ORCiD; Unit of Applied Artificial Intelligence and Data Science, The Maersk Mc-Kinney Møller Institute, Faculty of Engineering, University of Southern Denmark, Odense, Denmark
M. L. Cantuaria: ORCiD; Unit of Applied Artificial Intelligence and Data Science, The Maersk Mc-Kinney Møller Institute, Faculty of Engineering, University of Southern Denmark, Odense, Denmark
Ashkan Tashk: ORCiD; Unit of Applied Artificial Intelligence and Data Science, The Maersk Mc-Kinney Møller Institute, Faculty of Engineering, University of Southern Denmark, Odense, Denmark
T. B. Pedersen: Department of Urology, Odense University Hospital, Odense, Denmark
M. H. A. Poulsen: Department of Urology, Odense University Hospital, Odense, Denmark
M. Krogh: Open, Odense University Hospital (OUH), Odense, Open, Denmark
E. S. Nadimi: Unit of Applied Artificial Intelligence and Data Science, The Maersk Mc-Kinney Møller Institute, Faculty of Engineering, University of Southern Denmark, Odense, Denmark
S. P. Sheikh: Open, Odense University Hospital (OUH), Odense, Open, Denmark

DOI: https://doi.org/10.1109/ACCESS.2023.3272745
Journal volume & issue: Vol. 11
pp. 50295 – 50309

Abstract

Read online

Electronic health records (EHR) of large populations constitute a vast untapped resource for data-driven diagnosis and disease progression. We develop a model capable of predicting future steps in a patient’s journey for prostate cancer (PC) and its metastases without relying on direct biomarker-measurements on a set of $18\,529$ EHR. To this end, we 1) harmonise EHR without presumptions–events are sorted and grouped by fundamental a priori principles; 2) develop a new Long-Short-Term Memory (LSTM) recurrent neural network node for learning temporal relations, on which we build an autoencoder based model; 3) derive a graph representation based on unsupervised $k$ -means clustering of events related to PC in the autoencoder’s latent layer. We report $88 {\%}$ predicting accuracy for the targeted metastasis-related events, and lower accuracies for more general events. The model gains interpretability with a graph representation illustrating the patient journey. Most importantly, we predict that $20 {\%}$ of all PC diagnosed patients will progress into metastatic disease one visit ahead of time. For the remaining patients we can predict the next step in their journey. We conclude that the model based on the new LSTM node provides a valuable tool for earlier diagnosis of life threatening metastases and quality assurance of the procedure.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords