Clinical utility of automatic phenotype annotation in unstructured clinical notes: intensive care unit use

Jingqing Zhang; Luis Daniel Bolanos Trujillo; Ashwani Tanwar; Julia Ive; Vibhor Gupta; Yike Guo

doi:10.1136/bmjhci-2021-100519

BMJ Health & Care Informatics (Feb 2022)

Clinical utility of automatic phenotype annotation in unstructured clinical notes: intensive care unit use

Jingqing Zhang,
Luis Daniel Bolanos Trujillo,
Ashwani Tanwar,
Julia Ive,
Vibhor Gupta,
Yike Guo

Affiliations

Jingqing Zhang: Pangaea Data Limited, London, UK
Luis Daniel Bolanos Trujillo: Pangaea Data Limited, London, UK
Ashwani Tanwar: Pangaea Data Limited, London, UK
Julia Ive: Pangaea Data Limited, London, UK
Vibhor Gupta: Pangaea Data Limited, London, UK
Yike Guo: Pangaea Data Limited, London, UK

DOI: https://doi.org/10.1136/bmjhci-2021-100519
Journal volume & issue: Vol. 29, no. 1

Abstract

Read online

Objective Clinical notes contain information that has not been documented elsewhere, including responses to treatment and clinical findings, which are crucial for predicting key outcomes in patients in acute care. In this study, we propose the automatic annotation of phenotypes from clinical notes as a method to capture essential information to predict outcomes in the intensive care unit (ICU). This information is complementary to typically used vital signs and laboratory test results.Methods In this study, we developed a novel phenotype annotation model to extract the phenotypical features of patients, which were then used as input features of predictive models to predict ICU patient outcomes. We demonstrated and validated this approach by conducting experiments on three ICU prediction tasks, including in-hospital mortality, physiological decompensation and length of stay (LOS) for over 24 000 patients using the Medical Information Mart for Intensive Care (MIMIC-III) dataset.Results The predictive models incorporating phenotypical information achieved 0.845 (area under the curve–receiver operating characteristic (AUC-ROC)) for in-hospital mortality, 0.839 (AUC-ROC) for physiological decompensation and 0.430 (kappa) for LOS, all of which consistently outperformed the baseline models using only vital signs and laboratory test results. Moreover, we conducted a thorough interpretability study showing that phenotypes provide valuable insights at both the patient and cohort levels.Conclusion The proposed approach demonstrates that phenotypical information complements traditionally used vital signs and laboratory test results and significantly improves the accuracy of outcome prediction in the ICU.

Published in BMJ Health & Care Informatics

ISSN: 2632-1009 (Online)
Publisher: BMJ Publishing Group
Country of publisher: United Kingdom
LCC subjects: Medicine: Medicine (General): Computer applications to medicine. Medical informatics
Website: https://informatics.bmj.com/

About the journal