Frontiers in Artificial Intelligence (May 2023)
Assessing longitudinal housing status using Electronic Health Record data: a comparison of natural language processing, structured data, and patient-reported history
Abstract
IntroductionMeasuring long-term housing outcomes is important for evaluating the impacts of services for individuals with homeless experience. However, assessing long-term housing status using traditional methods is challenging. The Veterans Affairs (VA) Electronic Health Record (EHR) provides detailed data for a large population of patients with homeless experiences and contains several indicators of housing instability, including structured data elements (e.g., diagnosis codes) and free-text clinical narratives. However, the validity of each of these data elements for measuring housing stability over time is not well-studied.MethodsWe compared VA EHR indicators of housing instability, including information extracted from clinical notes using natural language processing (NLP), with patient-reported housing outcomes in a cohort of homeless-experienced Veterans.ResultsNLP achieved higher sensitivity and specificity than standard diagnosis codes for detecting episodes of unstable housing. Other structured data elements in the VA EHR showed promising performance, particularly when combined with NLP.DiscussionEvaluation efforts and research studies assessing longitudinal housing outcomes should incorporate multiple data sources of documentation to achieve optimal performance.
Keywords