JMIR Medical Informatics (Sep 2024)

Evaluating the Bias in Hospital Data: Automatic Preprocessing of Patient Pathways Algorithm Development and Validation Study

  • Laura Uhl,
  • Vincent Augusto,
  • Benjamin Dalmas,
  • Youenn Alexandre,
  • Paolo Bercelli,
  • Fanny Jardinaud,
  • Saber Aloui

DOI
https://doi.org/10.2196/58978
Journal volume & issue
Vol. 12
p. e58978

Abstract

Read online

BackgroundThe optimization of patient care pathways is crucial for hospital managers in the context of a scarcity of medical resources. Assuming unlimited capacities, the pathway of a patient would only be governed by pure medical logic to meet at best the patient’s needs. However, logistical limitations (eg, resources such as inpatient beds) are often associated with delayed treatments and may ultimately affect patient pathways. This is especially true for unscheduled patients—when a patient in the emergency department needs to be admitted to another medical unit without disturbing the flow of planned hospitalizations. ObjectiveIn this study, we proposed a new framework to automatically detect activities in patient pathways that may be unrelated to patients’ needs but rather induced by logistical limitations. MethodsThe scientific contribution lies in a method that transforms a database of historical pathways with bias into 2 databases: a labeled pathway database where each activity is labeled as relevant (related to a patient’s needs) or irrelevant (induced by logistical limitations) and a corrected pathway database where each activity corresponds to the activity that would occur assuming unlimited resources. The labeling algorithm was assessed through medical expertise. In total, 2 case studies quantified the impact of our method of preprocessing health care data using process mining and discrete event simulation. ResultsFocusing on unscheduled patient pathways, we collected data covering 12 months of activity at the Groupe Hospitalier Bretagne Sud in France. Our algorithm had 87% accuracy and demonstrated its usefulness for preprocessing traces and obtaining a clean database. The 2 case studies showed the importance of our preprocessing step before any analysis. The process graphs of the processed data had, on average, 40% (SD 10%) fewer variants than the raw data. The simulation revealed that 30% of the medical units had >1 bed difference in capacity between the processed and raw data. ConclusionsPatient pathway data reflect the actual activity of hospitals that is governed by medical requirements and logistical limitations. Before using these data, these limitations should be identified and corrected. We anticipate that our approach can be generalized to obtain unbiased analyses of patient pathways for other hospitals.