WPOM : Working Papers on Operations Management (May 2021)

A data generator for covid-19 patients’ care requirements inside hospitals

  • Juan A. Marin-Garcia,
  • Angel Ruiz,
  • Maheut Julien,
  • Jose P. Garcia-Sabater

Journal volume & issue
Vol. 12, no. 1
pp. 76 – 115


Read online

This paper presents the generation of a plausible data set related to the needs of COVID-19 patients with severe or critical symptoms. Possible illness’ stages were proposed within the context of medical knowledge as of January 2021. The parameters chosen in this data set were customized to fit the population data of the Valencia region (Spain) with approximately 2.5 million inhabitants. They were based on the evolution of the pandemic between September 2020 and March 2021, a period that included two complete waves of the pandemic. Contrary to expectation and despite the European and national transparency laws (BOE-A2013-12887, 2013; European Parliament and Council of the European Union, 2019), the actual COVID-19 pandemic-related data, at least in Spain, took considerable time to be updated and made available (usually a week or more). Moreover, some relevant data necessary to develop and validate hospital bed management models were not publicly accessible. This was either because these data were not collected, because public agencies failed to make them public (despite having them indexed in their databases), the data were processed within indicators and not shown as raw data, or they simply published the data in a format that was difficult to process (e.g., PDF image documents versus CSV tables). Despite the potential of hospital information systems, there were still data that were not adequately captured within these systems. Moreover, the data collected in a hospital depends on the strategies and practices specific to that hospital or health system. This limits the generalization of "real" data, and it encourages working with "realistic" or plausible data that are clean of interactions with local variables or decisions (Gunal, 2012; Marin-Garcia et al., 2020). Besides, one can parameterize the model and define the data structure that would be necessary to run the model without delaying till the real data become available. Conversely, plausible data sets can be generated from publicly available information and, later, when real data become available, the accuracy of the model can be evaluated (Garcia-Sabater and Maheut, 2021). This work opens lines of future research, both theoretical and practical. From a theoretical point of view, it would be interesting to develop machine learning tools that, by analyzing specific data samples in real hospitals, can identify the parameters necessary for the automatic prototyping of generators adapted to each hospital. Regarding the lines of research applied, it is evident that the formalism proposed for the generation of sound patients is not limited to patients affected by SARS-CoV-2 infection. The generation of heterogeneous patients can represent the needs of a specific population and serve as a basis for studying complex health service delivery systems.