Scientific Data (May 2024)

BOLD: Blood-gas and Oximetry Linked Dataset

  • João Matos,
  • Tristan Struja,
  • Jack Gallifant,
  • Luis Nakayama,
  • Marie-Laure Charpignon,
  • Xiaoli Liu,
  • Nicoleta Economou-Zavlanos,
  • Jaime S. Cardoso,
  • Kimberly S. Johnson,
  • Nrupen Bhavsar,
  • Judy Gichoya,
  • Leo Anthony Celi,
  • An-Kwok Ian Wong

DOI
https://doi.org/10.1038/s41597-024-03225-z
Journal volume & issue
Vol. 11, no. 1
pp. 1 – 13

Abstract

Read online

Abstract Pulse oximeters measure peripheral arterial oxygen saturation (SpO2) noninvasively, while the gold standard (SaO2) involves arterial blood gas measurement. There are known racial and ethnic disparities in their performance. BOLD is a dataset that aims to underscore the importance of addressing biases in pulse oximetry accuracy, which disproportionately affect darker-skinned patients. The dataset was created by harmonizing three Electronic Health Record databases (MIMIC-III, MIMIC-IV, eICU-CRD) comprising Intensive Care Unit stays of US patients. Paired SpO2 and SaO2 measurements were time-aligned and combined with various other sociodemographic and parameters to provide a detailed representation of each patient. BOLD includes 49,099 paired measurements, within a 5-minute window and with oxygen saturation levels between 70–100%. Minority racial and ethnic groups account for ~25% of the data – a proportion seldom achieved in previous studies. The codebase is publicly available. Given the prevalent use of pulse oximeters in the hospital and at home, we hope that BOLD will be leveraged to develop debiasing algorithms that can result in more equitable healthcare solutions.