PLOS Digital Health (Oct 2023)

A scoping review of the landscape of health-related open datasets in Latin America.

  • David Restrepo,
  • Justin Quion,
  • Constanza Vásquez-Venegas,
  • Cleva Villanueva,
  • Leo Anthony Celi,
  • Luis Filipe Nakayama

DOI
https://doi.org/10.1371/journal.pdig.0000368
Journal volume & issue
Vol. 2, no. 10
p. e0000368

Abstract

Read online

Artificial intelligence (AI) algorithms have the potential to revolutionize healthcare, but their successful translation into clinical practice has been limited. One crucial factor is the data used to train these algorithms, which must be representative of the population. However, most healthcare databases are derived from high-income countries, leading to non-representative models and potentially exacerbating health inequities. This review focuses on the landscape of health-related open datasets in Latin America, aiming to identify existing datasets, examine data-sharing frameworks, techniques, platforms, and formats, and identify best practices in Latin America. The review found 61 datasets from 23 countries, with the DATASUS dataset from Brazil contributing to the majority of articles. The analysis revealed a dearth of datasets created by the authors themselves, indicating a reliance on existing open datasets. The findings underscore the importance of promoting open data in Latin America. We provide recommendations for enhancing data sharing in the region.