PLOS Digital Health (Oct 2023)
A scoping review of the landscape of health-related open datasets in Latin America.
Abstract
Artificial intelligence (AI) algorithms have the potential to revolutionize healthcare, but their successful translation into clinical practice has been limited. One crucial factor is the data used to train these algorithms, which must be representative of the population. However, most healthcare databases are derived from high-income countries, leading to non-representative models and potentially exacerbating health inequities. This review focuses on the landscape of health-related open datasets in Latin America, aiming to identify existing datasets, examine data-sharing frameworks, techniques, platforms, and formats, and identify best practices in Latin America. The review found 61 datasets from 23 countries, with the DATASUS dataset from Brazil contributing to the majority of articles. The analysis revealed a dearth of datasets created by the authors themselves, indicating a reliance on existing open datasets. The findings underscore the importance of promoting open data in Latin America. We provide recommendations for enhancing data sharing in the region.