BMC Public Health (Jun 2024)

Accuracy, potential, and limitations of probabilistic record linkage in identifying deaths by gender identity and sexual orientation in the state of Rio De Janeiro, Brazil

  • Ricardo de Mattos Russo Rafael,
  • Kleison Pereira da Silva,
  • Helena Gonçalves de Souza Santos,
  • Davi Gomes Depret,
  • Jaime Alonso Caravaca-Morera,
  • Karen Marie Lucas Breda

DOI
https://doi.org/10.1186/s12889-024-19002-x
Journal volume & issue
Vol. 24, no. 1
pp. 1 – 13

Abstract

Read online

Abstract Background Globally, the counting of deaths based on gender identity and sexual orientation has been a challenge for health systems. In most cases, non-governmental organizations have dedicated themselves to this work. Despite these efforts in generating information, the scarcity of official data presents significant limitations in policy formulation and actions guided by population needs. Therefore, this manuscript aims to evaluate the accuracy, potential, and limits of probabilistic data relationships to yield information on deaths according to gender identity and sexual orientation in the State of Rio de Janeiro. Methods This study evaluated the accuracy of the probabilistic record linkage to obtain information on deaths according to gender and sexual orientation. Data from two information systems were used from June 15, 2015 to December 31, 2020. We constructed nine probabilistic data relationship strategies and identified the performance and cutoff points of the best strategy. Results The best data blocking strategy was established through logical blocks with the first and last names, birthdate, and mother’s name in the pairing strategy. With a population base of 80,178 records, 1556 deaths were retrieved. With an area under the curve of 0.979, this strategy presented 93.26% accuracy, 98.46% sensitivity, and 90.04% specificity for the cutoff point ≥ 17.9 of the data relationship score. The adoption of the cutoff point optimized the manual review phase, identifying 2259 (90.04%) of the 2509 false pairs and identifying 1532 (98.46%) of the 1556 true pairs. Conclusion With the identification of possible strategies for determining probabilistic data relationships, the retrieval of information on mortality according to sexual and gender markers has become feasible. Based on information from the daily routine of health services, the formulation of public policies that consider the LGBTQ + population more closely reflects the reality experienced by these population groups.

Keywords