Shanghai yufang yixue (Jan 2024)

Application of Bayesian probabilistic linkage model in birth and death data linking

  • YU Huiting,
  • CAI Renzhi,
  • LIN Weixiao,
  • NI Jingyi,
  • QIAN Naisi,
  • XIA Tian,
  • WU Fan

DOI
https://doi.org/10.19428/j.cnki.sjpm.2024.23137
Journal volume & issue
Vol. 36, no. 1
pp. 98 – 103

Abstract

Read online

ObjectiveTo elucidate the principles and methods of the Bayesian probabilistic linkage model, and to demonstrate the effect of applying the model in linking birth and death data.MethodsThrough the Shanghai birth and death registration system, data of 199 025 infants born in 2017 and 1 512 infants who died in 2017 and 2018 were collected. After cleaning the data, the data were divided into monthly blocks and fully linked. The Jaro-Winkler algorithm and Euclidean distance were employed to measure the similarity of fields for matching. A Bayesian probabilistic linkage model was constructed and the linking effect was evaluated using a confusion matrix.ResultsUsing the Bayesian probabilistic linkage model, the birth and death data of infants were effectively linked, revealing that 36.71% of infants who died in Shanghai were born outside the city, and the probability of infant death was 2.6‰. The confusion matrix of the test set showed a recall rate of 0.86, precision of 0.76, and an F-score of 0.81.ConclusionThe practical application of Bayesian probabilistic linkage demonstrates a good model performance, enabling the establishment of birth-death cohorts that more accurately reflect the true levels of infant mortality. Utilizing this technique to integrate data from different departments can effectively improve research efficiency in the field of public health.

Keywords