Mathematics (Jun 2022)

Application of Data Science for Cluster Analysis of COVID-19 Mortality According to Sociodemographic Factors at Municipal Level in Mexico

  • Joaquín Pérez-Ortega,
  • Nelva Nely Almanza-Ortega,
  • Kirvis Torres-Poveda,
  • Gerardo Martínez-González,
  • José Crispín Zavala-Díaz,
  • Rodolfo Pazos-Rangel

DOI
https://doi.org/10.3390/math10132167
Journal volume & issue
Vol. 10, no. 13
p. 2167

Abstract

Read online

Mexico is among the five countries with the largest number of reported deaths from COVID-19 disease, and the mortality rates associated to infections are heterogeneous in the country due to structural factors concerning population. This study aims at the analysis of clusters related to mortality rate from COVID-19 at the municipal level in Mexico from the perspective of Data Science. In this sense, a new application is presented that uses a machine learning hybrid algorithm for generating clusters of municipalities with similar values of sociodemographic indicators and mortality rates. To provide a systematic framework, we applied an extension of the International Business Machines Corporation (IBM) methodology called Batch Foundation Methodology for Data Science (FMDS). For the study, 1,086,743 death certificates corresponding to the year 2020 were used, among other official data. As a result of the analysis, two key indicators related to mortality from COVID-19 at the municipal level were identified: one is population density and the other is percentage of population in poverty. Based on these indicators, 16 municipality clusters were determined. Among the main results of this research, it was found that clusters with high values of mortality rate had high values of population density and low poverty levels. In contrast, clusters with low density values and high poverty levels had low mortality rates. Finally, we think that the patterns found, expressed as municipality clusters with similar characteristics, can be useful for decision making by health authorities regarding disease prevention and control for reinforcing public health measures and optimizing resource distribution for reducing hospitalizations and mortality.

Keywords