Acta Médica Portuguesa (Oct 2024)

Logistic Regression: Limitations in the Estimation of Measures of Association with Binary Health Outcomes

  • Lara Pinheiro-Guedes,
  • Clarisse Martinho,
  • Maria Rosário O. Martins

DOI
https://doi.org/10.20344/amp.21435
Journal volume & issue
Vol. 37, no. 10

Abstract

Read online

Introduction: Logistic regression models are frequently used to estimate measures of association between an exposure, health determinant or intervention, and a binary outcome. However, when the outcome is frequent (> 10%), model estimates for relative risks and prevalence ratios might be biased. Despite the availability of several alternatives, many still rely on these models, and a consensus is yet to be reached. We aimed to compare the estimation and goodness-of-fit of logistic, log-binomial and robust Poisson regression models, in cross-sectional studies involving frequent binary outcomes. Methods: Two cross-sectional studies were conducted. Study 1 was a nationally representative study on the impact of air pollution on mental health. Study 2 was a local study on immigrants’ access to urgent healthcare services. Odds ratios (OR) were obtained through logistic regression, and prevalence ratios (PR) through log-binomial and robust Poisson regression models. Confidence intervals (CI), their ranges, and standard-errors (SE) were also computed, along with models’ relative goodness-of-fit through Akaike Information Criterion (AIC), when applicable. Results: In Study 1, the OR (95% CI) was 1.015 (0.970 - 1.063), while the PR (95% CI) obtained through the robust Poisson mode was 1.012 (0.979 - 1.045). The log-binomial regression model did not converge in this study. In Study 2, the OR (95% CI) was 1.584 (1.026 - 2.446), the PR (95% CI) for the log-binomial model was 1.217 (0.978 - 1.515), and 1.130 (1.013 - 1.261) for the robust Poisson model. The 95% CI, their ranges, and the SE of the OR were higher than those of the PR, in both studies. However, in Study 2, the AIC value was lower for the logistic regression model. Conclusion: The odds ratio overestimated PR with wider 95% CI and higher SE. The overestimation was greater as the outcome of the study became more prevalent, in line with previous studies. In Study 2, the logistic regression was the model with the best fit, illustrating the need to consider multiple criteria when selecting the most appropriate statistical model for each study. Employing logistic regression models by default might lead to misinterpretations. Robust Poisson models are viable alternatives in cross-sectional studies with frequent binary outcomes, avoiding the non-convergence of log-binomial models.

Keywords