EClinicalMedicine (Aug 2022)

Effect of imbalanced sampling and missing data on associations between gender norms and risk of adolescent HIV

  • Ribhav Gupta,
  • Safa Abdalla,
  • Valerie Meausoone,
  • Nikitha Vicas,
  • Iván Mejía-Guevara,
  • Ann M. Weber,
  • Beniamino Cislaghi,
  • Gary L. Darmstadt

Journal volume & issue
Vol. 50
p. 101513

Abstract

Read online

Summary: Background: Despite strides towards gender equality, inequalities persist or remain unstudied, due potentially to data gaps. Although mapped, the effects of key data gaps remain unknown. This study provides a framework to measure effects of gender- and age-imbalanced and missing covariate data on gender-health research. The framework is demonstrated using a previously studied pathway for effects of pre-marital sex norms among adults on adolescent HIV risk. Methods: After identifying gender-age-imbalanced Demographic and Health Survey (DHS) datasets, we resampled responses and restricted covariate data from a relatively complete, balanced dataset derived from the 2007 Zambian DHS to replicate imbalanced gender-age sampling and covariate missingness. Differences in model outcomes due to sampling were measured using tests for interaction. Missing covariate effects were measured by comparing fully-adjusted and reduced model fitness. Findings: We simulated data from 25 DHS surveys across 20 countries from 2005-2014 on four sex-stratified models for pathways of adult attitude-behaviour discordance regarding pre-marital sex and adolescent risk of HIV. On average, across gender-age-imbalanced surveys, males comprised 29.6% of responses compared to 45.3% in the gender-balanced dataset. Gender-age-imbalanced sampling significantly affected regression coefficients in 40% of model-scenarios (N = 40 of 100) and biased relative-risk estimates away from gender-age-balanced sampling outcomes in 46% (N = 46) of model-scenarios. Model fitness was robust to covariate removal with minor effects on male HIV models. No consistent trends were observed between sampling distribution and risk of biased outcomes. Interpretation: Gender-health model outcomes may be affected by sampling gender-age-imbalanced data and less-so by missing covariates. Although occasionally attenuated, the effect magnitude of gender-age-imbalanced sampling is variable and may mask true associations, thus misinforming policy dialogue. We recommend future surveys improve balanced gender-age sampling to promote research reliability. Funding: Bill & Melinda Gates Foundation grant OPP1140262 to Stanford University.

Keywords