Genetics Selection Evolution (Nov 2011)

Simulation study for analysis of binary responses in the presence of extreme case problems

  • Rekaya Romdhane,
  • Sapp Robyn L,
  • Hay El H,
  • Davis Ryan,
  • Bertrand Joseph K

DOI
https://doi.org/10.1186/1297-9686-43-41
Journal volume & issue
Vol. 43, no. 1
p. 41

Abstract

Read online

Abstract Background Estimates of variance components for binary responses in presence of extreme case problems tend to be biased due to an under-identified likelihood. The bias persists even when a normal prior is used for the fixed effects. Methods A simulation study was carried out to investigate methods for the analysis of binary responses with extreme case problems. A linear mixed model that included a fixed effect and random effects of sire and residual on the liability scale was used to generate binary data. Five simulation scenarios were conducted based on varying percentages of extreme case problems, with true values of heritability equal to 0.07 and 0.17. Five replicates of each dataset were generated and analyzed with a generalized prior (g-prior) of varying weight. Results Point estimates of sire variance using a normal prior were severely biased when the percentage of extreme case problems was greater than 30%. Depending on the percentage of extreme case problems, the sire variance was overestimated when a normal prior was used by 36 to 102% and 25 to 105% for a heritability of 0.17 and 0.07, respectively. When a g-prior was used, the bias was reduced and even eliminated, depending on the percentage of extreme case problems and the weight assigned to the g-prior. The lowest Pearson correlations between true and estimated fixed effects were obtained when a normal prior was used. When a 15% g-prior was used instead of a normal prior with a heritability equal to 0.17, Pearson correlations between true and fixed effects increased by 11, 20, 23, 27, and 60% for 5, 10, 20, 30 and 75% of extreme case problems, respectively. Conversely, Pearson correlations between true and estimated fixed effects were similar, within datasets of varying percentages of extreme case problems, when a 5, 10, or 15% g-prior was included. Therefore this indicates that a model with a g-prior provides a more adequate estimation of fixed effects. Conclusions The results suggest that when analyzing binary data with extreme case problems, bias in the estimation of variance components could be eliminated, or at least significantly reduced by using a g-prior.