BMC Genomics (Jul 2019)
Estimation of a significance threshold for genome-wide association studies
Abstract
Abstract Background Selection of an appropriate statistical significance threshold in genome-wide association studies is critical to differentiate true positives from false positives and false negatives. Different multiple testing comparison methods have been developed to determine the significance threshold; however, these methods may be overly conservative and may lead to an increase in false negatives. Here, we developed an empirical formula to determine the statistical significance threshold that is based on the marker-based heritability of the trait. To develop a formula for a significance threshold, we used 45 simulated traits in soybean, maize, and rice that varied in both broad sense heritability and the number of QTLs. Results A formula to determine a significance threshold was developed based on a regression equation that used one independent variable, marker-based heritability, and one response variable, − log10 (P)-values. For all species, the threshold –log10 (P)-values increased as both marker-based and broad-sense heritability increased. Higher broad sense heritability in these crops resulted in higher significant threshold values. Among crop species, maize, with a lower linkage disequilibrium pattern, had higher significant threshold values as compared to soybean and rice. Conclusions Our formula was less conservative and identified more true positive associations than the false discovery rate and Bonferroni correction methods.
Keywords