BMC Medical Research Methodology (Feb 2022)

Robust meta-analysis for large-scale genomic experiments based on an empirical approach

  • Sinjini Sikdar

DOI
https://doi.org/10.1186/s12874-022-01530-y
Journal volume & issue
Vol. 22, no. 1
pp. 1 – 12

Abstract

Read online

Abstract Background Recent high-throughput technologies have opened avenues for simultaneous analyses of thousands of genes. With the availability of a multitude of public databases, one can easily access multiple genomic study results where each study comprises of significance testing results of thousands of genes. Researchers currently tend to combine this genomic information from these multiple studies in the form of a meta-analysis. As the number of genes involved is very large, the classical meta-analysis approaches need to be updated to acknowledge this large-scale aspect of the data. Methods In this article, we discuss how application of standard theoretical null distributional assumptions of the classical meta-analysis methods, such as Fisher’s p-value combination and Stouffer’s Z, can lead to incorrect significant testing results, and we propose a robust meta-analysis method that empirically modifies the individual test statistics and p-values before combining them. Results Our proposed meta-analysis method performs best in significance testing among several meta-analysis approaches, especially in presence of hidden confounders, as shown through a wide variety of simulation studies and real genomic data analysis. Conclusion The proposed meta-analysis method produces superior meta-analysis results compared to the standard p-value combination approaches for large-scale simultaneous testing in genomic experiments. This is particularly useful in studies with large number of genes where the standard meta-analysis approaches can result in gross false discoveries due to the presence of unobserved confounding variables.

Keywords