Royal Society Open Science (Aug 2024)

The assessment of replicability using the sum of p-values

  • Leonhard Held,
  • Samuel Pawel,
  • Charlotte Micheloud

DOI
https://doi.org/10.1098/rsos.240149
Journal volume & issue
Vol. 11, no. 8

Abstract

Read online

Statistical significance of both the original and the replication study is a commonly used criterion to assess replication attempts, also known as the two-trials rule in drug development. However, replication studies are sometimes conducted although the original study is non-significant, in which case Type-I error rate control across both studies is no longer guaranteed. We propose an alternative method to assess replicability using the sum of [Formula: see text]-values from the two studies. The approach provides a combined [Formula: see text]-value and can be calibrated to control the overall Type-I error rate at the same level as the two-trials rule but allows for replication success even if the original study is non-significant. The unweighted version requires a less restrictive level of significance at replication if the original study is already convincing which facilitates sample size reductions of up to 10%. Downweighting the original study accounts for possible bias and requires a more stringent significance level and larger sample sizes at replication. Data from four large-scale replication projects are used to illustrate and compare the proposed method with the two-trials rule, meta-analysis and Fisher’s combination method.

Keywords