European Radiology Experimental (Jul 2024)
Sample size calculation for data reliability and diagnostic performance: a go-to review
Abstract
Abstract Sample size, namely the number of subjects that should be included in a study to reach the desired endpoint and statistical power, is a fundamental concept of scientific research. Indeed, sample size must be planned a priori, and tailored to the main endpoint of the study, to avoid including too many subjects, thus possibly exposing them to additional risks while also wasting time and resources, or too few subjects, failing to reach the desired purpose. We offer a simple, go-to review of methods for sample size calculation for studies concerning data reliability (repeatability/reproducibility) and diagnostic performance. For studies concerning data reliability, we considered Cohen’s κ or intraclass correlation coefficient (ICC) for hypothesis testing, estimation of Cohen’s κ or ICC, and Bland-Altman analyses. With regards to diagnostic performance, we considered accuracy or sensitivity/specificity versus reference standards, the comparison of diagnostic performances, and the comparisons of areas under the receiver operating characteristics curve. Finally, we considered the special cases of dropouts or retrospective case exclusions, multiple endpoints, lack of prior data estimates, and the selection of unusual thresholds for α and β errors. For the most frequent cases, we provide example of software freely available on the Internet. Relevance statement Sample size calculation is a fundamental factor influencing the quality of studies on repeatability/reproducibility and diagnostic performance in radiology. Key points • Sample size is a concept related to precision and statistical power. • It has ethical implications, especially when patients are exposed to risks. • Sample size should always be calculated before starting a study. • This review offers simple, go-to methods for sample size calculations. Graphical Abstract
Keywords