PLoS ONE (Jan 2021)

A computational reproducibility study of PLOS ONE articles featuring longitudinal data analyses.

  • Heidi Seibold,
  • Severin Czerny,
  • Siona Decke,
  • Roman Dieterle,
  • Thomas Eder,
  • Steffen Fohr,
  • Nico Hahn,
  • Rabea Hartmann,
  • Christoph Heindl,
  • Philipp Kopper,
  • Dario Lepke,
  • Verena Loidl,
  • Maximilian Mandl,
  • Sarah Musiol,
  • Jessica Peter,
  • Alexander Piehler,
  • Elio Rojas,
  • Stefanie Schmid,
  • Hannah Schmidt,
  • Melissa Schmoll,
  • Lennart Schneider,
  • Xiao-Yin To,
  • Viet Tran,
  • Antje Völker,
  • Moritz Wagner,
  • Joshua Wagner,
  • Maria Waize,
  • Hannah Wecker,
  • Rui Yang,
  • Simone Zellner,
  • Malte Nalenz

DOI
https://doi.org/10.1371/journal.pone.0251194
Journal volume & issue
Vol. 16, no. 6
p. e0251194

Abstract

Read online

Computational reproducibility is a corner stone for sound and credible research. Especially in complex statistical analyses-such as the analysis of longitudinal data-reproducing results is far from simple, especially if no source code is available. In this work we aimed to reproduce analyses of longitudinal data of 11 articles published in PLOS ONE. Inclusion criteria were the availability of data and author consent. We investigated the types of methods and software used and whether we were able to reproduce the data analysis using open source software. Most articles provided overview tables and simple visualisations. Generalised Estimating Equations (GEEs) were the most popular statistical models among the selected articles. Only one article used open source software and only one published part of the analysis code. Replication was difficult in most cases and required reverse engineering of results or contacting the authors. For three articles we were not able to reproduce the results, for another two only parts of them. For all but two articles we had to contact the authors to be able to reproduce the results. Our main learning is that reproducing papers is difficult if no code is supplied and leads to a high burden for those conducting the reproductions. Open data policies in journals are good, but to truly boost reproducibility we suggest adding open code policies.