Analyzing Clustered Data: Why and How to Account for Multiple Observations Nested within a Study Participant?

Erika L Moen; Catherine J Fricano-Kugler; Bryan W Luikart; A James O'Malley

doi:10.1371/journal.pone.0146721

PLoS ONE (Jan 2016)

Analyzing Clustered Data: Why and How to Account for Multiple Observations Nested within a Study Participant?

Erika L Moen,
Catherine J Fricano-Kugler,
Bryan W Luikart,
A James O'Malley

Affiliations

Erika L Moen
Catherine J Fricano-Kugler
Bryan W Luikart
A James O'Malley

DOI: https://doi.org/10.1371/journal.pone.0146721
Journal volume & issue: Vol. 11, no. 1
p. e0146721

Abstract

Read online

A conventional study design among medical and biological experimentalists involves collecting multiple measurements from a study subject. For example, experiments utilizing mouse models in neuroscience often involve collecting multiple neuron measurements per mouse to increase the number of observations without requiring a large number of mice. This leads to a form of statistical dependence referred to as clustering. Inappropriate analyses of clustered data have resulted in several recent critiques of neuroscience research that suggest the bar for statistical analyses within the field is set too low. We compare naïve analytical approaches to marginal, fixed-effect, and mixed-effect models and provide guidelines for when each of these models is most appropriate based on study design. We demonstrate the influence of clustering on a between-mouse treatment effect, a within-mouse treatment effect, and an interaction effect between the two. Our analyses demonstrate that these statistical approaches can give substantially different results, primarily when the analyses include a between-mouse treatment effect. In a novel analysis from a neuroscience perspective, we also refine the mixed-effect approach through the inclusion of an aggregate mouse-level counterpart to a within-mouse (neuron level) treatment as an additional predictor by adapting an advanced modeling technique that has been used in social science research and show that this yields more informative results. Based on these findings, we emphasize the importance of appropriate analyses of clustered data, and we aim for this work to serve as a resource for when one is deciding which approach will work best for a given study.

Published in PLoS ONE

ISSN: 1932-6203 (Online)
Publisher: Public Library of Science (PLoS)
Country of publisher: United States
LCC subjects: Medicine; Science
Website: https://journals.plos.org/plosone/

About the journal