PLoS ONE (Jan 2021)

Influence of number of individuals and observations per individual on a model of community structure.

  • Julia Sunga,
  • Quinn M R Webber,
  • Hugh G Broders

DOI
https://doi.org/10.1371/journal.pone.0252471
Journal volume & issue
Vol. 16, no. 6
p. e0252471

Abstract

Read online

Social network analysis is increasingly applied to understand animal groups. However, it is rarely feasible to observe every interaction among all individuals in natural populations. Studies have assessed how missing information affects estimates of individual network positions, but less attention has been paid to metrics that characterize overall network structure such as modularity, clustering coefficient, and density. In cases such as groups displaying fission-fusion dynamics, where subgroups break apart and rejoin in changing conformations, missing information may affect estimates of global network structure differently than in groups with distinctly separated communities due to the influence single individuals can have on the connectivity of the network. Using a bat maternity group showing fission-fusion dynamics, we quantify the effect of missing data on global network measures including community detection. In our system, estimating the number of communities was less reliable than detecting community structure. Further, reliably assorting individual bats into communities required fewer individuals and fewer observations per individual than to estimate the number of communities. Specifically, our metrics of global network structure (i.e., graph density, clustering coefficient, Rcom) approached the 'real' values with increasing numbers of observations per individual and, as the number of individuals included increased, the variance in these estimates decreased. Similar to previous studies, we recommend that more observations per individual should be prioritized over including more individuals when resources are limited. We recommend caution when making conclusions about animal social networks when a substantial number of individuals or observations are missing, and when possible, suggest subsampling large datasets to observe how estimates are influenced by sampling intensity. Our study serves as an example of the reliability, or lack thereof, of global network measures with missing information, but further work is needed to determine how estimates will vary with different data collection methods, network structures, and sampling periods.