Methods in Ecology and Evolution (Sep 2024)

Clustering and unconstrained ordination with Dirichlet process mixture models

  • Christian Stratton,
  • Andrew Hoegh,
  • Thomas J. Rodhouse,
  • Jennifer L. Green,
  • Katharine M. Banner,
  • Kathryn M. Irvine

DOI
https://doi.org/10.1111/2041-210X.14389
Journal volume & issue
Vol. 15, no. 9
pp. 1720 – 1732

Abstract

Read online

Abstract Assessment of similarity in species composition or abundance across sampled locations is a common goal in multi‐species monitoring programs. Existing ordination techniques provide a framework for clustering sample locations based on species composition by projecting high‐dimensional community data into a low‐dimensional, latent ecological gradient representing species composition. However, these techniques require specification of the number of distinct ecological communities present in the latent space, which can be difficult to determine in advance. We develop an ordination model capable of simultaneous clustering and ordination that allows for estimation of the number of clusters present in the latent ecological gradient. This model draws latent coordinates for each sample location from a Dirichlet process mixture model, affording researchers with probabilistic statements about the number of clusters present in the latent ecological gradient. The model is compared to existing methods for simultaneous clustering and ordination via simulation and applied to two empirical datasets; JAGS code to fit the proposed model is provided in an appendix. The first dataset concerns presence‐absence records of fish in the Doubs river in eastern France and the second dataset describes presence‐absence records of plant species in Craters of the Moon National Monument and Preserve (CRMO) in Idaho, USA. Results from both analyses align with existing ecological gradients at each location. Development of the Dirichlet process ordination model provides wildlife managers with data‐driven inferences about the number of distinct communities present across monitored locations, allowing for more cost‐effective monitoring and reliable decision‐making for conservation management.

Keywords