PLoS ONE (Jan 2018)
Quantifying cell densities and biovolumes of phytoplankton communities and functional groups using scanning flow cytometry, machine learning and unsupervised clustering.
Abstract
Scanning flow cytometry (SFCM) is characterized by the measurement of time-resolved pulses of fluorescence and scattering, enabling the high-throughput quantification of phytoplankton morphology and pigmentation. Quantifying variation at the single cell and colony level improves our ability to understand dynamics in natural communities. Automated high-frequency monitoring of these communities is presently limited by the absence of repeatable, rapid protocols to analyse SFCM datasets, where images of individual particles are not available. Here we demonstrate a repeatable, semi-automated method to (1) rapidly clean SFCM data from a phytoplankton community by removing signals that do not belong to live phytoplankton cells, (2) classify individual cells into trait clusters that correspond to functional groups, and (3) quantify the biovolumes of individual cells, the total biovolume of the whole community and the total biovolumes of the major functional groups. Our method involves the development of training datasets using lab cultures, the use of an unsupervised clustering algorithm to identify trait clusters, and machine learning tools (random forests) to (1) evaluate variable importance, (2) classify data points, and (3) estimate biovolumes of individual cells. We provide example datasets and R code for our analytical approach that can be adapted for analysis of datasets from other flow cytometers or scanning flow cytometers.