Towards operational phytoplankton recognition with automated high-throughput imaging, near-real-time data processing, and convolutional neural networks

Kaisa Kraft; Otso Velhonoja; Tuomas Eerola; Sanna Suikkanen; Timo Tamminen; Lumi Haraguchi; Pasi Ylöstalo; Sami Kielosto; Milla Johansson; Lasse Lensu; Heikki Kälviäinen; Heikki Haario; Jukka Seppälä

doi:10.3389/fmars.2022.867695

Frontiers in Marine Science (Sep 2022)

Towards operational phytoplankton recognition with automated high-throughput imaging, near-real-time data processing, and convolutional neural networks

Kaisa Kraft,
Otso Velhonoja,
Tuomas Eerola,
Sanna Suikkanen,
Timo Tamminen,
Lumi Haraguchi,
Pasi Ylöstalo,
Sami Kielosto,
Milla Johansson,
Lasse Lensu,
Heikki Kälviäinen,
Heikki Haario,
Jukka Seppälä

Affiliations

Kaisa Kraft: Finnish Environment Institute, Marine Research Centre, Helsinki, Finland
Otso Velhonoja: Finnish Environment Institute, Marine Research Centre, Helsinki, Finland
Tuomas Eerola: Computer Vision and Pattern Recognition Laboratory, School of Engineering Science, Lappeenranta-Lahti University of Technology LUT, Lappeenranta, Finland
Sanna Suikkanen: Finnish Environment Institute, Marine Research Centre, Helsinki, Finland
Timo Tamminen: Finnish Environment Institute, Marine Research Centre, Helsinki, Finland
Lumi Haraguchi: Finnish Environment Institute, Marine Research Centre, Helsinki, Finland
Pasi Ylöstalo: Finnish Environment Institute, Marine Research Centre, Helsinki, Finland
Sami Kielosto: Finnish Environment Institute, Marine Research Centre, Helsinki, Finland
Milla Johansson: Finnish Meteorological Institute, Helsinki, Finland
Lasse Lensu: Computer Vision and Pattern Recognition Laboratory, School of Engineering Science, Lappeenranta-Lahti University of Technology LUT, Lappeenranta, Finland
Heikki Kälviäinen: Computer Vision and Pattern Recognition Laboratory, School of Engineering Science, Lappeenranta-Lahti University of Technology LUT, Lappeenranta, Finland
Heikki Haario: Computer Vision and Pattern Recognition Laboratory, School of Engineering Science, Lappeenranta-Lahti University of Technology LUT, Lappeenranta, Finland
Jukka Seppälä: Finnish Environment Institute, Marine Research Centre, Helsinki, Finland

DOI: https://doi.org/10.3389/fmars.2022.867695
Journal volume & issue: Vol. 9

Abstract

Read online

Plankton communities form the basis of aquatic ecosystems and elucidating their role in increasingly important environmental issues is a persistent research question. Recent technological advances in automated microscopic imaging, together with cloud platforms for high-performance computing, have created possibilities for collecting and processing detailed high-frequency data on planktonic communities, opening new horizons for testing core hypotheses in aquatic ecosystems. Analyzing continuous streams of big data calls for development and deployment of novel computer vision and machine learning systems. The implementation of these analysis systems is not always straightforward with regards to operationality, and issues regarding data flows, computing and data treatment need to be considered. We created a data pipeline for automated near-real-time classification of phytoplankton during remote deployment of imaging flow cytometer (Imaging FlowCytobot, IFCB). Convolutional neural network (CNN) is used to classify continuous imaging data with probability thresholds used to filter out images not belonging to our existing classes. The automated data flow and classification system were used to monitor dominating species of filamentous cyanobacteria on the coast of Finland during summer 2021. We demonstrate that good phytoplankton recognition can be achieved with transfer learning utilizing a relatively shallow, publicly available, pre-trained CNN model and fine-tuning it with community-specific phytoplankton images (overall F1-score of 0.95 for test set of our labeled image data complemented with a 50% unclassifiable image portion). This enables both fast training and low computing resource requirements for model deployment making it easy to modify and applicable in wide range of situations. The system performed well when used to classify a natural phytoplankton community over different seasons (overall F1-score 0.82 for our evaluation data set). Furthermore, we address the key challenges of image classification for varying planktonic communities and analyze the practical implications of confused classes. We published our labeled image data set of Baltic Sea phytoplankton community for the training of image recognition models (~63000 images in 50 classes) to accelerate implementation of imaging systems for other brackish and freshwater communities. Our evaluation data set, 59 fully annotated samples of natural communities throughout an annual cycle, is also available for model testing purposes (~150000 images).

Published in Frontiers in Marine Science

ISSN: 2296-7745 (Online)
Publisher: Frontiers Media S.A.
Country of publisher: Switzerland
LCC subjects: Science: Natural history (General): General. Including nature conservation, geographical distribution
Website: https://www.frontiersin.org/journals/marine-science

About the journal

Abstract

Keywords