Frontiers in Marine Science (Nov 2019)
SHiPCC—A Sea-going High-Performance Compute Cluster for Image Analysis
Abstract
Marine image analysis faces a multitude of challenges: data set size easily reaches Terabyte-scale; the underwater visual signal is often impaired to the point where information content becomes negligible; human interpreters are scarce and can only focus on subsets of the available data due to the annotation effort involved etc. Solutions to speed-up the analysis process have been presented in the literature in the form of semi-automation with artificial intelligence methods like machine learning. But the algorithms employed to automate the analysis commonly rely on large-scale compute infrastructure. So far, such an infrastructure has only been available on-shore. Here, a mobile compute cluster is presented to bring big image data analysis capabilities out to sea. The Sea-going High-Performance Compute Cluster (SHiPCC) units are mobile, robustly designed to operate with electrically impure ship-based power supplies and based on off-the-shelf computer hardware. Each unit comprises of up to eight compute nodes with graphics processing units for efficient image analysis and an internal storage to manage the big image data sets. The first SHiPCC unit has been successfully deployed at sea. It allowed us to extract semantic and quantitative information from a Terabyte-sized image data set within 1.5 h (a relative speedup of 97% compared to a single four-core CPU computer). Enabling such compute capability out at sea allows to include image-derived information into the cruise research plan, for example by determining promising sampling locations. The SHiPCC units are envisioned to generally improve the relevance and importance of optical imagery for marine science.
Keywords