BMC Biology (Nov 2022)
Synthetic Micrographs of Bacteria (SyMBac) allows accurate segmentation of bacterial cells using deep neural networks
Abstract
Abstract Background Deep-learning–based image segmentation models are required for accurate processing of high-throughput timelapse imaging data of bacterial cells. However, the performance of any such model strictly depends on the quality and quantity of training data, which is difficult to generate for bacterial cell images. Here, we present a novel method of bacterial image segmentation using machine learning models trained with Synthetic Micrographs of Bacteria (SyMBac). Results We have developed SyMBac, a tool that allows for rapid, automatic creation of arbitrary amounts of training data, combining detailed models of cell growth, physical interactions, and microscope optics to create synthetic images which closely resemble real micrographs, and is capable of training accurate image segmentation models. The major advantages of our approach are as follows: (1) synthetic training data can be generated virtually instantly and on demand; (2) these synthetic images are accompanied by perfect ground truth positions of cells, meaning no data curation is required; (3) different biological conditions, imaging platforms, and imaging modalities can be rapidly simulated, meaning any change in one’s experimental setup no longer requires the laborious process of manually generating new training data for each change. Deep-learning models trained with SyMBac data are capable of analysing data from various imaging platforms and are robust to drastic changes in cell size and morphology. Our benchmarking results demonstrate that models trained on SyMBac data generate more accurate cell identifications and precise cell masks than those trained on human-annotated data, because the model learns the true position of the cell irrespective of imaging artefacts. We illustrate the approach by analysing the growth and size regulation of bacterial cells during entry and exit from dormancy, which revealed novel insights about the physiological dynamics of cells under various growth conditions. Conclusions The SyMBac approach will help to adapt and improve the performance of deep-learning–based image segmentation models for accurate processing of high-throughput timelapse image data.
Keywords