BMC Biology (Mar 2022)
Assignment of unimodal probability distribution models for quantitative morphological phenotyping
Abstract
Abstract Background Cell morphology is a complex and integrative readout, and therefore, an attractive measurement for assessing the effects of genetic and chemical perturbations to cells. Microscopic images provide rich information on cell morphology; therefore, subjective morphological features are frequently extracted from digital images. However, measured datasets are fundamentally noisy; thus, estimation of the true values is an ultimate goal in quantitative morphological phenotyping. Ideal image analyses require precision, such as proper probability distribution analyses to detect subtle morphological changes, recall to minimize artifacts due to experimental error, and reproducibility to confirm the results. Results Here, we present UNIMO (UNImodal MOrphological data), a reliable pipeline for precise detection of subtle morphological changes by assigning unimodal probability distributions to morphological features of the budding yeast cells. By defining the data type, followed by validation using the model selection method, examination of 33 probability distributions revealed nine best-fitting probability distributions. The modality of the distribution was then clarified for each morphological feature using a probabilistic mixture model. Using a reliable and detailed set of experimental log data of wild-type morphological replicates, we considered the effects of confounding factors. As a result, most of the yeast morphological parameters exhibited unimodal distributions that can be used as basic tools for powerful downstream parametric analyses. The power of the proposed pipeline was confirmed by reanalyzing morphological changes in non-essential yeast mutants and detecting 1284 more mutants with morphological defects compared with a conventional approach (Box–Cox transformation). Furthermore, the combined use of canonical correlation analysis permitted global views on the cellular network as well as new insights into possible gene functions. Conclusions Based on statistical principles, we showed that UNIMO offers better predictions of the true values of morphological measurements. We also demonstrated how these concepts can provide biologically important information. This study draws attention to the necessity of employing a proper approach to do more with less.
Keywords