BMC Bioinformatics (Oct 2022)
Single object profiles regression analysis (SOPRA): a novel method for analyzing high-content cell-based screens
Abstract
Abstract Background High-content screening (HCS) experiments generate complex data from multiple object features for each cell within a treated population. Usually, these data are analyzed by using population-averaged values of the features of interest, increasing the amount of false positives and the need for intensive follow-up validation. Therefore, there is a strong need for novel approaches with reproducible hit prediction by identifying significantly altered cell populations. Results Here we describe SOPRA, a workflow for analyzing image-based HCS data based on regression analysis of non-averaged object features from cell populations, which can be run on hundreds of samples using different cell features. Following plate-wise normalization, the values are counted within predetermined binning intervals, generating unique frequency distribution profiles (histograms) for each population, which are then normalized to control populations (control-based normalization). These control-normalized frequency distribution profiles are analyzed using the Bioconductor R-package maSigPro, originally developed to analyze time profiles. However, statistically significant altered frequency distributions are also identified by maSigPro when integrating it into the SOPRA workflow. Finally, significantly changed profiles can be used to generate a heatmap from which altered cell populations with similar phenotypes can be identified, enabling the detection of siRNAs and compounds with the same ‘on-target’ profile and reducing the number of false positive hits. Conclusions SOPRA is a novel analysis workflow for the detection of statistically significant normalized frequency distribution profiles of cellular features generated in high-throughput RNAi screens. For the validation of the SOPRA software workflow, a screen for cell cycle progression was used. We were able to identify such profiles for siRNA-mediated gene perturbations and chemical inhibitors of different cell cycle stages. The SOPRA software is freely available from Github.
Keywords