Advanced Intelligent Systems (Mar 2022)
Biological Characteristics of Cell Similarity Measure
Abstract
Similarity measures play an important role in many data analysis fields. However, the current cell similarity measures are poor used in data characteristics. Herein, a novel similarity measure named segment weighting similarity (SWS) is developed for the analysis of single‐cell Raman spectra. SWS segments the spectra by the cell biological characteristics and quantifies the significant factors per region, which can increase the contribution of intrinsic biological features and reduce noise. The similarity heat maps of SWS and three other kinds of traditional similarities, including cosine similarity, Pearson correlation coefficient, and Euclidean distance, show that SWS has high accuracy and low bias in distinguishing cell spectra. K‐nearest‐neighbor classifiers have the identification accuracy, sensitivity, and specificity of 0.852, 0.853, and 0.965, respectively. The purity of the clustering model could increase by 0.31 in some tasks of the K‐means and spectral clustering. The classification and clustering results demonstrate that SWS is more effective than common ones. SWS, based on the basic data and intrinsic biological characteristics, provides a new thought and formula in the similarity measure for most of the Raman and infrared technologies, and has great potential for enhancing the performance of machine learning algorithms.
Keywords