Taiyuan Ligong Daxue xuebao (Sep 2023)

Classification Method of Pancreatic Single Cells Based on Improved Large Margin Nearest Neighbor

  • Ziyi XI,
  • Jiayu LU,
  • Zhuo CHEN,
  • Jie XIANG,
  • Bin WANG

DOI
https://doi.org/10.16355/j.tyut.1007-9432.2023.05.008
Journal volume & issue
Vol. 54, no. 5
pp. 812 – 819

Abstract

Read online

Purpose Cell type identification is one of the key steps in single cell RNA sequencing. Methods To solve the problem of low classification accuracy with single cell RNA sequencing data and insufficient measurement of distance characteristics of each cell type, a Large Margin Nearest Neighbor (LMNN) based on Multi Similarity Loss (MSL) metric learning method is proposed to adapt LMNN to the single cell classification field. Multi Similarity Loss can be used to measure the similarity from multiple perspectives, and solve the problem that the relationship utilization rate between sample pairs is not high when the training samples are small in the triplet loss of LMNN algorithm, thus improving the single cell classification effect. Findings Experiments on the pancreatic single cell dataset baron_human and segerstolpe show that the classification accuracy of MSL-LMNN is higher than that of the main metric learning method, and the accuracy rate of the combination of MSL-LMNN and Random Forest is improved compared with existing single cell classification method, with the accuracy rate of 0.96. Conclusions The MSL-LMNN proposed in this paper can accurately and effectively identify the cell types of pancreatic single cell sequencing data, and has its application value.

Keywords