Computational and Structural Biotechnology Journal (Jan 2021)
SCSit: A high-efficiency preprocessing tool for single-cell sequencing data from SPLiT-seq
Abstract
SPLiT-seq provides a low-cost platform to generate single-cell data by labeling the cellular origin of RNA through four rounds of combinatorial barcoding. However, an automatic and rapid method for preprocessing and classifying single-cell sequencing (SCS) data from SPLiT-seq, which directly identified and labeled combinatorial barcoding reads and distinguished special cell sequencing data, is currently lacking. Here, we develop a high-efficiency preprocessing tool for single-cell sequencing data from SPLiT-seq (SCSit), which can directly identify combinatorial barcodes and UMI of cell types and obtain more labeled reads, and remarkably enhance the retained data from SCS due to the exact alignment of insertion and deletion. Compared with the original method used in SPLiT-seq, the consistency of identified reads from SCSit increases to 97%, and mapped reads are twice than the original. Furthermore, the runtime of SCSit is less than 10% of the original. It can accurately and rapidly analyze SPLiT-seq raw data and obtain labeled reads, as well as effectively improve the single-cell data from SPLiT-seq platform. The data and source of SCSit are available on the GitHub website https://github.com/shang-qian/SCSit.