Hybrid Spectral Library Combining DIA-MS Data and a Targeted Virtual Library Substantially Deepens the Proteome Coverage
Ronghui Lou,
Pan Tang,
Kang Ding,
Shanshan Li,
Cuiping Tian,
Yunxia Li,
Suwen Zhao,
Yaoyang Zhang,
Wenqing Shui
Affiliations
Ronghui Lou
iHuman Institute, ShanghaiTech University, Shanghai 201210, China; School of Life Science and Technology, ShanghaiTech University, Shanghai 201210, China; University of Chinese Academy of Sciences, Beijing 100049, China
Pan Tang
iHuman Institute, ShanghaiTech University, Shanghai 201210, China; School of Life Science and Technology, ShanghaiTech University, Shanghai 201210, China; University of Chinese Academy of Sciences, Beijing 100049, China
Kang Ding
iHuman Institute, ShanghaiTech University, Shanghai 201210, China; School of Life Science and Technology, ShanghaiTech University, Shanghai 201210, China; University of Chinese Academy of Sciences, Beijing 100049, China
Shanshan Li
iHuman Institute, ShanghaiTech University, Shanghai 201210, China
Cuiping Tian
iHuman Institute, ShanghaiTech University, Shanghai 201210, China
Yunxia Li
Interdisciplinary Research Center on Biology and Chemistry, Shanghai Institute of Organic Chemistry, Chinese Academy of Sciences, Shanghai 201210, China
Suwen Zhao
iHuman Institute, ShanghaiTech University, Shanghai 201210, China; School of Life Science and Technology, ShanghaiTech University, Shanghai 201210, China; Corresponding author
Yaoyang Zhang
Interdisciplinary Research Center on Biology and Chemistry, Shanghai Institute of Organic Chemistry, Chinese Academy of Sciences, Shanghai 201210, China; Corresponding author
Wenqing Shui
iHuman Institute, ShanghaiTech University, Shanghai 201210, China; School of Life Science and Technology, ShanghaiTech University, Shanghai 201210, China; Corresponding author
Summary: Data-independent acquisition mass spectrometry (DIA-MS) is a powerful technique that enables relatively deep proteomic profiling with superior quantification reproducibility. DIA data mining predominantly relies on a spectral library of sufficient proteome coverage that, in most cases, is built on data-dependent acquisition-based analysis of the same sample. To expand the proteome coverage for a pre-determined protein family, we report herein on the construction of a hybrid spectral library that supplements a DIA experiment-derived library with a protein family-targeted virtual library predicted by deep learning. Leveraging this DIA hybrid library substantially deepens the coverage of three transmembrane protein families (G protein-coupled receptors, ion channels, and transporters) in mouse brain tissues with increases in protein identification of 37%–87% and peptide identification of 58%–161%. Moreover, of the 412 novel GPCR peptides exclusively identified with the DIA hybrid library strategy, 53.6% were validated as present in mouse brain tissues based on orthogonal experimental measurement. : Analytical Chemistry; Biological Sciences; Classification of Proteins; Proteomics Subject Areas: Analytical Chemistry, Biological Sciences, Classification of Proteins, Proteomics