Proteomic datasets of HeLa and SiHa cell lines acquired by DDA-PASEF and diaPASEF
Zelu Huang,
Weijia Kong,
Bertrand Jernhan Wong,
Huanhuan Gao,
Tiannan Guo,
Xianming Liu,
Xiaoxian Du,
Limsoon Wong,
Wilson Wen Bin Goh
Affiliations
Zelu Huang
School of Chemical and Biomedical Engineering, Nanyang Technological University, Singapore
Weijia Kong
School of Biological Sciences, Nanyang Technological University, Singapore; Department of Computer Science, National University of Singapore, Singapore
Bertrand Jernhan Wong
School of Biological Sciences, Nanyang Technological University, Singapore
Huanhuan Gao
Zhejiang Provincial Laboratory of Life Sciences and Biomedicine, Key Laboratory of Structural Biology of Zhejiang Province, School of Life Sciences, Westlake University, Zhejiang, China; Institute of Basic Medical Sciences, Westlake Institute for Advanced Study, Zhejiang, China
Tiannan Guo
Zhejiang Provincial Laboratory of Life Sciences and Biomedicine, Key Laboratory of Structural Biology of Zhejiang Province, School of Life Sciences, Westlake University, Zhejiang, China; Institute of Basic Medical Sciences, Westlake Institute for Advanced Study, Zhejiang, China
Xianming Liu
Bruker (Beijing) Scientific Technology Co., Ltd, Shanghai, China
Xiaoxian Du
Bruker (Beijing) Scientific Technology Co., Ltd, Shanghai, China
Limsoon Wong
Department of Computer Science, National University of Singapore, Singapore; Corresponding author at: School of Biological Sciences and Lee Kong Chian School of Medicine, Nanyang Technological University, Singapore.
Wilson Wen Bin Goh
School of Biological Sciences, Nanyang Technological University, Singapore; Lee Kong Chian School of Medicine, Nanyang Technological University, Singapore; Corresponding author at: School of Biological Sciences, Nanyang Technological University, Singapore.
We present four datasets on proteomics profiling of HeLa and SiHa cell lines associated with the research described in the paper “PROTREC: A probability-based approach for recovering missing proteins based on biological networks” [1]. Proteins in each cell line were acquired by two different data acquisition methods. The first was Data Dependent Acquisition-Parallel Accumulation Serial Fragmentation (DDA-PASEF) and the second was Parallel Accumulation-Serial Fragmentation combined with data-independent acquisition (diaPASEF) [2,3]. Protein assembly was performed following search against the Swiss-Prot Human database using Peaks Studio for DDA datasets and Spectronaut for DIA datasets. The assembled result contains identified PSMs, peptides and proteins that are above threshold for each HeLa and SiHa sample. Coverage-wise, for DDA-PASEF, approximately 6,090 and 7,298 proteins were quantified for HeLa and SiHA sample, while13,339 and 8,773 proteins were quantified by diaPASEF for HeLa for SiHa sample, respectively. Consistency-wise, diaPASEF has fewer missing values (∼ 2%) compared to its DDA counterparts (∼5–7%). The mass spectrometry proteomics data have been deposited to the ProteomeXchange Consortium (http://proteomecentral.proteomexchange.org) via the iProX partner repository [4] with the dataset identifier PXD029773.