Data in Brief (Apr 2022)

DIA proteomics data from a UPS1-spiked E.coli protein mixture processed with six software tools

  • Clarisse Gotti,
  • Florence Roux-Dalvai,
  • Charles Joly-Beauparlant,
  • Loïc Mangnier,
  • Mickaël Leclercq,
  • Arnaud Droit

Journal volume & issue
Vol. 41
p. 107829

Abstract

Read online

In this article, we provide a proteomic reference dataset that has been initially generated for a benchmarking of software tools for Data-Independent Acquisition (DIA) analysis. This large dataset includes 96 DIA .raw files acquired from a complex proteomic standard composed of an E.coli protein background spiked-in with 8 different concentrations of 48 human proteins (UPS1 Sigma). These 8 samples were analyzed in triplicates on an Orbitrap mass spectrometer with 4 different DIA window schemes. We also provide the spectral libraries and FASTA file used for their analysis and the software outputs of the six tools used in this study: DIA-NN, Spectronaut, ScaffoldDIA, DIA-Umpire, Skyline and OpenSWATH. This dataset also contains post-processed quantification tables where the peptides and proteins have been validated, their intensities normalized and the missing values imputed with a noise value. All the files are available on ProteomeXchange. Altogether, these files represent the most comprehensive DIA reference dataset acquired on an Orbitrap instrument ever published. It will be a very useful resource to the proteomic scientists in order to assess the performance of DIA software tools or to test their processing pipelines, to the software developers to improve their tools or develop new ones and to the students for their training on proteomics data analysis.

Keywords