Scientific Reports (Jun 2023)
Quantifiable peptide library bridges the gap for proteomics based biomarker discovery and validation on breast cancer
Abstract
Abstract Mass spectrometry (MS) based proteomics is widely used for biomarker discovery. However, often, most biomarker candidates from discovery are discarded during the validation processes. Such discrepancies between biomarker discovery and validation are caused by several factors, mainly due to the differences in analytical methodology and experimental conditions. Here, we generated a peptide library which allows discovery of biomarkers in the equal settings as the validation process, thereby making the transition from discovery to validation more robust and efficient. The peptide library initiated with a list of 3393 proteins detectable in the blood from public databases. For each protein, surrogate peptides favorable for detection in mass spectrometry was selected and synthesized. A total of 4683 synthesized peptides were spiked into neat serum and plasma samples to check their quantifiability in a 10 min liquid chromatography-MS/MS run time. This led to the PepQuant library, which is composed of 852 quantifiable peptides that cover 452 human blood proteins. Using the PepQuant library, we discovered 30 candidate biomarkers for breast cancer. Among the 30 candidates, nine biomarkers, FN1, VWF, PRG4, MMP9, CLU, PRDX6, PPBP, APOC1, and CHL1 were validated. By combining the quantification values of these markers, we generated a machine learning model predicting breast cancer, showing an average area under the curve of 0.9105 for the receiver operating characteristic curve.