Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 Sep;14(9):903-908.
doi: 10.1038/nmeth.4390. Epub 2017 Aug 7.

PECAN: library-free peptide detection for data-independent acquisition tandem mass spectrometry data

Affiliations

PECAN: library-free peptide detection for data-independent acquisition tandem mass spectrometry data

Ying S Ting et al. Nat Methods. 2017 Sep.

Abstract

Data-independent acquisition (DIA) is an emerging mass spectrometry (MS)-based technique for unbiased and reproducible measurement of protein mixtures. DIA tandem mass spectrometry spectra are often highly multiplexed, containing product ions from multiple cofragmenting precursors. Detecting peptides directly from DIA data is therefore challenging; most DIA data analyses require spectral libraries. Here we present PECAN (http://pecan.maccosslab.org), a library-free, peptide-centric tool that robustly and accurately detects peptides directly from DIA data. PECAN reports evidence of detection based on product ion scoring, which enables detection of low-abundance analytes with poor precursor ion signal. We demonstrate the chromatographic peak picking accuracy and peptide detection capability of PECAN, and we further validate its detection with data-dependent acquisition and targeted analyses. Lastly, we used PECAN to build a plasma proteome library from DIA data and to query known sequence variants.

PubMed Disclaimer

Conflict of interest statement

COMPETING FINANCIAL INTERESTS

The MacCoss Lab at the University of Washington has a sponsored research agreement with Thermo Fisher Scientific, the manufacturer of the instrumentation used in this research. Additionally, MJM is a paid consultant for Thermo Fisher Scientific.

Figures

Figure 1
Figure 1. Overview of PECAN workflow
PECAN takes DIA data, peptides of interest, and a background proteome database as inputs, and outputs evidence of detection with auxiliary scores for every query peptide and PECAN generated decoy peptide. Percolator uses PECAN output to train a classifier to distinguish correct and incorrect evidence, and then outputs confident peptide and protein detection with estimated FDR.
Figure 2
Figure 2. PECAN peak picking performance on SIS dataset
The percentage of total correct SIS peaks (a) and the number of SIS peaks (b) reported by PECAN prior to FDR control from three replicates combined. Same figures (c) and (d) respectively after the PECAN reported evidence of detection were subjected to peptide level FDR control per measurement at q-value < 0.01 by Percolator.
Figure 3
Figure 3. Validate PECAN detection with GST-fusion proteins
Comparative analysis of peptide detection from DIA and DDA data from HeLa protein digest. Peptide (a) and protein (b) comparison of PECAN-DIA detection and Comet-DDA identification. (c) SRM validation workflow for a set of analytical standards synthesized using in vitro transcription translation (IVTT). (d) Comparative analysis of retention time of HeLa peptides detected by PECAN from DIA data and IVTT peptides detected from SRM.
Figure 4
Figure 4. Deep proteome measurement with gas phase fractionation
Comparison of peptides (a) and proteins (b) detected by PECAN from 1xGPF, 2xGPF, and 4xGPF DIA data when queried with the human UniProt Swiss-Prot database. (c) Retention time comparison of 12,952 PECAN detected peptides form 1xGPF and 2xGPF relative to 4xGPF. (d) Number of peptides detected by either, or both PECAN and DIA-Umpire from the three GPF DIA datasets.
Figure 5
Figure 5. Natural variants in the plasma library data
Full-length canonical sequences of Serotransferrin (a) and Apolipoprotein A-1 (b) are obtained from the human UniProt Swiss-Prot database, accession number P02647 and P02787, respectively. Blue boxes represent PECAN detected peptides from the plasma library data when queried with canonical sequences. Red boxes represent PECAN detected variant-specific peptides from the plasma library data when queried with variant-specific tryptic peptides from 3,714 variants.

References

    1. Venable JD, Dong MQ, Wohlschlegel J, Dillin A, Yates JR. Automated approach for quantitative analysis of complex peptide mixtures from tandem mass spectra. Nat Methods. 2004;1:39–45. - PubMed
    1. Chapman JD, Goodlett DR, Masselon CD. Multiplexed and data-independent tandem mass spectrometry for global proteome profiling. Mass Spectrom Rev. 2014;33:452–470. - PubMed
    1. Röst HL, et al. OpenSWATH enables automated, targeted analysis of data-independent acquisition MS data. Nat Biotechnol. 2014;32:219–223. - PubMed
    1. Wang J, et al. MSPLIT-DIA: sensitive peptide identification for data-independent acquisition. Nat Methods. 2015 advance online publication. - PMC - PubMed
    1. Ting YS, et al. Peptide-Centric Proteome Analysis: An Alternative Strategy for the Analysis of Tandem Mass Spectrometry Data. Mol Cell Proteomics. 2015;14:2301–2307. - PMC - PubMed