Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 Mar 21;89(6):3747-3753.
doi: 10.1021/acs.analchem.7b00130. Epub 2017 Mar 8.

Comprehensive de Novo Peptide Sequencing from MS/MS Pairs Generated through Complementary Collision Induced Dissociation and 351 nm Ultraviolet Photodissociation

Affiliations

Comprehensive de Novo Peptide Sequencing from MS/MS Pairs Generated through Complementary Collision Induced Dissociation and 351 nm Ultraviolet Photodissociation

Andrew P Horton et al. Anal Chem. .

Abstract

We describe a strategy for de novo peptide sequencing based on matched pairs of tandem mass spectra (MS/MS) obtained by collision induced dissociation (CID) and 351 nm ultraviolet photodissociation (UVPD). Each precursor ion is isolated twice with the mass spectrometer switching between CID and UVPD activation modes to obtain a complementary MS/MS pair. To interpret these paired spectra, we modified the UVnovo de novo sequencing software to automatically learn from and interpret fragmentation spectra, provided a representative set of training data. This machine learning procedure, using random forests, synthesizes information from one or multiple complementary spectra, such as the CID/UVPD pairs, into peptide fragmentation site predictions. In doing so, the burden of fragmentation model definition shifts from programmer to machine and opens up the model parameter space for inclusion of nonobvious features and interactions. This spectral synthesis also serves to transform distinct types of spectra into a common representation for subsequent activation-independent processing steps. Then, independent from precursor activation constraints, UVnovo's de novo sequencing procedure generates and scores sequence candidates for each precursor. We demonstrate the combined experimental and computational approach for de novo sequencing using whole cell E. coli lysate. In benchmarks on the CID/UVPD data, UVnovo assigned correct full-length sequences to 83% of the spectral pairs of doubly charged ions with high-confidence database identifications. Considering only top-ranked de novo predictions, 70% of the pairs were deciphered correctly. This de novo sequencing performance exceeds that of PEAKS and PepNovo on the CID spectra and that of UVnovo on CID or UVPD spectra alone. As presented here, the methods for paired CID/UVPD spectral acquisition and interpretation constitute a powerful workflow for high-throughput and accurate de novo peptide sequencing.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Example CID/UVPD pair for peptide K[AMCA, carbamyl]AAITAEIR and synthetic spectra derived from the pair. (a) CID (NCE 35) and (b) UVPD (3 mJ per pulse, 15 pulses). (c) A random forest algorithm merges both into a single synthetic spectrum. Peaks scoring below 0.5 are shaded gray. (d) A hidden Markov model assigns a probability to each possible fragmentation site. The precursor is labeled with an asterisk.
Figure 2
Figure 2
Overlap in de novo sequencing results from the E. coli lysate benchmarking dataset of 4,616 doubly charged CID/UVPD pairs. (a) Comparison of CID spectra identification between UVnovo-CID, PEAKS, and PepNovo. This considers all de novo candidates for each spectrum. (b-d) Only the top ranked de novo prediction for each spectrum or CID/UVPD pair is included.
Figure 3
Figure 3
Counts and overlap of spectra that were identified correctly from the E. coli benchmark set. (a) UVnovo identifications from the paired, CID-only, and UVPD-only spectra. (b) Paired spectra UVnovo identifications and CID spectra identified through PEAKS and PepNovo. Numbers shown include correct de novo predictions of any rank.
Figure 4
Figure 4
Cumulative fraction of correct de novo sequences by descending prediction rank on paired (UVnovo) and individual (UVnovo, PEAKS, and PepNovo) spectra. UVnovo interpretation of paired CID/UVPD outperforms that using only the CID or UVPD subset of spectra. The dataset contains 4616 charge 2+ paired spectra from E. coli lysate. Correct sequence predictions match the full-length SEQUEST PSM with no gaps allowed. I/L and F/Moxidation residue assignments are treated as equivalent.
Figure 5
Figure 5
Fraction of sequences of each length with a correct de novo prediction.

Similar articles

Cited by

References

    1. Richards AL, Merrill AE, Coon J. J Curr Opin Chem Biol. 2015;24:11–17. - PMC - PubMed
    1. Mayne J, Ning Z, Zhang X, Starr AE, Chen R, Deeke S, Chiang C-K, Xu B, Wen M, Cheng K, Seebun D, Star A, Moore JI, Figeys D. Anal Chem. 2016;88(1):95–121. - PubMed
    1. Zhang Y, Fonslow BR, Shan B, Baek M-C, Yates JR. Chem Rev. 2013;113:2343–2394. - PMC - PubMed
    1. Frank AM, Savitski MM, Nielsen ML, Zubarev RA, Pevzner PA. J Proteome Res. 2007;6:114–123. - PMC - PubMed
    1. Seidler J, Zinn N, Boehm ME, Lehmann WD. PROTEOMICS. 2010;10:634–649. - PubMed

Publication types