Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2016 Sep 2;15(9):2891-9.
doi: 10.1021/acs.jproteome.5b00996. Epub 2016 Aug 23.

N-Terminal Peptide Detection with Optimized Peptide-Spectrum Matching and Streamlined Sequence Libraries

Affiliations

N-Terminal Peptide Detection with Optimized Peptide-Spectrum Matching and Streamlined Sequence Libraries

Brynne E Lycette et al. J Proteome Res. .

Abstract

We identified tryptic peptides in yeast cell lysates that map to translation initiation sites downstream of the annotated start sites using the peptide-spectrum matching algorithms OMSSA and Mascot. To increase the accuracy of peptide-spectrum matching, both algorithms were run using several standardized parameter sets, and Mascot was run utilizing a, b, and y ions from collision-induced dissociation. A large fraction (22%) of the detected N-terminal peptides mapped to translation initiation downstream of the annotated initiation sites. Expression of several truncated proteins from downstream initiation in the same reading frame as the full-length protein (frame 1) was verified by western analysis. To facilitate analysis of the larger proteome of Drosophila, we created a streamlined sequence library from which all duplicated trypsin fragments had been removed. OMSSA assessment using this "stripped" library revealed 171 peptides that map to downstream translation initiation sites, 76% of which are in the same reading frame as the full-length annotated proteins, although some are in different reading frames creating new protein sequences not in the annotated proteome. Sequences surrounding implicated downstream AUG start codons are associated with nucleotide preferences with a pronounced three-base periodicity N1^G2^A3.

Keywords: Mascot; OMSSA; peptide mass spectrometry; streamlined sequence libraries.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing financial interest.

Figures

Figure 1.
Figure 1.
Mascot is significantly more accurate when screening a, b, and y ions. In assessment of conformance to parent proteins before trypsin digestion, all 4560 peptides detected by Mascot screening for a, b, and y CID ions were also detected when screening for only b and y ions. However, the 264 additional peptides detected only in the b/y screen had very poor conformance to parent protein MWs (62.9%; arrow). Bootstrap analysis with 2000 samples of 264 peptides (with replacement) from the a/b/y screen revealed 99% confidence limits between 79.5 and 90.9% (broken arrows). This confirms that the poor b/y conformance of 62.9% is significantly lower than the a/b/y conformance.
Figure 2.
Figure 2.
Western analysis of frame-1 truncated proteins. DownPeptide genes with predicted frame-1 truncated proteins were tested in TAP-epitope-tag expression lines (DOT1, SKN7, GCS1). Cells were grown under various conditions: normal (n), stationary (s), synthetic minimal medium with glucose (SMD), starved in synthetic minimal medium lacking nitrogen (SD-N), or heat-treated (37 °C, 15 min). Five experiments are illustrated where truncated, and full-length proteins (arrows) were detected under different conditions. A truncated protein of DOT1 was also previously reported. Note that both the annPeptide and downPeptide were detected by MS/MS for DOT1, but only the downPeptide was detected by MS/MS for SKN7. The downPeptide for GCS1 was only detected with one OMSSA parameter set. The detected truncated and full-length proteins ran appropriately according to MW size markers (not shown).
Figure 3.
Figure 3.
Stripped sequence libraries facilitate MS/MS analysis of large proteomes. Sequence libraries stripped of duplicated trypsin fragments give much smaller search spaces for PSM algorithms. The mRNAs from alternative splicing produce proteins with high sequence redundancy, most of which is removed when duplicated trypsin fragments are removed from the proteins of the sequence library. Alternative transcription start sites are shown (*).
Figure 4.
Figure 4.
Lengths of detected frame-2 and -3 downORFs compared with all frame-2 and −3 ORFs initiated within 100 nucleotides downstream of the annAUG (and ≥15 nucleotides long). The ORFs for downPeptides detected in yeast (A) are longer than expected by random selection (chi-square goodness of fit p < 0.01). Although generally longer, the Drosophila downORFs (B) are not significantly longer by chi-square test. Illustrated are ORF lengths for downPeptides detected by OMSSA or Mascot with the standard yeast library (A; 5% FDR) and OMSSA with the stripped Drosophila library (B; 2% FDR).
Figure 5.
Figure 5.
(A) Sequences flanking the implicated frame-1 downAUGs of Drosophila downPeptide genes have a pronounced 3-nucleotide periodicity with depression of G at position 2 and A at position 3 of the codons. Average nucleotide frequencies of aligned sequences are shown relative to the downAUG at positions 1, 2, and 3. (B) Frequencies of G are depressed at the second nucleotide of codons downstream of the AUG of frame-1 and frame-2 downPeptide ORFs (*) despite the frame-2 ORF being out of frame with the overlapping frame-1 ORF. (C) Average deviations from background, measured as log2(freqobs/freqbackground), are illustrated for codon positions 1, 2, and 3 for windows upstream (positions −20 to −1) and downstream (positions +4 to +20) of the start codons of frame-1 and frame-2 downPeptide ORFs (based on background nucleotide frequencies in ORFs; fA: 25.6%, fC: 27.1%, fG: 26.8%, fU: 20.5%). Compared with random samples of 1000 downAUGs, frame-1 downPeptides show G2 and A3 depression at positions 2 and 3 of codons upstream and downstream of the start codon (*). Frame-2 downPeptides show G2 depression (*) downstream of the start codon at positions corresponding to the wobble position of frame-1. The G2 and A3 depressions (*) are significant by bootstrap analysis (p < 0.01). Only downPeptides (2% FDR) with downAUGs > 20 nucleotides downstream of the annAUG were used in this analysis.

Similar articles

Cited by

References

    1. Ingolia NT; Ghaemmaghami S; Newman JR; Weissman JS Genome-wide analysis in vivo of translation with nucleotide resolution using ribosome profiling. Science 2009, 324 (5924), 218–23. - PMC - PubMed
    1. Ingolia NT; Lareau LF; Weissman JS Ribosome profiling of mouse embryonic stem cells reveals the complexity and dynamics of mammalian proteomes. Cell 2011, 147 (4), 789–802. - PMC - PubMed
    1. Eckhard U; Marino G; Abbey SR; Tharmarajah G; Matthew I; Overall CM The Human Dental Pulp Proteome and N-Terminome: Levering the Unexplored Potential ofSemitryptic Peptides Enriched by TAILS to Identify Missing Proteins in the Human Proteome Project in Underexplored Tissues. J. Proteome Res 2015, 14 (9), 3568–82. - PubMed
    1. Fortelny N; Pavlidis P; Overall CM The path of no return–Truncated protein N-termini and current ignorance of their genesis. Proteomics 2015, 15 (14), 2547–52. - PMC - PubMed
    1. Fournier CT; Cherny JJ; Truncali K; Robbins-Pianka A; Lin MS; Krizanc D; Weir MP Amino termini of many yeast proteins map to downstream start codons. J. Proteome Res 2012, 11 (12), 5712–9. - PMC - PubMed

Publication types

LinkOut - more resources