Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 Nov 13;19(11):694-701.
doi: 10.1021/acscombsci.7b00109. Epub 2017 Sep 29.

Library Design-Facilitated High-Throughput Sequencing of Synthetic Peptide Libraries

Affiliations

Library Design-Facilitated High-Throughput Sequencing of Synthetic Peptide Libraries

Alexander A Vinogradov et al. ACS Comb Sci. .

Abstract

A methodology to achieve high-throughput de novo sequencing of synthetic peptide mixtures is reported. The approach leverages shotgun nanoliquid chromatography coupled with tandem mass spectrometry-based de novo sequencing of library mixtures (up to 2000 peptides) as well as automated data analysis protocols to filter away incorrect assignments, noise, and synthetic side-products. For increasing the confidence in the sequencing results, mass spectrometry-friendly library designs were developed that enabled unambiguous decoding of up to 600 peptide sequences per hour while maintaining greater than 85% sequence identification rates in most cases. The reliability of the reported decoding strategy was additionally confirmed by matching fragmentation spectra for select authentic peptides identified from library sequencing samples. The methods reported here are directly applicable to screening techniques that yield mixtures of active compounds, including particle sorting of one-bead one-compound libraries and affinity enrichment of synthetic library mixtures performed in solution.

Keywords: de novo sequencing; one-bead one-compound libraries; shotgun nanoliquid chromatography; synthetic peptide mixtures; tandem mass spectrometry.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing financial interest.

Figures

Figure 1
Figure 1
Five-step experimental workflow for the proposed library decoding strategy enables automated sequence assignment and data analysis.
Figure 2
Figure 2
Spectra with incomplete b/y-fragmentation ladders can be unambiguously assigned using the AMS principle. Displayed are merged, preprocessed CID and HCD spectra for 491.243 Da/e precursor ions. The first candidate sequence does not match the library design pattern due to the fact that the pair of b3/y7 ions is not observed in the spectrum, and consequently, an unreliable assignment is made. The molecular weight difference of 186.064 Da between the observed fragment ions corresponds to four different dipeptides (Gly-Glu, Glu-Gly, Ala-Asp, and Asp-Ala), but only one of them (Asp-Ala) matches the parent design, which makes an unambiguous assignment possible. ss1: monomer subset 1, ss2: monomer subset 2, cr: constant region.
Figure 3
Figure 3
Alternating monomer set (AMS) increases confidence in sequencing results. All data are from the 660-bead sample of library 1 peptides (587 unique sequences identified). (A) Color-coded positional amino acid frequency is shown on the left, and the mean amino acid frequencies are plotted on the right. Each nonzero cell in the matrix has the expected value of 0.125, and the observed values map closely to it. (B) Sequencing quality scatterplot for the unfiltered PEAKS output. (C) Sequencing quality scatterplot for the final filtered data set. Most peptides with low sequencing score and/or large assignment errors are removed during the postsequencing filtration. (D) Precision-recall curves for different data filtration methods. ALC thresholds (0–99) are applied to the reference and method 1 and 2 data sets, and the corresponding precision and recall values are calculated for each data set. Method 1 is inferior to AMS in both precision and recall. (E) Total number of sequences recovered as a function of sequencing score for different data filtration methods. Results diverge in the region of medium (50–85) sequencing scores.
Figure 4
Figure 4
Proposed library decoding workflow yields reliable results. Sequences assigned to spectra from a library analysis were resynthesized and subjected to analogous analytical conditions. Original library-derived spectra are shown in blue; spectra of authentic peptides are displayed in red. (A) Overlaid raw CID fragmentation spectra (collision energy = 20.6 eV, precursor ion: 710.82 Da/e) for GCβFLDEVEFPHG peptide (β = β-alanine). (B) Overlaid raw CID fragmentation spectra (collision energy = 21.1 eV, precursor ion: 654.77 Da/e) for GCβFADASEFPHG peptide.
Figure 5
Figure 5
Approximately 600 peptides/hour can be decoded while keeping the sequence identification rate above 0.85. (A) Sequence identification rate as a function of sample complexity. Gradual reduction of recall values is observed as samples become more complex. (B) Analysis of a 1600 bead library sample (marked red in panel A) under different nLC conditions. Extending the gradient time improves sequence recall.
Figure 6
Figure 6
Libraries of peptides with multiple unnatural amino acids can be successfully decoded using the proposed strategy. (A, B) Merged, postprocessed CID and HCD spectra for a 615.84 Da/e precursor ion and a 499.78 Da/e precursor ions with corresponding assignments and decoded sequences. One-letter encoding of unnatural amino acids highlighted in green are shown on the right.

References

    1. Goodwin S, Mcpherson J, Mccombie R. Coming of Age: Ten Years of next-Generation Sequencing Technologies. Nat. Rev. Genet. 2016;17:333–351. - PMC - PubMed
    1. Smith G. Filamentous Fusion Phage: Novel Expression Vectors That Display Cloned Antigens on the Virion Surface. Science. 1985;228:1315–1317. - PubMed
    1. Smith G, Petrenko V. Phage Display. Chem. Rev. 1997;97:391–410. - PubMed
    1. Boder E, Wittrup D. Yeast Surface Display for Screening Combinatorial Polypeptide Libraries. Nat. Biotechnol. 1997;15:553–557. - PubMed
    1. Wittrup D, Boder E. Yeast Surface Display for Directed Evolution of Protein Expression, Affinity, and Stability. Methods Enzymol. 2000;328:430–444. - PubMed

Publication types

MeSH terms

LinkOut - more resources