Automated de novo sequencing of proteins by tandem high-resolution mass spectrometry

D M Horn¹, R A Zubarev, F W McLafferty

Affiliations

PMID: 10984529
PMCID: PMC27020
DOI: 10.1073/pnas.97.19.10313

Automated de novo sequencing of proteins by tandem high-resolution mass spectrometry

D M Horn et al. Proc Natl Acad Sci U S A. 2000.

. 2000 Sep 12;97(19):10313-7.

doi: 10.1073/pnas.97.19.10313.

Authors

D M Horn¹, R A Zubarev, F W McLafferty

Affiliation

¹ Department of Chemistry and Chemical Biology, Cornell University, Ithaca, NY 14853-1301, USA.

PMID: 10984529
PMCID: PMC27020
DOI: 10.1073/pnas.97.19.10313

Abstract

A de novo sequencing program for proteins is described that uses tandem MS data from electron capture dissociation and collisionally activated dissociation of electrosprayed protein ions. Computer automation is used to convert the fragment ion mass values derived from these spectra into the most probable protein sequence, without distinguishing Leu/Ile. Minimum human input is necessary for the data reduction and interpretation. No extra chemistry is necessary to distinguish N- and C-terminal fragments in the mass spectra, as this is determined from the electron capture dissociation data. With parts-per-million mass accuracy (now available by using higher field Fourier transform MS instruments), the complete sequences of ubiquitin (8.6 kDa) and melittin (2.8 kDa) were predicted correctly by the program. The data available also provided 91% of the cytochrome c (12.4 kDa) sequence (essentially complete except for the tandem MS-resistant region K(13)-V(20) that contains the cyclic heme). Uncorrected mass values from a 6-T instrument still gave 86% of the sequence for ubiquitin, except for distinguishing Gln/Lys. Extensive sequencing of larger proteins should be possible by applying the algorithm to pieces of approximately 10-kDa size, such as products of limited proteolysis.

PubMed Disclaimer

Figures

**Figure 1**
Mass values yielding the complete sequence for ubiquitin. Larger type are “golden” complementary sets; those with a bold vertical bar were identified by a complementary y mass. Mass values in italics are in error by >2 ppm. Values in smaller type completed gaps in the sequence.

**Figure 2**
Correct (bold letters) and predicted sequences for ubiquitin without and with error correction. Underlined and italicized letters indicate incorrect predictions.

**Figure 3**
Predicted sequences for melittin at different error tolerances.

**Figure 4**
Cytochrome c fragmentations (92/103 bonds cleaved), with golden sets indicated by bold vertical bars.

**Figure 5**
Correct (bold letters) and predicted sequences for cytochrome c.

See this image and copyright information in PMC

References

1. Andersen J S, Svensson B, Roepstorff P. Nat Biotechnol. 1996;14:449–457. - PubMed
1. Ducret A, Van Oostveen I, Eng J K, Yates J R, III, Aebersold R. Protein Sci. 1998;7:706–719. - PMC - PubMed
1. McLafferty F W, Fridriksson E K, Horn D M, Lewis M A, Zubarev R A. Science. 1999;284:1289–1290. - PubMed
1. Kelleher N L. Chem Biol. 2000;7:R37–R45. - PubMed
1. Karas M, Hillenkamp F. Anal Chem. 1988;60:2299–2301. - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Automated de novo sequencing of proteins by tandem high-resolution mass spectrometry

Affiliation

Automated de novo sequencing of proteins by tandem high-resolution mass spectrometry

Authors

Affiliation

Abstract

Figures

References

Publication types

MeSH terms

Substances

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources