Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2000 Sep 12;97(19):10313-7.
doi: 10.1073/pnas.97.19.10313.

Automated de novo sequencing of proteins by tandem high-resolution mass spectrometry

Affiliations

Automated de novo sequencing of proteins by tandem high-resolution mass spectrometry

D M Horn et al. Proc Natl Acad Sci U S A. .

Abstract

A de novo sequencing program for proteins is described that uses tandem MS data from electron capture dissociation and collisionally activated dissociation of electrosprayed protein ions. Computer automation is used to convert the fragment ion mass values derived from these spectra into the most probable protein sequence, without distinguishing Leu/Ile. Minimum human input is necessary for the data reduction and interpretation. No extra chemistry is necessary to distinguish N- and C-terminal fragments in the mass spectra, as this is determined from the electron capture dissociation data. With parts-per-million mass accuracy (now available by using higher field Fourier transform MS instruments), the complete sequences of ubiquitin (8.6 kDa) and melittin (2.8 kDa) were predicted correctly by the program. The data available also provided 91% of the cytochrome c (12.4 kDa) sequence (essentially complete except for the tandem MS-resistant region K(13)-V(20) that contains the cyclic heme). Uncorrected mass values from a 6-T instrument still gave 86% of the sequence for ubiquitin, except for distinguishing Gln/Lys. Extensive sequencing of larger proteins should be possible by applying the algorithm to pieces of approximately 10-kDa size, such as products of limited proteolysis.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Mass values yielding the complete sequence for ubiquitin. Larger type are “golden” complementary sets; those with a bold vertical bar were identified by a complementary y mass. Mass values in italics are in error by >2 ppm. Values in smaller type completed gaps in the sequence.
Figure 2
Figure 2
Correct (bold letters) and predicted sequences for ubiquitin without and with error correction. Underlined and italicized letters indicate incorrect predictions.
Figure 3
Figure 3
Predicted sequences for melittin at different error tolerances.
Figure 4
Figure 4
Cytochrome c fragmentations (92/103 bonds cleaved), with golden sets indicated by bold vertical bars.
Figure 5
Figure 5
Correct (bold letters) and predicted sequences for cytochrome c.
None

Similar articles

Cited by

References

    1. Andersen J S, Svensson B, Roepstorff P. Nat Biotechnol. 1996;14:449–457. - PubMed
    1. Ducret A, Van Oostveen I, Eng J K, Yates J R, III, Aebersold R. Protein Sci. 1998;7:706–719. - PMC - PubMed
    1. McLafferty F W, Fridriksson E K, Horn D M, Lewis M A, Zubarev R A. Science. 1999;284:1289–1290. - PubMed
    1. Kelleher N L. Chem Biol. 2000;7:R37–R45. - PubMed
    1. Karas M, Hillenkamp F. Anal Chem. 1988;60:2299–2301. - PubMed

Publication types

Substances

LinkOut - more resources