Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2020 Jan;45(1):76-89.
doi: 10.1016/j.tibs.2019.09.005. Epub 2019 Oct 30.

Strategies for Development of a Next-Generation Protein Sequencing Platform

Affiliations
Review

Strategies for Development of a Next-Generation Protein Sequencing Platform

Nicholas Callahan et al. Trends Biochem Sci. 2020 Jan.

Abstract

Proteomic analysis can be a critical bottleneck in cellular characterization. The current paradigm relies primarily on mass spectrometry of peptides and affinity reagents (i.e., antibodies), both of which require a priori knowledge of the sample. An unbiased protein sequencing method, with a dynamic range that covers the full range of protein concentrations in proteomes, would revolutionize the field of proteomics, allowing a more facile characterization of novel gene products and subcellular complexes. To this end, several new platforms based on single-molecule protein-sequencing approaches have been proposed. This review summarizes four of these approaches, highlighting advantages, limitations, and challenges for each method towards advancing as a core technology for next-generation protein sequencing.

Keywords: peptide sequencing; proteomics; single-molecule analysis.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Current protein sequencing paradigm. After a protein of interest (grey) is purified, separated samples are digested with different proteases to yield a collection of peptides (Step 1). The peptides are then identified using a combination of HPLC (Step 2) and mass spectrometry (Step 3). The sequences of the digestion products are then used to computationally assemble the full-length protein sequence (Step 4).
Figure 2.
Figure 2.
Proposed sub-nanogap sequencing. The protein sample (grey) is denaturated in SDS (green circles; Step 1) and injected into a microfluidic cell for electrophoresis (Step 2, right). The current drives the denatured protein (grey squares with single letter amino acid abbreviations and N- and C-termini labeled) through a biconical pore structure (blue) that is 10 A thick (Step 2, left). Each residue in the protein chain transiently interacts with the “waist” of the biconical pore, creating a unique step in the current over time (Step 3). The magnitude of each step is determined by the combined volume of amino acids in the pore.
Figure 3.
Figure 3.
Proposed use of recognition tunneling in sequencing. Proteins (grey) are digested sequentially by either chemical degradation or a peptidase (Step 1), and cleaved residues are collected by flow (Step 2). Each fraction is then analyzed by recognition tunneling spectroscopy (Step 3). In this process, the free amino acids (grey squares) pass through a palladium-plated probe and a substrate, interacting with the 4(5) substituted-1-H-imidazole-2-carboxamide (ICA, green sphere) functional groups on the probe and substrate. The binding event generates a unique current trace from the interaction of the amino acid with the ICA; fractions are described by collections of current traces (Step 4). The sub-populations of current traces in each fraction allow quantification of residuelevel sub-populations in the protein sample.
Figure 4.
Figure 4.
Proposed image-based ClpXP sequencing. The purified protein sample (grey) is labeled with FRET acceptors (orange) at the N-terminal amine and at lysine and cysteine residues (Step 1), and a ClpX initiation tag is added to the C-terminus. Labeled proteins are flowed over a slide with donor-tagged ClpXP (green and blue), attached to the slide by a biotin-streptavidin bond (Step 2, right). When the acceptor on the lysine or the N-terminus enters ClpP, a FRET signal is observed due to the acceptor nearing the donor (green sphere; Step 2, left). After the protein is digested, the total time between signals is used to estimate the chain length and protein sequence (Step 3).
Figure 5.
Figure 5.
Proposed image-based Edman sequencing. (Scheme A) A protein sample (grey) is digested (Step 1), then labeled with fluorescent tags (red and green spheres, respectively) at lysine and cysteine residues (Step 2). The labeled peptides are immobilized on an amine-functionalized slide by a peptide bond (Step 3). The fluorescent signal from each peptide is quantified prior to a round of Edman degradation. A step down in quantified signal indicates the elimination of a lysine or cysteine (Step 4). (Scheme B) In an alternative protocol, peptides are not directly labeled before attachment to the slide (Steps 1-2). Instead, labeled N-terminal amino-acid binding (NAAB) proteins are used to identify the N-terminal amino acid between each round of degradation (Step 3). Please expand on what the colors may mean here.

References

    1. D’Alessandro A and Zolla L (2013) Meat science: From proteomics to integrated omics towards system biology. J. Proteomics 78, 558–577 - PubMed
    1. Fukushima A et al. (2009) Integrated omics approaches in plant systems biology. Curr. Opin. Chem. Biol 13, 532–538 - PubMed
    1. Zhang W et al. (2010) Integrating multiple “omics” analysis for microbial biology: Application and methodologies. Microbiology 156, 287–301 - PubMed
    1. Palsson B (2002) In silico biology through “omics.” Nat. Biotechnol 20, 649–650 - PubMed
    1. Gawad C et al. (2016) Single-cell genome sequencing: current state of the science. Nat. Rev. Genet 17, 175–88 - PubMed

LinkOut - more resources