Review

. 2020 Jan;45(1):76-89.

doi: 10.1016/j.tibs.2019.09.005. Epub 2019 Oct 30.

Strategies for Development of a Next-Generation Protein Sequencing Platform

Nicholas Callahan¹, Jennifer Tullman², Zvi Kelman³, John Marino²

Affiliations

¹ Institute for Bioscience and Biotechnology Research, National Institute of Standards and Technology, and University of Maryland, Rockville, MD 20850, USA. Electronic address: callahann@ibbr.umd.edu.
² Institute for Bioscience and Biotechnology Research, National Institute of Standards and Technology, and University of Maryland, Rockville, MD 20850, USA.
³ Institute for Bioscience and Biotechnology Research, National Institute of Standards and Technology, and University of Maryland, Rockville, MD 20850, USA; Biomolecular Labeling Laboratory, Institute for Bioscience and Biotechnology Research, Rockville, MD 20850, USA.

PMID: 31676211
PMCID: PMC7373172
DOI: 10.1016/j.tibs.2019.09.005

Review

Strategies for Development of a Next-Generation Protein Sequencing Platform

Nicholas Callahan et al. Trends Biochem Sci. 2020 Jan.

. 2020 Jan;45(1):76-89.

doi: 10.1016/j.tibs.2019.09.005. Epub 2019 Oct 30.

Authors

Nicholas Callahan¹, Jennifer Tullman², Zvi Kelman³, John Marino²

Affiliations

¹ Institute for Bioscience and Biotechnology Research, National Institute of Standards and Technology, and University of Maryland, Rockville, MD 20850, USA. Electronic address: callahann@ibbr.umd.edu.
² Institute for Bioscience and Biotechnology Research, National Institute of Standards and Technology, and University of Maryland, Rockville, MD 20850, USA.
³ Institute for Bioscience and Biotechnology Research, National Institute of Standards and Technology, and University of Maryland, Rockville, MD 20850, USA; Biomolecular Labeling Laboratory, Institute for Bioscience and Biotechnology Research, Rockville, MD 20850, USA.

PMID: 31676211
PMCID: PMC7373172
DOI: 10.1016/j.tibs.2019.09.005

Abstract

Proteomic analysis can be a critical bottleneck in cellular characterization. The current paradigm relies primarily on mass spectrometry of peptides and affinity reagents (i.e., antibodies), both of which require a priori knowledge of the sample. An unbiased protein sequencing method, with a dynamic range that covers the full range of protein concentrations in proteomes, would revolutionize the field of proteomics, allowing a more facile characterization of novel gene products and subcellular complexes. To this end, several new platforms based on single-molecule protein-sequencing approaches have been proposed. This review summarizes four of these approaches, highlighting advantages, limitations, and challenges for each method towards advancing as a core technology for next-generation protein sequencing.

Keywords: peptide sequencing; proteomics; single-molecule analysis.

PubMed Disclaimer

Figures

**Figure 1.**
Current protein sequencing paradigm. After a protein of interest (grey) is purified, separated samples are digested with different proteases to yield a collection of peptides (Step 1). The peptides are then identified using a combination of HPLC (Step 2) and mass spectrometry (Step 3). The sequences of the digestion products are then used to computationally assemble the full-length protein sequence (Step 4).

**Figure 2.**
Proposed sub-nanogap sequencing. The protein sample (grey) is denaturated in SDS (green circles; Step 1) and injected into a microfluidic cell for electrophoresis (Step 2, right). The current drives the denatured protein (grey squares with single letter amino acid abbreviations and N- and C-termini labeled) through a biconical pore structure (blue) that is 10 A thick (Step 2, left). Each residue in the protein chain transiently interacts with the “waist” of the biconical pore, creating a unique step in the current over time (Step 3). The magnitude of each step is determined by the combined volume of amino acids in the pore.

**Figure 3.**
Proposed use of recognition tunneling in sequencing. Proteins (grey) are digested sequentially by either chemical degradation or a peptidase (Step 1), and cleaved residues are collected by flow (Step 2). Each fraction is then analyzed by recognition tunneling spectroscopy (Step 3). In this process, the free amino acids (grey squares) pass through a palladium-plated probe and a substrate, interacting with the 4(5) substituted-1-H-imidazole-2-carboxamide (ICA, green sphere) functional groups on the probe and substrate. The binding event generates a unique current trace from the interaction of the amino acid with the ICA; fractions are described by collections of current traces (Step 4). The sub-populations of current traces in each fraction allow quantification of residuelevel sub-populations in the protein sample.

**Figure 4.**
Proposed image-based ClpXP sequencing. The purified protein sample (grey) is labeled with FRET acceptors (orange) at the N-terminal amine and at lysine and cysteine residues (Step 1), and a ClpX initiation tag is added to the C-terminus. Labeled proteins are flowed over a slide with donor-tagged ClpXP (green and blue), attached to the slide by a biotin-streptavidin bond (Step 2, right). When the acceptor on the lysine or the N-terminus enters ClpP, a FRET signal is observed due to the acceptor nearing the donor (green sphere; Step 2, left). After the protein is digested, the total time between signals is used to estimate the chain length and protein sequence (Step 3).

**Figure 5.**
Proposed image-based Edman sequencing. (Scheme A) A protein sample (grey) is digested (Step 1), then labeled with fluorescent tags (red and green spheres, respectively) at lysine and cysteine residues (Step 2). The labeled peptides are immobilized on an amine-functionalized slide by a peptide bond (Step 3). The fluorescent signal from each peptide is quantified prior to a round of Edman degradation. A step down in quantified signal indicates the elimination of a lysine or cysteine (Step 4). (Scheme B) In an alternative protocol, peptides are not directly labeled before attachment to the slide (Steps 1-2). Instead, labeled N-terminal amino-acid binding (NAAB) proteins are used to identify the N-terminal amino acid between each round of degradation (Step 3). Please expand on what the colors may mean here.

See this image and copyright information in PMC

References

1. D’Alessandro A and Zolla L (2013) Meat science: From proteomics to integrated omics towards system biology. J. Proteomics 78, 558–577 - PubMed
1. Fukushima A et al. (2009) Integrated omics approaches in plant systems biology. Curr. Opin. Chem. Biol 13, 532–538 - PubMed
1. Zhang W et al. (2010) Integrating multiple “omics” analysis for microbial biology: Application and methodologies. Microbiology 156, 287–301 - PubMed
1. Palsson B (2002) In silico biology through “omics.” Nat. Biotechnol 20, 649–650 - PubMed
1. Gawad C et al. (2016) Single-cell genome sequencing: current state of the science. Nat. Rev. Genet 17, 175–88 - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions

Substances

Actions

Grants and funding

9999-NIST/ImNIST/Intramural NIST DOC/United States

LinkOut - more resources

Full Text Sources
Other Literature Sources
- The Lens - Patent Citations Database

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Strategies for Development of a Next-Generation Protein Sequencing Platform

Affiliations

Strategies for Development of a Next-Generation Protein Sequencing Platform

Authors

Affiliations

Abstract

Figures

References

Publication types

MeSH terms

Substances

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources