Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2023 Sep 12;17(17):16369-16395.
doi: 10.1021/acsnano.3c05628. Epub 2023 Jul 25.

Engineering Biological Nanopore Approaches toward Protein Sequencing

Affiliations
Review

Engineering Biological Nanopore Approaches toward Protein Sequencing

Xiaojun Wei et al. ACS Nano. .

Abstract

Biotechnological innovations have vastly improved the capacity to perform large-scale protein studies, while the methods we have for identifying and quantifying individual proteins are still inadequate to perform protein sequencing at the single-molecule level. Nanopore-inspired systems devoted to understanding how single molecules behave have been extensively developed for applications in genome sequencing. These nanopore systems are emerging as prominent tools for protein identification, detection, and analysis, suggesting realistic prospects for novel protein sequencing. This review summarizes recent advances in biological nanopore sensors toward protein sequencing, from the identification of individual amino acids to the controlled translocation of peptides and proteins, with attention focused on device and algorithm development and the delineation of molecular mechanisms with the aid of simulations. Specifically, the review aims to offer recommendations for the advancement of nanopore-based protein sequencing from an engineering perspective, highlighting the need for collaborative efforts across multiple disciplines. These efforts should include chemical conjugation, protein engineering, molecular simulation, machine-learning-assisted identification, and electronic device fabrication to enable practical implementation in real-world scenarios.

Keywords: amino acid; engineering; instrumentation; machine learning; molecular simulation; nanopore; peptide; protein sequencing.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.. Engineering methods surrounding biological nanopore technologies for protein sequencing.
Four most widely used biological nanopores (inner ring) and six peripheral engineering methods for nanopore sensing (outer ring) with great potential to contribute to protein sequencing are shown. (Created with Biorender.com)§ §Certain commercial materials, equipment, and instruments may be identified in this work to describe the experiments as completely as possible. In no case does such an identification imply a recommendation or endorsement by the National Institute of Standards and Technology, nor does it imply that the materials, equipment, or instrument identified are necessarily the best available for the purpose. The authors declare no other competing interest.
Figure 2.
Figure 2.. Discrimination of amino acids with a biological nanopore.
Schematic illustration of the identification strategies and the corresponding representative results in the dashed box. (a) Nine AAs were derivatized with NITC at the N-terminus and then translocated through the α-HL nanopore. Box: mean relative current blockade produced by each NITC derivative versus its spatial volume. (b) Single AAs (C, N, and Q) traversing across an AeL nanopore. Box: raw current traces. The red star denotes a typical current blockade event of C in blue shadow. (c) Detection of Arg (R) peptides with different lengths in an equimolar mixture using the AeL nanopore. Box: typical current event of six distinct populations (top) and the corresponding histogram (bottom). (d) Recognition of 20 AAs (X) in a cationic carrier of seven Args (R7) by the AeL nanopore. Box: mean relative residual current produced by the XR7 probes versus volume of AA (top); experimentally determined mean I/I0 value for all 20 XR7 peptides (bottom). (e) Recognition of an AA (X) with a bipolar D4XR5 peptide carrier, in which five Arg residues and four Asp (D) residues were chemically linked to the target AA. Box: identification of D, M, E, R, L, W, and Y based on the I/I0 values (top); relationship of current blockade against volume (bottom). (f) Recognition of am7βCD-CuII complex-functionalized α-HL nanopore for AA enantiomers. Box: current traces (left) and the corresponding scatter plots (right) showing the interaction of pores with am7βCD, CuII, and either L-Phe (top) or D-Phe (bottom). (g) Depiction of an OmpF trimeric protein sensing a single peptide. The enlarged part represents the zig-zag alignment of each AA sidechain of N-Arg-Arg-Gly-Arg-Asp in bulk. Box: typical nanopore-based readouts for Mol-1 and Mol-5. Red star points denote enlarged events in the original trace. Data in the boxes are extracted with permission from these references: (a): 94, (b): 107, (c): 109,§ (d): 110, (e): 117, (f): 122, (g): 123 and all corresponding schematic diagrams are created with Biorender.com. § https://creativecommons.org/licenses/by/4.0/
Figure 3.
Figure 3.. Identification of peptides with biological nanopores.
Schematic diagrams of the sensing strategies and the corresponding representative results in the dashed box. (a) Three types of FraC nanopores with different diameters. Box: pH dependence of the Iex% for four peptides using Frac-T2 (top) and the relationship between the Iex% and the mass of peptides (bottom). (b) Peptide attachment methodology with nanopore-based cluster analysis. Box: low frequency fluctuation (top-left) and high-frequency fluctuation (top-right) resulting from peptides and cluster-peptides, respectively; the resulting ligand dynamics exhibits two-state fluctuations that can be analyzed to identify the target peptide (middle); linear dependence between the mean current step size and the ligand mass (bottom). (c) Peptides are pre-hydrolyzed by protease and measured as they translocate the FraC nanopore. Box: lysozyme fingerprinting using a nanopore. (d) Unfolding and cleaving the protein into multiple polypeptide fragment types, analyzed with an AeL nanopore. Box: typical current blockade events of polypeptide fragments (top); scatter plot of tD versus I/I0 (middle); and histogram of the I/I0 values (bottom). (e) Analyzing different polypeptides that have identical length, but different net charges and different charge distributions with an AeL nanopore. Box: typical current traces of polypeptides and discrimination of polypeptides through event scatter plots. (f) Three electrostatic constricted regions of N226Q/S228K AeL for heterogeneously charged peptide sensing. Box: typical current traces of heterogeneously charged peptides obtained with an N226Q/S228K AeL nanopore (top); scatter plots between current blockade and duration (bottom). (g) Label-free detection of both phosphorylation and O-glycosylation and their discrimination from unmodified peptides using a FraC nanopore. Box: typical current trace obtained for a measurement on a mixture of three peptides (top); scatter plot and blockade histogram of the mixture (bottom). (h) Discrimination of acetylation-derived positional isomers with the R220S variant AeL nanopore. The fragment (H4f.) of the full-length human H4 protein was modified at three different positions on lysin AAs. Box: current trace of an experiment recorded using the R220S pore in the simultaneous presence of eight different H4f. variants (top); together with the scatter plot (bottom). The error bars in a and b represent the standard deviations calculated from at least three independent repeats. Data in the boxes are extracted with permission from the references: (a): 132,§ (b): 140, (c): 143,§ (d): 144, (e): 146, (f): 147, (g): 156, (h): 160, and all corresponding schematic diagrams are created with Biorender.com. §https://creativecommons.org/licenses/by/4.0/
Figure 4.
Figure 4.. Controlling peptide translocation through a biological nanopore.
Schematic diagrams of the sensing strategies and the corresponding representative results in the dashed boxes. (a) Schematic view of peptide sequencing achieved by helicase-driven translocation of DNA-peptide conjugates through the MspA-M2 nanopore. Box: the sequencing signal of the DNA-peptide conjugate with the changes in the ionic current profile corresponding to the stages of the translocation marked in the schematic diagram (left); percentage of translocation events in which the polyT signal was detected for peptides of each length (right). (b) Nanopore-induced phase-shift sequencing strategy to observe the ratcheting motion of peptide-oligonucleotide conjugate (POC) with an MspA nanopore. The abasic spacer (X) serves as a signal marker separating the oligonucleotide and the peptide. The linker (L) conjugates the two parts. Box:representative trace of N-termini conjugated POC (top) and C-termini conjugated POC (bottom). (c) Rereading is facilitated by helicase queueing with a MspA nanopore. Box: highly repetitive ion current signal corresponding to numerous rereads of the same section of an individual peptide. The expanded plot below shows a region that contains four rewinding events, where the trace jumps back to the level of the consensus displayed in shadow. The data in the boxes are extracted with permission from the references: (a): 177, (b): 179, (c): 180, and all corresponding schematic diagrams are created with Biorender.com.
Figure 5.
Figure 5.. (a) AI aided signal processing and recognition.
Left: Typical flow diagram of the training process. Different classes of events including A, B, C, D, E, F, were applied as the input dataset. The time, I/I0, or standard deviation of each event were extracted to form a feature matrix. Results in the matrix were further randomly split into a training subset for model training and a validation subset for model validation. Right: The confusion matrix of classification generated using the best performing model. (b-d) Understanding molecular dynamics by simulation. b: All-atom model of a helicase-assisted protein sequencing platform recapitulates dependence of the ionic current blockade on peptide sequence and reveals its molecular origin. Images courtesy Jingqian Liu (UIUC). c: Multi-scale model of a cut-and-drop experimental system. d: Combining steered MD simulation of peptide transport with a steric exclusion model enables precise characterization of atom-scale modification on nanopore current. (e-f) Instrumentation of nanopores. e: Schematic representation of nanopore sensing instruments. TIA – trans-impedance amplifier, Diff – differential amplifier, LPF – low-pass filter, ADC – analogue-to-digital converter, DAC – digital-to-analogue converter. f: Noise power spectral density (PSD) as a function of frequency for a typical nanopore. Dominant noise sources at different frequency ranges. The data in the boxes are extracted and reproduced with permission from the references: (b): 180, (c):197, (d) 110, (f): 229. The schematic diagrams of (a) and (e) are created with Biorender.com.

References

    1. Alfaro JA; Bohländer P; Dai M; Filius M; Howard CJ; van Kooten XF; Ohayon S; Pomorski A; Schmid S; Aksimentiev A The emerging landscape of single-molecule protein sequencing technologies. Nature methods 2021, 18, 604–617. - PMC - PubMed
    1. Restrepo-Pérez L; Joo C; Dekker C Paving the way to single-molecule protein sequencing. Nat. Nanotechnol 2018, 13, 786–796. - PubMed
    1. Squires AH; Gilboa T; Torfstein C; Varongchayakul N; Meller A Single-molecule characterization of DNA–protein interactions using nanopore biosensors. Methods in enzymology 2017, 582, 353–385. - PubMed
    1. Timp W; Timp G Beyond mass spectrometry, the next step in proteomics. Science Advances 2020, 6, eaax8978. - PMC - PubMed
    1. Nicholson J A nanopore distance away from next-generation protein sequencing. Chem 2022, 8, 17–19.

Publication types