Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2020 Feb 11:53:e3.
doi: 10.1017/S0033583519000131.

De novo protein design, a retrospective

Affiliations
Review

De novo protein design, a retrospective

Ivan V Korendovych et al. Q Rev Biophys. .

Abstract

Proteins are molecular machines whose function depends on their ability to achieve complex folds with precisely defined structural and dynamic properties. The rational design of proteins from first-principles, or de novo, was once considered to be impossible, but today proteins with a variety of folds and functions have been realized. We review the evolution of the field from its earliest days, placing particular emphasis on how this endeavor has illuminated our understanding of the principles underlying the folding and function of natural proteins, and is informing the design of macromolecules with unprecedented structures and properties. An initial set of milestones in de novo protein design focused on the construction of sequences that folded in water and membranes to adopt folded conformations. The first proteins were designed from first-principles using very simple physical models. As computers became more powerful, the use of the rotamer approximation allowed one to discover amino acid sequences that stabilize the desired fold. As the crystallographic database of protein structures expanded in subsequent years, it became possible to construct proteins by assembling short backbone fragments that frequently recur in Nature. The second set of milestones in de novo design involves the discovery of complex functions. Proteins have been designed to bind a variety of metals, porphyrins, and other cofactors. The design of proteins that catalyze hydrolysis and oxygen-dependent reactions has progressed significantly. However, de novo design of catalysts for energetically demanding reactions, or even proteins that bind with high affinity and specificity to highly functionalized complex polar molecules remains an importnant challenge that is now being achieved. Finally, the protein design contributed significantly to our understanding of membrane protein folding and transport of ions across membranes. The area of membrane protein design, or more generally of biomimetic polymers that function in mixed or non-aqueous environments, is now becoming increasingly possible.

Keywords: Protein design.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
(Left) Proposed secondary structure of a DDT-binding peptide (reproduced with permission from Moser et al. (1983)). (Right) Molecular model of a short segment of the amyloid fibril formed by betabellin (reproduced with permission from Richardson and Richardson (1989)).
Fig. 2.
Fig. 2.
Design of a four-helix bundle. (a) A peptide was designed, which self-associated to form an antiparallel helical bundle in solution. A loop sequence was next inserted (b) between two helices to create a dimeric four-helix bundle, and then three loops were inserted between four helices to create the full-length helical bundle. At each stage, the free energy of assembly or folding was determined, and used to evaluate possible sequences. In this way, the complex problem of protein design was cut into smaller separable pieces. For simplicity, the monomeric species in panels (a) and (b) are shown as helices, but they were actually only partially helical, as shown by CD. Panel (d) shows the sequences of the peptides and proteins discussed in the text. Panel (e) shows an early energy-minimized model of α4 (left) as compared to larger natural four-helix bundle proteins (myohemerythrin, middle) and cytochrome c′ (right). Panels (ac) are reproduced with permission from Ho and DeGrado (1987). Copyright (2007) American Chemical Society, while panel (e) is reproduced with permission from DeGrado et al. (1989).
Fig. 3.
Fig. 3.
(a) Crystal structure of a peptide that was designed to solubilize membrane proteins, but was serendipitously found to crystallize as a four helix coiled-coil bundle DHP1 (PDB: 4HB1). (b) NMR structure of α3D (PDB: 2A3D) is stabilized by a set of apolar sidechains that pack in a geometrically complementary manner, shown in ball-and-stick format. (c) The model of 3-His α3D based on EXAFS data and NMR structure of α3D (PDB: 2A3D).
Fig. 4.
Fig. 4.
(a) A crystal structure of a dimeric natural coiled-coil GCN4 interaction (PDB: 2ZTA) and the corresponding helical wheel. (b) A side on and end on views of the hydrophobic interior of a trimeric coiled-coil GCN4 derivative (PDB: 1GCM) along with the corresponding helical wheel. (c) A side on and end on views of the hydrophobic interior of a tetrameric GCN4 derivative (PDB: 1GCL) along with the corresponding helical wheel. (d) End on views of de novo designed penta-, hexa-, hepta-, and octameric bundles (PDB: 4PND, 4H8O, 5EZ8, 6G67).
Fig. 5.
Fig. 5.
The desired geometry of the metal ion-binding site dictates the overall 3D structures during de novo protein design. In panel (a), a trigonal 3-Cys site dictates the backbone of a three-helix bundle in the TRI series of peptides (Dieckmann et al., 1997, 1998; Mocny and Pecoraro, 2015) (PDB: 2JGO). The structure is stabilized in the desired conformation by favorable vdW packing and the hydrophobic interactions between buried apolar residues (far right). In panel (b), a more complex C2 symmetrical site is formed from 4-Glu and two-His residues, which bind to two transition metal ions in a four-helix bundle in the DF series of proteins (Lombardi et al., 2019). The two-fold axis is denoted by an oval. A large number of second-shell hydrogen bonds were positioned to stabilize the ligands in the desired conformation, and the remaining interior residues chosen (not shown) were apolar sidechains that pack efficiently in the interior of the bundle.
Fig. 6.
Fig. 6.
Design of DF family proteins. Panels (ac) show experimentally determined structures of extended metal-ligand and second-shell hydrogen-bonded networks in DF1 and related proteins. Two projections of DF1s metal-binding site are shown in (a) and (b) (PDB: 1EC5). Panel (c) shows an axial view of 4DH1 (PDB: 5WLL), a DF analog that binds four Zn(II) ions. An Asp residue forms a second-shell-hydrogen bond to a His ligand, and an Arg residue forms a third-shell hydrogen bond. Overall, the network includes four Zn, two waters, eight Asp, four His, and four Arg – all converging at the center of the bundle. Panels (df) illustrate how the backbone of DF (d) was elaborated to create a single chain (DFsc, PDB: 2HZ8) or a self-assembling tetramer (DFtet).
Fig. 7.
Fig. 7.
Structural plasticity of MID1 (a and b). Two views of the crystal structures of di-zinc MID1 (PDB: 3V1C, blue ribbon), di-cobalt MID1 (PDB: 3V1D, magenta), di-zinc MID1-H12E (PDB: 3V1E, yellow), and di-zinc MID1-H35E (PDB: 3V1F, green) are shown with one of the two helix–loop–helix motifs superimposed. The overlay shows the variability in metal ion positions and ligand geometry, as well as variations in inter-subunit interactions. Panels (c) and (d) illustrate a similar superposition of di-zinc MID1 (PDB: 3V1C, blue ribbon, orange carbon atoms as sticks) with di-Zinc MID1sc10 (PDB: 5OD1, gray ribbon, magenta C atoms as sticks) showing a large rigid-body rotation of the helical hairpins, a shift in the primary ligand from His39 to His35, and a 7 Å shift of the metal ion. Panel (e) shows the substrates used to characterize the catalytic activity of MID1sc10.
Fig. 8.
Fig. 8.
Cofactor-binding helical bundles. Panel (a) shows a model of a two-porphyrin maquette. High-resolution structures have not been published for cofactor-bound maquettes, likely due to dynamic properties (Koder et al., 2009; Lichtenstein et al., 2012; Kodali et al., 2017; Watkins et al., 2017). However, recent work on other de novo proteins including PS1 indicates that it is possible to design uniquely structured porphyrin-binding proteins (Polizzi et al., 2017). Panels (b) and (c) illustrate PS1, a porphyrin-binding protein, that was instead computationally designed to carefully optimize the packing of the core as well as the packing of the cofactor (Polizzi et al., 2017). The high-resolution solution structure of the apo-state has two conformations that appear to facilitate binding of the porphyrin. Both conformers have well-packed hydrophobic core, but differ in the orientation of the helices in the binding site. Binding of the porphyrin results in ordering of the entire protein.
Fig. 9.
Fig. 9.
(Left) RM1 design cycle: (a) three-stranded sheet topology of natural rubredoxin, (b) C2 symmetry, (c) active-site geometry, (d) miniRM dimer, and (e) RM1 with Trpzip linker shown in red. Reproduced with permission from Nanda et al. (2005). Copyright (2005) American Chemical Society. (Right) Computational model of ambidoxin.
Fig. 10.
Fig. 10.
(a, b) Top and side views of computational models of de novo designed ion pores LS2 and PRIME, respectively. In panel (a), the Ser sidechains of LS2 are shown in ball-and-stick models. Leu residues that are important for packing interactions that stabilize the tetramer of LS2 are shown in green sticks. In panel (b), the carbon atoms of the porphyrin cofactor are shown in purple. (c) Rocker, a de novo designed zinc transporter, showing configurations that were used for positive (+) and negative (−) design.
Fig. 11.
Fig. 11.
Representative examples of de novo designed protein scaffolds. (a) TOP7, a de novo designed fold with no natural analogs (PDB: 1QYS). (b) A computationally designed TIM barrel (PDB: 5BVL). (c) A de novo designed mini protein (PDB: 5TX8). (d). Pizza6, a de novo designed fold with no natural analogs (PDB: 6F0Q). (e) A de novo designed β-barrel (PDB: 6D0T).
Fig. 12.
Fig. 12.
Overview of the computational design and high-throughput screening of mini-protein binders. Reproduced with permission from Makhlynets and Korendovych (2017). Copyright (2017) American Chemical Society.
Fig. 13.
Fig. 13.
Structures of amyloid fibrils. (a) Strands align perpendicular to the main fibril axis (indicated by a black line) in a structure of MAX1, a strand-turn-strand peptide designed by Schneider and coworkers (PDB: 2N1E). (b) Structure of MAX1, with polar Lys residues (blue sticks) on the solvent-exposed surface and apolar Val residues (green ball and sticks) forming a water-free interface. (c) Structure of a catalytic Zn2+-binding amyloid (PDB: 5UGK), showing a network of 3-His Zn2+ ion coordination, and an H-bonded zipper of Gln sidechains. (d) Structure of an α-amyloid assembly, αAmS (Zhang et al., 2018b) (PDB: 6C4Z) the N- and C-termini of the individual helices are designated in blue and red, respectively.
Fig. 14.
Fig. 14.
Structural assemblies of designed proteins. Proteins that assemble in one dimension to form fibers and tubes are shown in panels (ad). Panel (a) shows the structure of a hexameric bundle designed (PDB: 4H8M), that has been engineered to assemble into stacked bundles (structure inferred by EM). Panel (b) illustrates a dimeric three-helix bundle assembled from helix–loop–helix motifs (PDB: 1G6U) consisting of one short and one long helix. The sequence was designed to cause the units to assemble with the loops on opposite sides of the bundle in an ‘up-down’ orientation to give a domain-swapped dimer. In a second design, the sequence was designed to cause the loops to align in an ‘up-up’ orientation that induced fibril formation. Panel (d) illustrates larger-diameter nano-pores composed of helix–loop–helix motifs (PDB: 6MK1), and panel (e) shows the assembly scheme for TET12SN family peptides that spontaneously assemble into a tetrahedral cages (reproduced from Lapenta (2018) 351 – Published by The Royal Society of Chemistry). Panel (f) illustrates a tetrahedral protein cage created by computationally designing protein–protein interfaces (PDB: 4NWR), and panel (g) illustrates a computationally designed protein crystal (PDB: 4H8M).

Similar articles

Cited by

References

    1. Adamian L and Liang J (2001) Helix–helix packing and interfacial pairwise interactions of residues in membrane proteins. Journal of Molecular Biology 311, 891–907. - PubMed
    1. Adamian L and Liang J (2002) Interhelical hydrogen bonds and spatial motifs in membrane proteins: polar clamps and serine zippers. Proteins 47, 209–218. - PubMed
    1. Adhikari AN, Freed KF and Sosnick TR (2012) De novo prediction of protein folding pathways and structure using the principle of sequential stabilization. Proceedings of the National Academy of Sciences of the United States of America 109, 17442–17447. - PMC - PubMed
    1. Aida T, Meijer EW and Stupp SI (2012) Functional supramolecular polymers. Science 335, 813–817. - PMC - PubMed
    1. Åkerfeldt K, Kim RM, Camac D, Groves JT, Lear JD and DeGrado WF (1992) Tetraphilin: a four-helix proton channel built on a tetraphenylporphyrin framework. Journal of the American Chemical Society 114, 9656–9657.

Publication types