Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2009 Jun;18(6):1293-305.
doi: 10.1002/pro.142.

Motif-directed flexible backbone design of functional interactions

Affiliations
Comparative Study

Motif-directed flexible backbone design of functional interactions

James J Havranek et al. Protein Sci. 2009 Jun.

Abstract

Computational protein design relies on a number of approximations to efficiently search the huge sequence space available to proteins. The fixed backbone and rotamer approximations in particular are important for formulating protein design as a discrete combinatorial optimization problem. However, the resulting coarse-grained sampling of possible side-chain terminal positions is problematic for the design of protein function, which depends on precise positioning of side-chain atoms. Although backbone flexibility can greatly increase the conformation freedom of side-chain functional groups, it is not obvious which backbone movements will generate the critical constellation of atoms responsible for protein function. Here, we report an automated method for identifying protein backbone movements that can give rise to any specified set of desired side-chain atomic placements and interactions, using protein-DNA interfaces as a model system. We use a library of previously observed protein-DNA interactions (motifs) and a rotamer-based description of side-chain conformation freedom to identify placements for the protein backbone that can give rise to a favorable side-chain interaction with DNA. We describe a tree-search algorithm for identifying those combinations of interactions from the library that can be realized with minimal perturbation of the protein backbone. We compare the efficiency of this method with the alternative approach of building and screening alternate backbone conformations.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Examples of interaction motifs. Each panel depicts a single interaction motif. The three coordinate system-defining atoms in both the base (or bases) and the amino acid used to describe the motif are rendered as larger spheres. The sticks representing the amino acid backbone atoms are rendered with a decreased radius. A: A bidentate hydrogen-bond interaction between an arginine and a guanine. B: An interaction between an asparagine and two adjacent stacked bases. The noninteracting bases paired with the stacked bases are colored in solid grey. In this motif, the coordinate system on the DNA side of the interface is defined using atoms from both of the interacting bases. C: An interaction between a lysine and two bases that are diagonally related—the bases are on opposite strands but in adjacent base pairs. The noninteracting bases paired with the stacked bases are colored in solid grey. D: A hydrophobic interaction between a phenylalanine ring and a thymine methyl group. This and all other figures were generated using PyMOL.
Figure 2
Figure 2
Backbone positions for interaction motifs. For an example A-T base pair, inverse rotamers are shown for two interaction motifs: a commonly observed bidentate hydrogen bond interaction between a glutamine and an adenine base, and a packing interaction between a phenylalanine side chain and a thymine methyl group. Although the coordinates for the terminal atoms contacting the DNA are specified by the motif geometry, the location of possible backbone atoms are determined by building the amino acid backward from side chain to main chain using torsional values taken from a rotamer library. A full set of inverse rotamers for this base pair would include more motifs, each with multiple amino acid conformations capable of realizing the interaction encoded in the motif.
Figure 3
Figure 3
Test case for motif-directed relaxation. A: The wild-type recognition sequence for the homing endonuclease I-AniI is similar to a site found in exon six of the IL-2Rγ gene of a mouse model for SCID (severe combined immunodeficiency disease). To evaluate our algorithm, we used motif-directed relaxation to generate altered backbone conformations that could make interactions from our library with a target site incorporating three of the mutations required to change the I-AniI target site to the mouse SCID target site. B: The N-terminal domain of I-AniI and its DNA target half-site are shown with the three base pair changes mutated in silico. Inverse rotamers were built for the mutated base pairs and the intervening wild-type base pair (rendered in space-fill). The protein backbone region that was allowed to move to incorporate interaction motifs is colored magenta, with fixed regions of the protein colored cyan. C: The result of a three-motif loop relaxation (shown in green, with the incorporated motif side chains Arg20, Ser24, and Arg29 rendered as sticks) is overlaid on the native backbone conformation (in magenta).
Figure 4
Figure 4
Single motif incorporation. A: An inverse rotamer (shown in yellow) is selected for incorporation if a subset of its backbone atoms (see Methods section) are approximately super imposable with any residue in the protein (starting conformation shown in magenta). B: Backbone conformational relaxation under a potential augmented with constraints to force the coincidence of inverse rotamer and protein backbone atoms yields an altered protein backbone (shown in green). The incorporation attempt is terminated if the final rmsd between the two sets of backbone atoms is above a threshold value. C: The inverse rotamer is superimposed onto the protein backbone (shown in green). Small differences in backbone atom positions (amplified by a lever arm effect along the side chain) result in the displacement of the side chain functional atoms from the original interacting positions. D: A second round of relaxation is performed with constraints between the functional atoms of the original and superimposed inverse rotamers to restore the desired interaction (final conformation show in cyan).
Figure 5
Figure 5
Incorporation of multiple motifs. The starting (wild-type) backbone conformation is shown throughout in magenta, and altered backbones and incorporated motifs are shown in green. Unincorporated motifs that are identified as close to the current backbone are shown with yellow carbon atoms. The number of motifs incorporated at each level is indicated on the left side of the figure. Each round of incorporation begins by identifying those inverse rotamers whose backbone atoms may be made to coincide with corresponding atoms in the flexible protein region with small perturbations of the protein backbone. For each of these inverse rotamers, the backbone relaxation protocol is applied in an attempt to “thread” the protein backbone through the main chain atoms of the inverse rotamer. If successful, the rotamer is transplanted onto the protein backbone (and that position disallowed from downstream incorporation), and the altered conformation serves as a starting point for the next round of incorporation. As the algorithm proceeds, inverse rotamers considered too far from the initial conformation may be identified as close enough to altered backbones to attempt incorporation in later rounds. The procedure takes the form of a tree, in which the starting conformation is the root, and each successful incorporation of an inverse rotamer begins a new branch. Along each branch of the tree, the procedure terminates when no inverse rotamers are found to attempt another round (first right-hand branch), or when a specified number of motifs have been incorporated (final branch with three motifs).
Figure 6
Figure 6
Transition between homologous loop conformations. A: The N-terminal half-site of the I-CeuI crystal structure was used as the starting point for a test of loop homology modeling. The flexible protein backbone region is shown in magenta. The sequence of DNA was computationally altered to match the recognition sequence for the I-CreI homing endonuclease (rendered in grey spheres). When the I-CeuI and I-CreI structures are structurally aligned using only the atoms of the phosphate backbone and deoxyribose rings, it is seen that the strand-turn-strand regions for the two enzymes adopt different orientations (Cα rmsd of 2.6 Å over the 21 residue loop region) with respect to the major groove (I-CreI loop shown in blue). Three direct contacts between the I-CreI homing endonuclease and the DNA are rendered as sticks (Lys 28A, Asn 30A, and Gln 38A). B: After motif-driven backbone relaxation was performed using the I-CeuI backbone conformation as the starting point and the I-CreI recognition sequence as the target, a number of altered backbone loops incorporating motifs consistent with the I-CreI protein sequence were generated. The loop with the smallest Cα rmsd to the I-CreI backbone (1.5 Å) is shown in green. Three inverse rotamers were incorporated during the process, shown in blue sticks for comparison with the experimentally determined I-CreI side chains. C: Close-up view with the native I-CreI and incorporated motif interactions shown in blue and green, respectively. D: Backbone traces for the I-CeuI (magenta), I-CreI (blue), and the altered loop regions (green) are shown for comparison.

Similar articles

Cited by

References

    1. Pabo C. Molecular technology-designing proteins and peptides. Nature. 1983;301:200–200. - PubMed
    1. Ponder JW, Richards FM. Tertiary templates for proteins. Use of packing criteria in the enumeration of allowed sequences for different structural classes. J Mol Biol. 1987;193:775–791. - PubMed
    1. Harbury PB, Plecs JJ, Tidor B, Alber T, Kim PS. High-resolution protein design with backbone freedom. Science. 1998;282:1462–1467. - PubMed
    1. Kuhlman B, Dantas G, Ireton GC, Varani G, Stoddard BL, Baker D. Design of a novel globular protein fold with atomic-level accuracy. Science. 2003;302:1364–1368. - PubMed
    1. Dahiyat BI, Mayo SL. De novo protein design: fully automated sequence selection. Science. 1997;278:82–87. - PubMed

Publication types

LinkOut - more resources