. 1988 Sep 23;54(7):1081-90.

doi: 10.1016/0092-8674(88)90123-7.

The sequence specificity of homeodomain-DNA interaction

C Desplan¹, J Theis, P H O'Farrell

Affiliations

PMID: 3046753
PMCID: PMC2753412
DOI: 10.1016/0092-8674(88)90123-7

The sequence specificity of homeodomain-DNA interaction

C Desplan et al. Cell. 1988.

. 1988 Sep 23;54(7):1081-90.

doi: 10.1016/0092-8674(88)90123-7.

Authors

C Desplan¹, J Theis, P H O'Farrell

Affiliation

¹ Department of Biochemistry and Biophysics, University of California, San Francisco 94143-0448.

PMID: 3046753
PMCID: PMC2753412
DOI: 10.1016/0092-8674(88)90123-7

Abstract

The Drosophila developmental gene, engrailed, encodes a sequence-specific DNA binding activity. Using deletion constructs expressed as fusion proteins in E. coli, we localized this activity to the conserved homeodomain (HD). The binding site consensus, TCAATTAAAT, is found in clusters in the engrailed regulatory region. Weak binding of the En HD to one copy of a synthetic consensus is enhanced by adjacent copies. The distantly related HD encoded by fushi tarazu binds to the same sites as the En HD, but differs in its preference for related sites. Both HDs bind a second type of sequence, a repeat of TAA. The similarity in sequence specificity of En and Ftz HDs suggests that, within families of DNA binding proteins, close relatives will exhibit similar specificities. Competition among related regulatory proteins might govern which protein occupies a given binding site and consequently determine the ultimate effect of cis-acting regulatory sites.

PubMed Disclaimer

Figures

**Figure 1. The En and Ftz Fusion Protein Constructs Tested for DNA Binding Activity**
The 60 amino acid homeodomain (HD) is indicated by the thick bold line, other En sequences by a thin line, and other Ftz sequences by a triple line. The numbers above constructs A and G indicate positions with respect to the intact En or Ftz proteins, and the numbers below the various constructs indicate positions with respect to the first amino acid of the HD. The position of the helix-turn-helix motif within the HD is also shown. The capacity of these constructs to bind DNA is indicated (+/−). (A) Represents the En fusion protein used for experiments shown in Figures 2 and 5 (also see Desplan et al., 1985). The fusion contains 44 residues N-terminal to the HD and 39 residues extending from the end of the HD to the natural stop codon. (B) Deletion of a C-terminal segment of en coding sequences replaces the last amino acid of the En HD (thr to ser, hatched line). Translation terminates at a TGA codon immediately after the altered amino acid. (C) and (D) In these two constructs (Theis et al., unpublished data), smaller C-terminal parts of the En protein are fused to part of the preprocalcitonin rather than β-galactosidase. Fusion C contains 11 residues N-terminal to the HD, while D contains 41. (E) Cleavage at the BgIII site, within the homeobox, and fusion to a different ORF (hatched line) results in an HD truncated beyond position 47 and lacking half of the putative recognition helix. (F) This is an 11 residue deletion that removes amino acids 48 to 58, inclusively. This deletion removes half of the putative recognition helix. (G) The Ftz construct includes 110 residues N-terminal to the HD and extends to the natural termination codon 97 residues C-terminal to the HD. The only homology between the En and Ftz proteins is within the HD (thick line). The Ftz protein expressed is derived from the Oregon R cDNA, which is proposed to encode a 410 amino acid protein (Laughon and Scott, 1984). The cloned *ftz* cDNA was generously provided by A. Laughon and M. P. Scott.

**Figure 2. DNAase I Protection Patterns Produced by HD Fusions**
(A) DNAase I protection of regions 1 and 2 in en DNA. A Clal-Nael fragment from plasmid p615, 5′ end–labeled at the Clal site, was incubated for 30 min at 0°C without protein (0) or with 0.9 (1), 4.4 (2), or 22 μg (3) of bacterial extract containing the En HD fusion protein (construct A in Figure 1), partially digested with DNAase I as described in Experimental Procedures and electrophoresed on a 6% sequencing gel. The Ftz lanes represent protection obtained by 1.5 (1), 7.4 (2), or 37 (3) μg of bacterial extract containing the Ftz fusion protein (construct G in Figure 1). Protected (−) and enhanced (+) sites of DNAase I cleavage in and around sites 1 and 2 are indicated. The arrowhead indicates an enhanced band present only when the Ftz protein is used. No protection was observed in bacterial extracts producing truncated, inactive fusion proteins. A third protected region contained in this fragment is not visible in this separation. Protection by the En fusion spans 29,18, and 20 bp for regions 1,2, and 3, respectively. (B) DNAase I protection of a fragment containing six copies of the NP sequence (NP₆; see Figure 4A). Increasing amounts of the En fusion protein extract, no protein (0), 0.35 (1), 1.75 (2), 9 (3), or 44 μg (4) in 25 μl were incubated with the DNA fragment (end-labeled at its HindIII site). Digestion with 1 μg/ml of DNAase I is as in Experimental Procedures. The arrows indicate the positions and orientations of the six NP consensus sequences. The positions of the bands resulting from DNAase I cuts are indicated for the first copy of the NP sequence. These positions of cleavage are repeated in each subsequent copy of the NP sequence having the same orientation (copies 1 to 4). A characteristic pattern of protection/enhancement due to the En protein extract is observed in each of these copies. This pattern changes for copies 5 and 6, which are in the opposite polarity. Calcitonin fusions C and D (see Figure 1) exhibit a similar pattern of protection. (C) Concentration of En HD fusion protein extract required to protect one or three copies of the LP sequence (see Figure 4B). Fragments LP₁ and LP₃ were footprinted with increasing amounts of the extract: no extract (0); 1.3 (1); 2.7 (2); 5.5 (3); 11 (4); 22 (5); or 44 μg (6) of total protein. The positions of the DNAase I cuts within the various LP sequences are indicated by comparison with the G/A Maxam–Gilbert sequencing lane (note that the positions of the bands in this ladder are shifted compared with the DNAase I lane because of the difference in the position of cleavage [Maxam and Gilbert, 1980]). Each palindrome and its center is indicated. The dashed lines represent the limits of each copy.

**Figure 3. Consensus Sites near the *engrailed* Gene Are Clustered**
(A) The positions of regions protected from DNAase I by the En fusion (arrowheads) are clustered, and each protected region contains one or more sequences related to a consensus. Each footprinted region is designated with a number. The sequences of five footprinted regions are given, and positions matching the consensus (see B) are in upper case. Where a footprinted region (e.g., 1) includes more than one consensus site, these are distinguished with a prime (e.g., 1,1′, and 1″). Clal sites to the left of the illustrated sequences are the positions at which label was incorporated for analysis of DNAase I protection. (B) Alignment of the sequences exhibiting footprints with the En fusion protein. Sequences of the footprinted regions are aligned based on their homology. Regions 1 to 6 are from the en gene of D. melanogaster (A). The sequences marked “en vir” are the corresponding regions in the en gene of D. virilis (Kassis et al., unpublished data). The *ftz* footprinted regions are located 3′ to the *ftz* gene (see Desplan et al., 1985). Each distinct footprint is designated by a number. Most of these footprinted regions contain several sequences that can be aligned, and each distinct alignment is indicated with the number of the region (e.g., sites 1, 1′, and 1″). All these aligned sites are present within regions protected from DNAase I digestion by the En fusion protein (e.g., Figure 2A). The consensus is defined as the average between all these aligned sequences. The number of sites matching the consensus at each particular position as well as the score of each individual sequence matching the consensus are indicated. Three of the consensus sequences, en vir 2′, ftz 1′, and lambda 2′, are aligned on the strand opposite to the other represented sequences. The sites in lambda DNA are sequences present in fragments bound by the fusion protein. The sequences within these fragments that exhibit homology to the consensus are aligned. They have not been footprinted.

**Figure 4. Synthetic Version of the Consensus Sequence**
(A) Different arrangements of a synthetic consensus sequence. A 12 bp nearly palindromic sequence (NP), TCAATTAAATGA, was synthesized. Positions 1 through 10 of this sequence represent the consensus sequence. The G and A (positions 11 and 12) were added in order to create a BcII site at the junction between two consensus sequences. The arrows indicate the orientations of the consensus sequences in cloned repeats. Note that the addition of the G and A at positions 11 and 12 creates a sequence in the opposite polarity that matches the consensus at 9 out of 10 positions. One or several copies of the NP sequence were cloned in various orientations within the BamHI site of the M13mp18 polylinker. (B) The NP sequence is nearly palindromic, imperfect at positions 4 and 9, which are both A. Palindromic sequences were synthesized by duplicating in the opposite polarity the left six bases (left palindrome, LP) or duplicating the right six bases (right palindrome, RP) of the NP sequence. Various numbers of copies of the LP and RP sequences were cloned into the BamHI (LP₁, LP₃, and RP sequences) or Smal (LP₂) sites of M13mp18.

**Figure 5. Binding of En and Ftz Fusion to DNA Fragments with Different Numbers of Copies of Synthetic Sites**
In each binding reaction, the En fusion protein or Ftz fusion protein is offered a mixture of fragments carrying different numbers of copies of NP, LP, RP, or TAA as indicated. A separation of each total mixture is shown (T), and subsequent lanes show the fragments immunoprecipitated with the En fusion protein in the absence (0) and presence of increasing amounts (5,10, 20, 40, 80,160, and 320 ng) of competitor DNA, a mixture of oligomerized synthetic double-stranded fragments, [(TAA)₅]_n. Note that the TAA competitor DNA competed out the binding of both TAA and LP^* sequences. Competition with oligomerized NP sequences gives similar results. That is, the different competitor DNAs differed slightly in the concentration required to compete for binding; they gave the same order of competition of each of the labeled fragments. Fragments LP₂^* and LP₄^* contain 2 and 4 copies of the LP sequence, but modified (by a blunt-ending procedure) to remove two terminal nucleotides at each end. Cloning of the blunt-ended oligonucleotides in the Smal site of M13mp18 regenerates one of the two missing nucleotides from each end of the LP^* fragments. Consequently, the LP^* fragments differ in the position at which they are cloned (Smal versus BamHI) and are 4 bp shorter than the corresponding NP or RP fragments and differ from the LP sequence given in Figure 4B by lacking the leftmost T and rightmost A. Experiments using a perfect LP₂ sequence cloned in the Smal site showed that the sequence difference was of little or no consequence to binding by En or Ftz fusions (data not shown). LP₂^* is included as an internal reference in all but the NP panel. The fragments named TAA₅ and TAA₈ contain five and eight tandem copies of the trinucleotide TAA, respectively. The other fragments, TAA₂₀ and TAA₄₀, contain four or eight copies of TAA₅ ligated in various orientations. The dashed line is to aid alignment in making comparisons between experiments. Results similar to those shown for the En fusion are obtained using calcitonin constructs C and D.

See this image and copyright information in PMC

References

1. Arriza JL, Weinberger C, Cerelli G, Glaser TM, Handelin BL, Housman DE, Evans RM. Cloning of human mineralocorticoid receptor complimentary DNA: structural and functional kinship with the glucocorticoid receptor. Science. 1987;237:268–274. - PubMed
1. Beachy PA, Helfand SL, Hogness DS. Segmental distribution of bithorax complex proteins during Drosophila development. Nature. 1985;313:545–551. - PubMed
1. Bopp D, Burri M, Baumgartner S, Frigerio G, Noll M. Conservation of a large protein domain in the segmentation gene paired and in functionally related genes in Drosophila. Cell. 1986;47:1033–1049. - PubMed
1. Brenowitz M, Senear DF, Shea MA, Ackers GK. “Footprint” titrations yield valid thermodynamic isotherms. Proc Natl Acad Sci USA. 1986;83:8462–8466. - PMC - PubMed
1. Carroll SB, Scott MP. Localization of the fushi tarazu protein during Drosophila embryogenesis. Cell. 1985;43:47–57. - PubMed

Publication types

Actions
Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Substances

Actions
Actions
Actions

Grants and funding

R01 GM037193/GM/NIGMS NIH HHS/United States

LinkOut - more resources

Full Text Sources
Molecular Biology Databases
- FlyBase
- Institute for Transcriptional Informatics
Research Materials
- NCI CPTC Antibody Characterization Program

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

The sequence specificity of homeodomain-DNA interaction

Affiliation

The sequence specificity of homeodomain-DNA interaction

Authors

Affiliation

Abstract

Figures

References

Publication types

MeSH terms

Substances

Grants and funding

LinkOut - more resources

Full Text Sources

Molecular Biology Databases

Research Materials