Distinctive sequence patterns in metazoan and yeast nucleosomes: implications for linker histone binding to AT-rich and methylated DNA

Feng Cui¹, Victor B Zhurkin

Affiliations

PMID: 19282449
PMCID: PMC2685081
DOI: 10.1093/nar/gkp113

Distinctive sequence patterns in metazoan and yeast nucleosomes: implications for linker histone binding to AT-rich and methylated DNA

Feng Cui et al. Nucleic Acids Res. 2009 May.

. 2009 May;37(9):2818-29.

doi: 10.1093/nar/gkp113. Epub 2009 Mar 12.

Authors

Feng Cui¹, Victor B Zhurkin

Affiliation

¹ Laboratory of Cell Biology, National Cancer Institute, NIH, Bethesda, MD 20892, USA.

PMID: 19282449
PMCID: PMC2685081
DOI: 10.1093/nar/gkp113

Abstract

Linker histones (LHs) bind to the DNA entry/exit points of nucleosomes and demonstrate preference for AT-rich DNA, although the recognized sequence patterns remain unknown. These patterns are expected to be more pronounced in metazoan nucleosomes with abundant LHs, compared to yeast nucleosomes with few LHs. To test this hypothesis, we compared the nucleosome core particle (NCP) sequences from chicken, Drosophila and yeast, extending them by the flanking sequences extracted from the genomes. We found that the known approximately 10-bp periodic oscillation of AT-rich elements goes beyond the ends of yeast nucleosomes, but is distorted in metazoan sequences where the 'out-of-phase' AT-peaks appear at the NCP ends. The observed difference is likely to be associated with sequence-specific LH binding. We therefore propose a new structural model for LH binding to metazoan nucleosomes, postulating that the highly conserved nonpolar 'wing' region of the LH globular domain (tetrapeptide GVGA) recognizes AT-rich fragments through hydrophobic interactions with the thymine methyl groups. These interactions lead to DNA bending at the NCP ends and formation of a 'stem-like' structure. The same mechanism accounts for the high affinity of LH to methylated DNA-a feature critical for stabilization of the higher-order structure of chromatin and for repression of transcription.

PubMed Disclaimer

Figures

**Figure 1.**
Extension of chicken and yeast nucleosomes with genomic flanking sequences. (a) Scheme for generating ‘selected extended’ sequences from the ‘original’ NCPs. The sequences are represented by rectangles: gray for the ‘original’ NCPs and white for the flanking fragments in genomes. Note that ‘selected’ sequences are less numerous than the ‘original’ NCPs (see ‘Methods’ section). The sequences are center-aligned to the 147-bp core DNA template, such that the dyad is located at position 74. The base-pair positions in the ‘extended’ sequences are numbered −19, −18 … −1, 0, 1, … 147, 148, … 167. (b and c) Frequencies of occurrence of AA:TT dimers versus base-pair step position in the original (b) and selected (c) sets of ‘extended’ nucleosomal sequences. Running 3-point averages of the frequencies are shown in blue (for chicken) and red (for yeast). In (b and c), the ‘dimeric’ numbering scheme is used. That is, the dimer (x, x + 1) is assigned to position x. So, dimeric step positions in the 147-bp core are numbered from 1 to 146 (with the dyad corresponding to base-pair step 73.5). Accordingly, the resulting frequencies are ‘symmetrized’ with respect to the dyad (dashed line).

**Figure 2.**
Combined frequency of occurrence of AT2 dimers (AA:TT + AT) versus base-pair step position in ‘extended’ nucleosomal sequences: chicken (a), yeast (b), *Drosophila* (c) and yeast-H2A.Z (d) (thin black lines). The 3-point averages are shown in thick blue (for chicken and *Drosophila*) and red lines (for yeast). The resulting frequencies are ‘symmetrized’ with respect to the dyad at base-pair step 73.5 (vertical dashed lines). In Figures 2c and 4c, the data are presented for 147-bp long NCPs from *Drosophila* (see ‘Methods’ section).

**Figure 3.**
Length distribution of the chicken (a) and yeast (b) NCP sequences. Occurrences versus length are shown for the ‘original’ and ‘selected’ sets of NCP sequences (white and gray bars, respectively). The ‘original’ chicken (12) and yeast (9) sets contained 177 and 199 NCPs respectively. The ‘selected’ sets were chosen as described in ‘Methods’ section; these sets contain 169 chicken and 168 yeast NCP sequences. Note that the distributions are essentially the same for the ‘original’ NCP sets and for the nucleosomal sequences found in the genomes, indicating that the ‘selected’ sequences faithfully represent the ‘original’ sets.

**Figure 4.**
Autocorrelation between AT2 dimers located in terminal regions of nucleosomes. The distance autocorrelation function, *P(n)*, represents the frequency of occurrence of two AT2 dimers with distance n between them (13,35). The intervals from −8 to +12 in both strands were used to calculate the autocorrelation for the chicken (a), yeast (b), *Drosophila* (c) and yeast-H2A.Z (d) sequences. Dashed lines represent the *P(n)* averages (n = 1 to 15) over 1000 implementations of the same number of ‘random’ sequences as in the original NCP sets. The tri-nucleotide composition was the same as in the corresponding genomic fragments in the intervals (–8, +12). None of the standard deviations of randomly generated *P(n)* values exceeded 0.15. Therefore, all the peaks observed at n = 6/7 and n = 11/12 have Z-score values 4.5 and higher (the significance level is P < 10^–5).

**Figure 5.**
Structural model for LH globular domain binding to nucleosomal DNA. (a) DNA-binding sites in GH5 in various models for LH globular domain binding to nucleosomal DNA. The GH5 X-ray structure is shown (monomer A in ref. 20: Helix 1 and Helix 2, ochre; Helix 3, blue). The DNA-binding residues corroborated both *in vivo* (42) and *in vitro* (40,41) are traditionally divided into two groups—site I (K69 and R73 shown in blue) and site II (R42, R94 and K97 shown in green). In the model proposed by Zhou *et al*. (43), only R42 in site II is close to DNA. Site III (identified in this study) is highlighted in magenta. Note that K85 is traditionally viewed as part of site I. We propose that it may belong to site III (see ‘Discussion’ section); therefore, it is shown in magenta. The LH residues are numbered based on the GH5 sequence (20). (b) Molecular model of GH5 location within the nucleosome. The 147-bp core DNA (19), indexed from 1 to 147, is extended at each end by ‘ideal’ B-DNA fragments representing linkers. At one end (at the base-pair step 0/1), DNA is bent by 20° into the major groove; the other end of the nucleosomal DNA is ‘straight’. The 10-bp linker DNA fragments are numbered −9, −8, … 0 at one end, and 148, 149, … 157 at the other end. The positions 0/1 and 147/148 are colored in blue, corresponding to the ‘terminal’ AT2 peak in chicken (Figure 2a), while positions −2/−1 and 149/150 are colored in red, corresponding to the ‘terminal’ peak in yeast (Figure 2b). GH5 is represented with the same color code as in (A). The arrows on the right indicate accessibility of the DNA minor groove for MNase cleavage: red arrow, at positions −2/−1; blue arrow, at positions 0/1. The figure was prepared with Chimera (72); the H5 carboxyl end is shown by a dashed line.

**Figure 6.**
Different positioning of LH globular domain in the three models for linker histone binding to nucleosomal DNA. (a) The three DNA-binding sites in the globular domain GH1⁰/GH5 are shown schematically, using the same color code as in Figure 5: site I is in blue, site II in green and site III (wing) in magenta. Three cartoons illustrate different positioning of the LH globular domain with regard to the nucleosomal dyad proposed by Zhou *et al*. (43) (b), Brown *et al*. (42) (c) and in our model (d). Note that in our model, the position of GH1⁰/GH5 relative to the nucleosomal dyad is close to that proposed by Zhou *et al*. (43), but orientation of the globular domain is similar to that in the Brown *et al*. model (42). In addition, we postulate that the ‘wing’ domain interacts with DNA in the major groove, thereby facilitating bending of the linker toward the dyad. According to our model, LH–DNA interactions in the major groove are hydrophobic and sequence specific, involving, on the histone side, four nonpolar residues, GVGA, and on the DNA side, thymines or methylated cytosines (see Figure 5b).

See this image and copyright information in PMC

References

1. Thoma F, Koller T, Klug A. Involvement of histone H1 in the organization of the nucleosome and of the salt-dependent superstructures of chromatin. J. Cell Biol. 1979;83:403–427. - PMC - PubMed
1. Robinson PJ, Rhodes D. Structure of the ‘30 nm’ chromatin fibre: a key role for the linker histone. Curr. Opin. Struct. Biol. 2006;16:336–343. - PubMed
1. Woodcock CL, Skoultchi AI, Fan Y. Role of linker histone in chromatin structure and function: H1 stoichiometry and nucleosome repeat length. Chromosome Res. 2006;14:17–25. - PubMed
1. Simpson RT. Structure of the chromatosome, a chromatin particle containing 160 base pairs of DNA and all the histones. Biochemistry. 1978;17:5524–5531. - PubMed
1. Sponar J, Sormova Z. Complexes of histone F1 with DNA in 0.15M NaCl. Eur. J. Biochem. 1972;29:99–103. - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Substances

Actions
Actions
Actions

Grants and funding

Intramural NIH HHS/United States

LinkOut - more resources

Full Text Sources
Molecular Biology Databases

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Distinctive sequence patterns in metazoan and yeast nucleosomes: implications for linker histone binding to AT-rich and methylated DNA

Affiliation

Distinctive sequence patterns in metazoan and yeast nucleosomes: implications for linker histone binding to AT-rich and methylated DNA

Authors

Affiliation

Abstract

Figures

References

Publication types

MeSH terms

Substances

Grants and funding

LinkOut - more resources

Full Text Sources

Molecular Biology Databases