Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2011 Nov;21(11):1851-62.
doi: 10.1101/gr.122267.111. Epub 2011 Sep 13.

Evolutionary divergence of intrinsic and trans-regulated nucleosome positioning sequences reveals plastic rules for chromatin organization

Affiliations

Evolutionary divergence of intrinsic and trans-regulated nucleosome positioning sequences reveals plastic rules for chromatin organization

Alex Tsankov et al. Genome Res. 2011 Nov.

Abstract

The packaging of eukaryotic genomes into nuclesomes plays critical roles in chromatin organization and gene regulation. Studies in Saccharomyces cerevisiae indicate that nucleosome occupancy is partially encoded by intrinsic antinucleosomal DNA sequences, such as poly(A) sequences, as well as by binding sites for trans-acting factors that can evict nucleosomes, such as Reb1 and the Rsc3/30 complex. Here, we use genome-wide nucleosome occupancy maps in 13 Ascomycota fungi to discover large-scale evolutionary reprogramming of both intrinsic and trans determinants of chromatin structure. We find that poly(G)s act as intrinsic antinucleosomal sequences, comparable to the known function of poly(A)s, but that the abundance of poly(G)s has diverged greatly between species, obscuring their antinucleosomal effect in low-poly(G) species such as S. cerevisiae. We also develop a computational method that uses nucleosome occupancy maps for discovering trans-acting general regulatory factor (GRF) binding sites. Our approach reveals that the specific sequences bound by GRFs have diverged substantially across evolution, corresponding to a number of major evolutionary transitions in the repertoire of GRFs. We experimentally validate a proposed evolutionary transition from Cbf1 as a major GRF in pre-whole-genome duplication (WGD) yeasts to Reb1 in post-WGD yeasts. We further show that the mating type switch-activating protein Sap1 is a GRF in S. pombe, demonstrating the general applicability of our approach. Our results reveal that the underlying mechanisms that determine in vivo chromatin organization have diverged and that comparative genomics can help discover new determinants of chromatin organization.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
PolyG is an instrinsic antinucleosomal sequence element. (A) Phylogeny of species included in this study (adapted from Wapinski et al. 2007). Whole-genome duplication (WGD) event is marked by the yellow star. Species names are colored to denote major phylogenetic groups. (B) Nucleosome occupancy over 7-mers. All 7-mer sequences (rows) with mean log2 occupancy <−0.75 (there are no 7-mers with occupancy >0.75) in at least one species (columns) are shown across all species for in vivo data (righthand 14 columns) (Tsankov et al. 2010; Weiner et al. 2010) and for in vitro reconstitution experiments (lefthand two columns) (Kaplan et al. 2008; Field et al. 2009). Sequences are clustered by their nucleosome occupancy profiles and specific clusters are marked on right. (Pink) depleted; (violet) occupied. (C) Data for 7-mers AAAAAAA and GGGGGGG, as in B. (D) poly(G) sequences affect nucleosome depletion in vitro in a similar manner to poly(A) sequences. Shown is the average log2 nucleosome occupancy (y-axis) from in vitro reconstitution of C. albicans genomic DNA (Field et al. 2009) for poly(A) and poly(G) sequences of various lengths (x-axis). Ai (Gi) refers to poly(A) [poly(G)] sequences with i mismatches (e.g., A0, no mismatches; A4, four mismatches). (E) Abundance and locations of poly(G) sequences in each species. (Top) Shown are values for sequences of strength 4 or greater (see Methods). (Gray) Positioned within intergenic region; (black) positioned within coding sequence (CDS). (Bottom) A phylogenetic reconstruction of evolutionary losses (lightning bolt) of abundance in poly(G) sequences along the phylogeny. (F) poly(G) elements are also nucleosome depleted in vivo in several yeast species. Shown are the mean in vivo log2 nucleosome occupancies (y-axis) for poly(G) sequences (no mismatches) of different lengths (x-axis) in several species. (G) Median NFR width in each species correlates with abundance of antinucleosomal tracts in its genome. Shown is the median NFR width (Tsankov et al. 2010) (x-axis) for each species (♦) vs. the total number of poly(A) and poly(G) sequences of strength 2 or greater in that species (y-axis). Line represents the best linear fit. Species names are colored as in A.
Figure 2.
Figure 2.
Identification of novel general regulatory factor (GRF) motifs. (A) Overview of GRF motif discovery approach. Nucleosome-depleted sequences are first classified as intrinsic (left) if they evict nucleosomes in vitro and in vivo, or as trans-regulated (right) if they evict nucleosomes more strongly in vivo. Trans-regulated nucleosome positioning sequences in each genome are then clustered based on similarity, aligned, and combined into a position specific scoring matrix (PSSM). Shown are the results for S. cerevisiae, where the algorithm outputs the known PSSMs of chromatin regulators Reb1 and Rsc3/30. (B–E) Predicted binding sites for GRFs in C. albicans (B,C), and Y. lipolytica (D,E). Shown are the sequence logos of the PSSMs (insets) for the sites learned by our approach from 7-mers depleted in each species. For Y. lipolytica, the names of S. cerevisiae proteins with similar sequence specificity are displayed. Each graph shows the average normalized nucleosome occupancy (left y-axis) in promoters with significant matches to the PSSM (black curve, aligned by a gene's +1 nucleosome), as well as the location (gray bars) and number (right y-axis) of binding-site locations. As observed with GRFs in other species (Tsankov et al. 2010), these binding sites are almost entirely (>89%) NFR-localized. (F) Known Sap1 motif (Ghazvini et al. 1995) is identified as a GRF site in S. pombe. As in BE but for S. pombe data.
Figure 3.
Figure 3.
Cbf1 acts as a GRF in C. albicans. (A) Nucleosome depletion at the Cbf1 binding site in C. albicans is not a result of respiratory growth or genomic organization. Mean nucleosome occupancy at Cbf1 sites (y-axis) is shown for the indicated conditions (Kaplan et al. 2008; Field et al. 2009) and species. (B) Cbf1 deletion in C. albicans increases nucleosome occupancy at Cbf1 motifs. (Left) Genes with significant matches to the Cbf1-binding site (black). (Right) Difference in nucleosome abundance at each gene in C. albicans (rows) between cbf1Δ and wild-type strains. Genes are aligned by the +1 nucleosome/NFR boundary (0, red arrow) and ranked from gain (top, yellow) to loss (bottom, blue) in nucleosome occupancy over their NFR. (C) Cbf1-binding sites are the only 7-mers with increased mean nucleosome occupancy in C. albicans cbf1Δ strains compared with wild type. Shown is the mean nucleosome occupancy (log2) for each 7-mer in the wild-type (x-axis) and the cbf1Δ strain (y-axis). Cbf1 binding sites are indicated as blue squares, poly(G) sequences as purple diamonds. (D) As in C, but for S. cerevisiae. (E) Increased nucleosome occupancy in CACGTGA Cbf1 sites in C. albicans in the cbf1Δ strain. Shown is the average log2 nucleosome occupancy (y-axis) at all genes with a CACGTGA Cbf1 motif match in their promoter in wild-type (blue) and cbf1Δ (red) strains. Genes are aligned by the location of the CACGTGA Cbf1 motif (position 0 on the x-axis). (F) Poly(G) sequences are more nucleosome depleted in a cbf1Δ strain. Shown are average nucleosome occupancy values (y-axis) centered on all poly(G) elements of strength of 2 or greater (0 on the x-axis) for cbf1Δ (red) and wild-type (blue) strains in C. albicans. (G–I) Cbf1 deletion affects nucleosome occupancy in vivo at intrinsic poly(G) and poly(A) sequences in C. albicans, but not in S. cerevisiae. Shown are mean nucleosome occupancy levels (log2, y-axis) for poly(G) (G) and poly(A) (H,I) sequences of different length (x-axis) in C. albicans (G,H) and S. cerevisiae (I) for wild-type (blue), in vitro (black), and cbf1Δ (red) experiments.
Figure 4.
Figure 4.
GRF deletion affects occupancy at intrinsic poly(A) sequences in S. cerevisiae and S. pombe. Shown are mean nucleosome occupancy levels (log2, y-axis) for poly(A) sequences of different length (x-axis) in wild-type cells (blue) and in red: (A) abf1-101 strain in S. cerevisiae; (B) reb1-212 strain in S. cerevisiae; (C) rsc3-1 strain in S. cerevisiae; and (D) sap1ts strain at restrictive temperature (35°C) in S. pombe.
Figure 5.
Figure 5.
Sap1 acts as a GRF in S. pombe. (A) Sap1 deletion in S. pombe results in increased nucleosome occupancy at Sap1 motifs. (Left) Genes with significant matches to the Sap1 binding half-site (black). (Right) Difference in nucleosome abundance at each gene in S. pombe (rows) between sap1ts and wild-type strains (both at restrictive temperatures). Genes are aligned by the +1 nucleosome/NFR boundary (0, red arrow) and ranked from gain (top, yellow) to loss (bottom, blue) in nucleosome occupancy over their NFR. (B) Increased nucleosome occupancy at Sap1 half-sites in a sap1ts strain in S. pombe. Shown are log2 nucleosome occupancy averages at all genes with a significant Sap1 motif match in their upstream promoter for a sap1ts strain (red) and a wild-type strain (blue) grown in restrictive temperatures (35°C). Genes are aligned by the location of the Sap1 motif. Nucleosome occupancy increases over Sap1 sites is characteristic of GRF activity. (C) Increased nucleosome occupancy in 5-mers, reflecting the Sap1 half-sites in a sap1ts strain compared with wild-type (both at restrictive temperature, 35°C). Shown is the mean nucleosome occupancy (log2) for each 5-mer in the wild-type (x-axis) and the sap1ts strain (y-axis). Sap1 half-sites are labeled. The only additional site with increased occupancy is the intrinsic sequence GGGGG.

References

    1. Albert I, Mavrich TN, Tomsho LP, Qi J, Zanton SJ, Schuster SC, Pugh BF 2007. Translational and rotational settings of H2A.Z nucleosomes across the Saccharomyces cerevisiae genome. Nature 446: 572–576 - PubMed
    1. Badis G, Chan ET, van Bakel H, Pena-Castillo L, Tillo D, Tsui K, Carlson CD, Gossett AJ, Hasinoff MJ, Warren CL, et al. 2008. A library of yeast transcription factor motifs reveals a widespread function for Rsc3 in targeting nucleosome exclusion at promoters. Mol Cell 32: 878–887 - PMC - PubMed
    1. Barash Y, Elidan G, Kaplan T, Friedman N 2005. CIS: compound importance sampling method for protein-DNA binding site p-value estimation. Bioinformatics 21: 596–600 - PubMed
    1. Biswas K, Rieger KJ, Morschhauser J 2003. Functional characterization of CaCBF1, the Candida albicans homolog of centromere binding factor 1. Gene 323: 43–55 - PubMed
    1. Clapier CR, Cairns BR 2009. The biology of chromatin remodeling complexes. Annu Rev Biochem 78: 273–304 - PubMed

Publication types

Associated data