Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 Jan 25;45(2):832-845.
doi: 10.1093/nar/gkw1198. Epub 2016 Dec 2.

Quantitative profiling of selective Sox/POU pairing on hundreds of sequences in parallel by Coop-seq

Affiliations

Quantitative profiling of selective Sox/POU pairing on hundreds of sequences in parallel by Coop-seq

Yiming K Chang et al. Nucleic Acids Res. .

Abstract

Cooperative binding of transcription factors is known to be important in the regulation of gene expression programs conferring cellular identities. However, current methods to measure cooperativity parameters have been laborious and therefore limited to studying only a few sequence variants at a time. We developed Coop-seq (cooperativity by sequencing) that is capable of efficiently and accurately determining the cooperativity parameters for hundreds of different DNA sequences in a single experiment. We apply Coop-seq to 12 dimer pairs from the Sox and POU families of transcription factors using 324 unique sequences with changed half-site orientation, altered spacing and discrete randomization within the binding elements. The study reveals specific dimerization profiles of different Sox factors with Oct4. By contrast, Oct4 and the three neural class III POU factors Brn2, Brn4 and Oct6 assemble with Sox2 in a surprisingly indistinguishable manner. Two novel half-site configurations can support functional Sox/Oct dimerization in addition to known composite motifs. Moreover, Coop-seq uncovers a nucleotide switch within the POU half-site when spacing is altered, which is mirrored in genomic loci bound by Sox2/Oct4 complexes.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Coop-seq workflow to decipher the Sox/Oct partner code. (A) Work flow of Coop-seq. After a binding reaction containing the library of DNA sequences and the DNA-binding proteins attains equilibrium it is run on a EMSA gel to separate the heterodimer bound, monomer bound and unbound fractions. An example lane of an EMSA gel lane stained with ethidium bromide and visualized on using a ChemiDoc XRS+ (Bio-Rad) is shown. Each of the fractions is PCR amplified, barcoded and sequenced through Illumina deep sequencing. (B) A general reaction diagram for measuring cooperativity of any two proteins, X and Y to a single sequence Si with binding sites for each protein xi, and yi. Several equivalent equations for cooperativity (ωi) are shown, including the measurement from the concentrations of the sequence in each band (26,45). KX denotes the association constant for the binding of protein X alone to sequence Si. KX|Y denotes the association constant for protein X when protein Y is already bound. (C) The binding sites for all sequences used in these Coop-seq experiments. The first library contains sequences with −1 to 6 nucleotide spacers for the Sox and Pou proteins both in the forward direction. Each sequence contains one randomized nucleotide in both Sox2 (blue) and Oct4 (red) binding sites at the position immediately adjacent to the spacer (black). The second library contains all four possible combinations of binding site orientations. Each orientation has 0–4 spacer bases, the outer most of which are randomized. Data from both libraries were combined for the subsequent analysis.
Figure 2.
Figure 2.
Coop-seq captures a broad range of TF cooperativties leading to the identification of specific heterodimer clusters and underlying sequence determinants. (A) Violin with overlaid box-and whisker plots showing that the cooperativity for the studied 12 Sox/Oct pairs varies over 3–4 orders of magnitude. (B) Correlation heatmap of the cooperativity factors for the 12 Sox/Oct pairs (rows and columns). Pearson correlation coefficients were hierarchically clustered. Fields are color coded by correlation coefficients. The main clusters are indicated with column/row spacing. (C) PCA variable loadings PC1 and PC2 are shown for the 12 Sox/Oct pairs. (D) PCA scores of individual data points projected along PC1 and PC2. Data points are color-coded by element spacing and the four orientations are mapped to different symbols. Selected sequences that explain most of the dataset's variance are shown. For each spacer/orientation combination several sequence were analysed due to the randomization of selected positions (Figure 1C) leading to multiple data points with identical color/shape coding (i.e. black, orange and green circles for FF−1, FF0 and FF+1 in (D)).
Figure 3.
Figure 3.
Half-site spacing and orientation dependence of the Sox/Oct cooperativity profiles. (A) Violin and box-and-whisker plots illustrating the distribution of cooperativities for all Coop-seq data for all the 12 Sox/Oct pairs as a function of half-site orientation. (B) The cooperativity for the Sox2/Oct4 heterodimer formation is shown for the four orientations as a function of half-site spacing. Colored dots represent sequence variants as wells as replicates and the gray diamond denotes the median. (C) Structural models of Sox2/Oct4 dimers on four composite DNA elements with zero base pair spacer. The Sox2 HMG box is shown light blue and the POU domain of Oct4 in red (POU specific subdomain, POUS) or brown (POU homeodomain, POUHD). The POU linker is colored magenta. Proteins are depicted as van-der-Waals and residues predicted to clash after energy minimization are shown with space-filling spheres and shaded according to the severity of the clash. Residues involved in clashes are marked with blue labels (Sox2) or black labels (Oct4). The DNA backbone is shown as ribbon. Sox residues E46, K57 and R75 mutated to change the cooperativity profile are shown for the FF configuration.
Figure 4.
Figure 4.
Sox factors associate with Oct4 in a DNA-element dependent fashion. (A) Hierarchically clustered heatmap of cooperativity factors for 23 element types (spacer/orientation combinations) and 12 Sox/Oct pairs. The natural log transformed mean cooperativity over all sequence variants and replicates per element type was used. The heatmap was prepared using the pheatmap R package with hierarchical clustering of rows and columns. (B) Pair-wise scatter plots of mean ω values for each analyzed sequence to highlight the differential binding of Sox17/Oct4, Sox2/Oct4 and Sox5/Oct4 to sequences with FF−1 and FF0 configuration (left panel). Data points are color-coded for half-site spacers and symbol-coded for orientations. As multiple sequence variants were used per spacer/orientation combination individual color/shape combinations occur multiple times (i.e. four black circles are present for the four FF−1 sequences studied (CATTGTNTGCTAAT). (C) Chip-seq peaks of Sox2, Sox17 (14) and Sox6 (47) were searched for the presence of ‘compressed’ FF−1 and ‘canonical’ FF0 sequences using the IUPAC strings FF0 = HWTTGWNATGYWWWD and FF−1 = HWTTGWATGYWWWD. Log2 transformed motif count ratios per dataset are plotted as barchart.
Figure 5.
Figure 5.
POU factors exhibit a near-identical cooperativity pattern with Sox2 despite directing contrasting cell fate decisions. (A–C) Pair-wise scatter plots of ω values to compare the cooperativity profile of the Sox2/Oct4 dimerization important during the maintenance and induction of pluripotency and the dimerization of Sox2 with the neural POU factors Oct6, Brn2 and Brn4. The mean of five replicate measurements for every sequence is plotted. Structural models of Sox2/Oct4 (D) and Sox2/Oct6 (E) bound to the ‘canonical’ FF0 element. Amino acids mediating protein-protein interactions are show as ball-and-sticks. (F) Alignment of the POU domains of mouse Oct4, Brn4, Brn2 and Oct6. Sox2 contact amino acids are colored red. Identical residues are on black background, conservative replacements on gray background and non-conservative replacements on white background.
Figure 6.
Figure 6.
Nucleotide preferences in the octamer element change with altered half-sites spacing. (A) Number of sequence variants interrogated for the studied element types (orientation – spacer combinations). (B) Dependence of the normalized cooperativity (ω) on the identity of the randomized nucleotide is analyzed for the binding of Sox17 and Sox5 with Oct4 to the ‘compressed’ FF−1 sequence. (C) Genomic loci encoding FF−1 sequences co-bound by Sox17 and Oct4 in the KH2 mouse embryonic stem cell (mESC) line or retinoic acid (RA) treated embryonic carcinoma F9 cell line (14) were analyzed for nucleotide preferences at position 1 of the octamer sequence using the IUPAC string H1W2T3T4G5W6[ACGT]1T2G3Y4W5W6W7D8. (D) Contacts between position 1 of the octamer sequence and the Q44 of the POU-specific domain of Oct4. When Oct4 is bound to its cognate sequence (upper left) a favorable bidentate hydrogen bond is formed while this arrangement is disturbed upon replacement of the adenine. (E) Box-and-whisker plots to compare the dependence of the cooperativity on the identity of octamer nucleotide 1 for canonical FF0 and the FF1 elements. Coop-seq data for wild-type Sox2 binding with Oct4, Oct6, Brn2 and Brn4 were included for the analysis. Asteriks denote statistical significance from unpaired, two-side t-tests with P-values<0.01. (F, G) Genomic loci co-bound by Sox2 and Oct4 (57) were dissected for the relative preferences for position 1 octamer nucleotides as function of half-site spacing demonstrating that an ‘A’ is preferred for the canonical FF0 configurations but degenerate sites are preferred as the spacing increases. IUPAC strings used for the search are shown on top of the plots. Panel G uses a more stringent definition of the Sox site than panel F.

References

    1. Jerabek S., Merino F., Scholer H.R., Cojocaru V. OCT4: dynamic DNA binding pioneers stem cell pluripotency. Biochim. Biophys. Acta. 2014;1839:138–154. - PubMed
    1. Schepers G.E., Teasdale R.D., Koopman P. Twenty pairs of sox: extent, homology, and nomenclature of the mouse and human sox transcription factor gene families. Dev. Cell. 2002;3:167–170. - PubMed
    1. Gubbay J., Collignon J., Koopman P., Capel B., Economou A., Munsterberg A., Vivian N., Goodfellow P., Lovell-Badge R. A gene mapping to the sex-determining region of the mouse Y chromosome is a member of a novel family of embryonically expressed genes. Nature. 1990;346:245–250. - PubMed
    1. Harley V.R., Lovell-Badge R., Goodfellow P.N. Definition of a consensus DNA binding site for SRY. Nucleic Acids Res. 1994;22:1500–1501. - PMC - PubMed
    1. Werner M.H., Huth J.R., Gronenborn A.M., Clore G.M. Molecular basis of human 46X,Y sex reversal revealed from the three-dimensional solution structure of the human SRY-DNA complex. Cell. 1995;81:705–714. - PubMed

Publication types

MeSH terms