Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2007;35(3):e20.
doi: 10.1093/nar/gkl1062. Epub 2007 Jan 3.

Positional clustering improves computational binding site detection and identifies novel cis-regulatory sites in mammalian GABAA receptor subunit genes

Affiliations

Positional clustering improves computational binding site detection and identifies novel cis-regulatory sites in mammalian GABAA receptor subunit genes

Timothy E Reddy et al. Nucleic Acids Res. 2007.

Abstract

Understanding transcription factor (TF) mediated control of gene expression remains a major challenge at the interface of computational and experimental biology. Computational techniques predicting TF-binding site specificity are frequently unreliable. On the other hand, comprehensive experimental validation is difficult and time consuming. We introduce a simple strategy that dramatically improves robustness and accuracy of computational binding site prediction. First, we evaluate the rate of recurrence of computational TFBS predictions by commonly used sampling procedures. We find that the vast majority of results are biologically meaningless. However clustering results based on nucleotide position improves predictive power. Additionally, we find that positional clustering increases robustness to long or imperfectly selected input sequences. Positional clustering can also be used as a mechanism to integrate results from multiple sampling approaches for improvements in accuracy over each one alone. Finally, we predict and validate regulatory sequences partially responsible for transcriptional control of the mammalian type A gamma-aminobutyric acid receptor (GABA(A)R) subunit genes. Positional clustering is useful for improving computational binding site predictions, with potential application to improving our understanding of mammalian gene expression. In particular, predicted regulatory mechanisms in the mammalian GABA(A)R subunit gene family may open new avenues of research towards understanding this pharmacologically important neurotransmitter receptor system.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Schematic diagrams of the positional clustering process. (1) Sets of putatively co-regulated genes are identified. (2) Gibbs sampling is iterated on the input set thousands of times across numerous motif widths. Results are clustered on promoter position, creating a per-nucleotide frequency of the long term recurrence of Gibbs sampling. (3) A linear threshold is used to isolate the most frequently recurring positions, discarding all positions which fall below the threshold.
Figure 2
Figure 2
The robustness to decoy sequences (DSs) of Gibbs sampling with and without positional clustering. Fifty-one increasingly noisy STE12-binding site enriched datasets were analyzed using Gibbs sampling with positional clustering (red solid line) and without (black solid line). The dotted lines represent null controls, e.g. identification of STE12-like motifs by Gibbs sampling (black dotted line) and positional clustering (red dotted line) given random upstream regions. x-axis counts over the addition of DSs. Each set of DSs was chosen independently from all upstream regions in the S.cerevisiae genome. We evaluated the positive predictive value of each technique on each dataset, and found positional clustering significantly improved the PPV through addition of 45 DSs.
Figure 3
Figure 3
Improvement and robustness of positional clustering on promoters bound to other yeast TFs. Sets of S.cerevisiae promoters bound by the TFs YAP1, TEC1, HAP4 and YDR026C were chosen according to ChIP-chip experiments (10). For each set, the initial promoters were analyzed using Gibbs sampling with positional clustering (solid triangles) and without (open triangles). Two Gibbs sampling approaches were applied to each dataset: a Gibbs sampler procedure similar to BioProspector (8) (row A), and MotifSampler (39) (row B). Row C shows the combination of both sampling procedures, along with positional clustering of the combined results. x-axis counts over addition of DSs. We evaluated the positive predictive value of each technique on each dataset, and found positional clustering generally improved the PPV through addition of 100% random DSs.
Figure 4
Figure 4
Three putative transcription factor binding sites form DNA–protein complexes in neocortical nuclear extracts. Neocortical nuclear extracts from E18 rat embryos were incubated with three 32P-radiolabeled probes from human GABRB1, GABRD and GABRB3. Cold wild-type oligonucleotides were used to define specificity through competition. Cold oligonucleotides were added at 100-fold excess over probe. The conditions for each lane are as indicated. Specific binding complexes are shown using asterisks (*). The probe sequences are as follows: (A) GABRB1: AATACGGTCCCTACT, (B) GABRD: ACTTAATTTGATTCCAT and (C) GABRB3: CGTGCCGGGGCGCGGCGGA.
Figure 5
Figure 5
EMSA of three putative TF binding sites form DNA–protein complexes in neocortical and fibroblast nuclear extracts. Neocortical (NEO) and fibroblast (FIB) nuclear extracts from E18 rat embryos were incubated with three 32P-radiolabeled probes from human A4 and D receptor subunits. Cold wild-type oligonucleotides were used to define specificity through competition. Cold oligonucleotides were added at 100-fold excess over probe. The conditions for each lane are as indicated. Specific binding complexes are shown using asterisks (*). The probe sequences are as follows: (A) GABA-A4: AGCGCGGGCGAGTGTGAG CGCGAGTGTGCGCACGCCGCGGG, (B) GABA-A4: GTGCACACACACGCCCACC GCGGCTCGGG and (C) GABA-D: TGACCGTAGTAGA.
Figure 6
Figure 6
Double-stranded oligonucleotide functional assay for GABRA4 regulation. Primary cultures of rat neocortical neurons were treated with DOTAP (N-[1-(2,3-dioleoyloxy)propyl]-N,N,N-trimethylammonium methylsulfate) alone (Mock) or with DOTAP and phosphothioate oligonucleotides from either a cAMP response element (CRE Decoy) or a sequence from the GABA-A4 promoter predicted using positional clustering (GABA-A4 Decoy) (GTGCACACACACGCCCACCGCGGCTCGGG). mRNA was harvested after 24 h, and real-time RT–PCR was performed with GABA-A4 specific primers. Error bars refer to individual experiments; i.e. different platings of cells from different animals. Data was normalized to rRNA levels, and expressed as relative mRNA levels (GABA-A4/rRNA). Results are shown as mean ± SEM, N = 3, asterisk indicates significantly different from control at the 95% confidence interval.

Similar articles

Cited by

References

    1. Winderickx J., de Winde J.H., Crauwels M., Hino A., Hohmann S., Van Dijck P., Thevelein J.M. Regulation of genes encoding subunits of the trehalose synthase complex in Saccharomyces cerevisiae: novel variations of STRE-mediated transcription control? Mol. Gen. Genet. 1996;252:470–482. - PubMed
    1. Madhani H.D., Fink G.R. Combinatorial control required for the specificity of yeast MAPK signaling. Science. 1997;275:1314–1317. - PubMed
    1. Niehrs C., Pollet N. Synexpression groups in eukaryotes. Nature. 1999;402:483–487. - PubMed
    1. Lewin B. Genes VIII. Upper Saddle River, NJ: Pearson Prentice Hall; 2004.
    1. Stormo G.D. Consensus patterns in DNA. Methods Enzymol. 1990;183:211–221. - PubMed

Publication types