Design of a combinatorial DNA microarray for protein-DNA interaction studies

Julian Mintseris¹, Michael B Eisen

Affiliations

PMID: 17018151
PMCID: PMC1635571
DOI: 10.1186/1471-2105-7-429

Design of a combinatorial DNA microarray for protein-DNA interaction studies

Julian Mintseris et al. BMC Bioinformatics. 2006.

. 2006 Oct 3:7:429.

doi: 10.1186/1471-2105-7-429.

Authors

Julian Mintseris¹, Michael B Eisen

Affiliation

¹ Boston University, Bioinformatics Program, Boston, MA, USA. julianm@bu.edu

PMID: 17018151
PMCID: PMC1635571
DOI: 10.1186/1471-2105-7-429

Abstract

Background: Discovery of precise specificity of transcription factors is an important step on the way to understanding the complex mechanisms of gene regulation in eukaryotes. Recently, double-stranded protein-binding microarrays were developed as a potentially scalable approach to tackle transcription factor binding site identification.

Results: Here we present an algorithmic approach to experimental design of a microarray that allows for testing full specificity of a transcription factor binding to all possible DNA binding sites of a given length, with optimally efficient use of the array. This design is universal, works for any factor that binds a sequence motif and is not species-specific. Furthermore, simulation results show that data produced with the designed arrays is easier to analyze and would result in more precise identification of binding sites.

Conclusion: In this study, we present a design of a double stranded DNA microarray for protein-DNA interaction studies and show that our algorithm allows optimally efficient use of the arrays for this purpose. We believe such a design will prove useful for transcription factor binding site identification and other biological problems.

PubMed Disclaimer

Figures

**Figure 1**
**Probe design from the shortest path on a graph**. The de Bruijn graph for all possible DNA base doublets and one possible solution for a shortest path represented as a pseudo-Eulerian cycle (bold edges). The reverse complement solution is represented by dashed edges in the graph and also the inner cycle sequence. "Cutting" the circular sequence while retaining one overlapping base results in two sequences of total length 12 (containing all doublets) as compared to the length of all non-overlapping concatenated doublets 2 * 4²= 32. Cutting the circular sequence at different points allows screening multiple replicates and helps identify biases in sequence recognition preferences. Reverse complement strands for the replicates are not shown.

**Figure 2**
**Distribution of putative PBM probe hits for Rap1**. Frequency of array probe hits distributed by number of potential binding sites per probe. All sequences one or two mutations away from the consensus sequence are assumed to bind.

**Figure 3**
**Distribution of putative PBM probe hits for TBP**. Frequency of array probe hits distributed by number of potential binding sites per probe. All sequences one or two mutations away from the consensus sequence are assumed to bind.

**Figure 4**
**Distribution of putative PBM probe hits for 100 random transcription factor binding sites of length 10**. Frequency of array probe hits distributed by number of potential binding sites per probe. The data is averaged over 100 random 10-mer binding sites. For each 10-mer, all sequences one or two mutations away from the consensus sequence are assumed to bind.

**Figure 5**
**Robustness of designed array and Gibbs Sampler to addition of noise**. Starting with a set of 10-mer Rap1 TRANSFAC binding sites, the effect of added noise is measured as correlation of the original PWM with that derived from 100 Gibbs Sampler-runs. Each level of noise is represented by the standard box-and-whisker plot. In the 0–50% noise range, the boxes are so small that they are essentially represented by a single line.

See this image and copyright information in PMC

References

1. Davidson EH, Rast JP, Oliveri P, Ransick A, Calestani C, Yuh CH, Minokawa T, Amore G, Hinman V, Arenas-Mena C, Otim O, Brown CT, Livi CB, Lee PY, Revilla R, Rust AG, Pan Z, Schilstra MJ, Clarke PJ, Arnone MI, Rowen L, Cameron RA, McClay DR, Hood L, Bolouri H. A genomic regulatory network for development. Science. 2002;295:1669–1678. doi: 10.1126/science.1069883. - DOI - PubMed
1. Bolouri H, Davidson EH. Modeling transcriptional regulatory networks. Bioessays. 2002;24:1118–1129. doi: 10.1002/bies.10189. - DOI - PubMed
1. Lee TI, Rinaldi NJ, Robert F, Odom DT, Bar-Joseph Z, Gerber GK, Hannett NM, Harbison CT, Thompson CM, Simon I, Zeitlinger J, Jennings EG, Murray HL, Gordon DB, Ren B, Wyrick JJ, Tagne JB, Volkert TL, Fraenkel E, Gifford DK, Young RA. Transcriptional regulatory networks in Saccharomyces cerevisiae. Science. 2002;298:799–804. doi: 10.1126/science.1075090. - DOI - PubMed
1. Harbison CT, Gordon DB, Lee TI, Rinaldi NJ, Macisaac KD, Danford TW, Hannett NM, Tagne JB, Reynolds DB, Yoo J, Jennings EG, Zeitlinger J, Pokholok DK, Kellis M, Rolfe PA, Takusagawa KT, Lander ES, Gifford DK, Fraenkel E, Young RA. Transcriptional regulatory code of a eukaryotic genome. Nature. 2004;431:99–104. doi: 10.1038/nature02800. - DOI - PMC - PubMed
1. Bulyk ML, Gentalen E, Lockhart DJ, Church GM. Quantifying DNA-protein interactions by double-stranded DNA arrays. Nat Biotechnol. 1999;17:573–577. doi: 10.1038/9878. - DOI - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Substances

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Design of a combinatorial DNA microarray for protein-DNA interaction studies

Affiliation

Design of a combinatorial DNA microarray for protein-DNA interaction studies

Authors

Affiliation

Abstract

Figures

References

Publication types

MeSH terms

Substances

LinkOut - more resources

Full Text Sources