Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2003 Oct 15;31(20):e124.
doi: 10.1093/nar/gng124.

Pentaprobe: a comprehensive sequence for the one-step detection of DNA-binding activities

Affiliations

Pentaprobe: a comprehensive sequence for the one-step detection of DNA-binding activities

Ann H Y Kwan et al. Nucleic Acids Res. .

Abstract

The rapid increase in the number of novel proteins identified in genome projects necessitates simple and rapid methods for assigning function. We describe a strategy for determining whether novel proteins possess typical sequence-specific DNA-binding activity. Many proteins bind recognition sequences of 5 bp or less. Given that there are 4(5) possible 5 bp sites, one might expect the length of sequence required to cover all possibilities would be 4(5) x 5 or 5120 nt. But by allowing overlaps, utilising both strands and using a computer algorithm to generate the minimum sequence, we find the length required is only 516 base pairs. We generated this sequence as six overlapping double-stranded oligonucleotides, termed pentaprobe, and used it in gel retardation experiments to assess DNA binding by both known and putative DNA-binding proteins from several protein families. We have confirmed binding by the zinc finger proteins BKLF, Eos and Pegasus, the Ets domain protein PU.1 and the treble clef N- and C-terminal fingers of GATA-1. We also showed that the N-terminal zinc finger domain of FOG-1 does not behave as a typical DNA-binding domain. Our results suggest that pentaprobe, and related sequences such as hexaprobe, represent useful tools for probing protein function.

PubMed Disclaimer

Figures

Figure 1
Figure 1
The creation of an n-complete oligonucleotide. (A) Flow chart showing the process by which an n-complete oligonucleotide might be constructed. Also shown is an example for the 2 bp case. (B) Situation where the addition of any base (A, C, T or G) to the growing oligonucleotide (in position 9) does not give rise to any permutation that has not already been ticked off.
Figure 2
Figure 2
Computer algorithm used for the identification of n-complete oligonucleotides. A flow chart is shown that outlines the building up of n-complete sequences using a simple process of elimination.
Figure 3
Figure 3
Sequences of a pentaprobe and a triprobe. (A) One of the sequence solutions of a single double-stranded oligonucleotide (pentaprobe) that contains all 5 base elements exactly once. (B) The sequences of six overlapping double-stranded oligonucleotides (pentaprobe.1–pentaprobe.6) that together comprise a 5-complete solution. All possible 5 base sequences can be found exactly once in these oligonucleotides, with the exception of those sequences found in the overlap regions, which are represented twice. (C) The sequence (in bold) of a single double-stranded oligonucleotide (triprobe) that contains all 3 base elements exactly once. Flanking sequences to include BamHI and EcoRI restriction sites are shown in italic.
Figure 3
Figure 3
Sequences of a pentaprobe and a triprobe. (A) One of the sequence solutions of a single double-stranded oligonucleotide (pentaprobe) that contains all 5 base elements exactly once. (B) The sequences of six overlapping double-stranded oligonucleotides (pentaprobe.1–pentaprobe.6) that together comprise a 5-complete solution. All possible 5 base sequences can be found exactly once in these oligonucleotides, with the exception of those sequences found in the overlap regions, which are represented twice. (C) The sequence (in bold) of a single double-stranded oligonucleotide (triprobe) that contains all 3 base elements exactly once. Flanking sequences to include BamHI and EcoRI restriction sites are shown in italic.
Figure 3
Figure 3
Sequences of a pentaprobe and a triprobe. (A) One of the sequence solutions of a single double-stranded oligonucleotide (pentaprobe) that contains all 5 base elements exactly once. (B) The sequences of six overlapping double-stranded oligonucleotides (pentaprobe.1–pentaprobe.6) that together comprise a 5-complete solution. All possible 5 base sequences can be found exactly once in these oligonucleotides, with the exception of those sequences found in the overlap regions, which are represented twice. (C) The sequence (in bold) of a single double-stranded oligonucleotide (triprobe) that contains all 3 base elements exactly once. Flanking sequences to include BamHI and EcoRI restriction sites are shown in italic.
Figure 4
Figure 4
Analysis of the DNA-binding activity of chosen GST fusion proteins. (A) Coomassie stained SDS–PAGE showing that each of the protein solutions used in subsequent EMSA experiments was similar in concentration. (BG) EMSAs carried out using pentaprobe.1–pentaprobe.6 and the GST fusion proteins shown in (A). Lane 1, probe only; lane 2, GST; lane 3, GST fusion with GATA-N; lane 4, GST fusion with GATA-C; lane 5, GST fusion with BKLF; lane 6, GST fusion with Pegasus; lane 7, GST fusion with PU.1; lane 8, GST fusion with Eos; lane 9, GST fusion with FOG.
Figure 5
Figure 5
Detection of artefacts from contaminating bacterial DNA-binding proteins using anti-GST serum. EMSA showing the DNA-binding activity of GST–GATA-C (lane 1). Lane 2 additionally contains anti-GST serum, while lane 3 contains preimmune serum. The supershifting of the retarded bands seen for lane 2 indicates that the GST fusion protein, rather than an irrelevant contaminant, is responsible for the retardation.
Figure 6
Figure 6
Triprobe acts as an effective reagent for detecting DNA-binding activity. EMSAs carried out using triprobe and the GST fusion proteins shown in Figure 4A. Lane 1, probe only; lane 2, GST; lane 3, GST fusion with GATA-N; lane 4, GST fusion with GATA-C; lane 5, GST fusion with BKLF; lane 6, GST fusion with Pegasus; lane 7, GST fusion with PU.1; lane 8, GST fusion with Eos; lane 9, GST fusion with FOG.

References

    1. Pollock R. and Treisman,R. (1990) A sensitive method for the determination of protein-DNA binding specificities. Nucleic Acids Res., 18, 6197–6204. - PMC - PubMed
    1. Crossley M., Whitelaw,E., Perkins,A., Williams,G., Fujiwara,Y. and Orkin,S.H. (1996) Isolation and characterization of the cDNA encoding BKLF/TEF-2, a major CACCC-box-binding protein in erythroid cells and selected other cells. Mol. Cell. Biol., 16, 1695–1705. - PMC - PubMed
    1. Perdomo J., Holmes,M., Chong,B. and Crossley,M. (2000) Eos and pegasus, two members of the Ikaros family of proteins with distinct DNA binding activities. J. Biol. Chem., 275, 38347–38354. - PubMed
    1. Newton A., Mackay,J. and Crossley,M. (2001) The N-terminal zinc finger of the erythroid transcription factor GATA-1 binds GATC motifs in DNA. J. Biol. Chem., 276, 35794–35801. - PubMed
    1. Maniatis T., Fritsch,E.F. and Sambrook,J. (1982) Molecular Cloning: A Laboratory Manual. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY.

Publication types