Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2006 Feb 17:7:76.
doi: 10.1186/1471-2105-7-76.

BIPAD: a web server for modeling bipartite sequence elements

Affiliations

BIPAD: a web server for modeling bipartite sequence elements

Chengpeng Bi et al. BMC Bioinformatics. .

Abstract

Background: Many dimeric protein complexes bind cooperatively to families of bipartite nucleic acid sequence elements, which consist of pairs of conserved half-site sequences separated by intervening distances that vary among individual sites.

Results: We introduce the Bipad Server, a web interface to predict sequence elements embedded within unaligned sequences. Either a bipartite model, consisting of a pair of one-block position weight matrices (PWM's) with a gap distribution, or a single PWM matrix for contiguous single block motifs may be produced. The Bipad program performs multiple local alignment by entropy minimization and cyclic refinement using a stochastic greedy search strategy. The best models are refined by maximizing incremental information contents among a set of potential models with varying half site and gap lengths.

Conclusion: The web service generates information positional weight matrices, identifies binding site motifs, graphically represents the set of discovered elements as a sequence logo, and depicts the gap distribution as a histogram. Server performance was evaluated by generating a collection of bipartite models for distinct DNA binding proteins.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Performance of Bipad server for alignment of Scaffold/Matrix attachment regions sequences. Performance of Bipad server for alignment of Scaffold/Matrix attachment region (S-MAR) sites. The graph indicates the linear relationship between cycles and time required for convergence on the optimal model (filled squares), and that relationship between cycles and total information content is asymptotic at two cycles (filled triangles).
Figure 2
Figure 2
A gallery of sequence logos. For bipartite logos, the companion gap histogram is shown on the right. (A) FTZ-F1α monomer binding site; (B) CAR/RXRα PBREM sites; the right-half motif starts at position 4 and positions 0–3 correspond to the central gap between the half-sites; (C) HNF4α homodimer binding sites; the right-half motif starts at position 1, with the variable length gap placed at position 0; (D) MRS bipartite binding sites; the second half-site motif begins at position 1; the variable length gap denoted by the distribution corresponds to position 0 of the logo. Corresponding Bipad text file output for these models can be viewed at [1].
Figure 3
Figure 3
Refinement of CAR/RXR bipartite binding motif models. The x-axis represents an index of binding sites models of increasing site widths beginning with the initial input parameters defining site width and gap range were 5<[0,8]>5. For example, Model 1 corresponds to the motif pattern: 5<[0,8]>5, Model 2 is 5<[0,8]>6; where the final model, number 30, corresponds to the pattern 10<[0,8]>9. The unit incremental information (UII) value is computed for each motif and displayed on the Y-axis. The maximum UII usually has the highest information density and is indicative of the optimal model.
Figure 4
Figure 4
Bipad performance for various input sequence lengths. The graph shows the performance of Bipad (Y-axis) for recognition of S-MAR binding sites embedded in background sequences of varying lengths. Each S-MAR site was embedded in a background either with a uniform composition [black line], or having the average human genomic composition [blue line]. The background sequence was varied from 250 to 2000 bp in length (X-axis). The performance calculation is given in Reference [2]; each data point has been averaged over three replicates.

Similar articles

Cited by

References

    1. Bi CP, Rogan PK. Bipad. 2004. http://bipad.cmh.edu
    1. Bi CP, Rogan PK. Bipartite pattern discovery by entropy minimization-based multiple local alignment. Nucleic Acids Research. 2004;32:4979–4991. doi: 10.1093/nar/gkh825. - DOI - PMC - PubMed
    1. Claessens F, Gerwith D. DNA recognition by nuclear receptors. Essays in Biochemistry. 2004;40:59–72. - PubMed
    1. Aranda A, Pasucal A. Nuclear hormone receptors and gene expression. Physiological Reviews. 2001;81:1269–1304. - PubMed
    1. van Helden J, Rios AF, Collado-Vides J. Discovering regulatory elements in non-coding sequences by analysis of spaced dyads. Nucleic Acids Research. 2000;28:1808–1818. doi: 10.1093/nar/28.8.1808. - DOI - PMC - PubMed

Publication types

LinkOut - more resources