Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2007;8(10):R207.
doi: 10.1186/gb-2007-8-10-r207.

PAZAR: a framework for collection and dissemination of cis-regulatory sequence annotation

Affiliations

PAZAR: a framework for collection and dissemination of cis-regulatory sequence annotation

Elodie Portales-Casamar et al. Genome Biol. 2007.

Abstract

PAZAR is an open-access and open-source database of transcription factor and regulatory sequence annotation with associated web interface and programming tools for data submission and extraction. Curated boutique data collections can be maintained and disseminated through the unified schema of the mall-like PAZAR repository. The Pleiades Promoter Project collection of brain-linked regulatory sequences is introduced to demonstrate the depth of annotation possible within PAZAR. PAZAR, located at http://www.pazar.info, is open for business.

PubMed Disclaimer

Figures

Figure 1
Figure 1
PAZAR Mall. The PAZAR database can be viewed as a mall bringing together independent boutiques. The user can visit each store separately by clicking on the corresponding boutique and search through the data using various filters. Global search engines, allowing searching of the entire PAZAR mall, are available by clicking on one of the three department stores. The user can then search PAZAR by gene (Genes), transcription factor (TFMART), or transcription factor binding profiles (TF PROFILES).
Figure 2
Figure 2
PAZAR central concept: analysis and input/output system. The sequences and transcription factors are stored independently in the database and are then linked together as inputs of an analysis. Other types of input can be used, such as a biological sample (for example, nuclear extract) or a condition (for example, addition of a chemical compound). The analysis is defined by various properties (the method and cell type used, the PubMed identifier, and so on) and links inputs and outputs together. An output could be the observed effect, for example expression response or interaction level. The system is very flexible, allowing various combinations of inputs and outputs.
Figure 3
Figure 3
Example query: search by gene. By clicking on the 'Genes' department store at the upper right corner of the mall, users can perform a gene-specific query. One can view the list of all genes in PAZAR by clicking on the 'View Gene List' button. Alternatively, users can search for a specific gene within all of PAZAR based upon several gene-specific identifiers. At the top of the 'Gene View' page is a summary table of all of the genes obtained from the search. Here, the results show that the queried gene (EnsEMBL gene ID ENSG00000131095) has annotations in two different projects. Below, users can find the details and all annotated regulatory sequences for each of the resulting genes individually as, in PAZAR, each boutique stays independent within the mall. By clicking on the regulatory sequence ID for a specific regulatory sequence, found in the far left column, users can access the PAZAR Sequence view for that sequence. In this view, data are color-coded, with gene-specific information presented in blue and sequence-specific data in orange. A gene-specific summary table is presented at the top of the page followed by a table detailing the regulatory sequence of interest. A third table summarizes the supporting experimental data for this regulatory sequence. Clicking on the Analysis ID found in the leftmost column of this table takes users to the PAZAR Analysis View, color-coded in green and containing a more in-depth description of the supporting experimental data.
Figure 4
Figure 4
Example query: search by transcription factor. By clicking on the 'TFMART' department store at the left hand side of the mall, users can perform a TF-specific query. The 'TF View', color-coded in red, is very similar to the 'Gene View' (see Figure 3) with a summary table of all of the TFs obtained from the search at the top followed by details and binding sites for each of them individually. Here, the results show that the queried TF (HUMAN_NF1) has annotations in two different projects. The binding sites can be genomic sequences with defined coordinates or they can be artificial (for example, oligonucleotide representing a consensus sequence). All the sites are aligned and a TF binding profile is built dynamically using the MEME pattern discovery algorithm [31]. Users can construct a custom scoring matrix and binding profile based upon a subset of the sequences for that TF by clicking in the check boxes of those sequences meant to be included and clicking 'Generate PFM with selected sequences'. Alternatively, users can generate scoring matrices and binding profiles based upon just genomic or artificial sequences by clicking on 'Select genomic sequences' or 'Select artificial sequences', respectively.
Figure 5
Figure 5
Example query: search within a specific boutique project. One might desire to limit queries to a single collection. To do so, the user must find the corresponding boutique in the mall map or directory and click on it. The 'Project View' provides a brief description of the dataset (here the ABS project) as well as some statistics on the data it contains. Below, the user can choose amongst various filters to search through the data and display it in the 'Gene View', where regulatory sequences will be grouped by the genes they regulate, or in the 'TF View', where the sequences are grouped by the TFs that bind to them.
Figure 6
Figure 6
Visual representation of the human gene annotations of the 'Pleiades genes' project in PAZAR. (a) Cytoscape visualization. Human genes are represented as orange squares and transcription factors regulating them as circles (blue for human, purple for mouse and green for rat). The different species of transcription factors reflects the fact that assays on the regulation of human genes are often carried out in cell lines or with recombinant transcription factors from different organisms. The orange edges represent the annotated interactions between transcription factors and genes. The red edge visualizes an interaction between two transcription factors. The red box highlights the human transcription factor SPI1 (also called PU.1) and all the genes recorded as containing a transcription factor binding site for it. (b) PAZAR TF View detail for PU.1 annotations from the 'Pleiades genes' project. Only the first 6 binding sites (out of 60) are displayed, as well as the binding profile for the combined set dynamically generated by the MEME software [31].

References

    1. Farhadi HF, Lepage P, Forghani R, Friedman HC, Orfali W, Jasmin L, Miller W, Hudson TJ, Peterson AC. A combinatorial network of evolutionarily conserved myelin basic protein regulatory sequences confers distinct glial-specific phenotypes. J Neurosci. 2003;23:10214–10223. - PMC - PubMed
    1. Kirchhamer CV, Yuh CH, Davidson EH. Modular cis-regulatory organization of developmentally expressed genes: two genes transcribed territorially in the sea urchin embryo, and additional examples. Proc Natl Acad Sci USA. 1996;93:9322–9328. doi: 10.1073/pnas.93.18.9322. - DOI - PMC - PubMed
    1. Ren B, Robert F, Wyrick JJ, Aparicio O, Jennings EG, Simon I, Zeitlinger J, Schreiber J, Hannett N, Kanin E, et al. Genome-wide location and function of DNA binding proteins. Science. 2000;290:2306–2309. doi: 10.1126/science.290.5500.2306. - DOI - PubMed
    1. Kel AE, Kel-Margoulis OV, Farnham PJ, Bartley SM, Wingender E, Zhang MQ. Computer-assisted identification of cell cycle-related genes: new targets for E2F transcription factors. J Mol Biol. 2001;309:99–120. doi: 10.1006/jmbi.2001.4650. - DOI - PubMed
    1. Fickett JW. Quantitative discrimination of MEF2 sites. Mol Cell Biol. 1996;16:437–441. - PMC - PubMed

Publication types

MeSH terms

Substances

LinkOut - more resources