Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2007 Aug 30:7:100.
doi: 10.1186/1471-213X-7-100.

CONDOR: a database resource of developmentally associated conserved non-coding elements

Affiliations

CONDOR: a database resource of developmentally associated conserved non-coding elements

Adam Woolfe et al. BMC Dev Biol. .

Abstract

Background: Comparative genomics is currently one of the most popular approaches to study the regulatory architecture of vertebrate genomes. Fish-mammal genomic comparisons have proved powerful in identifying conserved non-coding elements likely to be distal cis-regulatory modules such as enhancers, silencers or insulators that control the expression of genes involved in the regulation of early development. The scientific community is showing increasing interest in characterizing the function, evolution and language of these sequences. Despite this, there remains little in the way of user-friendly access to a large dataset of such elements in conjunction with the analysis and the visualization tools needed to study them.

Description: Here we present CONDOR (COnserved Non-coDing Orthologous Regions) available at: http://condor.fugu.biology.qmul.ac.uk. In an interactive and intuitive way the website displays data on > 6800 non-coding elements associated with over 120 early developmental genes and conserved across vertebrates. The database regularly incorporates results of ongoing in vivo zebrafish enhancer assays of the CNEs carried out in-house, which currently number approximately 100. Included and highlighted within this set are elements derived from duplication events both at the origin of vertebrates and more recently in the teleost lineage, thus providing valuable data for studying the divergence of regulatory roles between paralogs. CONDOR therefore provides a number of tools and facilities to allow scientists to progress in their own studies on the function and evolution of developmental cis-regulation.

Conclusion: By providing access to data with an approachable graphics interface, the CONDOR database presents a rich resource for further studies into the regulation and evolution of genes involved in early development.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Methodology for alignment and identification of CNEs between Fugu and mammalian orthologous genomic sequences. Boundaries of CNE-containing regions are identified through a CNE synteny map created using a stringent whole genome comparison of the non-coding portions of the human and Fugu genomes [4]. The genomic regions to be aligned are then expanded past the map boundaries up to the next nearest known genes. Trans-dev genes in the region are determined using appropriate GO ontologies and/or InterPro domains. Orthologous sequence corresponding to this expanded region in human is then extracted from mouse, rat and/or dog genomes. Sequences are masked for repeats and are aligned using MLAGAN (identifying sequences conserved in the same order along the sequence) and SLAGAN (to identify conserved elements that have undergone rearrangement in one or more lineages). CNEs are identified using VISTA and filtered to exclude any that overlap known coding exons or ncRNAs. As an added stringency filter, only those CNEs that are conserved in at last four divergent vertebrate genomes (including Fugu) are retained to avoid spurious matches.
Figure 2
Figure 2
Chromosomal locations of developmental gene regions currently covered in CONDOR. Red outlined boxes represent regions across which CNEs are distributed, and are proportional to the size of the region. The reference trans-dev gene to which the region is associated is marked next to the box. In most cases this is the only trans-dev gene in the vicinity although in a number of cases CNEs are interspersed within clusters of related trans-dev genes (e.g. the HOXD cluster) or within clusters of unrelated trans-dev genes (e.g. the PAX1 region which contains the trans-dev genes PAX1, NKX2.2, NKX2.8 and FOXA2). CNE regions in CONDOR are found on all chromosomes except 21, 22 and Y. Figure created using Ensembl Karyoview [25].
Figure 3
Figure 3
Example of the "CONDOR View" graphical browser for CNEs in the vicinity of the SOX21 gene in Fugu. Letters indicate the following features. (A) Title bar shows the reference trans-dev gene and the current baseline organism. (B) Option allowing users to change the baseline organism co-ordinate system to which CNEs are mapped on the graphic. (C) Clickable graphic/image map representing all CNEs across the baseline sequence. Top bar represents the length of the sequence in kilobases (here ~66 Kb). The CDS track shows gene structures (exons indicated as red boxes) with the end arrow indicating the strand direction of the gene. The four organism tracks represent the organism sequences used in the initial MLAGAN/SLAGAN alignments. CNEs are positioned according to the baseline co-ordinate system and drawn as boxes with lengths relative to their size in bps. CNEs are 'bumped' onto lower lines if located too close to another CNE for them to be differentiated on a single line. While all CNEs are mapped to the baseline sequence, boxes are drawn in the other organisms if they are conserved to that CNE in the MLAGAN alignment. Moving the mouse over a specific CNE in one of the organism tracks brings up summary data on that CNE in a table below the browser (E). If the CNE has functional annotation (as shown) a composite schematic of GFP expression is displayed underneath. Clicking on the image map opens up a new web page with detailed information on the CNE. (D) Users can zoom in or out of the image map to get a clearer view of specific sets of CNEs (useful in larger regions).
Figure 4
Figure 4
Example of part of a "Text View" web results page for CNEs in the vicinity of the SOX21 gene in Fugu. The two top bars represent keys to data within the table relating to genomes in which CNEs are conserved, conservation scores and functional annotation. Links through names/pictures relating to all vertebrate genomes currently within CONDOR allow users to filter the CNEs by those conserved in a specific genome or to view the chromosomal locations of the CNEs in that genome. Main table shows data and features of each CNE in the region. The sequence data includes CNE identifier, location, position with relation to reference gene, length, conservation etc. CNE features shown also include the LPC score (a composite score that takes both the length and conservation of the element into consideration), whether a CNE is duplicated elsewhere in the genome (referred to as a dCNE or duplicated CNE [29]), whether the CNE has an in vivo annotation associated with it (and the general result of that annotation), and what vertebrate genomes it is conserved in.
Figure 5
Figure 5
Searching for CNEs through functional annotation. As well as viewing annotation for individual CNEs, sequences can be searched for enhancer activity in one or more tissue types (in this example the eye) and one or more developmental stages (24-30 and 48-54 hours post fertilization). Parameters can also be changed to include only strong expressers in the selected tissue type. A composite schematic of the expression domains as well as a histogram of the proportion of embryos expressing in each tissue type are shown for all annotations fulfilling the search criteria. Hyperlinks are provided for each CNE to view annotations in more detail. A multi-FASTA sequence file of the CNEs returned in the search is provided to use in downstream analysis such as searches for overrepresented words/conserved transcription factor binding sites etc responsible for regulation to this tissue.

Similar articles

Cited by

References

    1. Carroll SB. Evolution at two levels: on genes and form. PLoS Biol. 2005;3:e245. doi: 10.1371/journal.pbio.0030245. - DOI - PMC - PubMed
    1. Margulies EH, Green ED. Detecting highly conserved regions of the human genome by multispecies sequence comparisons. Cold Spring Harb Symp Quant Biol. 2003;68:255–263. doi: 10.1101/sqb.2003.68.255. - DOI - PubMed
    1. Bejerano G, Pheasant M, Makunin I, Stephen S, Kent WJ, Mattick JS, Haussler D. Ultraconserved elements in the human genome. Science. 2004;304:1321–1325. doi: 10.1126/science.1098119. - DOI - PubMed
    1. Woolfe A, Goodson M, Goode DK, Snell P, McEwen GK, Vavouri T, Smith SF, North P, Callaway H, Kelly K, Walter K, Abnizova I, Gilks W, Edwards YJ, Cooke JE, Elgar G. Highly conserved non-coding sequences are associated with vertebrate development. PLoS Biol. 2005;3:e7. doi: 10.1371/journal.pbio.0030007. - DOI - PMC - PubMed
    1. Venkatesh B, Kirkness EF, Loh YH, Halpern AL, Lee AP, Johnson J, Dandona N, Viswanathan LD, Tay A, Venter JC, Strausberg RL, Brenner S. Ancient noncoding elements conserved in the human genome. Science. 2006;314:1892. doi: 10.1126/science.1130708. - DOI - PubMed

Publication types

LinkOut - more resources