Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2016 Jan 4;44(D1):D96-100.
doi: 10.1093/nar/gkv1163. Epub 2015 Nov 2.

CEGA--a catalog of conserved elements from genomic alignments

Affiliations

CEGA--a catalog of conserved elements from genomic alignments

Aline Dousse et al. Nucleic Acids Res. .

Abstract

By identifying genomic sequence regions conserved among several species, comparative genomics offers opportunities to discover putatively functional elements without any prior knowledge of what these functions might be. Comparative analyses across mammals estimated 4-5% of the human genome to be functionally constrained, a much larger fraction than the 1-2% occupied by annotated protein-coding or RNA genes. Such functionally constrained yet unannotated regions have been referred to as conserved non-coding sequences (CNCs) or ultra-conserved elements (UCEs), which remain largely uncharacterized but probably form a highly heterogeneous group of elements including enhancers, promoters, motifs, and others. To facilitate the study of such CNCs/UCEs, we present our resource of Conserved Elements from Genomic Alignments (CEGA), accessible from http://cega.ezlab.org. Harnessing the power of multiple species comparisons to detect genomic elements under purifying selection, CEGA provides a comprehensive set of CNCs identified at different radiations along the vertebrate lineage. Evolutionary constraint is identified using threshold-free phylogenetic modeling of unbiased and sensitive global alignments of genomic synteny blocks identified using protein orthology. We identified CNCs independently for five vertebrate clades, each referring to a different last common ancestor and therefore to an overlapping but varying set of CNCs with 24 488 in vertebrates, 241 575 in amniotes, 709 743 in Eutheria, 642 701 in Boreoeutheria and 612 364 in Euarchontoglires, spanning from 6 Mbp in vertebrates to 119 Mbp in Euarchontoglires. The dynamic CEGA web interface displays alignments, genomic locations, as well as biologically relevant data to help prioritize and select CNCs of interest for further functional investigations.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Workflow of CEGA identification of conserved elements.
Figure 2.
Figure 2.
CEGA user interface.

References

    1. Alföldi J., Lindblad-Toh K. Comparative genomics as a tool to understand evolution and disease. Genome Res. 2013;23:1063–1068. - PMC - PubMed
    1. Waterston R.H., Lindblad-Toh K., Birney E., Rogers J., Abril J.F., Agarwal P., Agarwala R., Ainscough R., Alexandersson M., An P., et al. Initial sequencing and comparative analysis of the mouse genome. Nature. 2002;420:520–562. - PubMed
    1. Dermitzakis E.T., Reymond A., Lyle R., Scamuffa N., Ucla C., Deutsch S., Stevenson B.J., Flegel V., Bucher P., Jongeneel C.V., et al. Numerous potentially functional but non-genic conserved sequences on human chromosome 21. Nature. 2002;420:578–582. - PubMed
    1. Dubchak I., Brudno M., Loots G.G., Pachter L., Mayor C., Rubin E.M., Frazer K.A. Active conservation of noncoding sequences revealed by three-way species comparisons. Genome Res. 2000;10:1304–1306. - PMC - PubMed
    1. Dermitzakis E.T., Reymond A., Antonarakis S.E. Conserved non-genic sequences - an unexpected feature of mammalian genomes. Nat. Rev. Genet. 2005;6:151–157. - PubMed

Publication types