Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2004 May;14(5):852-9.
doi: 10.1101/gr.1934904. Epub 2004 Apr 12.

Comparison of human chromosome 21 conserved nongenic sequences (CNGs) with the mouse and dog genomes shows that their selective constraint is independent of their genic environment

Affiliations
Comparative Study

Comparison of human chromosome 21 conserved nongenic sequences (CNGs) with the mouse and dog genomes shows that their selective constraint is independent of their genic environment

Emmanouil T Dermitzakis et al. Genome Res. 2004 May.

Abstract

The analysis of conservation between the human and mouse genomes resulted in the identification of a large number of conserved nongenic sequences (CNGs). The functional significance of this nongenic conservation remains unknown, however. The availability of the sequence of a third mammalian genome, the dog, allows for a large-scale analysis of evolutionary attributes of CNGs in mammals. We have aligned 1638 previously identified CNGs and 976 conserved exons (CODs) from human chromosome 21 (Hsa21) with their orthologous sequences in mouse and dog. Attributes of selective constraint, such as sequence conservation, clustering, and direction of substitutions were compared between CNGs and CODs, showing a clear distinction between the two classes. We subsequently performed a chromosome-wide analysis of CNGs by correlating selective constraint metrics with their position on the chromosome and relative to their distance from genes. We found that CNGs appear to be randomly arranged in intergenic regions, with no bias to be closer or farther from genes. Moreover, conservation and clustering of substitutions of CNGs appear to be completely independent of their distance from genes. These results suggest that the majority of CNGs are not typical of previously described regulatory elements in terms of their location. We propose models for a global role of CNGs in genome function and regulation, through long-distance cis or trans chromosomal interactions.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Distributions of human, mouse, and dog branch lengths using the K80 estimate of divergence for intergenic CNGs (A,B,C), intronic CNGs (D,E,F), and CODs (exonic; G,H,I). The y-axes indicate the number of sequences (frequency).
Figure 2
Figure 2
(A) Numbers and direction of the different types of substitutions detected in the three lineages of human, mouse, and dog assuming an unrooted tree. (B,C,D) Relative rates of GC minus AT substitutions in CODs, intronic CNGs, and intergenic CNGs in human (B), mouse (C), and dog (D). Note the excess of GC to AT substitutions in human CODs.
Figure 3
Figure 3
(A) Distribution of the distances of CNGs from the nearest gene (minimum distance from gene). (B) Relative position of CNGs in intergenic regions when all of them are scaled to 1. Note the uniformity along the intergenic regions. The y-axis indicates the number of sequences (frequency).
Figure 4
Figure 4
Regression of human branch lengths with the distance from the nearest gene (A) and length of intergenic region (B). Regression lines are indicative. P-values and R-squared values are at the top of the graph.
Figure 5
Figure 5
(A) Confidence intervals of clustering P-values for CODs, intergenic CNGs, and intronic CNGs. (B,C) Regression of clustering P-values with distance of CNG from nearest gene (B) and length of intergenic region (C). Regression lines are indicative. P-values and R-squared values are at the top of the graph.

Similar articles

Cited by

References

    1. Alexandersson, M., Cawley, S., and Pachter, L. 2003. SLAM: Cross-species gene finding and alignment with a generalized pair Hidden Markov Model. Genome Res. 13: 496-502. - PMC - PubMed
    1. Boffelli, D., McAuliffe, J., Ovcharenko, D., Lewis, K.D., Ovcharenko, I., Pachter, L., and Rubin, E.M. 2003. Phylogenetic shadowing of primate sequences to find functional regions of the human genome. Science 299: 1391-1394. - PubMed
    1. Croft, J.A., Bridger, J.M., Boyle, S., Perry, P., Teague, P., and Bickmore, W.A. 1999. Differences in the localization and morphology of chromosomes in the human nucleus. J. Cell Biol. 145: 1119-1131. - PMC - PubMed
    1. Dermitzakis, E.T., Reymond, A., Lyle, R., Scamuffa, N., Ucla, C., Deutsch, S., Stevenson, B.J., Flegel, V., Bucher, P., Jongeneel, C.V., et al. 2002. Numerous potentially functional but nongenic conserved sequences on human chromosome 21. Nature 420: 578-582. - PubMed
    1. Dermitzakis, E.T., Reymond, A., Scamuffa, N., Ucla, C., Kirkness, E., Rossier, C., and Antonarakis, S.E. 2003. Evolutionary discrimination of mammalian conserved nongenic sequences (CNGs). Science 302: 1033-1035. - PubMed

Publication types

Substances