Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2004 May;14(5):886-92.
doi: 10.1101/gr.2246704.

Compositional gene landscapes in vertebrates

Affiliations

Compositional gene landscapes in vertebrates

Stéphane Cruveiller et al. Genome Res. 2004 May.

Abstract

The existence of a well conserved linear relationship between GC levels of genes' second and third codon positions (GC2, GC3) prompted us to focus on the landscape, or joint distribution, spanned by these two variables. In human, well curated coding sequences now cover at least 15%-30% of the estimated total gene set. Our analysis of the landscape defined by this gene set revealed not only the well documented linear crest, but also the presence of several peaks and valleys along that crest, a property that was also indicated in two other warm-blooded vertebrates represented by large gene databases, that is, mouse and chicken. GC2 is the sum of eight amino acid frequencies, whereas GC3 is linearly related to the GC level of the chromosomal region containing the gene. The landscapes therefore portray relations between proteins and the DNA environments of the genes that encode them.

PubMed Disclaimer

Figures

Figure 1
Figure 1
2D representations of the landscape of GC levels in second and third positions (GC2, GC3) of 10,218 curated human genes: (A) scatterplot, (B) bivariate histogram, and (C) smoothed contour plot. Bins were chosen to partition the range of GC2 and GC3 values found (minimum to maximum values) into 37 × 37 (B) or 21 × 21 (C) equal bins. Height (i.e., frequency) ranges are indicated by colors.
Figure 2
Figure 2
Frequency distributions of GC2 and GC3 values of mouse (A; 16,383 sequences) and Xenopus genes (B; 1303 sequences), represented as smoothed contour plots defining 3D compositional landscapes.
Figure 3
Figure 3
Smoothed contour plot showing a variant of the landscape of 10,218 human genes: the vertical axis is the summed frequencies of alanine, proline, serine, and threonine instead of GC2; the horizontal axis is GC3. The four amino acids used to recreate this landscape are frequent, and all have cytosine in second position in four of their codons.

Similar articles

Cited by

References

    1. Bernardi, G. 2000. The compositional evolution of vertebrate genomes. Gene 259: 31-43. - PubMed
    1. ———. 2001. Misunderstandings about isochores. Part I. Gene 276: 3-13. - PubMed
    1. Bernardi, G. and Bernardi, G. 1986. The human genome and its evolutionary context. Cold Spring Harb. Symp. Quant. Biol. 51: 479-487. - PubMed
    1. Bernardi, G., Olofsson, B., Filipski, J., Zerial, M., Salinas, J., Cuny, G., Meunier-Rotival, M., and Rodier, F. 1985. The mosaic genome of warm-blooded vertebrates. Science 228: 953-958. - PubMed
    1. Carels, N. and Bernardi, G. 2000. Two classes of genes in plants. Genetics 154: 1819-1825. - PMC - PubMed

LinkOut - more resources