Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2002 Jun;12(6):851-6.
doi: 10.1101/gr.189102.

Compositional gradients in Gramineae genes

Affiliations
Comparative Study

Compositional gradients in Gramineae genes

Gane Ka-Shu Wong et al. Genome Res. 2002 Jun.

Abstract

In this study, we describe a property of Gramineae genes, and perhaps all monocot genes, that is not observed in eudicot genes. Along the direction of transcription, beginning at the junction of the 5'-UTR and the coding region, there are gradients in GC content, codon usage, and amino-acid usage. The magnitudes of these gradients are large enough to hinder the annotation of the rice genome and to confound the detection of protein homologies across the monocot-eudicot divide.

PubMed Disclaimer

Figures

Figure 1
Figure 1
GC content as function of position from start of coding region for four pairs of best available O. sativa and A. thaliana homologs (possible orthologs). A 129-bp sliding window, equal to the median size of a rice exon, was used to filter out the fluctuations in the sequence.
Figure 2
Figure 2
Distributions for “per-gene” GC content gradients in O. sativa, Z. mays, A. thaliana, and N. tabacum. The gradient is the slope of the trend in GC content versus position, defined only for the first kilobase of the coding region, to respect the finite extent of the gradient effect.
Figure 3
Figure 3
Overall GC content as a function of cDNA position, relative to the start of the coding region, and averaged over all cDNAs with a 51-bp sliding window. Shown here are O. sativa, Z. mays, A. thaliana, and N. tabacum. Negative coordinates are 5′-UTR. Positive coordinates are coding.
Figure 4
Figure 4
GC1, GC2, and GC3 content as a function of cDNA position, relative to the start of the coding region, and averaged over all cDNAs with a 51-bp sliding window. Shown here are O. sativa, Z. mays, A. thaliana, and N. tabacum. Phase information is extended into the 5′-UTR.
Figure 5
Figure 5
Effective-number-of-codons as a function of cDNA position, relative to the start of the coding region, and averaged over all cDNAs with a 51-bp sliding window. Shown here are O. sativa, Z. mays, A. thaliana, and N. tabacum. Ne is a statistical measure of codon bias. It is 20 if only one codon is likely to be used for each amino acid, implying maximal bias. It is 61 if every codon is equally likely to be used, implying minimal bias.
Figure 6
Figure 6
Frequency of occurrence for amino acids alanine, leucine, glycine, and serine, as function of cDNA position, relative to the start of the coding region, and averaged over all cDNAs with a 51-bp sliding window. Shown here are O. sativa, Z. mays, A. thaliana, and N. tabacum. When all 20 amino acids occur with equal probability, the normalized frequencies are 1.
Figure 7
Figure 7
Probability of a homologous match across the monocot–eudicot divide, as a function of cDNA position, relative to the start of the coding region. TblastN searches were conducted in both directions, rice-to-arabidopsis and arabidopsis-to-rice. The cDNA data were divided into two equally populated groups, based on the size of the coding region, to emphasize that the reduced probability at the 5′-end is a position effect, not a gene size effect.
Figure 8
Figure 8
GC content of intron DNA, as a function of genomic position, relative to the start codon, and averaged over a 51-bp sliding window. Every intron is from a cDNA-to-genomic alignment. Only O. sativa and A. thaliana had enough genomic data to be so analyzed. Note that, unlike the previous figures, this abscissa is in genomic instead of cDNA coordinates.

References

    1. Altschul SF, Gish W. Local alignment statistics. Methods Enzymol. 1996;266:460–480. - PubMed
    1. Bernardi G. Isochores and the evolutionary genomics of vertebrates. Gene. 2000;241:3–17. - PubMed
    1. Burge C, Karlin S. Prediction of complete gene structures in human genomic DNA. J Mol Biol. 1997;268:78–94. - PubMed
    1. Carels N, Bernardi G. Two classes of genes in plants. Genetics. 2000;154:1819–1825. - PMC - PubMed
    1. Chothia C. The nature of the accessible and buried surfaces in proteins. J Mol Biol. 1976;105:1–12. - PubMed

Publication types