Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2011 Dec 30:12:638.
doi: 10.1186/1471-2164-12-638.

Opposite GC skews at the 5' and 3' ends of genes in unicellular fungi

Affiliations

Opposite GC skews at the 5' and 3' ends of genes in unicellular fungi

Malcolm A McLean et al. BMC Genomics. .

Abstract

Background: GC-skews have previously been linked to transcription in some eukaryotes. They have been associated with transcription start sites, with the coding strand G-biased in mammals and C-biased in fungi and invertebrates.

Results: We show a consistent and highly significant pattern of GC-skew within genes of almost all unicellular fungi. The pattern of GC-skew is asymmetrical: the coding strand of genes is typically C-biased at the 5' ends but G-biased at the 3' ends, with intermediate skews at the middle of genes. Thus, the initiation, elongation, and termination phases of transcription are associated with different skews. This pattern influences the encoded proteins by generating differential usage of amino acids at the 5' and 3' ends of genes. These biases also affect fourfold-degenerate positions and extend into promoters and 3' UTRs, indicating that skews cannot be accounted by selection for protein function or translation.

Conclusions: We propose two explanations, the mutational pressure hypothesis, and the adaptive hypothesis. The mutational pressure hypothesis is that different co-factors bind to RNA pol II at different phases of transcription, producing different mutational regimes. The adaptive hypothesis is that cytidine triphosphate deficiency may lead to C-avoidance at the 3' ends of transcripts to control the flow of RNA pol II molecules and reduce their frequency of collisions.

PubMed Disclaimer

Figures

Figure 1
Figure 1
GC skews around ATG and stop codon in three fungal species. All genes are superposed to the ATG (a) or stop codon (b), and total skew for each position calculated. GC skew at the G of the ATG is 1.0 by definition, as is GC skew at the G of the stop codon. GC skews in the coding region have been averaged over the three bases, and data has been smoothed slightly for clarity. Candida albicans skews are substantially larger than for the other species. (c) GC skew for a selected region of chromosome 3 in C. albicans, with a window size of 2000 nucleotides. Many but not all genes show a pattern of increasing GC skew in the direction of transcription.
Figure 2
Figure 2
GC (A) and AT (B) skews across genes in unicellular fungi. Open reading frames are divided into 40 bins from the ATG to the stop codon, data for each species is pooled and the GC and AT skew calculated for each bin. The 200 bp upstream and downstream are also included. In almost all species we see a clear pattern of C-bias in the start of the gene changing sign to a G-bias towards the end. AT skews are flat but almost always positive.
Figure 3
Figure 3
Nucleotide and amino acid frequency ratios between the 5'- and 3'-ends of genes. The 21 amino acids from the 5'-ends and 20 amino acids from the 3'-ends of all C. albicans genes (and the corresponding nucleotides) were compared by calculating the 5'/3' frequency ratio of each nucleotide (a) or amino acid (b). Bars in (b) are colored by the average GC-skew of the codons that encode for the corresponding amino acid (see colorbar), demonstrating that negatively GC-skewed amino acids are enriched at the 5'-ends of genes while positively GC-skewed amino acids are enriched at the 3'-ends of genes; Methionine is an exception because it is the first amino acid of all genes.
Figure 4
Figure 4
Correlation between AT skew and GC skew. (a) Pearson correlation between local GC skew and either AT skew (blue) or GC content (green) averaged over the entire coding regions. (b) Overall AT skew plotted against overall GC skew for 64 species, R = 0.52. (c) Average pattern of GC-skews over all C. albicans genes at three regions (5', middle and 3'), along with linear fits.

Similar articles

Cited by

References

    1. Chargaff E. Structure and function of nucleic acids as cell constituents. Federation proceedings. 1951;10:654–659. - PubMed
    1. Frank AC, Lobry JR. Asymmetric substitution patterns: a review of possible underlying mutational or selective mechanisms. Gene. 1999;238:65–77. doi: 10.1016/S0378-1119(99)00297-8. - DOI - PubMed
    1. Wei S-J, Shi M, Chen X-X, Sharkey MJ, van Achterberg C, Ye G-Y, He J-H. New views on strand asymmetry in insect mitochondrial genomes. PLoS ONE. 2010;5(9):e12708. doi: 10.1371/journal.pone.0012708. - DOI - PMC - PubMed
    1. Koren A, Tsai H-J, Tirosh I, Burrack LS, Barkai N, Berman J. Epigenetically -inherited centromere and neocentromere DNA replicates earliest in S-phase. PLoS Genetics. 2010;6(8):e1001068. doi: 10.1371/journal.pgen.1001068. - DOI - PMC - PubMed
    1. Touchon M, Nicolay S, Audit B, Brodie of Brodie E-B, d'Aubenton-Carafa Y, Arneodo A, Thermes C. Replication-associated strand asymmetries in mammalian genomes: towards detection of replication origins. PNAS. 2005;102(28):9836–9841. doi: 10.1073/pnas.0500577102. - DOI - PMC - PubMed

Publication types

MeSH terms