Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2006 Nov 21;103(47):17828-33.
doi: 10.1073/pnas.0605553103. Epub 2006 Nov 8.

Asymptotically increasing compliance of genomes with Chargaff's second parity rules through inversions and inverted transpositions

Affiliations

Asymptotically increasing compliance of genomes with Chargaff's second parity rules through inversions and inverted transpositions

Guenter Albrecht-Buehler. Proc Natl Acad Sci U S A. .

Abstract

Chargaff's second parity rules for mononucleotides and oligonucleotides (CIImono and CIIoligo rules) state that a sufficiently long (> 100 kb) strand of genomic DNA that contains N copies of a mono- or oligonucleotide, also contains N copies of its reverse complementary mono- or oligonucleotide on the same strand. There is very strong support in the literature for the validity of the rules in coding and noncoding regions, especially for the CIImono rule. Because the experimental support for the CIIoligo rule is much less complete, the present article, focusing on the special case of trinucleotides (triplets), examined several gigabases of genome sequences from a wide range of species and kingdoms including organelles such as mitochondria and chloroplasts. I found that all genomes, with the only exception of certain mitochondria, complied with the CIItriplet rule at a very high level of accuracy in coding and noncoding regions alike. Based on the growing evidence that genomes may contain up to millions of copies of interspersed repetitive elements, I propose in this article a quantitative formulation of the hypothesis that inversions and inverted transposition could be a major contributing if not dominant factor in the almost universal validity of the rules.

PubMed Disclaimer

Conflict of interest statement

The author declares no conflict of interest.

Figures

Fig. 1.
Fig. 1.
Typical triplet profile (chimpanzee chromosome 14, position 32 Mb to 40 Mb). The abscissa shows all possible triplets (to be read vertically from top to bottom). Numbers indicate the canonical numbers (see supporting information). The ordinate shows the frequency of triplets (%).
Fig. 2.
Fig. 2.
Method of testing the validity of the CIItriplet rule by using a correlation plot between the triplet frequencies of the Watson (abscissa) and Crick (ordinate) strands of the same sequence as shown in Fig. 1. If a sequence complies completely, the plot generates a straight diagonal line with a correlation coefficient of cWC = 1.0. In the above case the genome complies quite well because its correlation coefficient cWC = 0.9994.
Fig. 3.
Fig. 3.
Almost universal validity of Chargaff's second parity rules as applied to triplets. The correlation coefficient cWC is shown to vary only on the third decimal point. The ordinate shows correlation coefficient cWC, and the abscissa shows the location along the chromosome. (a) The correlation coefficients for each of the 8-Mb large segments along the entire length of human chromosome 1. (b) Average correlation coefficients cWC for all human chromosome averaged over 8 Mb segments along their entire length. (c) The correlation coefficients of arbitrarily selected entire chromosomes of various species ranging from primates to bacteria.
Fig. 4.
Fig. 4.
Role of genome size in the validity of the Chargaff second parity rules. The abscissa shows correlation coefficients cWC, and the ordinate shows genome size. (a) Correlation coefficients cWC of different size segments that include the 5′ end of human chromosome 1. (b) Lack of a size correlation between the correlation coefficients cWC of 51 mitochondrial genomes and their genome sizes.
Fig. 5.
Fig. 5.
Simulation of the convergence of a noncompliant genome to a compliant one by a recursive series of transposition/inversions. The abscissa shows the number of rounds of transposition/inversions, the left ordinate shows the number of G's or C's on the resulting Watson strand, and the right ordinate shows the degree of compliance of the resulting genome with the CIItriplet rule expressed as correlation coefficient cWC. The thick line labeled “compliance” depicts the simulated genome's degree of compliance with the CIItriplet rule as a function of rounds of transposition/inversions. The thinner lines labeled G and C depict the convergence of the numbers of the corresponding nucleotides during the same process. The thin line labeled “theoretical” depicts the theoretical curve of convergence calculated by Eq. 2. Note that this curve is not fitted to the simulation but merely uses the same value of (segment size)/(genome size). For the sake of graphic presentation the simulation assumed a large ratio of (size of average inverted segment)/(size of whole genome) of 0.008. It appears that the theoretical description matches quite accurately the exponential convergence of a noncompliant genome to a compliant one.
Fig. 6.
Fig. 6.
Simulation of the convergence of a noncompliant triplet profile to a compliant one by a recursive series of transposition/inversions. The abscissa shows all possible triplets encoded by their canonical numbers (see supporting information), and the ordinate shows the frequency of triplets (%). The figure plots into the same graph the converging series of triplet profiles starting with an initially arbitrary, noncompliant, simulated genome (cWC = 0.01) that converges to a compliant one (cWC = 0.994) during 12 recursive rounds of transposition/inversions. The final stage is marked by a thick line. For the same reasons as in Fig. 5, the simulation assumes a relatively large ratio of (inverted segment size)/(genome size) of 0.1. It appears that a recursive series of transposition/inversions as described quantitatively in supporting information is able to turn an initially noncompliant triplet profile into a compliant one.

References

    1. Watson JD, Crick FHC. Nature. 1953;177:964–967. - PubMed
    1. Fickett JW, Torney DC, Wolf DR. Genomics. 1992;13:1056–1064. - PubMed
    1. Baisnée PF, Hampson S, Baldi P. Bioinformatics. 2002;188:1021–1033. - PubMed
    1. Prabhu VV. Nucleic Acids Res. 1993;21:2797–2800. - PMC - PubMed
    1. Sanchez J, Jose MV. Biochem Biophys Res Commun. 2002;299:126–134. - PubMed

Substances

LinkOut - more resources