Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2010 Jul 20;107(29):13004-9.
doi: 10.1073/pnas.0914454107. Epub 2010 Jul 6.

A time-invariant principle of genome evolution

Affiliations

A time-invariant principle of genome evolution

Subhajyoti De et al. Proc Natl Acad Sci U S A. .

Abstract

Uncovering general principles of genome evolution that are time-invariant and that operate in germ and somatic cells has implications for genome-wide association studies (GWAS), gene therapy, and disease genomics. Here we investigate the relationship between structural alterations (e.g., insertions and deletions) and single-nucleotide substitutions by comparing the following genomes that diverged at different times across germ- and somatic-cell lineages: (i) the reference human and chimpanzee genome (in million years), (ii) the reference human and personal genomes (in tens of thousands of years), and (iii) structurally altered regions in cancer and genetically engineered cells (in days). At the species level, genes with structural alteration in nearby regions show increased single-nucleotide changes and tend to evolve faster. In personal genomes, the single-nucleotide substitution rate is higher near sites of structural alteration and decreases with increasing distance. In human cancer cell populations and in cells genetically engineered using zinc-finger nucleases, single-nucleotide changes occur frequently near sites of structural alterations. We present evidence that structural alteration induces single-nucleotide changes in nearby regions and discuss possible molecular mechanisms that contribute to this phenomenon. We propose that the low fidelity of nonreplicative error-prone repair polymerases, which are used during insertion or deletion, result in break-repair-induced single-nucleotide mutations in the vicinity of structural alteration. Thus, in the mutational landscape, structural alterations are linked to single-nucleotide changes across different time scales in both somatic- and germ-cell lineages. We discuss implications for genome evolution, GWAS, disease genomics, and gene therapy and emphasize the need to investigate both types of mutations within a single framework.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Fig. 1.
Fig. 1.
Are structural alterations and single-nucleotide changes linked across different time scales? To answer this question, this study analyzed genomes at three different time scales: speciation, population, and cell division time scales within a single framework. For each time scale, the figure summarizes the specific questions addressed and the corresponding datasets used. The shaded downward arrow in the left-most vertical panel shows the different time scales and the schematic figure next to the arrow shows the different genomes (which diverged at different time scales) that were compared. The third panel shows schematically the transmission of genetic material through the germ-cell and somatic-cell lineage. A cell containing the genetic material is shown as a solid circle (purple, zygote; red, germ cell; blue, somatic cell). Arrows represent a cell division event, and a semicircle represents a gamete, containing half the genetic material. The fourth and fifth panels describe the specific questions investigated and the datasets used to address them.
Fig. 2.
Fig. 2.
Box plot of the distribution of (A) dS, (B) KI, (C) dN, (D) percentage protein sequence identity, and (E) dN/dS for orthologous genes between the reference human and chimpanzee genome sequence grouped according to their CGN scores. The genome-wide median values are shown as horizontal lines. Box plot identifies the middle 50% of the data, the median, and the extreme points. The entire set of data points is divided into quartiles, and the interquartile range (IQR) is calculated as the difference between ×0.75 and ×0.25. The range of the 25% of the data points above (×0.75) and below (×0.25) the median (×0.50) is displayed as a solid box. The horizontal line and the notch represent the median and confidence intervals, respectively. Data points greater or less than 1.5·IQR represent outliers and are excluded only to improve visualization of the graphs. The horizontal line that is connected by dashed lines above and below the solid box (whiskers) represents the largest and the smallest nonoutlier data points, respectively.
Fig. 3.
Fig. 3.
Distribution of the density of single-nucleotide change as a function of distance from the site of structural alteration (i.e., InDels greater than 30 bp) in the genome of (A) Venter (HuRef), (B) Yoruban (NA18507), and (C) Korean (KorRef) individuals. (E) SNP density for 100-kb genomic blocks in the Yoruban genome that have at least one structural alteration (small red rectangles) is significantly different from that of the genomic blocks that have no structural alteration, given that both sets of blocks have no InDels in the KorRef genome (P < 2.2 × 10−16). (D) SNP density for 100-kb genomic blocks in the KorRef genome that have at least one structural alteration is significantly different from that of the genomic blocks that have no structural alteration, given that both sets of blocks have no structural alteration in the Yoruban genome (P < 2.2 × 10−16). (F) Conditional probability values. Nonoverlapping 100-kb genomic blocks with high SNP density (i.e., greater than the median SNP density) are defined as those blocks with an above-the-median value in each of the personal genomes.
Fig. 4.
Fig. 4.
Distribution of SNP density as a function of the distance from the site of structural alteration of size greater than 20 bp (A) and greater than 30 bp (B) in the cancer (melanoma) genome. (C) Conditional probability values. Nonoverlapping 100-kb genomic blocks with high SNP density (i.e., greater than the median SNP density) are defined as those blocks with above-the-median value in each of the personal genomes.

Comment in

References

    1. Hurles ME, Dermitzakis ET, Tyler-Smith C. The functional impact of structural variation in humans. Trends Genet. 2008;24:238–245. - PMC - PubMed
    1. Fisher RA. The Genetical Theory of Natural Selection. Clarendon Press, Oxford; 1930.
    1. Nei M. Modification of linkage intensity by natural selection. Genetics. 1967;57:625–641. - PMC - PubMed
    1. Navarro A, Barton NH. Chromosomal speciation and molecular divergence: Accelerated evolution in rearranged chromosomes. Science. 2003;300:321–324. - PubMed
    1. Tian D, et al. Single-nucleotide mutation rate increases close to insertions/deletions in eukaryotes. Nature. 2008;455:105–108. - PubMed

Publication types

LinkOut - more resources