Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2011;12(3):R27.
doi: 10.1186/gb-2011-12-3-r27. Epub 2011 Mar 22.

A genome-wide view of mutation rate co-variation using multivariate analyses

Affiliations

A genome-wide view of mutation rate co-variation using multivariate analyses

Guruprasad Ananda et al. Genome Biol. 2011.

Abstract

Background: While the abundance of available sequenced genomes has led to many studies of regional heterogeneity in mutation rates, the co-variation among rates of different mutation types remains largely unexplored, hindering a deeper understanding of mutagenesis and genome dynamics. Here, utilizing primate and rodent genomic alignments, we apply two multivariate analysis techniques (principal components and canonical correlations) to investigate the structure of rate co-variation for four mutation types and simultaneously explore the associations with multiple genomic features at different genomic scales and phylogenetic distances.

Results: We observe a consistent, largely linear co-variation among rates of nucleotide substitutions, small insertions and small deletions, with some non-linear associations detected among these rates on chromosome X and near autosomal telomeres. This co-variation appears to be shaped by a common set of genomic features, some previously investigated and some novel to this study (nuclear lamina binding sites, methylated non-CpG sites and nucleosome-free regions). Strong non-linear relationships are also detected among genomic features near the centromeres of large chromosomes. Microsatellite mutability co-varies with other mutation rates at finer scales, but not at 1 Mb, and shows varying degrees of association with genomic features at different scales.

Conclusions: Our results allow us to speculate about the role of different molecular mechanisms, such as replication, recombination, repair and local chromatin environment, in mutagenesis. The software tools developed for our analyses are available through Galaxy, an open-source genomics portal, to facilitate the use of multivariate techniques in future large-scale genomics studies.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Biplots of the first two PCA components for our four mutation rates, as obtained from the AR and NCNR subgenomes along the human-orangutan branch for 1-Mb windows. Black dots represent projected observations (that is, projected windows). The vectors labeled INS, DEL, SUB, and MS depict loadings for insertion rate, deletion rate, substitution rate, and mononucleotide microsatellite mutability, respectively. See Tables S1 and S2 in Additional file 1 for summary statistics.
Figure 2
Figure 2
Genome-wide locations of windows driving non-linear signals in the data. (a-c) Black circles denote windows without marked non-linearity. Green and blue circles denote windows displaying mutation rate non-linearity in PCA (a) and CCA in the response space (b). Red circles denote windows displaying genomic feature non-linearity in CCA in the predictor space (c). Yellow triangles represent the location of the centromeres on each of the chromosomes.
Figure 3
Figure 3
Helioplots for CCA performed on the AR and NCNR sub-genomes along the human-orangutan branch for 1-Mb windows. The labels on the plots are as follows: CV, canonical variate; GC, GC content; CpG, number of CpG islands; nCGm, number of non-CpG methyl-cytosines; LINE, number of LINE elements; SINE, number of SINE elements; NLp, number of nuclear lamina associated regions; telo, distance to the telomere; fRec and mRec, female and male recombination rates; SNPd, SNP density; RepT, replication time; nucFree, density of nucleosome-free regions; cExon, coverage by coding exons; mostCons, coverage by most conserved elements. Red bars indicate positive loadings, and blue bars negative loadings. See Table S6 in Additional file 1 for summary statistics.
Figure 4
Figure 4
Galaxy workflow developed for estimating mutation rates and computing principal components. A similar workflow (not shown) was implemented to compute canonical correlation component pairs. MAF, multiple alignment format.

References

    1. Lercher MJ, Williams EJ, Hurst LD. Local similarity in evolutionary rates extends over whole chromosomes in human-rodent and mouse-rat comparisons: implications for understanding the mechanistic basis of the male mutation bias. Mol Biol Evol. 2001;18:2032–2039. - PubMed
    1. Hardison RC, Roskin KM, Yang S, Diekhans M, Kent WJ, Weber R, Elnitski L, Li J, O'Connor M, Kolbe D, Schwartz S, Furey TS, Whelan S, Goldman N, Smit A, Miller W, Chiaromonte F, Haussler D. Covariation in frequencies of substitution, deletion, transposition, and recombination during eutherian evolution. Genome Res. 2003;13:13–26. doi: 10.1101/gr.844103. - DOI - PMC - PubMed
    1. Ellegren H. Microsatellites: simple sequences with complex evolution. Nat Rev Genet. 2004;5:435–445. doi: 10.1038/nrg1348. - DOI - PubMed
    1. Makova KD, Yang S, Chiaromonte F. Insertions and deletions are male biased too: a whole-genome analysis in rodents. Genome Res. 2004;14:567–573. doi: 10.1101/gr.1971104. - DOI - PMC - PubMed
    1. Lunter G, Ponting CP, Hein J. Genome-wide identification of human functional DNA using a neutral indel model. PLoS Comput Biol. 2006;2:e5. doi: 10.1371/journal.pcbi.0020005. - DOI - PMC - PubMed

Publication types

LinkOut - more resources