Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2007 Apr 6;3(4):e51.
doi: 10.1371/journal.pgen.0030051. Epub 2007 Feb 22.

Generalized analysis of molecular variance

Affiliations

Generalized analysis of molecular variance

Caroline M Nievergelt et al. PLoS Genet. .

Abstract

Many studies in the fields of genetic epidemiology and applied population genetics are predicated on, or require, an assessment of the genetic background diversity of the individuals chosen for study. A number of strategies have been developed for assessing genetic background diversity. These strategies typically focus on genotype data collected on the individuals in the study, based on a panel of DNA markers. However, many of these strategies are either rooted in cluster analysis techniques, and hence suffer from problems inherent to the assignment of the biological and statistical meaning to resulting clusters, or have formulations that do not permit easy and intuitive extensions. We describe a very general approach to the problem of assessing genetic background diversity that extends the analysis of molecular variance (AMOVA) strategy introduced by Excoffier and colleagues some time ago. As in the original AMOVA strategy, the proposed approach, termed generalized AMOVA (GAMOVA), requires a genetic similarity matrix constructed from the allelic profiles of individuals under study and/or allele frequency summaries of the populations from which the individuals have been sampled. The proposed strategy can be used to either estimate the fraction of genetic variation explained by grouping factors such as country of origin, race, or ethnicity, or to quantify the strength of the relationship of the observed genetic background variation to quantitative measures collected on the subjects, such as blood pressure levels or anthropometric measures. Since the formulation of our test statistic is rooted in multivariate linear models, sets of variables can be related to genetic background in multiple regression-like contexts. GAMOVA can also be used to complement graphical representations of genetic diversity such as tree diagrams (dendrograms) or heatmaps. We examine features, advantages, and power of the proposed procedure and showcase its flexibility by using it to analyze a wide variety of published data sets, including data from the Human Genome Diversity Project, classical anthropometry data collected by Howells, and the International HapMap Project.

PubMed Disclaimer

Conflict of interest statement

Competing interests. The authors have declared that no competing interests exist.

Figures

Figure 1
Figure 1. Neighbor-Joining Trees Depicting the Genetic Relationships of 1,040 Individuals from 51 World Populations Collected by the CEPH-HGDP
(A) Individuals are color coded according to which of five major geographic regions of the globe they are collected from. (B) Individuals are color coded according to which of the 51 populations they are associated with (1: Biaka Pygmy, 2: San, 3: Mbuti Pygmy, 4: Druze; 5: Bedouin, 6: Mozabite, 7: Palestinian, 8: Kalash, 9: Pima, 10: Columbian, 11: Karitiana, 12: Surui, 13: New Guinea, 14: Yakut).
Figure 2
Figure 2. Relationship between the Genetic Differentiation among Two Populations as Measured by Wright's F ST and the Average (±S.E.M.) Power of the GAMOVA Procedure to Detect that Differentiation
Results are based on 1,000 simulation studies involving four sets of two equally sized populations, each generated according to varying genetic differentiation. Known group membership was used as predictor in the GAMOVA analysis. For a constant data size (number of markers × number of subjects), genetic differentiation can be detected at lower F ST values in larger populations with fewer markers compared to smaller populations with more markers (squares: 32 individuals, 32768 markers; triangles: 64 individuals, 16384 markers; circles: 128 individuals, 8192 markers; stars: 256 individuals, 4096 markers).

References

    1. Rosenberg NA, Pritchard JK, Weber JL, Cann HM, Kidd KK, et al. Genetic structure of human populations. Science. 2002;298:2381–2385. - PubMed
    1. Bamshad M, Wooding S, Salisbury BA, Stephens JC. Deconstructing the relationship between genetics and race. Nat Rev Genet. 2004;5:598–609. - PubMed
    1. Serre D, Paabo S. Evidence for gradients of human genetic diversity within and among continents. Genome Res. 2004;14:1679–1685. - PMC - PubMed
    1. Cavalli-Sforza LL. The Human Genome Diversity Project: Past, present and future. Nat Rev Genet. 2005;6:333–340. - PubMed
    1. Mountain JL, Ramakrishnan U. Impact of human population history on distributions of individual-level genetic distance. Hum Genomics. 2005;2:4–19. - PMC - PubMed

Publication types