Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2011 Nov 29;5 Suppl 9(Suppl 9):S8.
doi: 10.1186/1753-6561-5-S9-S8.

Population structure analysis using rare and common functional variants

Affiliations

Population structure analysis using rare and common functional variants

Tesfaye M Baye et al. BMC Proc. .

Abstract

Next-generation sequencing technologies now make it possible to genotype and measure hundreds of thousands of rare genetic variations in individuals across the genome. Characterization of high-density genetic variation facilitates control of population genetic structure on a finer scale before large-scale genotyping in disease genetics studies. Population structure is a well-known, prevalent, and important factor in common variant genetic studies, but its relevance in rare variants is unclear. We perform an extensive population structure analysis using common and rare functional variants from the Genetic Analysis Workshop 17 mini-exome sequence. The analysis based on common functional variants required 388 principal components to account for 90% of the variation in population structure. However, an analysis based on rare variants required 532 significant principal components to account for similar levels of variation. Using rare variants, we detected fine-scale substructure beyond the population structure identified using common functional variants. Our results show that the level of population structure embedded in rare variant data is different from the level embedded in common variant data and that correcting for population structure is only as good as the level one wishes to correct.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Scatterplot of principal component axis one (PC1) and axis two (PC2) based on (a) common functional variants and (b) rare functional variants. CEPH, European-descended population (U.S. Caucasians).
Figure 2
Figure 2
Inferred genetic ancestry with even clusters from seven populations based on common (upper panel) and rare (lower panel) variants. Each individual is represented by a thin vertical line, which is partitioned into seven colored segments that represent the individual’s estimated ancestry coefficients in the seven clusters. Individuals (separated by solid lines) are represented by bars on the x-axis, and ancestry proportion is given on the y-axis. The proportion of ancestry is illustrated by the amount of different color in each individual. CEPH, European-descended population (U.S. Caucasians).
Figure 3
Figure 3
Predictive accuracy of common versus rare functional variants based on PC1 or PC2.
Figure 4
Figure 4
Scatterplot of the 697 individuals using allele frequency for the common and rare variants. CEPH, European-descended population (U.S. Caucasians).

References

    1. Steinmetz LM, Mindrinos M, Oefner PJ. Combining genome sequences and new technologies dissecting the genetics of complex phenotypes. Tr Plant Sci. 2000;5:397–401. doi: 10.1016/S1360-1385(00)01724-6. - DOI - PubMed
    1. Kalinowski S. Genetic polymorphism and mixed-stock fisheries analysis. Can J Fish Aquat Sci. 2004;61:1075–1082. doi: 10.1139/f04-060. - DOI
    1. 1000 Genomes Project Consortium. Durbin RM, Abecasis GR, Altshuler DL, Auton A, Brooks LD, Durbin RM, Gibbs RA, Hurles ME, McVean GA. A map of human genome variation from population-scale sequencing. Nature. 2010;467:1061–1073. doi: 10.1038/nature09534. - DOI - PMC - PubMed
    1. Krzanowsky W. Principles of Multivariate Analysis. New York, Oxford University Press; 2003.
    1. Pritchard JK, Stephens M, Donnelly P. Inference of population structure using multilocus genotype data. Genetics. 2000;155:945–959. - PMC - PubMed