Comparative Study

. 2003 Mar;72(3):578-89.

doi: 10.1086/368061. Epub 2003 Jan 28.

Human population genetic structure and inference of group membership

Michael J Bamshad¹, Stephen Wooding, W Scott Watkins, Christopher T Ostler, Mark A Batzer, Lynn B Jorde

Affiliations

PMID: 12557124
PMCID: PMC1180234
DOI: 10.1086/368061

Comparative Study

Human population genetic structure and inference of group membership

Michael J Bamshad et al. Am J Hum Genet. 2003 Mar.

. 2003 Mar;72(3):578-89.

doi: 10.1086/368061. Epub 2003 Jan 28.

Authors

Michael J Bamshad¹, Stephen Wooding, W Scott Watkins, Christopher T Ostler, Mark A Batzer, Lynn B Jorde

Affiliation

¹ Department of Pediatrics, University of Utah, Salt Lake City, Utah 84112, USA. mike@genetics.utah.edu

PMID: 12557124
PMCID: PMC1180234
DOI: 10.1086/368061

Abstract

A major goal of biomedical research is to develop the capability to provide highly personalized health care. To do so, it is necessary to understand the distribution of interindividual genetic variation at loci underlying physical characteristics, disease susceptibility, and response to treatment. Variation at these loci commonly exhibits geographic structuring and may contribute to phenotypic differences between groups. Thus, in some situations, it may be important to consider these groups separately. Membership in these groups is commonly inferred by use of a proxy such as place-of-origin or ethnic affiliation. These inferences are frequently weakened, however, by use of surrogates, such as skin color, for these proxies, the distribution of which bears little resemblance to the distribution of neutral genetic variation. Consequently, it has become increasingly controversial whether proxies are sufficient and accurate representations of groups inferred from neutral genetic variation. This raises three questions: how many data are required to identify population structure at a meaningful level of resolution, to what level can population structure be resolved, and do some proxies represent population structure accurately? We assayed 100 Alu insertion polymorphisms in a heterogeneous collection of approximately 565 individuals, approximately 200 of whom were also typed for 60 microsatellites. Stripped of identifying information, correct assignment to the continent of origin (Africa, Asia, or Europe) with a mean accuracy of at least 90% required a minimum of 60 Alu markers or microsatellites and reached 99%-100% when >/=100 loci were used. Less accurate assignment (87%) to the appropriate genetic cluster was possible for a historically admixed sample from southern India. These results set a minimum for the number of markers that must be tested to make strong inferences about detecting population structure among Old World populations under ideal experimental conditions. We note that, whereas some proxies correspond crudely, if at all, to population structure, the heuristic value of others is much higher. This suggests that a more flexible framework is needed for making inferences about population structure and the utility of proxies.

PubMed Disclaimer

Figures

**Figure 1**
Predicted origin versus known origin for Africans, East Asians, and Europeans, estimated from 1–100 *Alu* insertion polymorphisms, bounded by 95% CIs.

**Figure 2**
Predicted origin versus known origin for Africans, East Asians, and Europeans, estimated from 1–60 microsatellite loci, bounded by 95% CIs.

**Figure 3**
Predicted origin vs. known origin for Africans, East Asians, and Europeans, estimated from 1–160 loci including both 100 *Alu* and 60 microsatellite loci, bounded by 95% CIs.

**Figure 4**
Assignment of samples from 23 ethnic groups from Africa, Asia, and Europe, to genetic clusters inferred from the analysis of 100 *Alu* insertion polymorphisms for K=2, 3, and 4. Sample sizes for each population are in parentheses.

**Figure 5**
Proportion of ancestry for individual samples from Africa, Asia, and Europe for K=3, using 20, 60, and 100 *Alu* insertion polymorphisms; 20 and 60 microsatellite loci; and all 160 loci. The proportion of ancestry increases toward each apex.

**Figure 6**
Proportion of ancestry for individual samples from Asia, Europe, and India, for K=3, using 100 *Alu* loci (*left*), compared with the proportion of ancestry for individual samples from Africa, Asia, Europe, and India, for K=3, using 100 *Alu* loci (*right*). The proportion of ancestry increases toward each apex.

**Figure A**
Distribution of F_ST estimates among sub-Saharan Africans, East Asians, and Europeans, for each *Alu* marker (*blue*) and each microsatellite locus (*yellow*).

See this image and copyright information in PMC

References

Electronic-Database Information

1. Lewis Lab Software, http://lewis.eeb.uconn.edu/lewishome/software.html (for Genetic Data Analysis [GDA], the software used to estimate F_ST statistics)
1. M.A.B.'s Web site, http://batzerlab.lsu.edu/ (for the specific identities of each Alu marker, the PCR conditions used to amplify each system, and the expected amplicon sizes)
1. Pritchard Lab, http://pritch.bsd.uchicago.edu/ (for Structure, the software used to detect population structure and make inferences about population assignment)
1. W.S.W.'s Web site, http://www.genetics.utah.edu/~swatkins/pub/Alu_data.htm (for the specific identities of each Alu marker, the PCR conditions used to amplify each system, and the expected amplicon sizes)

References

1. Aspinall PJ (1998) Describing the “white” ethnic group and its composition in medical research. Soc Sci Med 47:1797–1808 - PubMed
1. Bamshad M, Kivisild T, Watkins WS, Dixon ME, Ricker CE, Rao BB, Naidu JM, Prasad BV, Reddy PG, Rasanayagam A, Papiha SS, Villems R, Redd AJ, Hammer MF, Nguyen SV, Carroll ML, Batzer MA, Jorde LB (2001) Genetic evidence on the origins of Indian caste populations. Genome Res 11:994–1004 - PMC - PubMed
1. Barbujani G, Magagni A, Minch E, Cavalli-Sforza LL (1997) An apportionment of human DNA diversity. Proc Nat Acad Sci USA 94:4516–4519 - PMC - PubMed
1. Batzer MA, Deininger PL (2002) Alu repeats and human genomic diversity. Nat Rev Genet 3:370–379 - PubMed
1. Bowcock AM, Kidd JR, Mountain JL, Hebert JM, Carotenuto L, Kidd KK, Cavalli-Sforza LL (1991) Drift, admixture, and selection in human evolution: a study with DNA polymorphisms. Proc Natl Acad Sci USA 88:3839–3343 - PMC - PubMed

Publication types

Actions
Actions
Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

LinkOut - more resources

Full Text Sources
Other Literature Sources
- The Lens - Patent Citations Database

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Human population genetic structure and inference of group membership

Affiliation

Human population genetic structure and inference of group membership

Authors

Affiliation

Abstract

Figures

References

Electronic-Database Information

References

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources