Genetic ancestry inference using support vector machines, and the active emergence of a unique American population
- PMID: 23211701
- PMCID: PMC3641388
- DOI: 10.1038/ejhg.2012.258
Genetic ancestry inference using support vector machines, and the active emergence of a unique American population
Erratum in
- Eur J Hum Genet. 2013 May;21(5):578
Abstract
We use genotype data from the Marshfield Clinical Research Foundation Personalized Medicine Research Project to investigate genetic similarity and divergence between Europeans and the sampled population of European Americans in Central Wisconsin, USA. To infer recent genetic ancestry of the sampled Wisconsinites, we train support vector machines (SVMs) on the positions of Europeans along top principal components (PCs). Our SVM models partition continent-wide European genetic variance into eight regional classes, which is an improvement over the geographically broader categories of recent ancestry reported by personal genomics companies. After correcting for misclassification error associated with the SVMs (<10%, in all cases), we observe a >14% discrepancy between insular ancestries reported by Wisconsinites and those inferred by SVM. Values of FST as well as Mantel tests for correlation between genetic and European geographic distances indicate minimal divergence between Europe and the local Wisconsin population. However, we find that individuals from the Wisconsin sample show greater dispersion along higher-order PCs than individuals from Europe. Hypothesizing that this pattern is characteristic of nascent divergence, we run computer simulations that mimic the recent peopling of Wisconsin. Simulations corroborate the pattern in higher-order PCs, demonstrate its transient nature, and show that admixture accelerates the rate of divergence between the admixed population and its parental sources relative to drift alone. Together, empirical and simulation results suggest that genetic divergence between European source populations and European Americans in Central Wisconsin is subtle but already under way.
Figures




Similar articles
-
Assessment of genetic ancestry and population substructure in Costa Rica by analysis of individuals with a familial history of mental disorder.Ann Hum Genet. 2010 Nov;74(6):516-24. doi: 10.1111/j.1469-1809.2010.00612.x. Epub 2010 Oct 6. Ann Hum Genet. 2010. PMID: 20946256 Free PMC article.
-
Ancestry variation and footprints of natural selection along the genome in Latin American populations.Sci Rep. 2016 Feb 18;6:21766. doi: 10.1038/srep21766. Sci Rep. 2016. PMID: 26887503 Free PMC article.
-
Genetic ancestry, self-reported race and ethnicity in African Americans and European Americans in the PCaP cohort.PLoS One. 2012;7(3):e30950. doi: 10.1371/journal.pone.0030950. Epub 2012 Mar 27. PLoS One. 2012. PMID: 22479307 Free PMC article.
-
Investigating European genetic history through computer simulations.Hum Hered. 2013;76(3-4):142-53. doi: 10.1159/000360162. Epub 2014 May 21. Hum Hered. 2013. PMID: 24861859 Review.
-
The Genetic Diversity of the Americas.Annu Rev Genomics Hum Genet. 2017 Aug 31;18:277-296. doi: 10.1146/annurev-genom-083115-022331. Annu Rev Genomics Hum Genet. 2017. PMID: 28859572 Review.
Cited by
-
Systematic Review on Local Ancestor Inference From a Mathematical and Algorithmic Perspective.Front Genet. 2021 May 24;12:639877. doi: 10.3389/fgene.2021.639877. eCollection 2021. Front Genet. 2021. PMID: 34108987 Free PMC article. Review.
-
New neural network classification method for individuals ancestry prediction from SNPs data.BioData Min. 2021 Jun 28;14(1):30. doi: 10.1186/s13040-021-00258-7. BioData Min. 2021. PMID: 34183066 Free PMC article.
-
Ancestry-Specific Analyses Reveal Differential Demographic Histories and Opposite Selective Pressures in Modern South Asian Populations.Mol Biol Evol. 2019 Aug 1;36(8):1628-1642. doi: 10.1093/molbev/msz037. Mol Biol Evol. 2019. PMID: 30952160 Free PMC article.
-
Estimation of Genomic Breed Composition for Purebred and Crossbred Animals Using Sparsely Regularized Admixture Models.Front Genet. 2020 Jun 11;11:576. doi: 10.3389/fgene.2020.00576. eCollection 2020. Front Genet. 2020. PMID: 32595700 Free PMC article.
-
Hybrid autoencoder with orthogonal latent space for robust population structure inference.Sci Rep. 2023 Feb 14;13(1):2612. doi: 10.1038/s41598-023-28759-x. Sci Rep. 2023. PMID: 36788253 Free PMC article.
References
-
- Lind JM, Hutcheson-Dilks HB, Williams SM, et al. Elevated male European and female African contributions to the genomes of African American individuals. Hum Genet. 2007;120:713–722. - PubMed
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources
Miscellaneous