Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2009 Dec;85(6):762-74.
doi: 10.1016/j.ajhg.2009.10.015.

Genomic dissection of population substructure of Han Chinese and its implication in association studies

Affiliations

Genomic dissection of population substructure of Han Chinese and its implication in association studies

Shuhua Xu et al. Am J Hum Genet. 2009 Dec.

Abstract

To date, most genome-wide association studies (GWAS) and studies of fine-scale population structure have been conducted primarily on Europeans. Han Chinese, the largest ethnic group in the world, composing 20% of the entire global human population, is largely underrepresented in such studies. A well-recognized challenge is the fact that population structure can cause spurious associations in GWAS. In this study, we examined population substructures in a diverse set of over 1700 Han Chinese samples collected from 26 regions across China, each genotyped at approximately 160K single-nucleotide polymorphisms (SNPs). Our results showed that the Han Chinese population is intricately substructured, with the main observed clusters corresponding roughly to northern Han, central Han, and southern Han. However, simulated case-control studies showed that genetic differentiation among these clusters, although very small (F(ST) = 0.0002 approximately 0.0009), is sufficient to lead to an inflated rate of false-positive results even when the sample size is moderate. The top two SNPs with the greatest frequency differences between the northern Han and southern Han clusters (F(ST) > 0.06) were found in the FADS2 gene, which associates with the fatty acid composition in phospholipids, and in the HLA complex P5 gene (HCP5), which associates with HIV infection, psoriasis, and psoriatic arthritis. Ingenuity Pathway Analysis (IPA) showed that most differentiated genes among clusters are involved in cardiac arteriopathy (p < 10(-101)). These signals indicating significant differences among Han Chinese subpopulations should be carefully explained in case they are also detected in association studies, especially when sample sources are diverse.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Analysis of the First Two Principal Components The % Eigenvalue is the percentage of the total variance in the first ten PCs. (A) 1708 Han Chinese and 767 non-Han Chinese individuals representing all samples studied. (B) 1708 Han Chinese and 89 Japanese (JPT) individuals (excluding non-East Asian samples). (C) Han Chinese individuals (excluding all non-Han Chinese samples). (D) Han Chinese individuals (excluding CHB, CHD, three metropolitan populations [Beijing, Shanghai, Guangzhou], and two regional populations close to Shanghai [Anhui and Jiangsu]).
Figure 2
Figure 2
Principal Components Analysis of Han Chinese Individuals (A) Analysis of PC1 and PC2 of Han Chinese. (B) Analysis of PC1 and PC3 of Han Chinese. (C) Geographical locations of three C-Han populations. (D) Distribution of PC1 for Han Chinese individuals as classified into three subgroups.
Figure 3
Figure 3
Classification of Han Chinese Individuals into the Three Clusters For each population or group (x axis), individuals are assigned to three clusters. Percentages (y axis) are depicted by three colors representing three clusters.
Figure 4
Figure 4
Relationship of PC1 and Latitude (A) Average PC1 and latitude of populations. The standard deviation of PC1 is also shown for each population. (B) Correlation of PC1 and latitude. The line in the plot shows the regression line (y = 0.0043x − 0.147). R2 for the linear regression of genetic distance on geographic distance is 0.69 (p = 1.27 × 10−7).

References

    1. Campbell C.D., Ogburn E.L., Lunetta K.L., Lyon H.N., Freedman M.L., Groop L.C., Altshuler D., Ardlie K.G., Hirschhorn J.N. Demonstrating stratification in a European American population. Nat. Genet. 2005;37:868–872. - PubMed
    1. Helgason A., Yngvadottir B., Hrafnkelsson B., Gulcher J., Stefansson K. An Icelandic example of the impact of population structure on association studies. Nat. Genet. 2005;37:90–95. - PubMed
    1. Price A.L., Helgason A., Palsson S., Stefansson H., St Clair D., Andreassen O.A., Reich D., Kong A., Stefansson K. The impact of divergence time on the nature of population structure: an example from Iceland. PLoS Genet. 2009;5:e1000505. - PMC - PubMed
    1. Li J.Z., Absher D.M., Tang H., Southwick A.M., Casto A.M., Ramachandran S., Cann H.M., Barsh G.S., Feldman M., Cavalli-Sforza L.L. Worldwide human relationships inferred from genome-wide patterns of variation. Science. 2008;319:1100–1104. - PubMed
    1. Kayser M., Lao O., Saar K., Brauer S., Wang X., Nurnberg P., Trent R.J., Stoneking M. Genome-wide analysis indicates more Asian than Melanesian ancestry of Polynesians. Am. J. Hum. Genet. 2008;82:194–198. - PMC - PubMed

Publication types

LinkOut - more resources