Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2014 May 2;15(1):332.
doi: 10.1186/1471-2164-15-332.

Evaluating the possibility of detecting evidence of positive selection across Asia with sparse genotype data from the HUGO Pan-Asian SNP Consortium

Affiliations

Evaluating the possibility of detecting evidence of positive selection across Asia with sparse genotype data from the HUGO Pan-Asian SNP Consortium

Xuanyao Liu et al. BMC Genomics. .

Abstract

Background: The HUGO Pan-Asian SNP Consortium (PASNP) has generated a genetic resource of almost 55,000 autosomal single nucleotide polymorphisms (SNPs) across more than 1,800 individuals from 73 urban and indigenous populations in Asia. This has offered valuable insights into the correlation between the genetic ancestry of these populations with major linguistic systems and geography. Here, we attempt to understand whether adaptation to local climate, diet and environment partly explains the genetic variation present in these populations by investigating the genomic signatures of positive selection.

Results: To evaluate the impact to the selection analyses due to the considerably lower SNP density as compared to other population genetics resources such as the International HapMap Project (HapMap) or the Singapore Genome Variation Project, we evaluated the extent of haplotype phasing switch errors and the consistency of selection signals from three haplotype-based approaches (iHS, XP-EHH, haploPS) when the HapMap data is thinned to a similar density as PASNP. We subsequently applied haploPS to detect and characterize positive selection in the PASNP populations, identifying 59 genomics regions that were selected in at least one PASNP populations. A cluster analysis on the basis of these 59 signals showed that indigenous populations such as the Negrito from Malaysia and Philippines, the China Hmong, and the Taiwan Ami and Atayal shared more of these signals. We also reported evidence of a positive selection signal encompassing the beta globin gene in the Taiwan Ami and Atayal that was distinct from the signal in the HapMap Africans, suggesting the possibility of convergent evolution at this locus due to malarial selection.

Conclusions: We established that the lower SNP content of the PASNP data conferred weaker ability to detect signatures of positive selection, but the availability of the new approach haploPS retained modest power. Out of all the populations in PASNP, we identified only 59 signals, suggesting a strong need for high-density population-level genotyping data or sequencing data in order to achieve a comprehensive survey of positive selection in Asian populations.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Distribution and principal component analyses of the populations in the HUGO Pan-Asian SNP Consortium (PASNP). (A) Geographical map indicating the locations of the populations in PASNP with the colors of the population labels determined by genetic, linguistic and geographic similarities of the populations. Three sets of principal components analyses were performed between: (B) the PASNP populations with HapMap Europeans (CEU) and Africans (YRI); (C) the PASNP populations and CEU; and (D) the PASNP populations only. Each circle represents an individual from the population grouping indicated by the assigned color, and the population labels correspond to the original labels used by PASNP.
Figure 2
Figure 2
Clustering of the PASNP and HapMap populations. A phylogenetic tree obtained using a maximum likelihood procedure in the PHYLIP package on the genotype data for SNPs in the autosomal chromosomes to cluster the PASNP and HapMap populations. Cross-referencing the populations found within the same major branches indicated that genetic similarities concurred with linguistic and geographic similarities.
Figure 3
Figure 3
Statistical power of haploPS. Statistical power of haploPS to successfully identify a genomic region simulated to possess an advantageous derived allele at different allele frequency was evaluated in two settings using simulated data that is publicly available from the haploPS website: (i) with data of the original SNP density (blue dotted line and circles); and (ii) when the SNP density is reduced to 1/20th of the original SNP density which is meant to reflect the density of SNPs in PASNP (red solid line and circles). Power was calculated from 2,000 simulated regions at a false discovery rate of 1%, defined against the empirical null distribution of the haploPS score obtained from a separate set of 2,000 simulated regions without positive selection.
Figure 4
Figure 4
Frequency spectrum of positive selection regions in PASNP. Summary of the number of positive selection signals in each of the 31 PASNP population groupings, classified according to the inferred frequencies of the advantageous alleles in three categories: (i) high, where derived allele frequency (DAF) ≤ 30%; (ii) medium, 30% < DAF < 80%; and (iii) high, DAF ≥ 80%. The vertical dashed line separates the urban and cosmopolitan populations (left of line) from the ethnic minorities and indigenous populations (right of line).
Figure 5
Figure 5
Clustering of PASNP population groups by selection signals. Hierarchical clustering of the 31 PASNP population groups according to the absence or presence of the 59 positive selection signals that have been identified by haploPS. Each of the 59 signals is present in at least one of the 31 population groups. The hierarchical clustering is performed using the Ward’s minimum variance method with the hclust command in R. Populations found in one of the major branches correspond to indigenous populations from Malaysia, Philippines, Thailand and China (upper purple box), while populations found in one of the sub-branches correspond to those in northern East Asia (lower red box).
Figure 6
Figure 6
Selection signals containing at least one height-associated gene. Distribution of the 30 positive selection signals that spanned at least one height-associated gene across the 31 PASNP population groupings. Each yellow block indicates that the specific selection signal (column header) is present in a particular group of PASNP populations (row header). The groups are ranked according to the number of selection signals in descending order (last column). Population groupings that correspond to indigenous populations are shaded in grey.
Figure 7
Figure 7
Selected haplotype forms at HBB in YRI and Taiwan aborigines. HaploPS identified the extended haplotypes that presented evidence of positive selection at HBB in HapMap Nigerians (YRI) and in two Taiwan indigenous populations, the Ami and Atayal, which were both found at 10% in the respective populations. The selection signals most likely stem from different mutation events in light of the low haplotype similarity index (HSI) of 0.63 between the haplotype forms from YRI and Taiwan.
Figure 8
Figure 8
Haplotype characteristics surrounding HBB. (A) Scatterplot of the number of SNPs against the genetic distance spanned by the longest haplotypes found at 10% in five collections of indigenous populations in Asia, including two from Thailand and China (Indigenous 1: China Wa and Thailand H’Tin, Mlabri, Plang, Karen and Lawa ethnicities; Indigenous 2: Thailand Tai Lue, Tai Yong, Tai Kern and Tai Yuan ethnicities). The corresponding signals that span HBB in the five collections have been represented by triangles. (B) A stacked haplotype plot for Taiwan Ami and Atayal, where the longest haplotype around HBB at each core haplotype frequency from 5% to 95% in increments of 5% is illustrated. (C) A similar stacked haplotype plot around HBB for Malaysia Negrito. (D) A comparison of the longest haplotype forms around HBB present at 10% in Malaysia Negrito and Taiwan Ami and Atayal samples, where the discordant positions between the two haplotype forms are illustrated in the top panel.

Similar articles

  • Characterising private and shared signatures of positive selection in 37 Asian populations.
    Liu X, Lu D, Saw WY, Shaw PJ, Wangkumhang P, Ngamphiw C, Fucharoen S, Lert-Itthiporn W, Chin-Inmanu K, Chau TN, Anders K, Kasturiratne A, de Silva HJ, Katsuya T, Kimura R, Nabika T, Ohkubo T, Tabara Y, Takeuchi F, Yamamoto K, Yokota M, Mamatyusupu D, Yang W, Chung YJ, Jin L, Hoh BP, Wickremasinghe AR, Ong RH, Khor CC, Dunstan SJ, Simmons C, Tongsima S, Suriyaphol P, Kato N, Xu S, Teo YY. Liu X, et al. Eur J Hum Genet. 2017 Apr;25(4):499-508. doi: 10.1038/ejhg.2016.181. Epub 2017 Jan 18. Eur J Hum Genet. 2017. PMID: 28098149 Free PMC article.
  • Singapore Genome Variation Project: a haplotype map of three Southeast Asian populations.
    Teo YY, Sim X, Ong RT, Tan AK, Chen J, Tantoso E, Small KS, Ku CS, Lee EJ, Seielstad M, Chia KS. Teo YY, et al. Genome Res. 2009 Nov;19(11):2154-62. doi: 10.1101/gr.095000.109. Epub 2009 Aug 21. Genome Res. 2009. PMID: 19700652 Free PMC article.
  • Natural positive selection and north-south genetic diversity in East Asia.
    Suo C, Xu H, Khor CC, Ong RT, Sim X, Chen J, Tay WT, Sim KS, Zeng YX, Zhang X, Liu J, Tai ES, Wong TY, Chia KS, Teo YY. Suo C, et al. Eur J Hum Genet. 2012 Jan;20(1):102-10. doi: 10.1038/ejhg.2011.139. Epub 2011 Jul 27. Eur J Hum Genet. 2012. PMID: 21792231 Free PMC article.
  • PanSNPdb: the Pan-Asian SNP genotyping database.
    Ngamphiw C, Assawamakin A, Xu S, Shaw PJ, Yang JO, Ghang H, Bhak J, Liu E, Tongsima S; HUGO Pan-Asian SNP Consortium. Ngamphiw C, et al. PLoS One. 2011;6(6):e21451. doi: 10.1371/journal.pone.0021451. Epub 2011 Jun 23. PLoS One. 2011. PMID: 21731755 Free PMC article.
  • The search for loci under selection: trends, biases and progress.
    Ahrens CW, Rymer PD, Stow A, Bragg J, Dillon S, Umbers KDL, Dudaniec RY. Ahrens CW, et al. Mol Ecol. 2018 Mar;27(6):1342-1356. doi: 10.1111/mec.14549. Epub 2018 Mar 30. Mol Ecol. 2018. PMID: 29524276 Review.

Cited by

References

    1. Tishkoff SA, Reed FA, Friedlaender FR, Ehret C, Ranciaro A, Froment A, Hirbo JB, Awomoyi AA, Bodo JM, Doumbo O, Ibrahim M, Juma AT, Kotze MJ, Lema G, Moore JH, Mortensen H, Nyambo TB, Omar SA, Powell K, Pretorius GS, Smith MW, Thera MA, Wambebe C, Weber JL, Williams SM. The genetic structure and history of Africans and African Americans. Science. 2009;324(5930):1035–1044. doi: 10.1126/science.1172257. - DOI - PMC - PubMed
    1. Coker RJ, Hunter BM, Rudge JW, Liverani M, Hanvoravongchai P. Emerging infectious diseases in southeast Asia: regional challenges to control. Lancet. 2011;377(9765):599–609. doi: 10.1016/S0140-6736(10)62004-1. - DOI - PMC - PubMed
    1. Frazer KA, Ballinger DG, Cox DR, Hinds DA, Stuve LL, Gibbs RA, Belmont JW, Boudreau A, Hardenbol P, Leal SM, Pasternak S, Wheeler DA, Willis TD, Yu F, Yang H, Zeng C, Gao Y, Hu H, Hu W, Li C, Lin W, Liu S, Pan H, Tang X, Wang J, Wang W, Yu J, Zhang B, Zhang Q, Zhao H, et al. A second generation human haplotype map of over 3.1 million SNPs. Nature. 2007;449(7164):851–861. doi: 10.1038/nature06258. - DOI - PMC - PubMed
    1. The International HapMap ProjectNature 2003,426(6968):789–796. - PubMed
    1. Altshuler DM, Gibbs RA, Peltonen L, Dermitzakis E, Schaffner SF, Yu F, Bonnen PE, de Bakker PI, Deloukas P, Gabriel SB, Gwilliam R, Hunt S, Inouye M, Jia X, Palotie A, Parkin M, Whittaker P, Yu F, Chang K, Hawes A, Lewis LR, Ren Y, Wheeler D, Muzny DM, Barnes C, Darvishi K, Hurles M, Korn JM, Kristiansson K, International HapMap 3 Consortium1 et al. Integrating common and rare genetic variation in diverse human populations. Nature. 2010;467(7311):52–58. doi: 10.1038/nature09298. - DOI - PMC - PubMed

Publication types

LinkOut - more resources