Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2006 Mar;16(3):323-30.
doi: 10.1101/gr.4138406. Epub 2006 Feb 8.

The portability of tagSNPs across populations: a worldwide survey

Affiliations

The portability of tagSNPs across populations: a worldwide survey

Anna González-Neira et al. Genome Res. 2006 Mar.

Abstract

In the search for common genetic variants that contribute to prevalent human diseases, patterns of linkage disequilibrium (LD) among linked markers should be considered when selecting SNPs. Genotyping efficiency can be increased by choosing tagging SNPs (tagSNPs) in LD with other SNPs. However, it remains to be seen whether tagSNPs defined in one population efficiently capture LD in other populations; that is, how portable tagSNPs are. Indeed, tagSNP portability is a challenge for the applicability of HapMap results. We analyzed 144 SNPs in a 1-Mb region of chromosome 22 in 1055 individuals from 38 worldwide populations, classified into seven continental groups. We measured tagSNP portability by choosing three reference populations (to approximate the three HapMap populations), defining tagSNPs, and applying them to other populations independently on the availability of information on the tagSNPs in the compared population. We found that tagSNPs are highly informative in other populations within each continental group. Moreover, tagSNPs defined in Europeans are often efficient for Middle Eastern and Central/South Asian populations. TagSNPs defined in the three reference populations are also efficient for more distant and differentiated populations (Oceania, Americas), in which the impact of their special demographic history on the genetic structure does not interfere with successfully detecting the most common haplotype variation. This high degree of portability lends promise to the search for disease association in different populations, once tagSNPs are defined in a few reference populations like those analyzed in the HapMap initiative.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Plot of the probability of SNPs being tagSNPs (bar graph, upper middle), added together for the six Asian populations studied; the bar is made up by the sum of the probabilities in the six populations, and thus its maximum value is six. According to the ldSelect algorithm used for tagSNP selection, one or more SNPs within a bin can be specified as a tagSNP, and only one tagSNP need be genotyped per bin. Probability values are from 0 (no new information given by the SNP within the bin) to 1 (unique tagSNP selected in a bin). These values are compared with LD values (D′ parameter), shown in the bottom part of the figure as performed with the Haploview software package. In the D-plot, each diagonal represents a different SNP, with each square representing a pairwise comparison between two SNPs. (Red squares) Statistically significant LD between the pair of SNPs; (dark red) the higher values of D′, up to a maximum of 1. (White squares) Pairwise D′ values <1 with no statistically significant evidence of LD. (Blue squares) Pairwise D′ values of 1 but without statistical significance. (Top) Physical map of the region is shown. Population abbreviations are as in Table 1.
Figure 2.
Figure 2.
Average maximum r2 values of non-tag SNPs in a population with tagSNPs selected in HapMap proxy populations from the same geographic region. Population abbreviations are as in Table 1. An r2 threshold of 0.64 is used for tagSNP selection and evaluation. (A) Values when tagSNPs defined in Yorubas from Africa are used in the rest of African populations, (B) tagSNPs defined in French being used in the rest of European populations, (C) tagSNPs defined in Han Chinese being used in the rest of East Asian populations. For each case results for the “blind” test (opaque bars) and “ideal” test (dashed bars) are shown. Detailed information on number of SNPs and distribution of r2 values can be found in Supplemental Table 1. The 95th percentile values are shown as central bars from each mean value.
Figure 3.
Figure 3.
Average maximum r2 obtained for non-tagSNPs when tagSNPs selected in the three reference populations are applied to populations of other geographic regions: Middle East, Central Asia, Oceania, and America. Population abbreviations are as in Table 1. For each case, results for the “blind” and “ideal” analysis are shown. Detailed information on the distribution can be found in Supplemental Table 1. The 95th percentile values are shown as central bars from each mean value.

References

    1. Ardlie, K.G., Kruglyak, L., and Seielstad, M. 2002. Patterns of linkage disequilibrium in the human genome. Nat. Rev. Genet. 3 299–309. - PubMed
    1. Bertranpetit, J., Calafell, F., Comas, D., González-Neira, A., and Navarro, A. 2003. Structure of linkage disequilibrium in humans: Genome factors and population stratification. In Cold Spring Harb. Symp. Quant. Biol., pp. 79–88. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY. - PubMed
    1. Cann, H.M., de Toma, C., Cazes, L., Legrand, M.F., Morel, V., Piouffre, L., Bodmer, J., Bonne-Tamir, B., Cambon-Thomsen, A., Chen, Z., et al. 2002. A human genome diversity cell line panel. Science 296 261–262. - PubMed
    1. Carlson, C.S., Eberle, M.A., Rieder, M.J., Smith, J.D., Kruglyak, L., and Nickerson, D.A. 2003. Additional SNPs and linkage-disequilibrium analyses are necessary for whole-genome association studies in humans. Nat. Genet. 33 518–521. - PubMed
    1. Carlson, C.S., Eberle, M.A., Kruglyak, L., and Nickerson, D.A. 2004a. Mapping complex disease loci in whole-genome association studies. Nature 429 446–452. - PubMed

Publication types