Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2010 Mar 1;19(5):761-73.
doi: 10.1093/hmg/ddp541. Epub 2009 Dec 5.

Population-genetic nature of copy number variations in the human genome

Affiliations

Population-genetic nature of copy number variations in the human genome

Mamoru Kato et al. Hum Mol Genet. .

Abstract

Copy number variations (CNVs) are universal genetic variations, and their association with disease has been increasingly recognized. We designed high-density microarrays for CNVs, and detected 3000-4000 CNVs (4-6% of the genomic sequence) per population that included CNVs previously missed because of smaller sizes and residing in segmental duplications. The patterns of CNVs across individuals were surprisingly simple at the kilo-base scale, suggesting the applicability of a simple genetic analysis for these genetic loci. We utilized the probabilistic theory to determine integer copy numbers of CNVs and employed a recently developed phasing tool to estimate the population frequencies of integer copy number alleles and CNV-SNP haplotypes. The results showed a tendency toward a lower frequency of CNV alleles and that most of our CNVs were explained only by zero-, one- and two-copy alleles. Using the estimated population frequencies, we found several CNV regions with exceptionally high population differentiation. Investigation of CNV-SNP linkage disequilibrium (LD) for 500-900 bi- and multi-allelic CNVs per population revealed that previous conflicting reports on bi-allelic LD were unexpectedly consistent and explained by an LD increase correlated with deletion-allele frequencies. Typically, the bi-allelic LD was lower than SNP-SNP LD, whereas the multi-allelic LD was somewhat stronger than the bi-allelic LD. After further investigation of tag SNPs for CNVs, we conclude that the customary tagging strategy for disease association studies can be applicable for common deletion CNVs, but direct interrogation is needed for other types of CNVs.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Definitions of CNVs and the typical observed pattern. (A) Definitions of CNVs. CNV segments are the chromosomal segments with CNVs for each individual (blue). CNV regions are the union of overlapping CNV segments (red). CNV events are the union of CNV segments that have the same start and end positions (black). CNV fragments are the parts of CNV segments that are divided with the start and end positions of any CNV segments (red circles). CNV fragment-sites are the union of CNV fragments (green). A fragment-segment rate is the proportion of the number of individuals with CNV fragments to the number of individuals with any CNV segments at a CNV fragment-site. (B) The typical segment pattern in a CNV region (chr 18: 45 938 595 to 45 956 033 for CEU). The red and four blue lines indicate a region (17 kb) and segments, respectively. Most CNV regions (89–90%) had a simple segment pattern characterized by two features: no individual with multiple segments and only one ‘core’ fragment-site, which was a fragment-site with a 100% fragment-segment rate.
Figure 2.
Figure 2.
Comparison of CNV regions in Nsp1.3M with those in 500KEA. The length of CNV regions detected with one platform versus the number of regions. The number of regions that did and did not overlap with those from the other platform is shown in red and olive, respectively. (A) CNV regions detected with Nsp1.3M. (B) CNV regions previously detected with 500KEA.
Figure 3.
Figure 3.
Frequency spectrums. These counts are based on allelic copy numbers and the derived diploid copy numbers that are classified by their population frequency. The width of each bin is 2%. Alleles with a very small or large frequency of <0.1% or >99.9% are excluded from the counts. (A) The allele frequency spectrum. (B) The frequency spectrum of diploid copy numbers.
Figure 4.
Figure 4.
CNV–SNP LD. LD versus distance for (A) bi-allelic CNVs and (B) two-way tri-allelic CNVs. The numbers of bi-allelic CNVs and tri-allelic CNVs were 503 and 875 and 32 and 22 for CEU and YRI, respectively. The distance between a CNV and a SNP was measured from either boundary of a CNV region to a SNP position. The distances were binned in a 10-kb width, and the median of (≥10) LD values was plotted against the middle distance of the bin. ‘SNP (permutated)’ indicates that the SNP genotype data were permutated across individuals, and the error bars indicate the standard deviation. ‘SNP (adjusted)’ indicates that the minor allele frequencies of one half of the SNP pairs in SNP–SNP LD were adjusted to those of CNVs. The larger and relatively larger frequencies indicate ≥10% and 1–10% frequencies of the deletion/duplication alleles, respectively.
Figure 5.
Figure 5.
Number of CNVs tagged by SNPs. Number of tagged CNVs versus the cutoff association values (R2 and conditional probability). We searched for tag SNPs up to 200 kb from the boundaries of each CNV region. ‘c.p.’ indicates conditional probability. (A) For bi-allelic CNVs. (B) For two-way tri-allelic CNVs.

Similar articles

Cited by

References

    1. Redon R., Ishikawa S., Fitch K.R., Feuk L., Perry G.H., Andrews T.D., Fiegler H., Shapero M.H., Carson A.R., Chen W., et al. Global variation in copy number in the human genome. Nature. 2006;444:444–454. - PMC - PubMed
    1. McCarroll S.A., Kuruvilla F.G., Korn J.M., Cawley S., Nemesh J., Wysoker A., Shapero M.H., de Bakker P.I., Maller J.B., Kirby A., et al. Integrated detection and population-genetic analysis of SNPs and copy number variation. Nat. Genet. 2008;40:1166–1174. - PubMed
    1. Sebat J. Major changes in our DNA lead to major changes in our thinking. Nat. Genet. 2007;39:S3–S5. - PubMed
    1. Komura D., Shen F., Ishikawa S., Fitch K.R., Chen W., Zhang J., Liu G., Ihara S., Nakamura H., Hurles M.E., et al. Genome-wide detection of human copy number variations using high-density DNA oligonucleotide arrays. Genome Res. 2006;16:1575–1584. - PMC - PubMed
    1. Conrad D.F., Hurles M.E. The population genetics of structural variation. Nat. Genet. 2007;39:S30–S36. - PMC - PubMed

Publication types