. 2010 Aug 20;5(8):e12185.

doi: 10.1371/journal.pone.0012185.

Novel association strategy with copy number variation for identifying new risk Loci of human diseases

Xianfeng Chen¹, Xinlei Li, Ping Wang, Yang Liu, Zhenguo Zhang, Guoping Zhao, Haiming Xu, Jun Zhu, Xueying Qin, Suchao Chen, Landian Hu, Xiangyin Kong

Affiliations

Affiliation

¹ The Key Laboratory of Stem Cell Biology, Institute of Health Sciences, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences and Shanghai Jiao Tong University School of Medicine, Shanghai, People's Republic of China.

PMID: 20808825
PMCID: PMC2924882
DOI: 10.1371/journal.pone.0012185

Novel association strategy with copy number variation for identifying new risk Loci of human diseases

Xianfeng Chen et al. PLoS One. 2010.

. 2010 Aug 20;5(8):e12185.

doi: 10.1371/journal.pone.0012185.

Authors

Xianfeng Chen¹, Xinlei Li, Ping Wang, Yang Liu, Zhenguo Zhang, Guoping Zhao, Haiming Xu, Jun Zhu, Xueying Qin, Suchao Chen, Landian Hu, Xiangyin Kong

Affiliation

¹ The Key Laboratory of Stem Cell Biology, Institute of Health Sciences, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences and Shanghai Jiao Tong University School of Medicine, Shanghai, People's Republic of China.

PMID: 20808825
PMCID: PMC2924882
DOI: 10.1371/journal.pone.0012185

Abstract

Background: Copy number variations (CNV) are important causal genetic variations for human disease; however, the lack of a statistical model has impeded the systematic testing of CNVs associated with disease in large-scale cohort.

Methodology/principal findings: Here, we developed a novel integrated strategy to test CNV-association in genome-wide case-control studies. We converted the single-nucleotide polymorphism (SNP) signal to copy number states using a well-trained hidden Markov model. We mapped the susceptible CNV-loci through SNP site-specific testing to cope with the physiological complexity of CNVs. We also ensured the credibility of the associated CNVs through further window-based CNV-pattern clustering. Genome-wide data with seven diseases were used to test our strategy and, in total, we identified 36 new susceptible loci that are associated with CNVs for the seven diseases: 5 with bipolar disorder, 4 with coronary artery disease, 1 with Crohn's disease, 7 with hypertension, 9 with rheumatoid arthritis, 7 with type 1 diabetes and 3 with type 2 diabetes. Fifteen of these identified loci were validated through genotype-association and physiological function from previous studies, which provide further confidence for our results. Notably, the genes associated with bipolar disorder converged in the phosphoinositide/calcium signaling, a well-known affected pathway in bipolar disorder, which further supports that CNVs have impact on bipolar disorder.

Conclusions/significance: Our results demonstrated the effectiveness and robustness of our CNV-association analysis and provided an alternative avenue for discovering new associated loci of human diseases.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: The authors have declared that no competing interests exist.

Figures

**Figure 1. CNV-association strategy transforms raw signal into copy number and detects association through site-specific testing and CNV-pattern clustering.**
(A) Relative intensity was log2-transformed value for the normalized intensity-sum of the SNP alleles. (B) the relative allele-ratio was actually a normalized anti-tangent value for the intensity ratio of SNP alleles. These two measurements were arranged along the chromosomal sequence as a hidden Markov model. (C) In this model (with well-trained parameters), the copy number could be calculated from the measurements on each SNP site and the neighboring copy numbers. (D) The copy numbers of a designated site for cases and controls were classified before performing the SNP site-based testing, a Chi-squared test with triple NULL hypotheses in which deletion (labeled as **Loss**), amplification (labeled as **Gain**) or both (labeled as **Abnm**) were viewed as abnormal. Copy numbers in a window centered to the significant SNP site (denoted in the orange box) were subjected to a complete linkage clustering (E). To this clustering heat map, a statistical test on the CNV-pattern (named as window-based testing) was used to reconfirm the significance of association. (See details in the **Materials and Methods** .)

**Figure 2. Thresholds for the significance of CNV-association and genome-wide distribution of the results in bipolar disorder.**
(A) In the SNP site-based testing, 1000 permutations were performed and the boundary P values (*Psnp*) were plotted against the false discovery rate (FDR) values, with different colors indicating the different hypotheses (blue for **Abnm**, green for **Loss** and red for **Gain**). FDR<0.05 (labeled with vertical dashed line) for each hypothesis was used to select 2488 SNPs as candidates for the window-based testing. (B) In the window-based testing, 25000 permutations were performed and the resulting P values (*Pwin*) were plotted against the FDR values. 401 SNP sites were selected as the final results, with an FDR of 2.35×10⁻³ (indicated by the vertical dashed line) to ensure that the false positives in all the results were less than 1. (C) The −log10 of the SNP site-based P values were plotted against the position on each chromosome. The three hypotheses are plotted in different panels, and the P values of the chromosomes are shown in alternating colors for clarity. The P values that passed the SNP site-based testing are highlighted in green, and the P values that passed the window-based testing are highlighted in yellow. The genome-wide distribution results for the seven diseases are in **Figure S1**.

**Figure 3. Comparison with the traditional genotype-association analysis demonstrates the priority of our method in CNV-regions.**
“**Gen**” labels the genotypic testing (a Chi-squared test with 2 degrees of freedom) results obtained from the WTCCC paper . The −log10 of SNP site-based P values in our study with the triple NULL hypotheses, in which deletion (A, labeled **Loss**), amplification (C, labeled **Gain**) and both (B, labeled **Abnm**) were evaluated separately, are plotted against the −log10 of the P value from the genotype-association test of WTCCC . For clarity, the genotype-association P values<10⁻⁵ are highlighted in green, the CNV-association P values that passed the single SNP site-based testing are in blue, and the CNV-association P values that passed the window-based testing are in red. The SNP sites that are absent from the genotype-association testing are plotted by default as zero (highlighted in brown), and the absent sites that passed the SNP site-based testing are labeled with black. The genotypic testing (**Gen**) and trend testing (**Add**, another testing for genotype tendency of disease in WTCCC [9]) for the seven disease are compared with our CNV-association results in **Figure S2**. (D) Evidence that CNVs can lead to chaotic genotyping clusters in copy number variable regions. All the 17000 individuals are labeled with grey, individuals with CNVs in the disease group are in red, and individuals with CNVs in controls are in green. More evidence of chaotic sample-wide intensity maps affected by CNVs can be found in **Figure S3**.

See this image and copyright information in PMC

Cited by

Genomic structural variations for cardiovascular and metabolic comorbidity.
Nazarenko MS, Sleptcov AA, Lebedev IN, Skryabin NA, Markov AV, Golubenko MV, Koroleva IA, Kazancev AN, Barbarash OL, Puzyrev VP. Nazarenko MS, et al. Sci Rep. 2017 Jan 25;7:41268. doi: 10.1038/srep41268. Sci Rep. 2017. PMID: 28120895 Free PMC article.
From the Eukaryotic Molybdenum Cofactor Biosynthesis to the Moonlighting Enzyme mARC.
Tejada-Jimenez M, Chamizo-Ampudia A, Calatrava V, Galvan A, Fernandez E, Llamas A. Tejada-Jimenez M, et al. Molecules. 2018 Dec 11;23(12):3287. doi: 10.3390/molecules23123287. Molecules. 2018. PMID: 30545001 Free PMC article. Review.
Rare genomic structural variants in complex disease: lessons from the replication of associations with obesity.
Walters RG, Coin LJ, Ruokonen A, de Smith AJ, El-Sayed Moustafa JS, Jacquemont S, Elliott P, Esko T, Hartikainen AL, Laitinen J, Männik K, Martinet D, Meyre D, Nauck M, Schurmann C, Sladek R, Thorleifsson G, Thorsteinsdóttir U, Valsesia A, Waeber G, Zufferey F, Balkau B, Pattou F, Metspalu A, Völzke H, Vollenweider P, Stefansson K, Järvelin MR, Beckmann JS, Froguel P, Blakemore AI. Walters RG, et al. PLoS One. 2013;8(3):e58048. doi: 10.1371/journal.pone.0058048. Epub 2013 Mar 12. PLoS One. 2013. PMID: 23554873 Free PMC article. Clinical Trial.
Genome-wide association study of copy number variation with lung function identifies a novel signal of association near BANP for forced vital capacity.
Shrine N, Tobin MD, Schurmann C, Soler Artigas M, Hui J, Lehtimäki T, Raitakari OT, Pennell CE, Ang QW, Strachan DP, Homuth G, Gläser S, Felix SB, Evans DM, Henderson J, Granell R, Palmer LJ, Huffman J, Hayward C, Scotland G, Malarstig A, Musk B, James AL; UK BiLEVE; Wain LV. Shrine N, et al. BMC Genet. 2016 Aug 11;17(1):116. doi: 10.1186/s12863-016-0423-0. BMC Genet. 2016. PMID: 27514831 Free PMC article.
The impact of genomics on pediatric research and medicine.
Connolly JJ, Hakonarson H. Connolly JJ, et al. Pediatrics. 2012 Jun;129(6):1150-60. doi: 10.1542/peds.2011-3636. Epub 2012 May 7. Pediatrics. 2012. PMID: 22566424 Free PMC article. Review.

See all "Cited by" articles

References

1. Redon R, Ishikawa S, Fitch KR, Feuk L, Perry GH, et al. Global variation in copy number in the human genome. Nature. 2006;444:444–454. - PMC - PubMed
1. Stranger BE, Forrest MS, Dunning M, Ingle CE, Beazley C, et al. Relative impact of nucleotide and copy number variation on gene expression phenotypes. Science. 2007;315:848–853. - PMC - PubMed
1. Gonzalez E, Kulkarni H, Bolivar H, Mangano A, Sanchez R, et al. The influence of CCL3L1 gene-containing segmental duplications on HIV-1/AIDS susceptibility. Science. 2005;307:1434–1440. - PubMed
1. Diskin SJ, Hou C, Glessner JT, Attiyeh EF, Laudenslager M, et al. Copy number variation at 1q21.1 associated with neuroblastoma. Nature. 2009;459:987–991. - PMC - PubMed
1. Sebat J, Lakshmi B, Malhotra D, Troge J, Lese-Martin C, et al. Strong association of de novo copy number mutations with autism. Science. 2007;316:445–449. - PMC - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

WT_/Wellcome Trust/United Kingdom

LinkOut - more resources

Full Text Sources
Other Literature Sources
- The Lens - Patent Citations Database
Molecular Biology Databases
- GlyGen glycoinformatics resource

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Novel association strategy with copy number variation for identifying new risk Loci of human diseases

Affiliation

Novel association strategy with copy number variation for identifying new risk Loci of human diseases

Authors

Affiliation

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources

Molecular Biology Databases

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

Related information

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources

Molecular Biology Databases