. 2016 Oct 14;44(18):8641-8654.

doi: 10.1093/nar/gkw519. Epub 2016 Jun 8.

Explaining the disease phenotype of intergenic SNP through predicted long range regulation

Jingqi Chen¹, Weidong Tian²

Affiliations

¹ State Key Laboratory of Genetic Engineering and Collaborative Innovation Center for Genetics and Development, School of Life Sciences, Fudan University, Shanghai 200436, P.R. China Department of Biostatistics and Computational Biology, School of Life Sciences, Fudan University, Shanghai 200436, P.R. China.
² State Key Laboratory of Genetic Engineering and Collaborative Innovation Center for Genetics and Development, School of Life Sciences, Fudan University, Shanghai 200436, P.R. China Department of Biostatistics and Computational Biology, School of Life Sciences, Fudan University, Shanghai 200436, P.R. China weidong.tian@fudan.edu.cn.

PMID: 27280978
PMCID: PMC5062962
DOI: 10.1093/nar/gkw519

Explaining the disease phenotype of intergenic SNP through predicted long range regulation

Jingqi Chen et al. Nucleic Acids Res. 2016.

. 2016 Oct 14;44(18):8641-8654.

doi: 10.1093/nar/gkw519. Epub 2016 Jun 8.

Authors

Jingqi Chen¹, Weidong Tian²

Affiliations

¹ State Key Laboratory of Genetic Engineering and Collaborative Innovation Center for Genetics and Development, School of Life Sciences, Fudan University, Shanghai 200436, P.R. China Department of Biostatistics and Computational Biology, School of Life Sciences, Fudan University, Shanghai 200436, P.R. China.
² State Key Laboratory of Genetic Engineering and Collaborative Innovation Center for Genetics and Development, School of Life Sciences, Fudan University, Shanghai 200436, P.R. China Department of Biostatistics and Computational Biology, School of Life Sciences, Fudan University, Shanghai 200436, P.R. China weidong.tian@fudan.edu.cn.

PMID: 27280978
PMCID: PMC5062962
DOI: 10.1093/nar/gkw519

Abstract

Thousands of disease-associated SNPs (daSNPs) are located in intergenic regions (IGR), making it difficult to understand their association with disease phenotypes. Recent analysis found that non-coding daSNPs were frequently located in or approximate to regulatory elements, inspiring us to try to explain the disease phenotypes of IGR daSNPs through nearby regulatory sequences. Hence, after locating the nearest distal regulatory element (DRE) to a given IGR daSNP, we applied a computational method named INTREPID to predict the target genes regulated by the DRE, and then investigated their functional relevance to the IGR daSNP's disease phenotypes. 36.8% of all IGR daSNP-disease phenotype associations investigated were possibly explainable through the predicted target genes, which were enriched with, were functionally relevant to, or consisted of the corresponding disease genes. This proportion could be further increased to 60.5% if the LD SNPs of daSNPs were also considered. Furthermore, the predicted SNP-target gene pairs were enriched with known eQTL/mQTL SNP-gene relationships. Overall, it's likely that IGR daSNPs may contribute to disease phenotypes by interfering with the regulatory function of their nearby DREs and causing abnormal expression of disease genes.

PubMed Disclaimer

Figures

**Figure 1.**
(A) The proportions of daSNPs with respect to their relative locations to protein-coding genes in the genome. IGR refers to intergenic region. (B) The numbers of IGR daSNPs located at different distance cutoff to their nearest DHSs. (C) The workflow for explaining the disease phenotype of IGR daSNPs.

**Figure 2.**
(A) The numbers of IGR daSNP-disease associations found positive by different methods (the occurrence, the ORA and the relevance analysis) without or with considering LD SNPs. Random refers to the results using the same number of randomly selected genes as the predicted target genes (LD SNPs were not considered for Random). (B) Similar to (A) except that the numbers of positive IGR daSNPs were reported. (C and D) show the proportions of different categories of explainable IGR daSNP-disease associations without or with considering LD SNPs, respectively.

**Figure 3.**
Examples of explainable IGR daSNP-disease pairs. (**A–C**) were three ‘highly likely’ explainable IGR daSNP-disease associations. (D and E) were two ‘mechanistically likely’ explainable IGR daSNP-disease associations. (F) was an IGR daSNP-disease association explained through the LD SNPs. In (**A–F**), predicted target genes were presented schematically according to their relative distances to the corresponding daSNP. Blue color referred to known disease genes, and orange color referred to eQTL/mQTL-validated genes. Grey color represented other target genes. The arrows upon a rectangle illustrated the transcription directions of the gene. In (D and E), rectangles in different concentration of purple represented genes with different ranks of functional relevance scores with the corresponding disease.

**Figure 4.**
(A) The distributions of the numbers of IGR daSNPs in terms of the number of relevant diseases (e.g. > 10 relevant diseases) determined by the ORA or the relevance approaches. (B) The distribution of the numbers of disease genes in terms of the number of associated diseases. (C) Boxplots for the relative similarity among the relevant diseases found by the ORA or the relevance analysis for each IGR daSNP. The relative similarity was defined as the average Jaccard similarity among the relevant diseases divided by the average Jaccard similarity between all pairs of diseases. (D) Boxplots for the relative similarity between the relevant disease and the annotated diseases of an IGR daSNPs. Here, only those IGR daSNPs whose relevant diseases did not include the annotated diseases were considered, and the relative similarity was defined as the average Jaccard similarity between the relevant diseases and the annotated disease divided by the average Jaccard similarity between the annotated disease and all other diseases.

**Figure 5.**
(A) The proportions of the predicted IGR daSNP-target gene pairs by each of the five component methods among the predicted pairs by INTREPID. (B) The relative proportions of IGR daSNP-target gene pairs that were predicted by only the investigated method (unique predictions) or by the investigated method and also the other methods (cross-confirmed predictions) among the predicted pairs by the investigated method. (C) The numbers of IGR daSNP-disease associations identified by the occurrence, the ORA, or the relevance analysis using the predicted target genes from only HIC, or by successively adding the predicted target genes from each of PPC, ENCODE, PreSTIGE and IM-PET. (D) Similar to (C) except that the numbers of different categories of ‘explainable’ IGR daSNP-disease associations were reported.

See this image and copyright information in PMC

Cited by

A Multi-Dimensional Approach to Map Disease Relationships Challenges Classical Disease Views.
Möbus L, Serra A, Fratello M, Pavel A, Federico A, Greco D. Möbus L, et al. Adv Sci (Weinh). 2024 Aug;11(30):e2401754. doi: 10.1002/advs.202401754. Epub 2024 Jun 5. Adv Sci (Weinh). 2024. PMID: 38840452 Free PMC article.
Cancer-specific expression quantitative loci are affected by expression dysregulation.
Sheng Q, Samuels DC, Yu H, Ness S, Zhao YY, Guo Y. Sheng Q, et al. Brief Bioinform. 2020 Jan 17;21(1):338-347. doi: 10.1093/bib/bby108. Brief Bioinform. 2020. PMID: 30475999 Free PMC article.
The rs13388259 Intergenic Polymorphism in the Genomic Context of the BCYRN1 Gene Is Associated with Parkinson's Disease in the Hungarian Population.
Márki S, Göblös A, Szlávicz E, Török N, Balicza P, Bereznai B, Takáts A, Engelhardt J, Klivényi P, Vécsei L, Molnár MJ, Nagy N, Széll M. Márki S, et al. Parkinsons Dis. 2018 Apr 3;2018:9351598. doi: 10.1155/2018/9351598. eCollection 2018. Parkinsons Dis. 2018. PMID: 29850016 Free PMC article.
Intergenic SNPs in Obstructive Sleep Apnea Syndrome: Revealing Metabolic, Oxidative Stress and Immune-Related Pathways.
Raptis DG, Vavougios GD, Siachpazidou DI, Pastaka C, Xiromerisiou G, Gourgoulianis KI, Malli F. Raptis DG, et al. Diagnostics (Basel). 2021 Sep 24;11(10):1753. doi: 10.3390/diagnostics11101753. Diagnostics (Basel). 2021. PMID: 34679450 Free PMC article.
Characterisation of the role played by ELMO1, GPR141 and the intergenic polymorphism rs918980 in Fuchs' dystrophy in the Indian population.
Sharma S, Basak SK, Das S, Alone DP. Sharma S, et al. FEBS Open Bio. 2025 May;15(5):822-835. doi: 10.1002/2211-5463.70006. Epub 2025 Feb 19. FEBS Open Bio. 2025. PMID: 39967558 Free PMC article.

See all "Cited by" articles

References

1. Welter D., MacArthur J., Morales J., Burdett T., Hall P., Junkins H., Klemm A., Flicek P., Manolio T., Hindorff L. The NHGRI GWAS Catalog, a curated resource of SNP-trait associations. Nucleic Acids Res. 2014;42:D1001–D1006. - PMC - PubMed
1. Stranger B.E., Stahl E.A., Raj T. Progress and promise of genome-wide association studies for human complex trait genetics. Genetics. 2011;187:367–383. - PMC - PubMed
1. Maurano M.T., Humbert R., Rynes E., Thurman R.E., Haugen E., Wang H., Reynolds A.P., Sandstrom R., Qu H., Brody J., et al. Systematic localization of common disease-associated variation in regulatory DNA. Science. 2012;337:1190–1195. - PMC - PubMed
1. Trynka G., Sandor C., Han B., Xu H., Stranger B.E., Liu X.S., Raychaudhuri S. Chromatin marks identify critical cell types for fine mapping complex trait variants. Nat. Genet. 2013;45:124–130. - PMC - PubMed
1. Claussnitzer M., Dankel S.N., Klocke B., Grallert H., Glunk V., Berulava T., Lee H., Oskolkov N., Fadista J., Ehlers K., et al. Leveraging cross-species transcription factor binding site patterns: from diabetes risk Loci to disease mechanisms. Cell. 2014;156:343–358. - PMC - PubMed

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Substances

Actions

LinkOut - more resources

Full Text Sources
Other Literature Sources
- scite Smart Citations
Research Materials
- NCI CPTC Antibody Characterization Program

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Explaining the disease phenotype of intergenic SNP through predicted long range regulation

Affiliations

Explaining the disease phenotype of intergenic SNP through predicted long range regulation

Authors

Affiliations

Abstract

Figures

Similar articles

Cited by

References

MeSH terms

Substances

LinkOut - more resources

Full Text Sources

Other Literature Sources

Research Materials

Abstract

Figures

Similar articles

Cited by

References

MeSH terms

Substances

Related information

LinkOut - more resources

Full Text Sources

Other Literature Sources

Research Materials