Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2014 Feb;36(2):191-9.
doi: 10.1002/bies.201300126. Epub 2013 Dec 5.

Functional interpretation of non-coding sequence variation: concepts and challenges

Affiliations
Free PMC article

Functional interpretation of non-coding sequence variation: concepts and challenges

Dirk S Paul et al. Bioessays. 2014 Feb.
Free PMC article

Abstract

Understanding the functional mechanisms underlying genetic signals associated with complex traits and common diseases, such as cancer, diabetes and Alzheimer's disease, is a formidable challenge. Many genetic signals discovered through genome-wide association studies map to non-protein coding sequences, where their molecular consequences are difficult to evaluate. This article summarizes concepts for the systematic interpretation of non-coding genetic signals using genome annotation data sets in different cellular systems. We outline strategies for the global analysis of multiple association intervals and the in-depth molecular investigation of individual intervals. We highlight experimental techniques to validate candidate (potential causal) regulatory variants, with a focus on novel genome-editing techniques including CRISPR/Cas9. These approaches are also applicable to low-frequency and rare variants, which have become increasingly important in genomic studies of complex traits and diseases. There is a pressing need to translate genetic signals into biological mechanisms, leading to prognostic, diagnostic and therapeutic advances.

Keywords: GWAS; chromatin; complex traits; gene regulation; genome editing; regulatory variants.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Genome annotation data in different cellular systems guide the functional interpretation of genetic variation. The pros and cons of annotation data sets obtained in different cellular systems are indicated on the left panel. Currently, publicly available resources mainly consist of data sets derived from primary cell cultures and transformed cell lines, while iPSCs may become more prominent in the future. Indeed, iPSC-derived cells may represent the key technological advance for assaying inaccessible cell types for which primary cultures cannot be obtained. Functional annotation of genetic signals at GWAS intervals can be restricted to selected cell types that are most relevant to the trait of interest (A), or unrestricted using all available cell types in an annotation resource (B). The latter approach may be valuable if target cell types of the trait are not yet established. To gain biological insights into the genetic architecture of complex traits, all GWAS signals may be collectively analyzed and associated with gene pathways and networks using bioinformatic tools (C). In parallel, individual GWAS intervals may be studied in depth using a range of in vitro and in vivo experimental assays (D). The use of emerging genome-editing techniques is illustrated in detail in Fig. 2. Abbreviations: LD, linkage disequilibrium; siRNA, small interfering RNA; TALEN, transcription activator-like effector nuclease; CRISPR, clustered regularly interspaced short palindromic repeats; 3C, chromosome conformation capture.
Figure 2
Figure 2
Investigating molecular consequences of candidate functional variants using CRISPR/Cas9. The advent of novel genome-editing techniques, such as CRISPR/Cas9, enables exciting new opportunities for validating GWAS candidate regulatory sites and genes. CRISPR/Cas9 in conjunction with customizable guide RNA can be used to precisely target genomic sites of interest to induce loss-of-function alterations. In addition, CRISPR-associated catalytically inactive Cas9 protein (dCas9) can be fused to different effector domains, including VP64 (activation), KRAB (repression), LSD1 (histone demethylation, specifically H3K4me2 and H3K27ac), and TET family proteins (DNA demethylation). Upon introduction of the CRISPR/(d)Cas9-complex into a cellular system, the molecular consequences of the genome editing can be further investigated.

References

    1. Mardis ER. A decade's perspective on DNA sequencing technology. Nature. 2011;470:198–203. - PubMed
    1. McCarthy MI, Abecasis GR, Cardon LR, Goldstein DB, et al. Genome-wide association studies for complex traits: consensus, uncertainty and challenges. Nat Rev Genet. 2008;9:356–69. - PubMed
    1. Donnelly P. Progress and challenges in genome-wide association studies in humans. Nature. 2008;456:728–31. - PubMed
    1. Visscher PM, Brown MA, McCarthy MI, Yang J. Five years of GWAS discovery. Am J Hum Genet. 2012;90:7–24. - PMC - PubMed
    1. Manolio TA, Collins FS, Cox NJ, Goldstein DB, et al. Finding the missing heritability of complex diseases. Nature. 2009;461:747–53. - PMC - PubMed

Publication types