Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2023 Jun;39(6):462-490.
doi: 10.1016/j.tig.2023.02.014. Epub 2023 Mar 28.

Functional characterization of human genomic variation linked to polygenic diseases

Affiliations
Review

Functional characterization of human genomic variation linked to polygenic diseases

Tania Fabo et al. Trends Genet. 2023 Jun.

Abstract

The burden of human disease lies predominantly in polygenic diseases. Since the early 2000s, genome-wide association studies (GWAS) have identified genetic variants and loci associated with complex traits. These have ranged from variants in coding sequences to mutations in regulatory regions, such as promoters and enhancers, as well as mutations affecting mediators of mRNA stability and other downstream regulators, such as 5' and 3'-untranslated regions (UTRs), long noncoding RNA (lncRNA), and miRNA. Recent research advances in genetics have utilized a combination of computational techniques, high-throughput in vitro and in vivo screening modalities, and precise genome editing to impute the function of diverse classes of genetic variants identified through GWAS. In this review, we highlight the vastness of genomic variants associated with polygenic disease risk and address recent advances in how genetic tools can be used to functionally characterize them.

Keywords: CRISPR; GWAS variants; MPRA; chromatin capture; colocalization.

PubMed Disclaimer

Conflict of interest statement

Declaration of interests No interests are declared.

Figures

Figure 1.
Figure 1.. Genome-wide association studies (GWAS) identify a diverse catalog of variants which affect gene expression and protein function across the central dogma.
(a) Workflow for GWA studies. (b) Distribution of GWAS-identified variants across variant classes (based on the 2022 GWAS Catalog), highlighting large proportion of noncoding variants. “Regulatory and Other Noncoding” includes variants designated as regulatory, intronic, and other noncoding in the GWAS catalog. (c) Variants which are found to be associated with disease can disrupt gene expression and protein function through diverse mechanisms. Yellow stars highlight common genetic elements and mechanisms impacted by trait- and disease-associated variants. The diverse mechanisms highlight the need for diverse tools to functionally annotate variants. CM = chromatin modifier; TF = transcription factor.
Figure 2.
Figure 2.. Variants in various classes of genomic elements have diverse effects on gene expression and protein translation.
(a) Promoter and enhancer variants, as well as some types of lncRNA variants, disrupt the regulation of gene expression, decreasing transcription. (b) Splice site variants can result in abnormally spliced transcripts and the production of altered protein. (c) Coding variants can disrupt protein function by various mechanisms, including protein misfolding. (d) Variants in miRNA or 3’ UTR sequences can disrupt miRNA-mediated regulation of mRNA transcript abundance, resulting in mRNA retention. (e) Variants in some types of lncRNAs can disrupt the ability of lncRNAs to function as an miRNA sponge, resulting in mRNA degradation. (f) 5’ UTR variants can alter binding of regulatory proteins, resulting in altered translation.
Figure 3.
Figure 3.. Overview of high-throughput screening tools used to test variant function.
(a) Plasmid-based reporter screening tools have been adapted for various types of variants, with readouts specifically tailored to the variant effects on function. This includes MPRA (enhancer), MPRAu (3’ UTR), VAMP-Seq (coding), and ASSET-Seq (splicing). T = test; C = constant. (b) Oligo-based transcription factor binding screening tools used to test ability of variant to disrupt TF binding. REEL-Seq relies on differential migration oligos bound vs unbound to TF. SNPs-Seq uses protein purification columns to enrich for oligo SNPs bound to TFs. SNP-Seq uses restriction enzyme digest to test the ability of a variant to disrupt TF binding.
Figure 4.
Figure 4.. Overview of tools to characterize diverse types of genomic loci.
(a) CRISPR tools, including CRISPRi and CRISPRa, have been used to characterize coding and noncoding regions. CRISPR screens like Perturb-Seq can be leveraged to study the effect of a CRISPR perturbation on gene expression in a high-throughput, single-cell manner. shRNA and siRNA screens can similarly be used to test for the effect of gene silencing, this time at the RNA level, on a phenotype. (b) Colocalization of GWAS susceptibility loci with eQTLs is a common strategy for identifying target genes for non-coding variants/loci. Traditional eQTL mapping uses hundreds of individuals to identify associations between a variant and a change in gene expression. eQTL databases include eQTLGen and GTEx. Single-cell eQTL mapping using high MOI CRISPR perturbation can bypass the need for hundreds of human samples. (c) Hi-C, HiChIP, and ChIA-PET are chromatin capture techniques used to identify physical contacts between DNA to predict enhancer-promoter interactions. HiChIP and ChIA-PET allow for capture of specific proteins to enrich for enhancer-promoter loops.
Figure 5.
Figure 5.. CRISPR-based precision gene editing tools.
Homology-directed repair (HDR), base editing, and prime editing are all tools for introducing a precise genetic edit of a variant (SNP, indel, or CNV) at its native genomic context. HDR involves the generation of double-stranded breaks using CRISPR-Cas9, followed by homology-directed repair using a supplied repair template with homology to the DNA region of interest. Base editors use a dCas9 fused to a cytidine deaminase, adenosine deaminase, or uracil DNA glycosylase which can introduce a single base change. Prime editing uses nCas9 and reverse transcriptase to insert a precise edit included in the pegRNA template.
Figure 6.
Figure 6.. Flowchart showing how to go from GWAS to disease/trait-relevant function.
A diverse array of computational, high-throughput screening, gene editing, and other tools can be used to annotate coding and/or noncoding variants. In all cases, the goal is to identify the causal gene driving GWAS disease heritability and characterize the disease- or trait-relevant function to provide better understanding of disease and guide future therapeutics.

References

    1. Visscher PM et al. (2012) Five Years of GWAS Discovery. Am. J. Hum. Genet. 90, 7–24 - PMC - PubMed
    1. Visscher PM et al. (2017) 10 Years of GWAS Discovery: Biology, Function, and Translation. Am. J. Hum. Genet. 101, 5–22 - PMC - PubMed
    1. Buniello A et al. (2019) The NHGRI-EBI GWAS Catalog of published genome-wide association studies, targeted arrays and summary statistics 2019. Nucleic Acids Res. 47, D1005–D1012 - PMC - PubMed
    1. Glazier AM et al. (2002) Finding Genes That Underlie Complex Traits. Science 298, 2345–2349 - PubMed
    1. Schaid DJ et al. (2018) From genome-wide associations to candidate causal variants by statistical fine-mapping. Nat. Rev. Genet. 19, 491–504 - PMC - PubMed

Publication types

MeSH terms