Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 May;29(5):809-818.
doi: 10.1101/gr.243592.118. Epub 2019 Apr 2.

A new approach for rare variation collapsing on functional protein domains implicates specific genic regions in ALS

Affiliations

A new approach for rare variation collapsing on functional protein domains implicates specific genic regions in ALS

Sahar Gelfman et al. Genome Res. 2019 May.

Abstract

Large-scale sequencing efforts in amyotrophic lateral sclerosis (ALS) have implicated novel genes using gene-based collapsing methods. However, pathogenic mutations may be concentrated in specific genic regions. To address this, we developed two collapsing strategies: One focuses rare variation collapsing on homology-based protein domains as the unit for collapsing, and the other is a gene-level approach that, unlike standard methods, leverages existing evidence of purifying selection against missense variation on said domains. The application of these two collapsing methods to 3093 ALS cases and 8186 controls of European ancestry, and also 3239 cases and 11,808 controls of diversified populations, pinpoints risk regions of ALS genes, including SOD1, NEK1, TARDBP, and FUS While not clearly implicating novel ALS genes, the new analyses not only pinpoint risk regions in known genes but also highlight candidate genes as well.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Gene and regional collapsing. (A) A standard gene-based approach for collapsing analysis of nonsynonymous and canonical splice rare variants in cases (green) and controls (black) on example Gene A. (B) A domain-unit–based regional approach in which only the domains that are intolerant to functional variation are considered as units for collapsing. (C) Intolerance-informed gene collapsing: a regional approach to gene collapsing in which the unit for collapsing is the entire gene, yet missense variants only qualify for the analysis if they reside in domains that are intolerant to variation (domain 1 and 3). Loss-of-function variants (big circles) continue to qualify regardless of whether they reside in a tolerant or intolerant domain of the gene. Bright blue background marks qualifying region.
Figure 2.
Figure 2.
Q-Q plots of gene- and domain-level collapsing. (A) The results for a standard gene-level collapsing of 3093 cases and 8186 controls; 18,065 covered genes passed QC with more than one case or control carrier for this test. The genes with the top associations and FUS gene are labeled. The genomic inflation factor, lambda (λ), is 1.10. (B) The results for the domain-based collapsing of 3093 cases and 8186 controls; 70,603 covered domains passed QC with more than one case or control carrier for this test. The genes with the top associations are labeled and genome-wide significant genes are in bold. λ = 1.046.
Figure 3.
Figure 3.
Intolerance-informed gene-level collapsing with unified/diversified ancestry samples. (A) A Q-Q plot presenting the results of the gene-based intolerance-informed collapsing of 3239 cases and 11,808 controls from diversified ancestries. Missense variants are aggregated only if they reside in an intolerant domain that is lower than the 50th percentile OE-ratio score, while loss-of-function variants are aggregated independent of location; 17,795 genes passed QC with more than one case or control carrier for this test. The genes with the top associations are labeled. λ = 1.14. (B) A Q-Q plot of a gene-based intolerance-informed collapsing of 3093 cases and 8186 controls of European ancestry; 18,135 genes passed QC with more than one case or control carrier for this test. The genes with the top associations are labeled and genome-wide significant genes are in bold. λ = 1.073.
Figure 4.
Figure 4.
Distribution of functional coding variants across LGALSL and PKP4. The distribution of LGALSL (A) and PKP4 (B) coding variants across domains (LGALSL transcript ENST00000238875 and PKP4 transcript ENST00000389757). The y-axis corresponds to the total number of variants identified at a specific location. The blue boxes highlight the (A) LGALSL carbohydrate-binding domain and (B) PKP4 armadillo repeat domain 2 (ARM2) found to be enriched for variants in cases (green) compared to controls (black). Each domain's OE-ratio percentile is marked above for both tolerant (bright blue) and intolerant (orange) domains.

References

    1. Bannwarth S, Ait-El-Mkadem S, Chaussenot A, Genin EC, Lacas-Gervais S, Fragaki K, Berg-Alonso L, Kageyama Y, Serre V, Moore DG, et al. 2014. A mitochondrial origin for frontotemporal dementia and amyotrophic lateral sclerosis through CHCHD10 involvement. Brain 137: 2329–2345. 10.1093/brain/awu138 - DOI - PMC - PubMed
    1. Becher A, Eiseler T, Porzner M, Walther P, Keil R, Bobrovich S, Hatzfeld M, Seufferlein T. 2017. The armadillo protein p0071 controls KIF3 motor transport. J Cell Sci 130: 3374–3387. 10.1242/jcs.200170 - DOI - PubMed
    1. Cingolani P, Platts A, Wang le L, Coon M, Nguyen T, Wang L, Land SJ, Lu X, Ruden DM. 2012. A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster strain w1118; iso-2; iso-3. Fly (Austin) 6: 80–92. 10.4161/fly.19695 - DOI - PMC - PubMed
    1. Cirulli ET, Lasseigne BN, Petrovski S, Sapp PC, Dion PA, Leblond CS, Couthouis J, Lu YF, Wang Q, Krueger BJ, et al. 2015. Exome sequencing in amyotrophic lateral sclerosis identifies risk genes and pathways. Science 347: 1436–1441. 10.1126/science.aaa3650 - DOI - PMC - PubMed
    1. Cruchaga C, Karch CM, Jin SC, Benitez BA, Cai Y, Guerreiro R, Harari O, Norton J, Budde J, Bertelsen S, et al. 2014. Rare coding variants in the phospholipase D3 gene confer risk for Alzheimer's disease. Nature 505: 550–554. 10.1038/nature12825 - DOI - PMC - PubMed

Publication types