Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Nov 14;187(23):6725-6741.e13.
doi: 10.1016/j.cell.2024.09.003. Epub 2024 Sep 30.

Pervasive mislocalization of pathogenic coding variants underlying human disorders

Affiliations

Pervasive mislocalization of pathogenic coding variants underlying human disorders

Jessica Lacoste et al. Cell. .

Abstract

Widespread sequencing has yielded thousands of missense variants predicted or confirmed as disease causing. This creates a new bottleneck: determining the functional impact of each variant-typically a painstaking, customized process undertaken one or a few genes and variants at a time. Here, we established a high-throughput imaging platform to assay the impact of coding variation on protein localization, evaluating 3,448 missense variants of over 1,000 genes and phenotypes. We discovered that mislocalization is a common consequence of coding variation, affecting about one-sixth of all pathogenic missense variants, all cellular compartments, and recessive and dominant disorders alike. Mislocalization is primarily driven by effects on protein stability and membrane insertion rather than disruptions of trafficking signals or specific interactions. Furthermore, mislocalization patterns help explain pleiotropy and disease severity and provide insights on variants of uncertain significance. Our publicly available resource extends our understanding of coding variation in human diseases.

PubMed Disclaimer

Conflict of interest statement

Declaration of interests A.E.C. serves as a scientific advisor for Recursion, Quiver, and SyzOnc, which use image-based profiling for drug discovery, and receives honoraria for occasional talks at pharmaceutical and biotechnology companies. F.P.R. is a scientific advisor and investor in Constantiam Biosciences, which provides tools for clinical variant annotation.

Figures

Figure 1.
Figure 1.. Systematic profiling of subcellular localization of missense variants.
A) Sources of variants used in this study. B) Available ClinVar annotations for the variants used in this study. C) Reported inheritance pattern of variants used in this study. D) Pipeline for high-content screen for protein localization. E) Computational pipeline for analyzing localization patterns and comparison of reference alleles and variants. F) Examples of variants with high, medium and low similarity to the reference allele. G) Reference protein localization in this study compared to other large-scale datasets. The dashed line shows the percentage of constructs for which at least one localization annotation matches the annotation in the indicated dataset. The density map represents the overlap of 10,000 random permutations of localization patterns with the same dataset. H) Impact score of localization patterns between reference alleles and missense variants for visually identified hits and non-hits. Statistical significance was calculated with a Mann-Whitney test. I) Impact score of localization patterns between reference alleles and missense variants for non-hits, low penetrance hits and high penetrance hits. Statistical significance was calculated with ANOVA with Tukey’s correction for multiple testing. See also Figures S1 and S2.
Figure 2.
Figure 2.. Mislocalization map of missense variants.
The overall pattern of mislocalization is shown in the center. Each line represents a mislocalized variant. The color of the line indicates destination compartment (i.e. mislocalization compartment). Examples of mislocalized variants for different categories are shown with manual annotation and the associated disease phenotype. Scale bar, 20 µm. Some cellular structures were created with Biorender.com. See also Figures S2 and S3.
Figure 3.
Figure 3.. Mislocalization and cellular compartments.
A) Fraction of genes with at least one mislocalized variant as a function of the number of variants tested. B) Mislocalization affects some genes more than others. Fraction of mislocalized variants for all genes compared to those genes for which already one variant is mislocalized. Statistical significance was calculated with Fisher’s exact test. C) Mislocalization affects some compartments more than others. Relative enrichment of mislocalized variants by localization of the reference protein. Red and blue circles represent compartments from where significantly more (red) or fewer (blue) variants are mislocalized. D) Mislocalized variants are enriched in proteins normally localized to the secretory compartment. Statistical significance was calculated with Fisher’s exact test. E) Examples of mislocalized variants of cytoskeletal proteins. Top, keratin proteins forming distinct punctae. Bottom, mislocalized variants of tubulin and doublecortin, a microtubule-associated protein. F) Examples of missense variants forming distinct foci. Missense variants of SPOP associated with prostate cancer form more foci than the reference protein, whereas coding variants associated with endometrial cancer do not form foci. G) Inheritance pattern of mislocalized variants forming distinct foci and those with other localization patterns. Statistical significance was calculated with Fisher’s exact test. H) Comparison of mislocalization results from this study and from Banani et al. for variants predicted to dysregulate biomolecular condensates. Statistical significance was calculated with Fisher’s exact test with Bonferroni correction for multiple hypotheses. See also Figure S4.
Figure 4.
Figure 4.. Features associated with mislocalization.
A) Mislocalized variants are enriched in pathogenic and likely pathogenic variants. B) Mislocalized variants are predicted to be more damaging by AlphaMissense. C) Pathogenic and likely pathogenic variants are mislocalized more often than benign or likely benign variants. D) The subcellular localization of 95 additional benign and likely benign variants for 37 genes was assessed and their mislocalization rate was compared to all pathogenic variants in the same gene set. E) Variants causing mislocalization are not enriched in post-translational modification sites. Total number of variants in PTM sites is indicated inside each bar; mislocalized n = 250, normally localized n = 2,030. F) Variants causing mislocalization do not disrupt protein-protein interactions more often than variants leading to normal localization, as assessed by yeast two-hybrid assay. Mislocalized n = 41, normally localized n = 254. G) Variants causing mislocalization are not enriched in signal peptides. Mislocalized n = 250, normally localized n = 2,030. H) Variants causing mislocalization are highly enriched in transmembrane domains. Mislocalized n = 250, normally localized n = 2,030. I) Mislocalized variants interact more with chaperones and quality-control factors than normally localized variants, as determined by quantitative high-throughput protein/protein interaction assay LUMIER. Mislocalized n = 190, normally localized n = 1,416. J-K) Comparison of chaperone and quality control factor interactions of reference proteins and mislocalized or normally localized proteins for Hsp70/HSPA8 (J) and Grp78/HSPA5 (K). Statistical significance was calculated with a chi-square test (panels A and B), Fisher’s exact test (panels D-H), and Mann-Whitney test (panels I-K). See also Figure S4.
Figure 5.
Figure 5.. Mislocalization and disease phenotypes.
A) Variants of the same gene that have a distinct localization pattern are more often associated with distinct disease phenotypes than variants that are similarly localized. Statistical significance was calculated with Fisher’s exact test. B) Top, loss of membrane localization of PLP1 variants is concordant with disease manifestation. Bottom, loss of intermediate filament staining and appearance of distinct punctae (arrowheads) with GFAP variants correlates with age of onset. C-D) The subcellular localization of 49 3xFLAG-V5 tagged GFAP variants with a known disease severity (age of onset) was assessed in HeLa cells and compared to wild-type GFAP (shown as a dashed line). The fraction of cells with diffuse cytoplasmic GFAP not associated with intermediate filaments (C) or with cytoplasmic GFAP aggregates (D) was measured manually. Statistical significance was calculated with a two-sided Student’s t-test. ***, p < 0.001; **, p < 0.01; *, p < 0.05. See also Figure S5.
Figure 6.
Figure 6.. Functional characterization of ACTB and SMAD2 variants.
A) Distinct localization of beta-actin variants underlying different diseases. Filamentous actin staining with phalloidin (magenta) shows distinct patterns with wild-type actin and each mutant. B) Proximity interactomes of wild-type actin and R183W and E364K variants were determined by BioID in HEK293 cells. The graph shows selected interactions; the full dataset is available in Table S1 C) Mislocalization of SMAD2 D304G variant from the nucleus to the cytoplasm. Wild-type and mutant SMAD2–3xFLAG-V5 constructs were transfected into HeLa and U2OS cells and stained with anti-FLAG antibody (green). D) SMAD2 D304G interacts less with the transcriptional co-regulator SKI and more with the TGFβ receptor TGFBR1. Indicated 3xFLAG-V5 tagged constructs were co-transfected into HEK293T cells with NanoLuc-tagged wild-type SMAD2 or the D304G variant, and interaction was assayed with LUMIER assay. Statistical significance was calculated with ANOVA with Tukey’s correction for multiple hypotheses. ***, p < 0.001, *, p < 0.05. E) SMAD2 D304G is a weaker transactivator than wild-type SMAD2. Indicated constructs were co-transfected with 3TP-lux reporter and NanoLuc control into MDA-231 cells and the cells were treated with vehicle control or TGFβ. Transactivation activity was measured with luciferase assay. The ratio between Firefly and NanoLuc luminescence was normalized to EGFP control with vehicle treatment. Statistical significance was calculated with ANOVA with Tukey’s correction for multiple hypotheses. ***, p < 0.001, *, p < 0.05. See also Figure S5.

Update of

References

    1. Tabet D, Parikh V, Mali P, Roth FP, and Claussnitzer M (2022). Scalable Functional Assays for the Interpretation of Human Genetic Variation. Annu. Rev. Genet 56, 441–465. 10.1146/annurev-genet-072920-032107. - DOI - PubMed
    1. Taipale M (2018). Disruption of protein function by pathogenic mutations: common and uncommon mechanisms. Biochem. Cell Biol 37, 508. - PubMed
    1. Capriotti E, Ozturk K, and Carter H (2019). Integrating molecular networks with genetic variant interpretation for precision medicine. WIREs Syst. Biol. Med 11. 10.1002/wsbm.1443. - DOI - PMC - PubMed
    1. Jänes J, Müller M, Selvaraj S, Manoel D, Stephenson J, Gonçalves C, Lafita A, Polacco B, Obernier K, Alasoo K, et al. (2024). Predicted mechanistic impacts of human protein missense variants. Preprint, 10.1101/2024.05.29.596373. - DOI
    1. Vihinen M (2021). Functional effects of protein variants. Biochimie 180, 104–120. 10.1016/j.biochi.2020.10.009. - DOI - PubMed

LinkOut - more resources