Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Jul 22;14(1):77.
doi: 10.1186/s13073-022-01082-2.

Assessing the clinical utility of protein structural analysis in genomic variant classification: experiences from a diagnostic laboratory

Affiliations

Assessing the clinical utility of protein structural analysis in genomic variant classification: experiences from a diagnostic laboratory

Richard C Caswell et al. Genome Med. .

Abstract

Background: The widespread clinical application of genome-wide sequencing has resulted in many new diagnoses for rare genetic conditions, but testing regularly identifies variants of uncertain significance (VUS). The remarkable rise in the amount of genomic data has been paralleled by a rise in the number of protein structures that are now publicly available, which may have clinical utility for the interpretation of missense and in-frame insertions or deletions.

Methods: Within a UK National Health Service genomic medicine diagnostic laboratory, we investigated the number of VUS over a 5-year period that were evaluated using protein structural analysis and how often this analysis aided variant classification.

Results: We found 99 novel missense and in-frame variants across 67 genes that were initially classified as VUS by our diagnostic laboratory using standard variant classification guidelines and for which further analysis of protein structure was requested. Evidence from protein structural analysis was used in the re-assessment of 64 variants, of which 47 were subsequently reclassified as pathogenic or likely pathogenic and 17 remained as VUS. We identified several case studies where protein structural analysis aided variant interpretation by predicting disease mechanisms that were consistent with the observed phenotypes, including loss-of-function through thermodynamic destabilisation or disruption of ligand binding, and gain-of-function through de-repression or escape from proteasomal degradation.

Conclusions: We have shown that using in silico protein structural analysis can aid classification of VUS and give insights into the mechanisms of pathogenicity. Based on our experience, we propose a generic evidence-based workflow for incorporating protein structural information into diagnostic practice to facilitate variant classification.

Keywords: Genomic medicine; Missense variant; Modelling; Pathogenicity; Prediction; Protein structure; Variant classification; Variant interpretation.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no competing interests.

Figures

Fig. 1
Fig. 1
Evidence-based workflow for structural and sequence analysis of missense variants. The generic workflow used for analysis of missense variants proceeds through a short series of questions. Following the initial question (“Is there an experimental structure for the human protein or domain?”), the analysis pathway is then determined by the level of evidence available for each variant, and may differ on a case-by-case basis
Fig. 2
Fig. 2
CASR NM_000388.3:c.488C > G, p.(Pro163Arg). A Structure of the inactive form of CaSR (PDB 5k5t) around Pro163. Protein ribbon is coloured from N-terminal, blue, to C-terminal, red, except for Pro163 (carbon atoms coloured magenta); sidechain atoms of Pro163 and near neighbours are shown in stick format; the grey sphere shows a bound calcium ion. B As A, but showing the predicted structure of the p.(Pro163Arg) variant
Fig. 3
Fig. 3
GNAO1 NM_020988.3:c.980C > A p.(Thr327Lys). A Predicted structure of GNAO1 residues 3–347, modelled on template 6crk chain A; the protein is coloured grey by default, with residues of the five nucleotide binding G boxes blue; the view shows both the protein ribbon and surface, sliced through to demonstrate the interior of the binding pocket; the Thr327 sidechain and guanosine diphosphate (GDP) ligand are shown as space-filling spheres, with carbon atoms of Thr327 coloured orange. B As A, but showing the predicted structure of the p.(Thr327Lys) variant; note that the novel lysine sidechain is predicted to occlude the binding pocket, with ligand absent from the predicted structure. Models obtained using PDB 3c7k as template were essentially identical to those shown here for 6crk-based modelling
Fig. 4
Fig. 4
MAP2K1 NM_002755.4:c.149 T > C p.(Leu50Pro). A Structure of MEK1 residues 39–381 in complex with an adenosine triphosphate (ATP) analogue and an inhibitor compound (PDB 3eqc); default colouring is grey, with residues of the NRR (44–58) and kinase domain (68–361) coloured light green or cyan, respectively; additionally, Leu50 is coloured magenta, with the sidechain shown in stick format, while positions of missense variants reported as pathogenic in HGMD (class DM) are coloured red. B As A, but magnified to show detail around Leu50, for which all atoms are shown as space-filling spheres; spheres are also shown for sidechains atoms of Asn122 (carbon atoms cyan) and Pro124 (carbon atoms red), which lie in van der Waals contact with Leu50. C As B, but showing the predicted structure of the p.(Leu50Pro) variant. D The upper part shows the schematic organisation of MEK1; the grey bar indicates a region of predicted disorder (residues 1–27), while green and cyan bars show the NRR and protein kinase domains, respectively; triangles below show the location of variants reported in HGMD (red, pathogenic/class DM; orange, possibly pathogenic/class DM?), while the site of the p.(Leu50Pro) variant is shown by a magenta triangle. The lower part shows the predicted thermodynamic effect of the VUS p.(Leu50Pro) (magenta fill) and all HGMD missense variants (red fill, class DM; orange fill, class DM?) on MEK1 stability calculated in PDB 3uqc, and is aligned to the upper schematic; the light green-shaded region in the graph shows the extent of the NRR (green shading), while cyan-shaded regions show residues of the kinase domain which lie in contact with the NRR (NRRI: NRR-interacting); note that the most destabilising variants, including p.(Leu50Pro), all occur in the NRR or NRRI regions; these include three variants at Pro124, which interacts directly with Leu50 (vertical broken lines)
Fig. 5
Fig. 5
WNK1 NM_018979.3:c.1903G > A, p.(Asp635Asn). The upper part of the figure shows results of ELM analysis (http://elm.eu.org) for residues 601–700 of native WNK1 (left) and the p.(Asp635Asn) variant (right); upper tracks show predicted sites of phosphorylation (PhosphoELM), conserved domains (SMART/Pfam) and underlying structure or disorder (GlobProt, IUPRED, Secondary Structure tracks); below these tracks are lists of matches to short linear motifs, ranked by score with red shading indicating high confidence; the top hit (and only high confidence scoring motif) in the native sequence was for the Kelch-binding degron motif, DEG_Kelch_KLHL3_1 (boxed in red); this motif was not identified in the variant sequence (upper right panel). The table below shows a detailed description from the ELM server of the DEG_Kelch_KLHL3_1 motif; the right column shows the search pattern for this motif, which is shown below in the context of the WNK1 sequence (Asp635 shown in red font)

References

    1. Tang H, Thomas PD. Tools for predicting the functional impact of nonsynonymous genetic variation. Genetics. 2016;203(2):635–647. doi: 10.1534/genetics.116.190033. - DOI - PMC - PubMed
    1. Gunning AC, Fryer V, Fasham J, Crosby AH, Ellard S, Baple EL, Wright CF. Assessing performance of pathogenicity predictors using clinically relevant variant datasets. J Med Genet. 2021;58(8):547–555. doi: 10.1136/jmedgenet-2020-107003. - DOI - PMC - PubMed
    1. Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, et al. The Protein Data Bank. Nucleic Acids Res. 2000;28(1):235–242. doi: 10.1093/nar/28.1.235. - DOI - PMC - PubMed
    1. Ittisoponpisan S, Islam SA, Khanna T, Alhuzimi E, David A, Sternberg MJE. Can predicted protein 3D structures provide reliable insights into whether missense variants are disease associated? J Mol Biol. 2019;431(11):2197–2212. doi: 10.1016/j.jmb.2019.04.009. - DOI - PMC - PubMed
    1. David A, Islam S, Tankhilevich E, Sternberg MJE. The AlphaFold database of protein structures: a biologist’s guide. J Mol Biol. 2022;434(2):167336. doi: 10.1016/j.jmb.2021.167336. - DOI - PMC - PubMed

Publication types

LinkOut - more resources