Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Feb 21:14:1052383.
doi: 10.3389/fgene.2023.1052383. eCollection 2023.

Evaluation of AlphaFold structure-based protein stability prediction on missense variations in cancer

Affiliations

Evaluation of AlphaFold structure-based protein stability prediction on missense variations in cancer

Hilal Keskin Karakoyun et al. Front Genet. .

Abstract

Identifying pathogenic missense variants in hereditary cancer is critical to the efforts of patient surveillance and risk-reduction strategies. For this purpose, many different gene panels consisting of different number and/or set of genes are available and we are particularly interested in a panel of 26 genes with a varying degree of hereditary cancer risk consisting of ABRAXAS1, ATM, BARD1, BLM, BRCA1, BRCA2, BRIP1, CDH1, CHEK2, EPCAM, MEN1, MLH1, MRE11, MSH2, MSH6, MUTYH, NBN, PALB2, PMS2, PTEN, RAD50, RAD51C, RAD51D, STK11, TP53, and XRCC2. In this study, we have compiled a collection of the missense variations reported in any of these 26 genes. More than a thousand missense variants were collected from ClinVar and the targeted screen of a breast cancer cohort of 355 patients which contributed to this set with 160 novel missense variations. We analyzed the impact of the missense variations on protein stability by five different predictors including both sequence- (SAAF2EC and MUpro) and structure-based (Maestro, mCSM, CUPSAT) predictors. For the structure-based tools, we have utilized the AlphaFold (AF2) protein structures which comprise the first structural analysis of this hereditary cancer proteins. Our results agreed with the recent benchmarks that computed the power of stability predictors in discriminating the pathogenic variants. Overall, we reported a low-to-medium-level performance for the stability predictors in discriminating pathogenic variants, except MUpro which had an AUROC of 0.534 (95% CI [0.499-0.570]). The AUROC values ranged between 0.614-0.719 for the total set and 0.596-0.682 for the set with high AF2 confidence regions. Furthermore, our findings revealed that the confidence score for a given variant in the AF2 structure could alone predict pathogenicity more robustly than any of the tested stability predictors with an AUROC of 0.852. Altogether, this study represents the first structural analysis of the 26 hereditary cancer genes underscoring 1) the thermodynamic stability predicted from AF2 structures as a moderate and 2) the confidence score of AF2 as a strong descriptor for variant pathogenicity.

Keywords: AlphaFold; cancer; missense variants; pLDDT score; protein stability.

PubMed Disclaimer

Conflict of interest statement

Authors AY, CY were employed by the company Acibadem Health Group. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Figures

FIGURE 1
FIGURE 1
Distribution of variant characteristics collected from a cohort of 355 breast cancer patients. (A) shows the distributions of two groups: any type of variation detected and not detected in any of 26 genes; (B) shows distribution of the number of variants for each patient; (C–F) show the distribution of variants according to novelty, type, clinical significance and gene, respectively.
FIGURE 2
FIGURE 2
Missense variations from 26 genes in ClinVar. (A) shows the distributions based on annotation scores. (B) shows the distribution of clinical significance labels of the variants with at least 2 review scores. (C) shows the gene-distribution of benign and pathogenic variants from (B) First tick marks the frequency value of 5. (D) shows the distribution of VUS labels across genes.
FIGURE 3
FIGURE 3
AF2 predictions of the full-length structures. (A) shows the pairwise superimposition of the overlapped AF2 predictions. RMSD change in the Ca trace were shown and the paired number of atoms were given in parenthesis. (B) shows the superimposition of the crystal structure of the C-terminus of ATM (PDB ID: 7ni6). (C) shows the full-length structure of ATM colored according to the confidence score pLDDT. (D) shows the full-length structure of BRCA2 colored based on confidence of the prediction. (E) and (F) show the AF2 structures of RAD51C and XRCC2 respectively colored according to per-residue confidence scores (pLDDT) (Jumper et al., 2021). Heatmap insets show the predicted aligned error (PAE) of the predictions which shows positional error of each residue pair (Mariani et al., 2013; Jumper et al., 2021).
FIGURE 4
FIGURE 4
Performance of protein stability tools and two characteristics of AF2 structures. (A) ROC curve and (B) correlation analyses of ΔΔG predictors (pathogenic: 445, benign: 647).
FIGURE 5
FIGURE 5
Cross-correlation of stability predictors for the total set and for the regions with high confidence (pLDDT high).
FIGURE 6
FIGURE 6
Performance of pathogenicity prediction of two characteristics of AF2 structures. (A) ROC curve and (B) correlation analyses of pLDDT and rASA values of the variants in the AF2 structures and (C) pLDDT vs rASA scatter plot and (D) pLDDT distributions (benign: green, pathogenic: red).

References

    1. Adzhubei I., Jordan D. M., Sunyaev S. R. (2013). Predicting functional effect of human missense mutations using PolyPhen-2. Curr. Protoc. Hum. Genet. Chapter 7. Unit7 20. - PMC - PubMed
    1. Akbari M. R., Ghadirian P., Robidoux A., Foumani M., Sun Y., Royer R., et al. (2009). Germline RAP80 mutations and susceptibility to breast cancer. Breast Cancer Res. Treat. 113, 377–381. 10.1007/s10549-008-9938-z - DOI - PubMed
    1. Akdel M., Pires D. E. V., Pardo E. P., Janes J., Zalevsky A. O., Meszaros B., et al. (2022). A structural biology community assessment of AlphaFold2 applications. Nat. Struct. Mol. Biol. 29, 1056–1067. 10.1038/s41594-022-00849-w - DOI - PMC - PubMed
    1. Andreotti G., Guarracino M. R., Cammisa M., Correra A., Cubellis M. V. (2010). Prediction of the responsiveness to pharmacological chaperones: Lysosomal human alpha-galactosidase, a case of study. Orphanet J. rare Dis. 5, 36. 10.1186/1750-1172-5-36 - DOI - PMC - PubMed
    1. Angeli D., Salvi S., Tedaldi G. (2020). Genetic predisposition to breast and ovarian cancers: How many and which genes to test? Int. J. Mol. Sci. 21, 1128. 10.3390/ijms21031128 - DOI - PMC - PubMed