Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 May 6;8(18):eabj1624.
doi: 10.1126/sciadv.abj1624. Epub 2022 May 6.

CancerVar: An artificial intelligence-empowered platform for clinical interpretation of somatic mutations in cancer

Affiliations

CancerVar: An artificial intelligence-empowered platform for clinical interpretation of somatic mutations in cancer

Quan Li et al. Sci Adv. .

Abstract

Several knowledgebases are manually curated to support clinical interpretations of thousands of hotspot somatic mutations in cancer. However, discrepancies or even conflicting interpretations are observed among these databases. Furthermore, many previously undocumented mutations may have clinical or functional impacts on cancer but are not systematically interpreted by existing knowledgebases. To address these challenges, we developed CancerVar to facilitate automated and standardized interpretations for 13 million somatic mutations based on the AMP/ASCO/CAP 2017 guidelines. We further introduced a deep learning framework to predict oncogenicity for these variants using both functional and clinical features. CancerVar achieved satisfactory performance when compared to several independent knowledgebases and, using clinically curated datasets, demonstrated practical utility in classifying somatic variants. In summary, by integrating clinical guidelines with a deep learning framework, CancerVar facilitates clinical interpretation of somatic variants, reduces manual work, improves consistency in variant classification, and promotes implementation of the guidelines.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.. Summary of the functionality of CancerVar and descriptions of 12 types of evidence.
AWS, Amazon Web Services; LOF, Loss of Function; MAF, Minor allele frequency; HGMD, Human Gene Mutation Database.
Fig. 2.
Fig. 2.. Workflow and architecture of the generator and discriminator/classifier used in OPAI.
The generator contains three linear layers with batch normalization, LeakyReLu as the activation layer, and a 60% dropout rate in each layer. The final layer is a linear layer with batch normalization and tanh as the activation layer. For the discriminator we implemented three Convolutional Neural Network (CNN) layers with tanh as the activation layer.
Fig. 3.
Fig. 3.. Comparison of the interpretation of 43 variants between 20 pathologists and CancerVar.
The heatmap shows the ratio of 20 pathologists voting for the four tiers: tier I, strong clinical significance (SCS); tier II, potential clinical significance (PCS); tier III, variant of uncertain clinical significance (VUS); and tier IV, benign/likely benign (B/LB). The last two columns are CancerVar-predicted scores and classifications. CancerVar showed an 81% (17 of 21) agreement rate with pathologists’ majority voting for tier I/II and a 60.5% (26 of 43) agreement rate for all tiers. This agreement rate is comparable to the 58% agreement rate among the 20 pathologists, but CancerVar can automate the interpretation process. P, Pathogenic/strong clinical significance; LP:Likely Pathogenic/potential clinical significance; B:(Likely)Benign.
Fig. 4.
Fig. 4.. UpSet plot highlighting the intersection of multiple methods with oncogenic prediction from different datasets.
(A) Mutations were taken from the OncoKB dataset. (B) Mutations were taken from CIViC. (C) Mutations were taken from the IARC TP53 transactivation dataset. (D) Mutations were taken from in vitro cell viability by Ng et al. (42).
Fig. 5.
Fig. 5.. Performance comparisons.
(A and B) Receiver operating characteristic (ROC) curves for performance comparison between OPAI and five other machine learning algorithms, including gradient boosting tree (GBT), support vector machine (SVM), AdaBoost (ADA), random forest (RF), and XGBoost (XGB), and five other in silico predictive tools using 6226 somatic mutations as the testing set. (C and D) Area under the precision-recall curve (AUPRC) comparison between OPAI and five other machine learning tools and in silico predictive tools. OPAI outperformed any individual tool in the prediction of somatic driver mutations in cancer. TPR, true-positive rate; FPR, false-positive rate.
Fig. 6.
Fig. 6.. A use case of using rule-based and deep learning-based models in CancerVar for interpretation of FOXA1 variants.
– We queried the FOXA1 mutation R219C in prostate cancer. The rule-based prediction of this variant was tier III (uncertain significance), with a score of 7, which is very close to tier II. However, the OPAI model predicted this variant to be oncogenic, with a score of 0.99. On the basis of a manual review of the results, we suggest that this variant has clinical significance.

Similar articles

Cited by

References

    1. Chakravarty D., Gao J., Phillips S. M., Kundra R., Zhang H., Wang J., Rudolph J. E., Yaeger R., Soumerai T., Nissan M. H., Chang M. T., Chandarlapaty S., Traina T. A., Paik P. K., Ho A. L., Hantash F. M., Grupe A., Baxi S. S., Callahan M. K., Snyder A., Chi P., Danila D., Gounder M., Harding J. J., Hellmann M. D., Iyer G., Janjigian Y., Kaley T., Levine D. A., Lowery M., Omuro A., Postow M. A., Rathkopf D., Shoushtari A. N., Shukla N., Voss M., Paraiso E., Zehir A., Berger M. F., Taylor B. S., Saltz L. B., Riely G. J., Ladanyi M., Hyman D. M., Baselga J., Sabbatini P., Solit D. B., Schultz N., OncoKB: A precision oncology knowledge base. JCO Precis. Oncol. 2017, (2017). - PMC - PubMed
    1. Bailey M. H., Tokheim C., Porta-Pardo E., Sengupta S., Bertrand D., Weerasinghe A., Colaprico A., Wendl M. C., Kim J., Reardon B., Ng P. K.-S., Jeong K. J., Cao S., Wang Z., Gao J., Gao Q., Wang F., Liu E. M., Mularoni L., Rubio-Perez C., Nagarajan N., Cortes-Ciriano I., Zhou D. C., Liang W. W., Hess J. M., Yellapantula V. D., Tamborero D., Gonzalez-Perez A., Suphavilai C., Ko J. Y., Khurana E., Park P. J., Van Allen E. M., Liang H.; MC3 Working Group; Cancer Genome Atlas Research Network, Lawrence M. S., Lawrence M. S., Godzik A., Lopez-Bigas N., Stuart J., Wheeler D., Getz G., Chen K., Lazar A. J., Mills G. B., Karchin R., Ding L., Comprehensive characterization of cancer driver genes and mutations. Cell 174, 1034–1035 (2018). - PMC - PubMed
    1. Micheel C. M., Sweeney S. M., LeNoue-Newton M. L., Andre F., Bedard P. L., Guinney J., Meijer G. A., Rollins B. J., Sawyers C. L., Schultz N., Shaw K. R. M., Velculescu V. E., Levy M. A.; AACR Project GENIE Consortium , American Association for Cancer Research Project Genomics Evidence Neoplasia Information Exchange: From inception to first data release and beyond-lessons learned and member institutions’ perspectives. JCO Clin. Cancer Inform. 2, 1–14 (2018). - PMC - PubMed
    1. Griffith M., Spies N. C., Krysiak K., McMichael J. F., Coffman A. C., Danos A. M., Ainscough B. J., Ramirez C. A., Rieke D. T., Kujan L., Barnell E. K., Wagner A. H., Skidmore Z. L., Wollam A., Liu C. J., Jones M. R., Bilski R. L., Lesurf R., Feng Y. Y., Shah N. M., Bonakdar M., Trani L., Matlock M., Ramu A., Campbell K. M., Spies G. C., Graubert A. P., Gangavarapu K., Eldred J. M., Larson D. E., Walker J. R., Good B. M., Wu C., Su A. I., Dienstmann R., Margolin A. A., Tamborero D., Lopez-Bigas N., Jones S. J., Bose R., Spencer D. H., Wartman L. D., Wilson R. K., Mardis E. R., Griffith O. L., CIViC is a community knowledgebase for expert crowdsourcing the clinical interpretation of variants in cancer. Nat. Genet. 49, 170–174 (2017). - PMC - PubMed
    1. Huang L., Fernandes H., Zia H., Tavassoli P., Rennert H., Pisapia D., Imielinski M., Sboner A., Rubin M. A., Kluk M., Elemento O., The cancer precision medicine knowledge base for structured clinical-grade mutations and interpretations. J. Am. Med. Inform. Assoc. 24, 513–519 (2017). - PMC - PubMed