Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 Jul 3;45(W1):W201-W206.
doi: 10.1093/nar/gkx390.

DEOGEN2: prediction and interactive visualization of single amino acid variant deleteriousness in human proteins

Affiliations

DEOGEN2: prediction and interactive visualization of single amino acid variant deleteriousness in human proteins

Daniele Raimondi et al. Nucleic Acids Res. .

Abstract

High-throughput sequencing methods are generating enormous amounts of genomic data, giving unprecedented insights into human genetic variation and its relation to disease. An individual human genome contains millions of Single Nucleotide Variants: to discriminate the deleterious from the benign ones, a variety of methods have been developed that predict whether a protein-coding variant likely affects the carrier individual's health. We present such a method, DEOGEN2, which incorporates heterogeneous information about the molecular effects of the variants, the domains involved, the relevance of the gene and the interactions in which it participates. This extensive contextual information is non-linearly mapped into one single deleteriousness score for each variant. Since for the non-expert user it is sometimes still difficult to assess what this score means, how it relates to the encoded protein, and where it originates from, we developed an interactive online framework (http://deogen2.mutaframe.com/) to better present the DEOGEN2 deleteriousness predictions of all possible variants in all human proteins. The prediction is visualized so both expert and non-expert users can gain insights into the meaning, protein context and origins of each prediction.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Overview of the DEOGEN2 web server visualization. (1) The user starts to enter a Uniprot ID or sequence, which activates a dropdown list from which a human protein is selected. After pressing the play button, the user can navigate the sequence (2) to create and submit a variant for this sequence. After pressing ‘Submit sequence’, the variant is visualized in the page report (3) which contains two sections. The General section displays the change between the wild-type and variant amino acid, with the chemical structures of both shown, and the difference between the amino acids expressed on (A) the dashboard as a percentage; clicking on the percentage bar will show the breakdown of these components. The DEOGEN2 section shows the DEOGEN2 score with (B) a breakdown of the contribution of each machine learning feature, so informing the user about which contextual information was most important to reach the final score, and an overview of the raw features scores used as input for the machine learning. Section (C) shows the distribution of all the variant scores in this protein, including in a heat map format (not shown). Information on data points is obtained by hovering over them, the visualization can be changed by clicking on the buttons or the graph icon in the top right corner.

References

    1. van Dijk E.L., Auger H., Jaszczyszyn Y., Thermes C.. Ten years of next-generation sequencing technology. Trends Genet. 2014; 30:418–426. - PubMed
    1. Bamshad M.J., Ng S.B., Bigham A.W., Tabor H.K., Emond M.J., Nickerson D.A., Shendure J.. Exome sequencing as a tool for Mendelian disease gene discovery. Nat. Rev. Genet. 2011; 12:745–755. - PubMed
    1. Boycott K.M., Vanstone M.R., Bulman D.E., MacKenzie A.E.. Rare-disease genetics in the era of next-generation sequencing: discovery to translation. Nat. Rev. Genet. 2013; 14:681–691. - PubMed
    1. Johnston J.J., Biesecker L.G.. Databases of genomic variation and phenotypes: existing resources and future needs. Hum. Mol. Genet. 2013; 22:R27–R31. - PMC - PubMed
    1. 1000 Genomes Project Consortium Abecasis G.R., Altshuler D., Auton A., Brooks L.D., Durbin R.M., Gibbs R.A., Hurles M.E., McVean G.A.. A map of human genome variation from population-scale sequencing. Nature. 2010; 467:1061–1073. - PMC - PubMed

Publication types