Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2016 Oct;18(10):1029-36.
doi: 10.1038/gim.2015.208. Epub 2016 Feb 18.

Establishing the precise evolutionary history of a gene improves prediction of disease-causing missense mutations

Affiliations

Establishing the precise evolutionary history of a gene improves prediction of disease-causing missense mutations

Ogun Adebali et al. Genet Med. 2016 Oct.

Abstract

Purpose: Predicting the phenotypic effects of mutations has become an important application in clinical genetic diagnostics. Computational tools evaluate the behavior of the variant over evolutionary time and assume that variations seen during the course of evolution are probably benign in humans. However, current tools do not take into account orthologous/paralogous relationships. Paralogs have dramatically different roles in Mendelian diseases. For example, whereas inactivating mutations in the NPC1 gene cause the neurodegenerative disorder Niemann-Pick C, inactivating mutations in its paralog NPC1L1 are not disease-causing and, moreover, are implicated in protection from coronary heart disease.

Methods: We identified major events in NPC1 evolution and revealed and compared orthologs and paralogs of the human NPC1 gene through phylogenetic and protein sequence analyses. We predicted whether an amino acid substitution affects protein function by reducing the organism's fitness.

Results: Removing the paralogs and distant homologs improved the overall performance of categorizing disease-causing and benign amino acid substitutions.

Conclusion: The results show that a thorough evolutionary analysis followed by identification of orthologs improves the accuracy in predicting disease-causing missense mutations. We anticipate that this approach will be used as a reference in the interpretation of variants in other genetic diseases as well.Genet Med 18 10, 1029-1036.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Relationships between Patched domain–containing proteins. (a) The domain architectures of human Patched domain–containing proteins were retrieved using the CDvist Web server. Boxes with a white background represent PFAM domains. Cholesterol-binding domains (in blue) were retrieved using a PDB database profile. The cholesterol-binding domain was found exclusively in NPC1 and NPC1L1. (b) Some pairs such as PTCH1-PTCH2, NPC1-NPC1L1, and PTCHD3-PTCHD4 have a relatively recent common ancestor, whereas the other proteins are related to each other more distantly, as they are represented as single clades on the phylogenetic tree. According to the phylogenetic tree, the NPC1-NPC1L1 clade is clearly separated from other Patched domain–containing sequences. PDB, protein data bank.
Figure 2
Figure 2
Maximum likelihood phylogenetic tree of NPC1 proteins and described sets. The star is placed at the root of full-length NPC1. On the left side, the black markers represent the closest NPC1 to the root for each organism. Green markers (Sets 1 and 2) show the orthologs, whereas red markers point to paralogs. Blue markers represent sequences that are ambiguous in terms of orthology. The gray-shaded clade contains a short version of NPC1. Set 1, which contains the HsNPC1 orthologs after the most recent duplication, is a subset of Set 2.
Figure 3
Figure 3
SAVER algorithm workflow.
Figure 4
Figure 4
An alignment window illustrating false effects of paralogs in predicting damaging mutations. Blue-shaded sequences are HsNPC1 orthologs and the rest are paralogs. For each tool, the red marker represents “predicted as damaging” and the green marker represents “predicted as benign.” Residues highlighted in red are the potential causes of predicting pathogenic variants as benign.

References

    1. Katsanis SH, Katsanis N. Molecular genetic testing and the future of clinical genomics. Nat Rev Genet 2013;14:415–426. - PMC - PubMed
    1. Chang F, Li MM. Clinical application of amplicon-based next-generation sequencing in cancer. Cancer Genet 2013;206:413–419. - PubMed
    1. Ng SB, Buckingham KJ, Lee C, et al. Exome sequencing identifies the cause of a mendelian disorder. Nat Genet 2010;42:30–35. - PMC - PubMed
    1. Oliver GR, Hart SN, Klee EW. Bioinformatics for clinical next generation sequencing. Clin Chem 2015;61:124–135. - PubMed
    1. Tennessen JA, Bigham AW, O’Connor TD, et al.; Broad GO; Seattle GO; NHLBI Exome Sequencing Project. Evolution and functional impact of rare coding variation from deep sequencing of human exomes. Science 2012;337:64–69. - PMC - PubMed

MeSH terms

Substances

LinkOut - more resources