Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2014 Jan 9:15:5.
doi: 10.1186/1471-2105-15-5.

Quantitative prediction of the effect of genetic variation using hidden Markov models

Affiliations

Quantitative prediction of the effect of genetic variation using hidden Markov models

Mingming Liu et al. BMC Bioinformatics. .

Abstract

Background: With the development of sequencing technologies, more and more sequence variants are available for investigation. Different classes of variants in the human genome have been identified, including single nucleotide substitutions, insertion and deletion, and large structural variations such as duplications and deletions. Insertion and deletion (indel) variants comprise a major proportion of human genetic variation. However, little is known about their effects on humans. The absence of understanding is largely due to the lack of both biological data and computational resources.

Results: This paper presents a new indel functional prediction method HMMvar based on HMM profiles, which capture the conservation information in sequences. The results demonstrate that a scoring strategy based on HMM profiles can achieve good performance in identifying deleterious or neutral variants for different data sets, and can predict the protein functional effects of both single and multiple mutations.

Conclusions: This paper proposed a quantitative prediction method, HMMvar, to predict the effect of genetic variation using hidden Markov models. The HMM based pipeline program implementing the method HMMvar is freely available at https://bioinformatics.cs.vt.edu/zhanglab/hmm.

PubMed Disclaimer

Figures

Figure 1
Figure 1
A pipeline of variant prediction using HMMvar.
Figure 2
Figure 2
HMMvar score distribution of the dbSNP dataset. (a) Histogram of HMMvar scores for disease associated indels and nondisease associated indels. (b) Distribution of sample means of HMMvar scores from the two categories (LSDB and nonLSDB).
Figure 3
Figure 3
Distributions of HMMvar scores for different types of variants.
Figure 4
Figure 4
Compare HMMvar prediction with SIFT Indel prediction on dbSNP indel dataset. Distributions of HMMvar of indels that are predicted as damaging (left) and neutral (right) by SIFT Indel.
Figure 5
Figure 5
HMMvar and Provean score distributions and mean/error bars of TP53 mutations binned into 15 classes in terms of transactivity level. (a) HMMvar score distribution of the 15 classes (x-axis represents the 15 classes based on the median of transactivity levels). (b) Provean score distribution of the 15 classes. (c) Mean along with error bar of HMMvar scores in each class. (d) Mean along with error bar of Provean scores in each class.
Figure 6
Figure 6
ROC curve and standard error of the HMMvar score and the Provean score. (a) ROC curve of the Provean score and the HMMvar score to distinguish "nonfunctional" and "partly functional" classes from "functional" and "supertrans" classes. (b) Standard error of the mean of Provean and HMMvar scores in the 15 transactivity level classes.
Figure 7
Figure 7
The HMMvar score of TP53 variants grouped by SIFT SNP prediction.
Figure 8
Figure 8
The relationship between the HMMvar score and the position of an artificially introduced variant.

Similar articles

Cited by

References

    1. Sherry S, Ward M, Kholodov M. dbSNP: the ncbi database of genetic variation. Nucleic Acids Res. 2001;29(1):308–311. doi: 10.1093/nar/29.1.308. - DOI - PMC - PubMed
    1. MacDonald JR. et al.The database of genomic variants: a curated collection of structural variation in the human genome. Nucleic Acids Res. 2013;42:D986–D992. - PMC - PubMed
    1. Stenson P, Mort M, Ball E. The human gene mutation database: 2008 update. Genome Med. 2009;22(1):13. - PMC - PubMed
    1. Flicek P, Amode M, Barrell D. Ensembl 2012. Nucleic Acids Res. 2012;40:D84–D90. doi: 10.1093/nar/gkr991. - DOI - PMC - PubMed
    1. Forbes S, Bindal N, Bamford S. Cosmic: mining complete cancer genomes in the catalogue of somatic mutations in cancer. Nucleic Acids Res. 2010;39:D945–D950. - PMC - PubMed

Publication types

LinkOut - more resources