Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2010 Jun 15;26(12):i287-93.
doi: 10.1093/bioinformatics/btq199.

Recognition of beta-structural motifs using hidden Markov models trained with simulated evolution

Affiliations

Recognition of beta-structural motifs using hidden Markov models trained with simulated evolution

Anoop Kumar et al. Bioinformatics. .

Abstract

Motivation: One of the most successful methods to date for recognizing protein sequences that are evolutionarily related, has been profile hidden Markov models. However, these models do not capture pairwise statistical preferences of residues that are hydrogen bonded in beta-sheets. We thus explore methods for incorporating pairwise dependencies into these models.

Results: We consider the remote homology detection problem for beta-structural motifs. In particular, we ask if a statistical model trained on members of only one family in a SCOP beta-structural superfamily, can recognize members of other families in that superfamily. We show that HMMs trained with our pairwise model of simulated evolution achieve nearly a median 5% improvement in AUC for beta-structural motif recognition as compared to ordinary HMMs.

Availability: All datasets and HMMs are available at: http://bcb.cs.tufts.edu/pairwise/.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
Training HMMs by (A) a pointwise mutation model, (B) a pairwise mutation model and (C) combining (A and B).
Fig. 2.
Fig. 2.
Variation in SD of MEP for HMM training augmented with 10–100 sequences based on the point mutation model.
Fig. 3.
Fig. 3.
Variation in SD of MEP for HMM training augmented with 10–1000 sequences based on pairwise β-sheet mutation model.
Fig. 4.
Fig. 4.
Median percent AUC improvement with mutation rate for HMMs trained with pointwise mutations. The maximum median improvement is 3.72% at 15% mutation rate.
Fig. 5.
Fig. 5.
Median percent AUC improvement with mutation rate for HMMs trained SAM with pairwise mutations. The maximum median improvement is 4.79% at 50% mutation rate.
Fig. 6.
Fig. 6.
Median percent AUC improvement with mutation rate for HMMs trained with and dataset augmented with combined pointwise and pairwise mutations. The maximum median improvement is 4.95% at pairwise mutation rate of 10% and pointwise mutation rate of 15%.
Fig. 7.
Fig. 7.
Distribution of families with improved performance for pointwise mutation model.
Fig. 8.
Fig. 8.
Distribution of families with improved performance for pairwise mutation model.

Similar articles

Cited by

References

    1. Am Busch MS, et al. Computational protein design as a tool for fold recognition. Proteins: Struct. Funct. Bioinformatics. 2009;77:139–158. - PubMed
    1. Bradley P, et al. Betawrap: successful prediction of parallel β-helices from primary sequence reveals an association with many microbial pathogens. Proc. Natl. Acad. Sci. USA. 2001;98:14819–14824. - PMC - PubMed
    1. Bryan AW, et al. BETASCAN: probable β-amyloids identified by pairwise probabilistic analysis. PLoS Comput. Biol. 2009;5:e1000333. - PMC - PubMed
    1. Chandonia JM, et al. The ASTRAL compendium in 2004. Nucleic Acids Res. 2004;32:D189–D192. - PMC - PubMed
    1. Cheng J, Baldi P. A machine learning information retrieval approach to protein fold recognition. Bioinformatics. 2006;22:1456–1463. - PubMed

Publication types