Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2006 Jul 1;34(Web Server issue):W356-61.
doi: 10.1093/nar/gkl309.

MAGIIC-PRO: detecting functional signatures by efficient discovery of long patterns in protein sequences

Affiliations

MAGIIC-PRO: detecting functional signatures by efficient discovery of long patterns in protein sequences

Chen-Ming Hsu et al. Nucleic Acids Res. .

Erratum in

  • Nucleic Acids Res. 2008 Mar;36(4):1400-6

Corrected and republished in

Abstract

This paper presents a web service named MAGIIC-PRO, which aims to discover functional signatures of a query protein by sequential pattern mining. Automatic discovery of patterns from unaligned biological sequences is an important problem in molecular biology. MAGIIC-PRO is different from several previously established methods performing similar tasks in two major ways. The first remarkable feature of MAGIIC-PRO is its efficiency in delivering long patterns. With incorporating a new type of gap constraints and some of the state-of-the-art data mining techniques, MAGIIC-PRO usually identifies satisfied patterns within an acceptable response time. The efficiency of MAGIIC-PRO enables the users to quickly discover functional signatures of which the residues are not from only one region of the protein sequences or are only conserved in few members of a protein family. The second remarkable feature of MAGIIC-PRO is its effort in refining the mining results. Considering large flexible gaps improves the completeness of the derived functional signatures. The users can be directly guided to the patterns with as many blocks as that are conserved simultaneously. In this paper, we show by experiments that MAGIIC-PRO is efficient and effective in identifying ligand-binding sites and hot regions in protein-protein interactions directly from sequences. The web service is available at http://biominer.bime.ntu.edu.tw/magiicpro and a mirror site at http://biominer.cse.yzu.edu.tw/magiicpro.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Examples of pattern snapshots and the conservation plot provided by MAGIIC-PRO. (a) The complete conservation plot derived from all the patterns. (b) Top 10 high-support patterns with three or more blocks. (c) Top 10 large-size patterns with three or more blocks.
Figure 2
Figure 2
A pattern plotted with an available structure of the Oxidoreducatase FAD/NAD(P)-binding protein. The blocks are numbered in the same way as in Figure1a. Structures are shown with the conserved pattern blocks plotted with sticks in different colors, block ‘R-x-Y-S-x(2)-S’ highlighted in green, block ‘G-T-G-x-A-P’ in yellow, block ‘G-x(3)-L-x(2)-G’ in pink, block ‘A-x-S-R’ in orange, block ‘K-x-Y-x-Q’ in deep pink, and block ‘Y-x-C-G’ in purple, the ligand FAD plotted with ball-and-stick representation in blue, and the ligand NAP with ball-and-stick representation in brown. (PDB code1QFY:A, query protein: P10933, FENR1_PEA).
Figure 3
Figure 3
The patterns discovered for P00730. Patterns are shown in sticks with different blocks plotted by distinct colors, LCI protein in ribbons and zinc ions in crimson spheres. (a) The pattern with a high support is plotted with the structure of the bovine pancreatic carboxypeptidase A complexed with the ligand INF, which is plotted in ball-and-sticks representation and colored in CPK (1hdq.pdb, P00730) (b) A longer pattern with a lower support provides the contact regions when interacting with the protein LCI, shown with the structure of another protein P48052 in complex with LCI, where the ligand GLU is plotted in ball-and-sticks representation and colored in CPK. (1DTD.pdb, P48052).

Similar articles

Cited by

References

    1. Gutman R., Berezin C., Wollman R., Rosenberg Y., Ben-Tal N. QuasiMotiFinder: protein annotation by searching for evolutionarily conserved motif-like patterns. Nucleic Acids Res. 2005;33:W255–W261. - PMC - PubMed
    1. Su Q.J., Lu L., Saxonov S., Brutlag D.L. eBLOCKs: enumerating conserved protein blocks to achieve maximal sensitivity and specificity. Nucleic Acids Res. 2005;33:D178–D182. - PMC - PubMed
    1. Hulo N., Bairoch A., Bulliard V., Cerutti L., De Castro E., Langendijk-Genevaux P.S., Pagni M., Sigrist C.J. The PROSITE database. Nucleic Acids Res. 2006;34:D227–D230. - PMC - PubMed
    1. Ogiwara A., Uchiyama I., Yasuhiko S., Kanehisa M. Construction of a dictionary of sequence motifs that characterize groups of related proteins. Protein Eng. 1992;5:479–488. - PMC - PubMed
    1. Saqi M.A.S., Sternberg M.J.E. Identification of sequence motifs from a set of proteins with related function. Protein Eng. 1994;7:165–171. - PubMed

Publication types