Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2009 Jul;8(7):1516-26.
doi: 10.1074/mcp.M900025-MCP200. Epub 2009 Apr 7.

Multiple Motif Scanning to identify methyltransferases from the yeast proteome

Affiliations

Multiple Motif Scanning to identify methyltransferases from the yeast proteome

Tanya C Petrossian et al. Mol Cell Proteomics. 2009 Jul.

Abstract

A new program (Multiple Motif Scanning) was developed to scan the Saccharomyces cerevisiae proteome for Class I S-adenosylmethionine-dependent methyltransferases. Conserved Motifs I, Post I, II, and III were identified and expanded in known methyltransferases by primary sequence and secondary structural analysis through hidden Markov model profiling of both a yeast reference database and a reference database of methyltransferases with solved three-dimensional structures. The roles of the conserved amino acids in the four motifs of the methyltransferase structure and function were then analyzed to expand the previously defined motifs. Fisher-based negative log statistical matrix sets were developed from the prevalence of amino acids in the motifs. Multiple Motif Scanning is able to scan the proteome and score different combinations of the top fitting sequences for each motif. In addition, the program takes into account the conserved number of amino acids between the motifs. The output of the program is a ranked list of proteins that can be used to identify new methyltransferases and to reevaluate the assignment of previously identified putative methyltransferases. The Multiple Motif Scanning program can be used to develop a putative list of enzymes for any type of protein that has one or more motifs conserved at variable spacings and is freely available (www.chem.ucla.edu/files/MotifSetup.Zip). Finally hidden Markov model profile clustering analysis was used to subgroup Class I methyltransferases into groups that reflect their methyl-accepting substrate specificity.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
The four signature motifs (underlined) of the Class I methyltransferases shown in their primary, predicted secondary, and actual secondary structure in three known methyltransferase proteins in S.cerevisiae. The motifs were identified using HHpred with HHsearch 1.5, which utilizes HMM profile versus profile searches to align the proteins. HHpred also generates secondary structural predictions (seen here) that also aid in the alignment and identification of the motifs, denoted C for random coil, E for β sheet, and H for helical structures. Because the crystal structures have been solved for Dot1, Hmt1, and Ppm1 proteins, the secondary structures of the crystals have been used for comparison with those predicted by HHpred. Although a crystal structure is known for the Mtf1/YMR228w, the reaction that it catalyzes is unknown. As seen here, predicted secondary structures are very similar to the actual secondary structures, especially in the motif regions.
Fig. 2.
Fig. 2.
Crystal structure of Dot1 (top) and schematic (bottom) that depicts the secondary structures present in Class I methyltransferases. Class I methyltransferases are distinguished by a common three-dimensional structural core, which includes a seven-strand twisted β sheet that provides the major binding interactions for AdoMet. β strands are in yellow, helices are in red, and non-strand/non-helices are shown in green. The S-adenosylhomocysteine molecule in the crystal structure is depicted in the stick model (Protein Data Bank code 1U2Z, Ref. 29).
Fig. 3.
Fig. 3.
Development of a Fisher-based sequence scoring matrix used for Multiple Motif Scanning analysis. The scoring matrix for each amino acid was compiled using statistical analysis from the two known databases of methyltransferases. The number of each amino acid residue at each position of each motif was tallied. p values from χ2 tests were obtained by comparing the actual count of each amino acid with the “expected” count, which was calculated from frequencies found in the Pseudogene.org database. Values were then converted to the absolute value of the scores by taking the negative log, and scores were designated with a negative value if the actual amino acid count was less than the expected.
Fig. 4.
Fig. 4.
Conservation of spacing between motifs. The number of amino acids between the extended Motifs I and Post I and Motifs II and III for known methyltransferases are depicted. Scores for spacing were based on the frequency of yeast and structural database combined with penalties for excessive gap distance as described under “Experimental Procedures.”
Fig. 5.
Fig. 5.
Conserved amino acid residues in and adjacent to the four methyltransferase motifs. The motifs are arranged as they occur in the β sheet. Conserved amino acids along with the χ2 p values from the yeast data set are shown in boxes. The residues that composed the originally described motifs are highlighted in bold print with a gray background. The secondary structure as determined from both the yeast and crystal data sets is denoted by an arrow (β strand), purple box (helix), or line (non-helix or strand). Dotted representations indicate structures that are not present in all methyltransferases. Residues involved in turns are also shown. The chemical interactions of key amino acids residues within the protein (blue lines) and/or with cofactor AdoMet (*) are shown. The thick black lines between each of the β strands indicate amino acids contributing to two hydrophobic pockets created by side chains of residues coming into (solid) and out of (dashed) the plane of the β sheet. Side chains from helical regions contributing to these hydrophobic pockets are indicated similarly by solid and dashed circles and ovals.
Fig. 6.
Fig. 6.
Extended Motifs I, Post I, II, and III used for Multiple Motif Scanning analysis. The amino acid frequency among known yeast methyltransferases is depicted using WebLogo (30) for the expanded Motifs I, Post I, II, and III determined in this work. A larger number of bits for each letter designates the importance of each amino acid in the motif.
Fig. 7.
Fig. 7.
Multiple Motif Scanning software. The Multiple Motif Scanning program was developed to scan the proteome for novel methyltransferases to resolve the problems encountered by Katz et al. (6). Yeast matrix and crystal structure matrix were independently used to scan the S. cerevisiae proteome. Multiple Motif Scanning recognizes the top five best matches for the first motif entered and subsequently identifies the top five plausible second motifs for each of the matches of the first motifs. The program continues to find the top five sequences that fit the motif for every previous combination to produce 5n matches for n number of motifs. All combinations are scored, and the top 10 combinations are saved. Proteins are ranked among each other based on the top score. The extended Motifs I, Post I, II, and III were used for analysis with gap considerations between motifs.
Fig. 8.
Fig. 8.
Accuracy of Multiple Motif Scanning.Motifs found by the Multiple Motif Scanning program using the yeast matrix were compared with the HMM profile alignments to calculate the percentage of inaccuracy in identifying the motifs. The program outputs the overall top score calculated from the combination of sequence-fitting motifs and spacing between them combined along with the next nine top scoring combinations.
Fig. 9.
Fig. 9.
HMM profile clustering to determine putative substrates of potential methyltransferases. HMM profile versus profile clustering was utilized to find the calculated homology between proteins. Proteins clustered in several groups (A\NK) that display similarity in type of substrate as well as the atomic nucleophile of the substrate in the methyltransferase reaction (carbon, nitrogen, oxygen, etc.). Known methyltransferases are depicted in black, known non-methyltransferases are depicted in magenta, and unknown ORFs are depicted in green

Similar articles

Cited by

References

    1. Cheng X., Blumenthal R. M. ( 1999) S-Adenosylmethionine-Dependent Methyltransferases: Structures and Functions, World Scientific, Singapore
    1. Djordjevic S., Stock A. M. ( 1997) Crystal structure of the chemotaxis receptor methyltransferase CheR suggests a conserved structural motif for binding S-adenosylmethionine. Structure 5, 545– 558 - PubMed
    1. Martin J. L., McMillan F. M. ( 2002) SAM (dependent) I AM: the S-adenosylmethionine-dependent methyltransferase fold. Curr. Opin. Struct. Biol. 12, 783– 793 - PubMed
    1. Schluckebier G., O'Gara M., Saenger W, Cheng X. ( 1995) Universal catalytic domain structure of AdoMet-dependent methyltransferases. J. Mol. Biol. 247, 16– 20 - PubMed
    1. Schubert H. L., Blumenthal R. M., Cheng X. ( 2003) Many paths to methyltransfer: a chronicle of convergence. Trends Biochem. Sci. 28, 329– 335 - PMC - PubMed

Publication types