Re-annotating the Mycoplasma pneumoniae genome sequence: adding value, function and reading frames

T Dandekar¹, M Huynen, J T Regula, B Ueberle, C U Zimmermann, M A Andrade, T Doerks, L Sánchez-Pulido, B Snel, M Suyama, Y P Yuan, R Herrmann, P Bork

Affiliations

PMID: 10954595
PMCID: PMC110705
DOI: 10.1093/nar/28.17.3278

Re-annotating the Mycoplasma pneumoniae genome sequence: adding value, function and reading frames

T Dandekar et al. Nucleic Acids Res. 2000.

. 2000 Sep 1;28(17):3278-88.

doi: 10.1093/nar/28.17.3278.

Authors

T Dandekar¹, M Huynen, J T Regula, B Ueberle, C U Zimmermann, M A Andrade, T Doerks, L Sánchez-Pulido, B Snel, M Suyama, Y P Yuan, R Herrmann, P Bork

Affiliation

¹ EMBL, Postfach 102209, D-69012 Heidelberg, Germany, Max Delbrück Centre for Molecular Medicine, Robert-Rössle-Strabetae 10, 13092 Berlin-Buch, Germany.

PMID: 10954595
PMCID: PMC110705
DOI: 10.1093/nar/28.17.3278

Abstract

Four years after the original sequence submission, we have re-annotated the genome of Mycoplasma pneumoniae to incorporate novel data. The total number of ORFss has been increased from 677 to 688 (10 new proteins were predicted in intergenic regions, two further were newly identified by mass spectrometry and one protein ORF was dismissed) and the number of RNAs from 39 to 42 genes. For 19 of the now 35 tRNAs and for six other functional RNAs the exact genome positions were re-annotated and two new tRNA(Leu) and a small 200 nt RNA were identified. Sixteen protein reading frames were extended and eight shortened. For each ORF a consistent annotation vocabulary has been introduced. Annotation reasoning, annotation categories and comparisons to other published data on M.pneumoniae functional assignments are given. Experimental evidence includes 2-dimensional gel electrophoresis in combination with mass spectrometry as well as gene expression data from this study. Compared to the original annotation, we increased the number of proteins with predicted functional features from 349 to 458. The increase includes 36 new predictions and 73 protein assignments confirmed by the published literature. Furthermore, there are 23 reductions and 30 additions with respect to the previous annotation. mRNA expression data support transcription of 184 of the functionally unassigned reading frames.

PubMed Disclaimer

Figures

**Figure 1**
(A) Peptides identified by mass spectrometry of the protein MPN033_(MP121) (see Materials and Methods). Those peptides matching the genome-derived sequence are shown in bold. The protein reading frame sequence not covered by these peptides is shown in plain text. Extension of the MPN033_(MP121) sequence respective to its original annotation could be confirmed. The methionine at the start is shown in italic. The start position given in the original annotation is underlined. The exact start sequence is predicted (as shown) at the methionine directly before the furthest N-terminal peptide determined. (B) Identification of three new short proteins by mass spectrometry. These proteins are shorter than 100 amino acids. The methionine at the start is shown in italic. The first protein shows high similarity to a pentitol phosphotransferase IIB subunit. This peptide was also predicted from screening intergenic regions and by Reizer *et al*. (30). The other two are hypothetical (show no similarities). The short proteins are given here as maximal extensions between two stop codons according to the peptides sequenced. A detailed analysis of the peptide data derived for these proteins as well as the proteome of *M.pneumoniae* will be published elsewhere (J.T.Regula, B.Ueberle, G.Boguth, A.Görg, M.Schnölzer, R.Herrmann and R.Frank, submitted for publication). (C) Sequence of the reading frame MPN254_(MP579.1) (hypothetical) predicted between MPN255_(MP579) and MPN253_(MP580) according to the mass spectrometric data. Peptides matching the genome-derived sequence identified by mass spectrometry are shown in bold. Protein sequence not covered by these peptides is shown in plain text (protein coverage 40.8% by amino acid count, 40.1% by mass). (D) *Mycoplasma pneumoniae* proteins were separated by 2-dimensional gel electrophoresis in a pH gradient from 3 to 10, in a vertical 12.5% slab gel and stained with silver. A part of the 2-dimensional gel showing the presence of the product of gene MPN254_(MP579.1) (labeled A, sequence as shown in C). Previously known MP proteins surrounding MPN254_(MP579.1) in the 2-dimensional gel are labeled in red, with the MPN number given at the top and the number according to Himmelreich *et al*. (1) at the bottom.

**Figure 2**
(A) Sequence alignment of MPN280_(MP555) with related secD sequences. Only the central part (140 amino acid positions) of the alignment is given. After the *M.pneumoniae* sequence the *M.genitalium* homolog is shown (MG277), aligned with secD proteins from various species (top to bottom) (SwissProt identifier/accession no.): *Mycobacterium leprae* (SECD_MYCLE); *Mycobacterium tuberculosis* (SECD_MYCTU); *Bacillus subtilis* (accession no. AAC31122; the secD domain from the fusion protein secDF only); *Treponema pallidum* (SECD_TREPA); *Thermotoga maritima* (accession no. Q9WZW4); *Borrelia burgdorferi* (accession no. AAC66993); *Helicobacter pylori* (SECD_ECOLI); secD from *Campylobacter jejuni* (accession no. CAB73348); *Escherichia coli* (SECD_ECOLI); *Haemophilus influenzae* (SECD_HAEIN); *Rickettsia prowazecki* (SECD_RICPR); *Streptomyces coelicolor* (SECD_STRCO); *Synechocystes* PCC6803 (SECD_SYNY3). (B) Phylogenetic tree with bootstrap values (1000 trials) comparing certified secD and secF domains (T.mar, *Thermotoga maritima*; S.sp., *Synechocystes* PCC6803; R.pro, *Rickettsia prowazecki*; H.pyl, *Helicobacter pylori*; E.col, *E.coli*, B.sub, *Bacillus subtilis*) with MPN280_(MP555) and its homolog MG277, secA from *H.influenzae* and MPN210_(MP622) from *M.pneumoniae* and polymerase III subunits (*Aquifex aeolicus*).

See this image and copyright information in PMC

References

1. Himmelreich R., Hilbert,H., Plagens,H., Pirkl,E., Li,B.-C. and Herrmann,R. (1996) Nucleic Acids Res., 24, 4420–4449 - PMC - PubMed
1. Himmelreich R., Plagens,H., Hilbert,H., Reiner,B. and Herrmann,R. (1997) Nucleic Acids Res., 25, 701–712. - PMC - PubMed
1. Brenner S.E. (1999) Trends Genet., 15, 132–133. - PubMed
1. Koonin E.V., Mushegian,A.R. and Rudd,K.E. (1996) Curr. Biol., 6, 404–416 - PubMed
1. Ouzounis C., Casari,G., Valencia,A. and Sander,C. (1996) Mol. Microbiol., 20, 898–900. - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Substances

Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
Other Literature Sources
- The Lens - Patent Citations Database
Molecular Biology Databases
- REBASE - The Restriction Enzyme Database
- SILVA
Research Materials
- NCI CPTC Antibody Characterization Program

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Re-annotating the Mycoplasma pneumoniae genome sequence: adding value, function and reading frames

Affiliation

Re-annotating the Mycoplasma pneumoniae genome sequence: adding value, function and reading frames

Authors

Affiliation

Abstract

Figures

References

Publication types

MeSH terms

Substances

LinkOut - more resources

Full Text Sources

Other Literature Sources

Molecular Biology Databases

Research Materials