Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2000 Sep 1;28(17):3278-88.
doi: 10.1093/nar/28.17.3278.

Re-annotating the Mycoplasma pneumoniae genome sequence: adding value, function and reading frames

Affiliations

Re-annotating the Mycoplasma pneumoniae genome sequence: adding value, function and reading frames

T Dandekar et al. Nucleic Acids Res. .

Abstract

Four years after the original sequence submission, we have re-annotated the genome of Mycoplasma pneumoniae to incorporate novel data. The total number of ORFss has been increased from 677 to 688 (10 new proteins were predicted in intergenic regions, two further were newly identified by mass spectrometry and one protein ORF was dismissed) and the number of RNAs from 39 to 42 genes. For 19 of the now 35 tRNAs and for six other functional RNAs the exact genome positions were re-annotated and two new tRNA(Leu) and a small 200 nt RNA were identified. Sixteen protein reading frames were extended and eight shortened. For each ORF a consistent annotation vocabulary has been introduced. Annotation reasoning, annotation categories and comparisons to other published data on M.pneumoniae functional assignments are given. Experimental evidence includes 2-dimensional gel electrophoresis in combination with mass spectrometry as well as gene expression data from this study. Compared to the original annotation, we increased the number of proteins with predicted functional features from 349 to 458. The increase includes 36 new predictions and 73 protein assignments confirmed by the published literature. Furthermore, there are 23 reductions and 30 additions with respect to the previous annotation. mRNA expression data support transcription of 184 of the functionally unassigned reading frames.

PubMed Disclaimer

Figures

Figure 1
Figure 1
(A) Peptides identified by mass spectrometry of the protein MPN033(MP121) (see Materials and Methods). Those peptides matching the genome-derived sequence are shown in bold. The protein reading frame sequence not covered by these peptides is shown in plain text. Extension of the MPN033(MP121) sequence respective to its original annotation could be confirmed. The methionine at the start is shown in italic. The start position given in the original annotation is underlined. The exact start sequence is predicted (as shown) at the methionine directly before the furthest N-terminal peptide determined. (B) Identification of three new short proteins by mass spectrometry. These proteins are shorter than 100 amino acids. The methionine at the start is shown in italic. The first protein shows high similarity to a pentitol phosphotransferase IIB subunit. This peptide was also predicted from screening intergenic regions and by Reizer et al. (30). The other two are hypothetical (show no similarities). The short proteins are given here as maximal extensions between two stop codons according to the peptides sequenced. A detailed analysis of the peptide data derived for these proteins as well as the proteome of M.pneumoniae will be published elsewhere (J.T.Regula, B.Ueberle, G.Boguth, A.Görg, M.Schnölzer, R.Herrmann and R.Frank, submitted for publication). (C) Sequence of the reading frame MPN254(MP579.1) (hypothetical) predicted between MPN255(MP579) and MPN253(MP580) according to the mass spectrometric data. Peptides matching the genome-derived sequence identified by mass spectrometry are shown in bold. Protein sequence not covered by these peptides is shown in plain text (protein coverage 40.8% by amino acid count, 40.1% by mass). (D) Mycoplasma pneumoniae proteins were separated by 2-dimensional gel electrophoresis in a pH gradient from 3 to 10, in a vertical 12.5% slab gel and stained with silver. A part of the 2-dimensional gel showing the presence of the product of gene MPN254(MP579.1) (labeled A, sequence as shown in C). Previously known MP proteins surrounding MPN254(MP579.1) in the 2-dimensional gel are labeled in red, with the MPN number given at the top and the number according to Himmelreich et al. (1) at the bottom.
Figure 1
Figure 1
(A) Peptides identified by mass spectrometry of the protein MPN033(MP121) (see Materials and Methods). Those peptides matching the genome-derived sequence are shown in bold. The protein reading frame sequence not covered by these peptides is shown in plain text. Extension of the MPN033(MP121) sequence respective to its original annotation could be confirmed. The methionine at the start is shown in italic. The start position given in the original annotation is underlined. The exact start sequence is predicted (as shown) at the methionine directly before the furthest N-terminal peptide determined. (B) Identification of three new short proteins by mass spectrometry. These proteins are shorter than 100 amino acids. The methionine at the start is shown in italic. The first protein shows high similarity to a pentitol phosphotransferase IIB subunit. This peptide was also predicted from screening intergenic regions and by Reizer et al. (30). The other two are hypothetical (show no similarities). The short proteins are given here as maximal extensions between two stop codons according to the peptides sequenced. A detailed analysis of the peptide data derived for these proteins as well as the proteome of M.pneumoniae will be published elsewhere (J.T.Regula, B.Ueberle, G.Boguth, A.Görg, M.Schnölzer, R.Herrmann and R.Frank, submitted for publication). (C) Sequence of the reading frame MPN254(MP579.1) (hypothetical) predicted between MPN255(MP579) and MPN253(MP580) according to the mass spectrometric data. Peptides matching the genome-derived sequence identified by mass spectrometry are shown in bold. Protein sequence not covered by these peptides is shown in plain text (protein coverage 40.8% by amino acid count, 40.1% by mass). (D) Mycoplasma pneumoniae proteins were separated by 2-dimensional gel electrophoresis in a pH gradient from 3 to 10, in a vertical 12.5% slab gel and stained with silver. A part of the 2-dimensional gel showing the presence of the product of gene MPN254(MP579.1) (labeled A, sequence as shown in C). Previously known MP proteins surrounding MPN254(MP579.1) in the 2-dimensional gel are labeled in red, with the MPN number given at the top and the number according to Himmelreich et al. (1) at the bottom.
Figure 2
Figure 2
(A) Sequence alignment of MPN280(MP555) with related secD sequences. Only the central part (140 amino acid positions) of the alignment is given. After the M.pneumoniae sequence the M.genitalium homolog is shown (MG277), aligned with secD proteins from various species (top to bottom) (SwissProt identifier/accession no.): Mycobacterium leprae (SECD_MYCLE); Mycobacterium tuberculosis (SECD_MYCTU); Bacillus subtilis (accession no. AAC31122; the secD domain from the fusion protein secDF only); Treponema pallidum (SECD_TREPA); Thermotoga maritima (accession no. Q9WZW4); Borrelia burgdorferi (accession no. AAC66993); Helicobacter pylori (SECD_ECOLI); secD from Campylobacter jejuni (accession no. CAB73348); Escherichia coli (SECD_ECOLI); Haemophilus influenzae (SECD_HAEIN); Rickettsia prowazecki (SECD_RICPR); Streptomyces coelicolor (SECD_STRCO); Synechocystes PCC6803 (SECD_SYNY3). (B) Phylogenetic tree with bootstrap values (1000 trials) comparing certified secD and secF domains (T.mar, Thermotoga maritima; S.sp., Synechocystes PCC6803; R.pro, Rickettsia prowazecki; H.pyl, Helicobacter pylori; E.col, E.coli, B.sub, Bacillus subtilis) with MPN280(MP555) and its homolog MG277, secA from H.influenzae and MPN210(MP622) from M.pneumoniae and polymerase III subunits (Aquifex aeolicus).
Figure 2
Figure 2
(A) Sequence alignment of MPN280(MP555) with related secD sequences. Only the central part (140 amino acid positions) of the alignment is given. After the M.pneumoniae sequence the M.genitalium homolog is shown (MG277), aligned with secD proteins from various species (top to bottom) (SwissProt identifier/accession no.): Mycobacterium leprae (SECD_MYCLE); Mycobacterium tuberculosis (SECD_MYCTU); Bacillus subtilis (accession no. AAC31122; the secD domain from the fusion protein secDF only); Treponema pallidum (SECD_TREPA); Thermotoga maritima (accession no. Q9WZW4); Borrelia burgdorferi (accession no. AAC66993); Helicobacter pylori (SECD_ECOLI); secD from Campylobacter jejuni (accession no. CAB73348); Escherichia coli (SECD_ECOLI); Haemophilus influenzae (SECD_HAEIN); Rickettsia prowazecki (SECD_RICPR); Streptomyces coelicolor (SECD_STRCO); Synechocystes PCC6803 (SECD_SYNY3). (B) Phylogenetic tree with bootstrap values (1000 trials) comparing certified secD and secF domains (T.mar, Thermotoga maritima; S.sp., Synechocystes PCC6803; R.pro, Rickettsia prowazecki; H.pyl, Helicobacter pylori; E.col, E.coli, B.sub, Bacillus subtilis) with MPN280(MP555) and its homolog MG277, secA from H.influenzae and MPN210(MP622) from M.pneumoniae and polymerase III subunits (Aquifex aeolicus).

References

    1. Himmelreich R., Hilbert,H., Plagens,H., Pirkl,E., Li,B.-C. and Herrmann,R. (1996) Nucleic Acids Res., 24, 4420–4449 - PMC - PubMed
    1. Himmelreich R., Plagens,H., Hilbert,H., Reiner,B. and Herrmann,R. (1997) Nucleic Acids Res., 25, 701–712. - PMC - PubMed
    1. Brenner S.E. (1999) Trends Genet., 15, 132–133. - PubMed
    1. Koonin E.V., Mushegian,A.R. and Rudd,K.E. (1996) Curr. Biol., 6, 404–416 - PubMed
    1. Ouzounis C., Casari,G., Valencia,A. and Sander,C. (1996) Mol. Microbiol., 20, 898–900. - PubMed

Publication types

MeSH terms