PROTOGENE: turning amino acid alignments into bona fide CDS nucleotide alignments

Sébastien Moretti¹, Frédéric Reinier, Olivier Poirot, Fabrice Armougom, Stéphane Audic, Vladimir Keduas, Cédric Notredame

Affiliations

Affiliation

¹ Information Génomique et Structurale, CNRS UPR2589, Institute for Structural Biology and Microbiology (IBSM), Parc Scientifique de Luminy, 163 Avenue de Luminy, FR 13288, Marseille cedex 09, France.

PMID: 16845080
PMCID: PMC1538918
DOI: 10.1093/nar/gkl170

PROTOGENE: turning amino acid alignments into bona fide CDS nucleotide alignments

Sébastien Moretti et al. Nucleic Acids Res. 2006.

. 2006 Jul 1;34(Web Server issue):W600-3.

doi: 10.1093/nar/gkl170.

Authors

Sébastien Moretti¹, Frédéric Reinier, Olivier Poirot, Fabrice Armougom, Stéphane Audic, Vladimir Keduas, Cédric Notredame

Affiliation

¹ Information Génomique et Structurale, CNRS UPR2589, Institute for Structural Biology and Microbiology (IBSM), Parc Scientifique de Luminy, 163 Avenue de Luminy, FR 13288, Marseille cedex 09, France.

PMID: 16845080
PMCID: PMC1538918
DOI: 10.1093/nar/gkl170

Abstract

We describe Protogene, a server that can turn a protein multiple sequence alignment into the equivalent alignment of the original gene coding DNA. Protogene relies on a pipeline where every initial protein sequence is BLASTed against RefSeq or NR. The annotation associated with potential matches is used to identify the gene sequence. This gene sequence is then aligned with the query protein using Exonerate in order to extract a coding nucleotide sequence matching the original protein. Protogene can handle protein fragments and will return every CDS coding for a given protein, even if they occur in different genomes. Protogene is available from http://www.tcoffee.org/.

PubMed Disclaimer

Figures

**Figure 1**
Protogene flow chart sequences are first BLASTed against RefSeq. If no match is found, they are then BLASTed against NR. Nucleotide sequences are fetched from NCBI and processed with Exonerate to yield CDSs that perfectly match the original protein.

**Figure 2**
Protogene output on the CLP Serine Protease family. The Seed MSA of the PFAM profile entry (PFAM PF00574) was processed by Protogene. The portion of the alignment containing the Serine active site classes are indicated in yellow (UCN) and green (AGY).

**Figure 3**
Protogene output on the Human H2A Histone protein. The original protein sequence is indicated on the top. Light coloured columns are those not entirely conserved.

See this image and copyright information in PMC

References

1. Bininda-Emonds O.R. transAlign: using amino acids to facilitate the multiple alignment of protein-coding DNA sequences. BMC Bioinformatics. 2005;6:156. - PMC - PubMed
1. Stocsits R.R., Hofacker I.L., Fried C., Stadler P.F. Multiple sequence alignments of partially coding nucleic acid sequences. BMC Bioinformatics. 2005;6:160. - PMC - PubMed
1. Wernersson R., Pedersen A.G. RevTrans: multiple alignment of coding DNA from aligned amino acid sequences. Nucleic Acids Res. 2003;31:3537–3539. - PMC - PubMed
1. Wheeler D.L., Barrett T., Benson D.A., Bryant S.H., Canese K., Chetvernin V., Church D.M., DiCuccio M., Edgar R., Federhen S., et al. Database resources of the National Center for Biotechnology Information. Nucleic Acids Res. 2006;34:D173–D180. - PMC - PubMed
1. Altschul S.F., Gish W., Miller W., Myers E.W., Lipman D.J. Basic local alignment search tool. J. Mol. Biol. 1990;215:403–410. - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Substances

Actions
Actions

LinkOut - more resources

Full Text Sources

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

PROTOGENE: turning amino acid alignments into bona fide CDS nucleotide alignments

Affiliation

PROTOGENE: turning amino acid alignments into bona fide CDS nucleotide alignments

Authors

Affiliation

Abstract

Figures

References

Publication types

MeSH terms

Substances

LinkOut - more resources

Full Text Sources