An Ambystoma mexicanum EST sequencing project: analysis of 17,352 expressed sequence tags from embryonic and regenerating blastema cDNA libraries

Bianca Habermann¹, Anne-Gaelle Bebin, Stephan Herklotz, Michael Volkmer, Kay Eckelt, Kerstin Pehlke, Hans Henning Epperlein, Hans Konrad Schackert, Glenis Wiebe, Elly M Tanaka

Affiliations

PMID: 15345051
PMCID: PMC522874
DOI: 10.1186/gb-2004-5-9-r67

Comparative Study

An Ambystoma mexicanum EST sequencing project: analysis of 17,352 expressed sequence tags from embryonic and regenerating blastema cDNA libraries

Bianca Habermann et al. Genome Biol. 2004.

. 2004;5(9):R67.

doi: 10.1186/gb-2004-5-9-r67. Epub 2004 Aug 13.

Authors

Bianca Habermann¹, Anne-Gaelle Bebin, Stephan Herklotz, Michael Volkmer, Kay Eckelt, Kerstin Pehlke, Hans Henning Epperlein, Hans Konrad Schackert, Glenis Wiebe, Elly M Tanaka

Affiliation

¹ Scionics Computer Innovation GmbH, Pfotenhauerstrasse 110, Dresden 01307, Germany. habermann@mpi-cbg.de

PMID: 15345051
PMCID: PMC522874
DOI: 10.1186/gb-2004-5-9-r67

Abstract

Background: The ambystomatid salamander, Ambystoma mexicanum (axolotl), is an important model organism in evolutionary and regeneration research but relatively little sequence information has so far been available. This is a major limitation for molecular studies on caudate development, regeneration and evolution. To address this lack of sequence information we have generated an expressed sequence tag (EST) database for A. mexicanum.

Results: Two cDNA libraries, one made from stage 18-22 embryos and the other from day-6 regenerating tail blastemas, generated 17,352 sequences. From the sequenced ESTs, 6,377 contigs were assembled that probably represent 25% of the expressed genes in this organism. Sequence comparison revealed significant homology to entries in the NCBI non-redundant database. Further examination of this gene set revealed the presence of genes involved in important cell and developmental processes, including cell proliferation, cell differentiation and cell-cell communication. On the basis of these data, we have performed phylogenetic analysis of key cell-cycle regulators. Interestingly, while cell-cycle proteins such as the cyclin B family display expected evolutionary relationships, the cyclin-dependent kinase inhibitor 1 gene family shows an unusual evolutionary behavior among the amphibians.

Conclusions: Our analysis reveals the importance of a comprehensive sequence set from a representative of the Caudata and illustrates that the EST sequence database is a rich source of molecular, developmental and regeneration studies. To aid in data mining, the ESTs have been organized into an easily searchable database that is freely available online.

PubMed Disclaimer

Figures

**Figure 1**
Distribution of sequence length. **(a)** Distribution of read lengths of the sequenced ESTs after quality control. The average read length was 569 bp, corresponding to a peak of between 500 and 600 bp. **(b)** Distribution of sequence length of assembled contigs. The average length of contigs was 597 bp. **(c)** Distribution of the number of ESTs per assembled contig. Most of the contigs had one EST. The two largest contigs contained over 400 ESTs (cytochrome c oxidase subunit I and 12S rRNA, respectively).

**Figure 2**
Homology of *A. mexicanum* contigs to protein and nucleotide sequences from other species. **(a)** Distribution of E-values from the first identified hit in the protein non-redundant database that was used to assign a putative identity to the contig. The majority of contigs identified a protein with an E-value between 1e-20 and 1e-99. In 11% of the cases, the E-value of the first hit was below 1e-100 and can therefore be considered a true ortholog. **(b)** Distribution of hits in the different sequence databases that were searched sequentially.

**Figure 3**
Annotated GO terms and protein domains in the *A. mexicanum* EST libraries. **(a)** Gene Ontology electronic annotation in the category 'biological process' of contigs from *A. mexicanum*. The largest proportion of annotated contigs was assigned a 'cellular process' (87%). Of those, five large groups of cellular processes emerged, with 'cell cycle/proliferation' (13%), 'intracellular signaling' and 'intracellular transport' (8% and 15%), 'metabolism' (17%), 'protein metabolism/modification' (18%) and 'RNA metabolism' (13%). **(b)** Domains associated with cellular processes identified in the *A. mexicanum* contig sequence dataset. The largest fraction of contigs was associated with a domain function in 'intracellular transport', followed by 'RNA-binding and metabolism' and 'DNA-binding and transcriptional control'.

**Figure 4**
Phylogenetic analysis of the vertebrate cyclin-dependent kinase (CDK) inhibitors (CKIs) p21(Cip1), p27(Kip1) and p57(Kip2). **(a)** Reference phylogenetic tree of mitochondrial 12S rRNA. The Caudata and Salientia both branch out to build the amphibian group. **(b)** Unrooted phylogenetic tree of the cyclin B1 gene family. The amphibian cyclin B1 family members form a distinct group. **(c)** Unrooted phylogenetic tree of the amino-terminal CDK-inhibitory domain of vertebrate p21, p27, p28 and p57, which is conserved between the protein families. p27 of *A. mexicanum* clearly groups with the p27 proteins from other vertebrates. The amphibian-specific p28-family does not parse with any singe group. Note, however, that unlike the 12S rRNA tree, the *A. mexicanum* and *A. t. tigrinum* p27 branch out with that of *D. rerio*. **(d)** Unrooted, phylogenetic tree of the full-length kinase inhibitor sequences. Using the full-length protein sequences from the CKI families, the p28 family branches off between the p21 and p27 families. **(e)** Multiple sequence alignment of the amino-terminal, CDK-inhibitory region of the CKI families. The protein sequence of *A. mexicanum* p27 is clearly the ortholog of the p27 family, yet displays higher than expected divergence on the protein level. The same divergence is observed for the ambystomatid p57 proteins. The p28 family has extremely high sequence divergence compared to any other CDKN1 family member. Conserved residues between the three CDKN1 families are highlighted in green and the p28-family in light blue. Residues that differ between ambystomatid sequences and the other vertebrate species are highlighted in the ambystomatid sequences in red. Accession numbers are: NM_131513 (*D. rerio* ccnb1), NM_031966 (*H. sapiens* ccnb1), BC041302 (*X. laevis* ccnb1), NM_172301 (*M. musculus* ccnb1), NM_171991 (*R. norvegicus* ccnb1), P13351 (*X. leavis* ccnb2), XP_343420 (*R. norvegicus* ccnb2), P29332 (*G. gallus* ccnb2), NP_004692 (*H. sapiens* ccnb2), NP_031656 (*M. musculus* ccnb2), CAC24491 (*X. laevis* ccnb3), P39963 (*G. gallus* ccnb3), CAC94915 (*H. sapiens* ccnb3), NP_898836 (*M. musculus* ccnb3), AAH56746.1 (*D. rerio* p27A, Drp27A); AAK84219.1 (*D. rerio* p27, Drp27); CN056871.1 (*A. t. tigrinum* p27, Attp27); AAM22491.1 (*G. gallus* p27, Ggp27); NP_004055.1 (*H. sapiens* p27, Hsp27); P46414 (*M. musculus* p27, Mmp27); NP_113950.1 (*R. norvegicus* p27, Rnp27); NP_000067.1 (*H. sapiens* p57, Hsp57); P49919 (*M. musculus* p57, Mmp57); XP_341967.1 (*R. norvegicus* p57, Rnp57); CN039016.1 (*A. mexicanum* p57, Amp57); BM489375.1 (*G. gallus* p57, Ggp57); CK697132.1 (*D. rerio* p57, Drp57); AAH01935.1 (*H. sapiens* p21, Hsp21); NP_031695.1 (*M. musculus* p21, Mmp21); NP_542960.1 (*R. norvegicus* p21, Rnp21); AL639561.2 (*X. tropicalis* p21, Xtp21); BJ065460.1 (*X. laevis* p21, Xlp21); AAN63876.1 (*G. gallus* p21, Ggp21); I51683 (*X. laevis* Xic1, XlXic1); BX712320.1 (*X. tropicalis* p28, Xtp28); TNeu143i03.p1cSP6 (*X. tropicalis* p28A, Xtp28A); CN033557.1 (*A. mexicanum* p28, Amp28); CN035131.1 (*A. mexicanum* p28A, Amp28A); CN033708.1 (*A. mexicanum* p28B, Amp28B). The scale bar indicates substitutions per site.

**Figure 5**
The *Ambystoma mexicanum* EST database. A relational database was created as a sequence storage and annotation resource of the sequenced ESTs from *A. mexicanum*. **(a)** The main entry site of the EST resource is the contig page, where a subset of the information is available, including the identity of included ESTs, putative identity of the contig, GO annotation including cellular role, biochemical function and cellular component, a list of homologs from different model organisms, and identified conserved domains. Source data are available for all BLAST-based alignments, for external sequence or domain data, and for the complete contig sequence. **(b,c)** EST information and protein information pages, containing more detailed description of storage information, library source and read length (b). A complete list of homologs and identified conserved domains can be assessed on the protein information page (c). For a more detailed description of the database, see text.

See this image and copyright information in PMC

References

1. Shubin NWD. Phylogeny, variation, and morphological Integration. Am Zool. 1996;36:51–60.
1. Roth G, Nishikawa KC, Wake DB. Genome size, secondary simplification, and the evolution of the brain in salamanders. Brain Behav Evol. 1997;50:50–59. - PubMed
1. Rabinowicz PD. Constructing gene-enriched plant genomic libraries using methylation filtration technology. Methods Mol Biol. 2003;236:21–36. doi: 10.1385/1-59259-413-1:21. - DOI - PubMed
1. Animal Genome Size Database http://www.genomesize.com
1. Edstrom JE, Kawiak J. Microchemical deoxyribonucleic acid determination in individual cells. J Biophys Biochem Cytol. 1961;9:619–626. doi: 10.1083/jcb.9.3.619. - DOI - PMC - PubMed

Publication types

Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Substances

Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
Research Materials
- NCI CPTC Antibody Characterization Program

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

An Ambystoma mexicanum EST sequencing project: analysis of 17,352 expressed sequence tags from embryonic and regenerating blastema cDNA libraries

Affiliation

An Ambystoma mexicanum EST sequencing project: analysis of 17,352 expressed sequence tags from embryonic and regenerating blastema cDNA libraries

Authors

Affiliation

Abstract

Figures

References

Publication types

MeSH terms

Substances

LinkOut - more resources

Full Text Sources

Research Materials