Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2004;5(9):R67.
doi: 10.1186/gb-2004-5-9-r67. Epub 2004 Aug 13.

An Ambystoma mexicanum EST sequencing project: analysis of 17,352 expressed sequence tags from embryonic and regenerating blastema cDNA libraries

Affiliations
Comparative Study

An Ambystoma mexicanum EST sequencing project: analysis of 17,352 expressed sequence tags from embryonic and regenerating blastema cDNA libraries

Bianca Habermann et al. Genome Biol. 2004.

Abstract

Background: The ambystomatid salamander, Ambystoma mexicanum (axolotl), is an important model organism in evolutionary and regeneration research but relatively little sequence information has so far been available. This is a major limitation for molecular studies on caudate development, regeneration and evolution. To address this lack of sequence information we have generated an expressed sequence tag (EST) database for A. mexicanum.

Results: Two cDNA libraries, one made from stage 18-22 embryos and the other from day-6 regenerating tail blastemas, generated 17,352 sequences. From the sequenced ESTs, 6,377 contigs were assembled that probably represent 25% of the expressed genes in this organism. Sequence comparison revealed significant homology to entries in the NCBI non-redundant database. Further examination of this gene set revealed the presence of genes involved in important cell and developmental processes, including cell proliferation, cell differentiation and cell-cell communication. On the basis of these data, we have performed phylogenetic analysis of key cell-cycle regulators. Interestingly, while cell-cycle proteins such as the cyclin B family display expected evolutionary relationships, the cyclin-dependent kinase inhibitor 1 gene family shows an unusual evolutionary behavior among the amphibians.

Conclusions: Our analysis reveals the importance of a comprehensive sequence set from a representative of the Caudata and illustrates that the EST sequence database is a rich source of molecular, developmental and regeneration studies. To aid in data mining, the ESTs have been organized into an easily searchable database that is freely available online.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Distribution of sequence length. (a) Distribution of read lengths of the sequenced ESTs after quality control. The average read length was 569 bp, corresponding to a peak of between 500 and 600 bp. (b) Distribution of sequence length of assembled contigs. The average length of contigs was 597 bp. (c) Distribution of the number of ESTs per assembled contig. Most of the contigs had one EST. The two largest contigs contained over 400 ESTs (cytochrome c oxidase subunit I and 12S rRNA, respectively).
Figure 2
Figure 2
Homology of A. mexicanum contigs to protein and nucleotide sequences from other species. (a) Distribution of E-values from the first identified hit in the protein non-redundant database that was used to assign a putative identity to the contig. The majority of contigs identified a protein with an E-value between 1e-20 and 1e-99. In 11% of the cases, the E-value of the first hit was below 1e-100 and can therefore be considered a true ortholog. (b) Distribution of hits in the different sequence databases that were searched sequentially.
Figure 3
Figure 3
Annotated GO terms and protein domains in the A. mexicanum EST libraries. (a) Gene Ontology electronic annotation in the category 'biological process' of contigs from A. mexicanum. The largest proportion of annotated contigs was assigned a 'cellular process' (87%). Of those, five large groups of cellular processes emerged, with 'cell cycle/proliferation' (13%), 'intracellular signaling' and 'intracellular transport' (8% and 15%), 'metabolism' (17%), 'protein metabolism/modification' (18%) and 'RNA metabolism' (13%). (b) Domains associated with cellular processes identified in the A. mexicanum contig sequence dataset. The largest fraction of contigs was associated with a domain function in 'intracellular transport', followed by 'RNA-binding and metabolism' and 'DNA-binding and transcriptional control'.
Figure 4
Figure 4
Phylogenetic analysis of the vertebrate cyclin-dependent kinase (CDK) inhibitors (CKIs) p21(Cip1), p27(Kip1) and p57(Kip2). (a) Reference phylogenetic tree of mitochondrial 12S rRNA. The Caudata and Salientia both branch out to build the amphibian group. (b) Unrooted phylogenetic tree of the cyclin B1 gene family. The amphibian cyclin B1 family members form a distinct group. (c) Unrooted phylogenetic tree of the amino-terminal CDK-inhibitory domain of vertebrate p21, p27, p28 and p57, which is conserved between the protein families. p27 of A. mexicanum clearly groups with the p27 proteins from other vertebrates. The amphibian-specific p28-family does not parse with any singe group. Note, however, that unlike the 12S rRNA tree, the A. mexicanum and A. t. tigrinum p27 branch out with that of D. rerio. (d) Unrooted, phylogenetic tree of the full-length kinase inhibitor sequences. Using the full-length protein sequences from the CKI families, the p28 family branches off between the p21 and p27 families. (e) Multiple sequence alignment of the amino-terminal, CDK-inhibitory region of the CKI families. The protein sequence of A. mexicanum p27 is clearly the ortholog of the p27 family, yet displays higher than expected divergence on the protein level. The same divergence is observed for the ambystomatid p57 proteins. The p28 family has extremely high sequence divergence compared to any other CDKN1 family member. Conserved residues between the three CDKN1 families are highlighted in green and the p28-family in light blue. Residues that differ between ambystomatid sequences and the other vertebrate species are highlighted in the ambystomatid sequences in red. Accession numbers are: NM_131513 (D. rerio ccnb1), NM_031966 (H. sapiens ccnb1), BC041302 (X. laevis ccnb1), NM_172301 (M. musculus ccnb1), NM_171991 (R. norvegicus ccnb1), P13351 (X. leavis ccnb2), XP_343420 (R. norvegicus ccnb2), P29332 (G. gallus ccnb2), NP_004692 (H. sapiens ccnb2), NP_031656 (M. musculus ccnb2), CAC24491 (X. laevis ccnb3), P39963 (G. gallus ccnb3), CAC94915 (H. sapiens ccnb3), NP_898836 (M. musculus ccnb3), AAH56746.1 (D. rerio p27A, Drp27A); AAK84219.1 (D. rerio p27, Drp27); CN056871.1 (A. t. tigrinum p27, Attp27); AAM22491.1 (G. gallus p27, Ggp27); NP_004055.1 (H. sapiens p27, Hsp27); P46414 (M. musculus p27, Mmp27); NP_113950.1 (R. norvegicus p27, Rnp27); NP_000067.1 (H. sapiens p57, Hsp57); P49919 (M. musculus p57, Mmp57); XP_341967.1 (R. norvegicus p57, Rnp57); CN039016.1 (A. mexicanum p57, Amp57); BM489375.1 (G. gallus p57, Ggp57); CK697132.1 (D. rerio p57, Drp57); AAH01935.1 (H. sapiens p21, Hsp21); NP_031695.1 (M. musculus p21, Mmp21); NP_542960.1 (R. norvegicus p21, Rnp21); AL639561.2 (X. tropicalis p21, Xtp21); BJ065460.1 (X. laevis p21, Xlp21); AAN63876.1 (G. gallus p21, Ggp21); I51683 (X. laevis Xic1, XlXic1); BX712320.1 (X. tropicalis p28, Xtp28); TNeu143i03.p1cSP6 (X. tropicalis p28A, Xtp28A); CN033557.1 (A. mexicanum p28, Amp28); CN035131.1 (A. mexicanum p28A, Amp28A); CN033708.1 (A. mexicanum p28B, Amp28B). The scale bar indicates substitutions per site.
Figure 5
Figure 5
The Ambystoma mexicanum EST database. A relational database was created as a sequence storage and annotation resource of the sequenced ESTs from A. mexicanum. (a) The main entry site of the EST resource is the contig page, where a subset of the information is available, including the identity of included ESTs, putative identity of the contig, GO annotation including cellular role, biochemical function and cellular component, a list of homologs from different model organisms, and identified conserved domains. Source data are available for all BLAST-based alignments, for external sequence or domain data, and for the complete contig sequence. (b,c) EST information and protein information pages, containing more detailed description of storage information, library source and read length (b). A complete list of homologs and identified conserved domains can be assessed on the protein information page (c). For a more detailed description of the database, see text.

References

    1. Shubin NWD. Phylogeny, variation, and morphological Integration. Am Zool. 1996;36:51–60.
    1. Roth G, Nishikawa KC, Wake DB. Genome size, secondary simplification, and the evolution of the brain in salamanders. Brain Behav Evol. 1997;50:50–59. - PubMed
    1. Rabinowicz PD. Constructing gene-enriched plant genomic libraries using methylation filtration technology. Methods Mol Biol. 2003;236:21–36. doi: 10.1385/1-59259-413-1:21. - DOI - PubMed
    1. Animal Genome Size Database http://www.genomesize.com
    1. Edstrom JE, Kawiak J. Microchemical deoxyribonucleic acid determination in individual cells. J Biophys Biochem Cytol. 1961;9:619–626. doi: 10.1083/jcb.9.3.619. - DOI - PMC - PubMed

Publication types

MeSH terms