Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2003 Dec;13(12):2736-46.
doi: 10.1101/gr.1674103.

Generation, annotation, evolutionary analysis, and database integration of 20,000 unique sea urchin EST clusters

Affiliations
Comparative Study

Generation, annotation, evolutionary analysis, and database integration of 20,000 unique sea urchin EST clusters

Albert J Poustka et al. Genome Res. 2003 Dec.

Abstract

Together with the hemichordates, sea urchins represent basal groups of nonchordate invertebrate deuterostomes that occupy a key position in bilaterian evolution. Because sea urchin embryos are also amenable to functional studies, the sea urchin system has emerged as one of the leading models for the analysis of the function of genomic regulatory networks that control development. We have analyzed a total of 107,283 cDNA clones of libraries that span the development of the sea urchin Strongylocentrotus purpuratus. Normalization by oligonucleotide fingerprinting, EST sequencing and sequence clustering resulted in an EST catalog comprised of 20,000 unique genes or gene fragments. Around 7000 of the unique EST consensus sequences were associated with molecular and developmental functions. Phylogenetic comparison of the identified genes to the genome of the urochordate Ciona intestinalis indicate that at least one quarter of the genes thought to be chordate specific were already present at the base of deuterostome evolution. Comparison of the number of gene copies in sea urchins to those in chordates and vertebrates indicates that the sea urchin genome has not undergone extensive gene or complete genome duplications. The established unique gene set represents an essential tool for the annotation and assembly of the forthcoming sea urchin genome sequence. All cDNA clones and filters of all analyzed libraries are available from the resource center of the German genome project at http://www.rzpd.de.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Histogram of the size distribution of the oligonucleotide fingerprinting (ONF) clusters reflecting the abundance distribution of all clones across all libraries, that is, all analyzed developmental stages. The X-axis shows the cluster size (clusters containing more than five clones are grouped). The Y-axis represents the frequency of each cluster size group, which is also given at the top of the bars representing the group, overlaid with the actual number of clones in that category (i.e., 106 clusters of size >100 with 27,482 clones). In total, 91,392 clones were analyzed by ONF clustering (see Methods), and about a third of the clones (27,482) belong to the superprevalent class, summarized in only 106 clusters. A third of the clones (30,304) exist in only one copy in any of the libraries (presumably complex class transcripts). In total, 35,238 different clusters were identified, which indicates that a 2.6-fold normalization was achieved.
Figure 2
Figure 2
Example of the interface of the sea urchin database. (A) Using the Soxb2 gene name as a keyword, a list of clusters matching this query is displayed. (B) Selecting one of the listed clusters, all the information relevant for this cluster is displayed. By selecting the cluster data field, information such as all clonenames, read directions, trace identifiers, and so on for all the clones in the cluster are given. The number and percentage of all clones in the cluster as well as the developmental stage from which they are derived is displayed in the abundance field at the bottom, which allows a quick overview of when the gene of interest is expressed. (C) Furthermore, selection of EST clusters according to the developmental stage from which the respective cDNA clones were isolated can also be performed. As an example, a list of clusters consisting of clones expressed at low levels in the egg and cleavage stage, medium levels in blastula stage, and high levels in gastrula but not expressed in the larva is shown. (D) A graphical representation of the overlap of the ESTs assigned to an EST cluster is provided for each of the clusters contained in the database. As an example, the 3907-bp alignment of all 46 reads derived from 40 different clones of the cluster001767.a1.2 that represents a sea urchin ortholog of the Na+ and Cl- coupled neutral and basic amino acid transporter ATB0 (SWISS-PROT entry number: SP_RO:Q91Y60) is shown. Bars represent the EST sequences. On the left the sequencer trace identifiers are given, which can be used to retrieve a specific EST sequence.
Figure 3
Figure 3
Distribution of the EST consensus sequences into the main Gene Ontology (GO) functional classes. Of 7146 consensus sequences, 5114 could be annotated in at least one of the three Gene Ontology (GO) defined functional classes. The numbers above the bars represent the percentage of all sea urchin clusters with a significant match to the SWISS-PROT database that are classified in the given functional class. The height of the bars and the ordinate give the number of proteins per group. (A) 4570 proteins could be associated with a molecular function, (B) 3219 proteins with a potential cellular component, and (C) 3910 proteins were associated with a specific biological process. A total of 2426 proteins were annotated in all three main categories.
Figure 4
Figure 4
Example of a neighbor-joining phylogenetic tree generated for a CD-type gene family that includes a single sea urchin and multiple Ciona and human orthologs. This CD group represents a family of potassium channels. Most of the branches contain a Ciona ortholog, and hence most of the genes of this family were duplicated after the protostome/deuterostome split but before the vertebrate/chordate divergence as the CD/CDY groups include orthologs of genes that are simultaneously single copy in C. elegans, D. melanogaster and yeast (S. cerevisiae). Because the sea urchin sequence does not root the tree, the expansions might have taken place before the separation of the echinoderm lineage. Additional vertebrate-specific expansions like in the example above were repeatedly observed in the selected CD/CDY groups. The C. elegans and D. melanogaster node was used as an outgroup for the tree. Numbers at branch points are confidence values derived from 1000 bootstrap resamplings of the alignment data. The sequence distance is indicated at the bottom as substitutions per site. Human genes are abbreviated by Hu followed by the Ensembl gene identifier (release 4.28; see Methods), Ciona genes are abbreviated by C followed by the Ciona gene model identifier number (JGI Release1; see Methods), and the sea urchin sequence is abbreviated by Su followed by the sequence cluster identifier that can be retrieved from our database. Other abbreviations are (Dro) D. melanogaster and (Cel) C. elegans.

Similar articles

Cited by

References

    1. Angerer, L.M. and Angerer, R.C. 2003. Patterning the sea urchin embryo: Gene regulatory networks, signaling pathways, and cellular interactions. Curr. Top. Dev. Biol. 53: 159-198. - PubMed
    1. Apweiler, R., Attwood, T.K., Bairoch, A., Bateman, A., Birney, E., Biswas, M., Bucher, P., Cerutti, L., Corpet, F., Croning, M.D., et al. 2001. The InterPro database, an integrated documentation resource for protein families, domains and functional sites. Nucleic Acids Res. 29: 37-40. - PMC - PubMed
    1. Arenas-Mena, C., Cameron, A.R., and Davidson, E.H. 2000. Spatial expression of Hox cluster genes in the ontogeny of a sea urchin. Development 127: 4631-4643. - PubMed
    1. Arnone, M.I. and Davidson, E.H. 1997. The hardwiring of development: Organization and function of genomic regulatory systems. Development 124: 1851-1864. - PubMed
    1. Burge, C. and Karlin, S. 1997. Prediction of complete gene structures in human genomic DNA. J. Mol. Biol. 268: 78-94. - PubMed

WEB SITE REFERENCES

    1. http://blast.wustl.edu; WU-BLAST.
    1. http://dbi.perl.org/index.html; perl DBI-module homepage.
    1. http://genome.jgi-psf.org/ciona4/ascidian.txt; Ciona-specific gene identifiers.
    1. http://genome.jgi-psf.org/ciona4/chordate.txt; deuterostome/chordate-specific gene identifiers.
    1. http://jhegaala.caltech.edu/~t/transfer/bacs.tar.gz; sea urchin genome.

Publication types

MeSH terms

Associated data

LinkOut - more resources