Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2015 Nov 26;527(7579):459-65.
doi: 10.1038/nature16150. Epub 2015 Nov 18.

Hemichordate genomes and deuterostome origins

Affiliations
Comparative Study

Hemichordate genomes and deuterostome origins

Oleg Simakov et al. Nature. .

Abstract

Acorn worms, also known as enteropneust (literally, 'gut-breathing') hemichordates, are marine invertebrates that share features with echinoderms and chordates. Together, these three phyla comprise the deuterostomes. Here we report the draft genome sequences of two acorn worms, Saccoglossus kowalevskii and Ptychodera flava. By comparing them with diverse bilaterian genomes, we identify shared traits that were probably inherited from the last common deuterostome ancestor, and then explore evolutionary trajectories leading from this ancestor to hemichordates, echinoderms and chordates. The hemichordate genomes exhibit extensive conserved synteny with amphioxus and other bilaterians, and deeply conserved non-coding sequences that are candidates for conserved gene-regulatory elements. Notably, hemichordates possess a deuterostome-specific genomic cluster of four ordered transcription factor genes, the expression of which is associated with the development of pharyngeal 'gill' slits, the foremost morphological innovation of early deuterostomes, and is probably central to their filter-feeding lifestyle. Comparative analysis reveals numerous deuterostome-specific gene novelties, including genes found in deuterostomes and marine microbes, but not other animals. The putative functions of these genes can be linked to physiological, metabolic and developmental specializations of the filter-feeding ancestor.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing financial interests.

Figures

Extended Data Figure 1
Extended Data Figure 1. Summary of genome assemblies and heterozygosity distributions for Saccoglossus and Ptychodera
a, Genome statistics summary. b, c, The single nucleotide polymorphism distribution across 100-bp windows for Saccoglossus (b) and the corresponding distribution for Ptychodera (c). The distributions in b and c are fitted with a geometric (expected when high recombination rate is present) and a Poisson distribution (expected with low recombination rate). The distribution for Saccoglossus is fitted to windows with one or more SNPs only, as there is an excess of zero SNP windows (approximately 84% of total 94,324 selected windows). For methods refer to Supplementary Note 2.
Extended Data Figure 2
Extended Data Figure 2. Ambulacrarians approximate the ancestral metazoan gene repertoire
a, Principal component analysis of Panther gene family sizes. Variances of the first two components are plotted in parentheses. Blue indicates deuterostomes; green indicates lophotrochozoans; red, ecdysozoans; yellow/orange, non-bilaterian metazoans. Note the clustering of the ambulacrarians Sko, Pfl and Spu with the non-vertebrate deuterostomes Bfl and Cin in the lower right corner, also with the lophotrochozoans Cgi, Lgi, Hro, Cte and the non-bilaterians Hma, Nve and Adi. b, Heat map of gene family counts showing significant (Fisher’s exact test P value <0.01 after Bonferroni multiple testing correction) expansion in ambulacrarians as well as in Saccoglossus/Ptychodera/amphioxus. The cases discussed in the main text are highlighted in red. See Supplementary Note 4 for details. Species abbreviations are defined in Supplementary Note 4.1.
Extended Data Figure 3
Extended Data Figure 3. Molecular dating of deuterostome and metazoan radiations using PhyloBayes assuming a log-normal relaxed clock model
Yellow circles on particular nodes indicate the calibration dates applied from the fossil record, as indicated in Supplementary Note 6.2. Bars are 95% credibility intervals derived from posterior distributions. Note the estimated times of divergence of chordates and ambulacraria (the deuterostome ancestor) at 570 million years ago (Ma; mid-Ediacaran), hemichordates and echinoderms at 559 Ma, enteropneusts and pterobranchs at 547 Ma, and Harrimaniid and Ptychoderid enteropneusts at 373 Ma.
Extended Data Figure 4
Extended Data Figure 4. Homeobox gene complement of the two hemichordates in comparison to that of amphioxus. The numbers of homeobox-containing gene models are 170 in Saccoglossus and 139 in Ptychodera
These homeobox domains were aligned with 128 homeobox genes of Branchiostoma floridae using ClustalW2, then gaps and unaligned regions were manually removed. Since some genes have more than one homeobox domain, we kept all domains or chose the longest one according to the state of domain conservation. In total, 448 homeobox sequences were aligned. See Supplementary Information for details. The clusters of homeobox genes on scaffolds in Saccoglossus and Ptychodera were identified and drawn at positions around the tree. Conserved clusters between the two species were aligned. In addition to the well-known Hox and ParaHox cluster, 17 clusters were found in at least one of the hemichordates or some in both. Sixteen genes of the Nkx class are distributed over four clusters: (i) nkx1a-vent1-vent2.1-vent2.2; (ii) nkx2.1-nkx2.2-msxlx; (iii) nkx5-msx-nkx3.2-nkx4-lbx-hex; and (iv) voxvent-nk7like-nk7like2. The second cluster (ii) of these is part of the pharyngeal cluster (Fig. 4). Another five-gene cluster consists of one Lim class homeobox gene and four PRD class homeobox genes; isl-otp-rax-arx-gsc. A cluster of six3/6-six1/2-six4/5 was found in both species, and a cluster of three unx genes was found only in P. flava. Ten more clusters were found containing two homeobox genes each. Notably, we found species-specific homeobox clusters in both species. Three remarkable clusters were found in S. kowalevskii in which 10, 12 and 5 homeobox-containing genes are tandem duplicated in scaffold_1710, _52 and _4796, respectively. We also found such clusters in P.flava in which 7, 4, 8 and 10 genes are aligned on scaffold 19451, scaffold 1398, scaffold 12422 and scaffold 154657, respectively. All homeobox genes identified in the genomes of the two hemichordates and amphioxus are listed in the Supplementary Table for Extended Data Fig. 4. This list includes some genes not containing a homeobox (for example, pax1/9) in cases where other family members do (for example, pax2).
Extended Data Figure 5
Extended Data Figure 5. High retention rate of micro-synteny in Saccoglossus
Circos plot showing micro-syntenic conservation in blocks of genes (nmax = 10 and nmin = 2) for six metazoan species for observed (left) and simulated (right) linkages. The width of connecting segments is proportional to the number of genes participating in the syntenic linkages (normalized by the total gene count). In this representation scaffolds are placed end-to-end, and adjacent scaffolds need not be from the same chromosome. While simulated data yields some blocks shared between pairs of species, few or no synteny blocks can be recovered among three or more species (Methods). Saccoglossus shows one of the highest retentions among the selected species (and the highest among the sequenced ambulacrarians). Xenopus (and vertebrates in general) have lost some micro-synteny due to whole-genome duplications and differential loss of paralogues. The matching between the hemichordate S. kowalevskii and the chordate amphioxus is highest, consistent with the fact that neither genome has undergone extensive gene loss (as have tunicates) or pseudo-tetraploidization with extensive loss of paralogues (as have vertebrates).
Extended Data Figure 6
Extended Data Figure 6. Deuterostome specific micro-syntenic linkages
a, b, Very tight linkages with no intervening genes. a, ParaHox cluster shown in S. kowalevskii, P. flava, and human. b, bmp2/4 and univin cluster in the hemichordates S. kowalevskii and P. flava, the sea urchin S. purpuratus, and the cephalochordate B. floridae. ce, Loose micro-syntenic linkages with a maximum of five intervening genes: lefty (c), six1six4 (d), and fgf8fbxw (e) clusters. For c to e all species with micro-synteny are shown. Numbers above the genes indicate the copy number in the locus.
Extended Data Figure 7
Extended Data Figure 7. Three examples showing the domain structures of some proteins encoded by genes found in deuterostomes and marine microbes but not non-deuterostome animals
Best BLASTP hits of the Saccoglossus sequence in human/mouse, as well as in non-deuterostome metazoans and in non-metazoans (such as the cyanobacterium Staniera cyanosphaera, or the eukaryotic micro-alga Ostreococcus tauri) are shown. a, Cytidine monophosphate-N-acetylneuraminic acid hydroxylase (CMAH), an enzyme of sialic acid modification; b, peptidyl arginyl deiminase (PAD), an enzyme of post-translational modification of proteins; c, FATSO-like, also called α-ketoglutarate-dependent dioxygenase FTO, an enzyme that de-methylates N6-methyladenosine in nuclear RNA. Other analyses of these and other genes with the unusual phylogenetic distributions can be found in Supplementary Note 10.
Extended Data Figure 8
Extended Data Figure 8. In situ hybridization demonstration of the expression of von Willebrand type D (vWD) domain-encoding genes (putative glycoproteins/mucins) in Saccoglossus and Ptychodera
a, In Saccoglossus the genes are specifically expressed in different subregions of the ectoderm of the proboscis or collar at these pre-feeding stages. b, In Ptychodera, several of the genes are expressed in endoderm as well as ectoderm of the developing tornaria larva. The sequence IDs for the genes are provided in Supplementary Note S10.4.
Extended Data Figure 9
Extended Data Figure 9. Gene innovation in deuterostomes
a, FastTree phylogenetic tree of the TGFβ family members Lefty, TGFβ 2, GDF8/11 and Nodal ligands (using GTR model). Bootstrap support is plotted as filled circles (size proportional to the support value) on each node. While Lefty shows deuterostome unique sequence composition, TGFβ 2 has an acceleration of sequence change at the deuterostome stem branch, compared to the GDF8/11 or Nodal groups. b, Temporal co-expression of Lefty and TGFβ receptor type III in Saccoglossus at pre-gastrulation developmental stages and of TGFβ 2 and TGFβ receptor type II at post-gastrulation stages. c, In situ hybridization demonstration of the expression in S. kowalevskii of one of the putative type I novelty genes (c9orf9, also known as rsb66) and of two of AAADC genes (aromatic amino acid decarboxylases of the microbial type) of S. kowalevskii (also in P. flava and B. floridae), which closely resemble sequences from bacteria rather than from non-deuterostome metazoans. gs, gill slits. d, The temporal expression profile for c9orf9 during S. kowalevskii development, taken from transcriptome data.
Figure 1
Figure 1. Hemichordate model systems and their embryonic development
The hemichordate phylum includes the enteropneusts (acorn worms) and pterobranchs (minute, colonial, tube-dwelling; not shown). a, c, Saccoglossus kowalevskii (Harrimaniid (direct developing) enteropneust) adult (a) and juvenile (c) with gill slits. b, d, Ptychodera flava (Ptychoderid (indirect developing) enteropneust) adult (b) and the tornaria stage larva (d). Gill slits labelled with an asterisk in a and b. e, Comparison of the direct and indirect modes of development of the two hemichordates, indicating the long pelagic larval period in Ptychodera until the settlement and metamorphosis as a juvenile.
Figure 2
Figure 2. Phylogenetic placement of deuterostome taxa within the metazoan tree
Maximum-likelihood tree obtained with a super-matrix of 506,428 amino-acid residues gathered from 1,564 orthologous genes in 52 species (65.1% occupancy) and using a LG+Γ model partitioned for each gene. Filled circles at nodes denote maximal bootstrap support. Taxa highlighted in bold are newly sequenced genomes and transcriptomes introduced in this study. Bar indicates the number of substitutions per site.
Figure 3
Figure 3. High level of linkage conservation in Saccoglossus
a, Macro-synteny dot plot between Saccoglossus and amphioxus; each dot represents two orthologous genes linked in the two species, and ordered according to their macro-syntenic linkage. Amphioxus scaffolds are organized according to the 17 ancestral linkage groups (ALGs) inferred by comparison of the amphioxus and vertebrate genomes. Intersection areas of highest dot density are marked by numbers along the top of the plot, identifying each of the 17 putative ALGs. Axes represent orthologous gene group index along the genome. b, Branch-length estimation for loss and gain of synteny blocks with MrBayes, see Supplementary Note 7 for details. Short branches in hemichordates (in bold) indicate a high level of micro-syntenic retention in their genomes.
Figure 4
Figure 4. Conservation of a pharyngeal gene cluster across deuterostomes
a, Linkage and order of six genes including the four genes encoding transcription factors Nkx2.1, Nkx2.2, Pax1/9 and FoxA, and two genes encoding non-transcription factors Slc25A21 (solute transporter) and Mipol1 (mirror-image polydactyly 1 protein), which are putative ‘bystander’ genes containing regulatory elements of pax1/9 and foxA, respectively. The pairings of slc25A21 with pax1/9 and of mipol1 with foxA occur also in protostomes, indicating bilaterian ancestry. The cluster is not present in protostomes such as Lottia (Lophotrochozoa), Drosophila melanogaster, Caenorhabditis elegans (Ecdysozoa), or in the cnidarian, Nematostella. SLC25A6 (the slc25A21 paralogue on human chromosome 20) is a potential pseudogene. The dots marking A2 and A4 indicate two conserved non-coding sequences first recognized in vertebrates and amphioxus, also present in S. kowalevskii and, partially, in P. flava and A. planci. b, The four transcription factor genes of the cluster are expressed in the pharyngeal/foregut endoderm of the Saccoglossus juvenile: nkx2.1 is expressed in a band of endoderm at the level of the forming gill pore, especially ventral and posterior to it (arrow), and in a separate ectodermal domain in the proboscis. It is also known as thyroid transcription factor 1 due to its expression in the pharyngeal thyroid rudiment in vertebrates. The nkx2.2 gene is expressed in pharyngeal endoderm just ventral to the forming gill pore, shown in side view (arrow indicates gill pore) and ventral view; and pax1/9 is expressed in the gill pore rudiment itself. In S. kowalevskii, this is its only expression domain, whereas in vertebrates it is also expressed in axial mesoderm. The foxA gene is expressed widely in endoderm but is repressed at the site of gill pore formation (arrow). An external view of gill pores is shown; up to 100 bilateral pairs are present in adults, indicative of the large size of the pharynx.
Figure 5
Figure 5. Examples of deuterostome gene novelties
a, Steps of biosynthesis of sialic acid and its addition to and removal from glycoproteins. bd, Novel genes in TGFβ signalling pathways. The encoded proteins are shown and include Lefty (b), an antagonist of Nodal signalling, which activates Smad2/3-dependent transcription when not antagonized; Univin (c), an agonist of Nodal signalling, also called Vg1, DVR1, and GDF1; and TGFβ 2 (d), a ligand that act ivates Smad2/3-dependent transcription by binding to a deuterostome-specific TGFβ receptor type II, which contains a novel ectodomain (not shown). Also shown in d is the novel protein thrombospondin 1 that activates TGFβ 2 by releasing it from an inactive complex, by way of its TSP1 domains. Red boxes around protein names indicate their deuterostome novelty. Green boxes around the names indicate genes with pan-metazoan/bilaterian ancestry and without accelerated sequence change in the deuterostome lineage.

Comment in

References

    1. Bateson W. The later stages in the development of Balanoglossus Kowalevskii, with a suggestion as to the affinities of the Enteropneusta. 2 parts. Q J Microsc Sci. 1885;25:81–122.
    1. Bateson W. Memoirs: the ancestry of the Chordata. Q J Microsc Sci. 1886;2:535–572.
    1. Kovalevskij AO. Anatomie des Balanoglossus delle Chiaje (Mémoires de l’Académie Impériale des Sciences de. St. Pétersbourg: Imperatorskaja Akademija Nauk; 1866.
    1. Agassiz A. The history of Balanoglossus and tornaria. Memoirs of the American Academy of Arts and Sciences. 1873;9:421–436.
    1. Metschnikof V. Über die systematische Stellung von Balanoglossus. Zool Anz. 1881;4:139–157.

Publication types

Substances

Associated data