Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2008 Jul 2;2(7):e258.
doi: 10.1371/journal.pntd.0000258.

On the extent and origins of genic novelty in the phylum Nematoda

Affiliations

On the extent and origins of genic novelty in the phylum Nematoda

James Wasmuth et al. PLoS Negl Trop Dis. .

Abstract

Background: The phylum Nematoda is biologically diverse, including parasites of plants and animals as well as free-living taxa. Underpinning this diversity will be commensurate diversity in expressed genes, including gene sets associated specifically with evolution of parasitism.

Methods and findings: Here we have analyzed the extensive expressed sequence tag data (available for 37 nematode species, most of which are parasites) and define over 120,000 distinct putative genes from which we have derived robust protein translations. Combined with the complete proteomes of Caenorhabditis elegans and Caenorhabditis briggsae, these proteins have been grouped into 65,000 protein families that in turn contain 40,000 distinct protein domains. We have mapped the occurrence of domains and families across the Nematoda and compared the nematode data to that available for other phyla. Gene loss is common, and in particular we identify nearly 5,000 genes that may have been lost from the lineage leading to the model nematode C. elegans. We find a preponderance of novelty, including 56,000 nematode-restricted protein families and 26,000 nematode-restricted domains. Mapping of the latest time-of-origin of these new families and domains across the nematode phylogeny revealed ongoing evolution of novelty. A number of genes from parasitic species had signatures of horizontal transfer from their host organisms, and parasitic species had a greater proportion of novel, secreted proteins than did free-living ones.

Conclusions: These classes of genes may underpin parasitic phenotypes, and thus may be targets for development of effective control measures.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Figure 1
Figure 1. Nematode species contributing to NemPep3.
EST cluster consensuses (putative genes) from 37 nematode species were obtained from NEMBASE3. This set of species includes seven not previously analyzed . The species are organized by their systematic grouping based on the SSU rRNA phylogeny . Feeding strategy is indicated by the small icons. We use contig to describe the consensus sequence produced for each set of clustered ESTs. For each species, the numbers of peptides derived from the BLAST-similarity and ESTScan methods of prot4EST are given: only polypeptides generated by these two high-quality components contributed to NemPep3. The complete proteomes of C. elegans and C. briggsae were obtained from WormBase.
Figure 2
Figure 2. Protein family discovery in the phylum Nematoda.
Nematode protein families (NemFam3) were generated using Markov flow clustering with a range of Inflation parameters. The bars show the extreme number of protein families considering different Inflation parameters. Here we analyse families defined with an Inflation parameter of 3.0. A collector's curve was derived as described in Materials and Methods. Yellow circles indicate the cumulative counts of proteins (x-axis) and unique families (y-axis) as each species was added. The upper black line follows the cumulative number of protein families identified as each new species was included. For example, the 4,368 protein sequences from A. caninum included 1,200 NemFam3 families not present in the Caenorhabditis proteomes. The middle black line tracks the cumulative number of NemFam3 protein family models that identify representatives in non-nematodes, and the bottom line shows the number of NemFam3 protein family models that were present in C. elegans and in species from other (non-nematode) phyla. Region A protein families were restricted to nematodes (given current databases), while region B families have been lost in C. elegans or gained in specific nematode lineages (loss/gain candidates) and are shared with non-nematode taxa. Region C protein families are shared between C. elegans, other nematodes and non-nematode species.
Figure 3
Figure 3. Nematode-restricted protein families.
(A) Distribution of protein family size can be described by a power law, with a large number of small families and the number of families decreasing as their size increases. Removing C. elegans-containing families reduced the total number of families, but the power law distribution persisted. (B) Many protein families had restricted taxonomic distribution within Nematoda. For all protein families with at least five non-C. elegans members, the systematic affinities of the contributing species were compared. Proteins families were identified that were restricted to each of the taxonomic families represented in the analysis, and to higher-level taxonomic groups (e.g. the Spirurina which includes Ascaridomorpha and Spiruromorpha). For example, 243 protein families were restricted to the Tylenchomorpha One species from a taxonomic family needed to be represented in the protein family for inclusion in the figure.
Figure 4
Figure 4. Methionine metabolism in nematodes.
Cytosine-5′-methyltransferase (EC 2.1.1.37) is not present in C. elegans but has been detected in four phylogenetically divergent nematode species, suggesting that it may be widespread throughout the phylum and lost in the Caenorhabditis lineage. Enzymes found in C. elegans are green, those present in other nematodes but absent in C. elegans are red. Three further enzymes were identified as possible candidates for gene loss in C. elegans. Betaine-homocysteine S-methyltransferase (EC 2.1.1.5), homocysteine S-methyltransferase (EC 2.1.1.10) and 5-methyltetrahydropteroyltriglutamate–homocysteine methyltransferase (EC2.1.1.14) were found in one, seven and four nematode species, respectively, but not in C. elegans. The latter two enzymes have not previously been reported in metazoans and their identification in plant-parasitic nematodes may be a result of horizontal gene transfer.
Figure 5
Figure 5. Signal peptides in nematode proteomes in NemPep3.
Signal peptides were predicted in NemPep3 using SignalP . For each species the proportion of signal peptide-containing proteins is given. There is a significant increase in the proportion of novel nematode proteins containing signal peptides relative to proteins with homologues in other phylum (p<0.0001; t = 10.53230; df = 38; paired t-test with data arcsin transformed).

Similar articles

Cited by

References

    1. Lambshead PJ. Recent developments in marine benthic biodiversity research. Oceanis. 1993;19:5–24.
    1. Lambshead PJ, Brown CJ, Ferrero TJ, Hawkins LE, Smith CR, et al. Biodiversity of nematode assemblages from the region of the Clarion-Clipperton Fracture Zone, an area of commercial mining interest. BMC Ecol. 2003;3:1. - PMC - PubMed
    1. Adams MD, Celniker SE, Holt RA, Evans CA, Gocayne JD, et al. The genome sequence of Drosophila melanogaster. Science. 2000;287:2185–2195. - PubMed
    1. Holt RA, Subramanian GM, Halpern A, Sutton GG, Charlab R, et al. The genome sequence of the malaria mosquito Anopheles gambiae. Science. 2002;298:129–149. - PubMed
    1. Mita K, Kasahara M, Sasaki S, Nagayasu Y, Yamada T, et al. The genome sequence of silkworm, Bombyx mori. DNA Res. 2004;11:27–35. - PubMed

Publication types

Substances