Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2011 Jan 24:12:32.
doi: 10.1186/1471-2105-12-32.

Worm Phenotype Ontology: integrating phenotype data within and beyond the C. elegans community

Affiliations

Worm Phenotype Ontology: integrating phenotype data within and beyond the C. elegans community

Gary Schindelman et al. BMC Bioinformatics. .

Abstract

Background: Caenorhabditis elegans gene-based phenotype information dates back to the 1970's, beginning with Sydney Brenner and the characterization of behavioral and morphological mutant alleles via classical genetics in order to understand nervous system function. Since then C. elegans has become an important genetic model system for the study of basic biological and biomedical principles, largely through the use of phenotype analysis. Because of the growth of C. elegans as a genetically tractable model organism and the development of large-scale analyses, there has been a significant increase of phenotype data that needs to be managed and made accessible to the research community. To do so, a standardized vocabulary is necessary to integrate phenotype data from diverse sources, permit integration with other data types and render the data in a computable form.

Results: We describe a hierarchically structured, controlled vocabulary of terms that can be used to standardize phenotype descriptions in C. elegans, namely the Worm Phenotype Ontology (WPO). The WPO is currently comprised of 1,880 phenotype terms, 74% of which have been used in the annotation of phenotypes associated with greater than 18,000 C. elegans genes. The scope of the WPO is not exclusively limited to C. elegans biology, rather it is devised to also incorporate phenotypes observed in related nematode species. We have enriched the value of the WPO by integrating it with other ontologies, thereby increasing the accessibility of worm phenotypes to non-nematode biologists. We are actively developing the WPO to continue to fulfill the evolving needs of the scientific community and hope to engage researchers in this crucial endeavor.

Conclusions: We provide a phenotype ontology (WPO) that will help to facilitate data retrieval, and cross-species comparisons within the nematode community. In the larger scientific community, the WPO will permit data integration, and interoperability across the different Model Organism Databases (MODs) and other biological databases. This standardized phenotype ontology will therefore allow for more complex data queries and enhance bioinformatic analyses.

PubMed Disclaimer

Figures

Figure 1
Figure 1
The hierarchical structure of the WPO. (a) The five children of the root term 'Variant' as viewed in OBO-Edit [63], the ontology editing tool in use at WormBase. (b) Each of these five terms (classes) has multiple descendants, as illustrated by the children and grandchildren of the 'behavior variant' term. The '+' sign in the box denotes that descendent terms are present. Clicking on the "+" sign in OBO-edit reveals the subclasses. The lowercase 'i' icon denotes the 'is_a' parent-child relationship between terms. (c) Under 'movement variant', 'locomotion reduced' is a visible subclass. Among its descendants are 'paralyzed' and 'sluggish' (see text for details). (d) On the right is the OBO-Edit display of the 'bacterially unswollen' phenotype class including a unique identifier (ID), primary name (name) and the definition of the term with references (Dbxrefs i.e., database references). The references in this case are a specific WormBase curator (cab is Carol A Bastiani) and a paper reference [65]. Below the definition are synonyms for this term. In this case, 'Bus' is a three-letter synonym familiar to the C. elegans community. On the left is the placement of 'Bus' in the WPO. Note it has two parents, 'pathogen resistance increased' and 'tail morphology variant'.
Figure 2
Figure 2
WPO term usage. Shown is the distribution of the number of phenotypes (y-axis) with the indicated number of genes annotated per phenotype term (x-axis). Of the 1880 phenotype terms in the WPO, 486 (26%) are unused. Of the remaining terms, 684 have been used to annotate between 1 to 5 genes. 253 terms have been used to annotate between 6 and 10 genes, and so on. The most used phenotype term is 'embryonic lethal', which has been used to annotate 3304 genes (not shown, 'embryonic lethal' is one of 8 terms that have been used to annotate greater than 1000 genes).
Figure 3
Figure 3
Forces driving the development of the WPO. Curation of C. elegans literature helps to increase the robustness of the phenotype ontology and we create terms as needed. Ontology views within OBO-Edit. (a) Blue lines point to the reference for the term. In some cases more than one term is created from a single reference [9,66-71]. (b) Expert input leads to extensive granularity in the ontology. There are 29 descendants of the 'pronuclear nuclear appearance defective early emb' branch (bracketed box), which was refined by soliciting feedback from the community.
Figure 4
Figure 4
C. briggsae phenotype assignments in WormBase. C. briggsae is a nematode species that is closely related to C. elegans [72]. (a) Shown are excerpts from the AF16 strain page, a wild-type form of C. briggsae, which reports the associated phenotype annotations and the corresponding references that describe the controls for each of the experiments. (b) Shown are excerpts from the v53 variation report page, listing observed phenotypes and corresponding references. v53 is a C. briggsae she-1 mutant.
Figure 5
Figure 5
C. elegans phenotype assignments in WormBase. Shown are excerpts from the daf-2 gene page in WormBase. Phenotypes associated with alleles, RNAi experiments or transgenes (not shown) can be viewed in the phenotype summary tables. The e1370 allele object has its own specialized 'Variation Report' page that can be accessed through links, marked with a red oval, embedded in the phenotype summary tables on the Gene Summary page. The phenotype summary tables include a list of phenotypes associated with knockdown via RNAi for daf-2 (green oval). A more detailed overview of this RNAi experiment can be found within the 'RNAi details' section. The details section also contains links to a specific experiment, called the 'RNAi Report', via the WBRNAi ID (purple oval). The phenotype summary also includes 'Not' phenotype annotations (bottom left).
Figure 6
Figure 6
Data mining using the ontology search tool in WormBase. (a) A user may enter a query term in the search box; in this case 'dumpy' is used as an example. Results displayed in the output include the terms that contain 'dumpy' within the term name or within its definition (highlighted in red). Clicking on the number to the right, which indicates the total number of annotations to each term, retrieves RNAi, allele (variation) and transgene objects associated with a phenotype. Displayed is a portion of the 686 annotations made to 'dumpy'. There are RNAi and variation objects associated with this term, but no transgene data. (b) Included in the ontology search output (shown here for 'dumpy') is a window that allows the user to browse the ontology. If a user clicks on a term, the children of that term are revealed as well as the number of genes associated to that term. Shown is one gene directly annotated to 'body length variant' (red arrow), but 1401 total associations are indicated, as this number includes all the annotations to the children ('dumpy', 'short', 'long' and 'small').
Figure 7
Figure 7
Integrating phenotype ontologies across evolutionarily divergent species. (a) Conceptual diagram depicting how multiple orthogonal phenotype ontologies (FlyBase Controlled Vocabulary, Mouse Phenotype Ontology, Worm Phenotype Ontology) can interact with each other via equivalence relationships (cross-products indicated by orange boxes). The example used here pertains to the 'cell death' process. XP stands for 'cross-product' and GO-BP stands for 'Gene Ontology Biological Process' (b) The table displays some of the phenotype annotations to genes relating to cell death anomalies in fly (Drosophila melanogaster), mouse (Mus musculus) and worm (Caenorhabditis elegans). Annotations were retrieved directly from their respective model organism databases (FlyBase, MGI, WormBase). Red font indicates conserved genes among all the depicted species. Green font shows conserved genes between D. melanogaster and M. musculus. Black font shows conserved genes between C. elegans and M. musculus.
Figure 8
Figure 8
Online user submission form for alleles. The form allows users to browse the Worm Phenotype Ontology, assign phenotype terms to alleles or propose changes to the existing phenotype ontology. Submissions are reviewed prior to entry into the database. This form can also be accessed from the Allele data link on the WormBase Online Data Submission forms page at http://www.wormbase.org/db/curate/online_forms.

References

    1. Brenner S. The genetics of Caenorhabditis elegans. Genetics. 1974;77(1):71–94. - PMC - PubMed
    1. Lewis EB. Genetic control and regulation of developmental pathways. New York: Academic Press; 1964.
    1. Muller HJ. Further Studies on the Nature and Causes of Gene Mutations. Proceedings of the 6th International Congress of Genetics. 1932. pp. 213–255.
    1. Consortium. Genome sequence of the nematode C. elegans: a platform for investigating biology. Science. 1998;282(5396):2012–2018. doi: 10.1126/science.282.5396.2012. - DOI - PubMed
    1. Hillier LW, Coulson A, Murray JI, Bao Z, Sulston JE, Waterston RH. Genomics in C. elegans: so many genes, such a little worm. Genome Res. 2005;15(12):1651–1660. doi: 10.1101/gr.3729105. - DOI - PubMed

Publication types