Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2002 Dec;12(12):1982-91.
doi: 10.1101/gr.580102.

Extension and integration of the gene ontology (GO): combining GO vocabularies with external vocabularies

Affiliations

Extension and integration of the gene ontology (GO): combining GO vocabularies with external vocabularies

David P Hill et al. Genome Res. 2002 Dec.

Abstract

Structured vocabulary development enhances the management of information in biological databases. As information grows, handling the complexity of vocabularies becomes difficult. Defined methods are needed to manipulate, expand and integrate complex vocabularies. The Gene Ontology (GO) project provides the scientific community with a set of structured vocabularies to describe domains of molecular biology. The vocabularies are used for annotation of gene products and for computational annotation of sequence data sets. The vocabularies focus on three concepts universal to living systems, biological process, molecular function and cellular component. As the vocabularies expand to incorporate terms needed by diverse annotation communities, species-specific terms become problematic. In particular, the use of species-specific anatomical concepts remains unresolved. We present a method for expansion of GO into areas outside of the three original universal concept domains. We combine concepts from two orthogonal vocabularies to generate a larger, more specific vocabulary. The example of mammalian heart development is presented because it addresses two issues that challenge GO; inclusion of organism-specific anatomical terms, and proliferation of terms and relationships. The combination of concepts from orthogonal vocabularies provides a robust representation of relevant terms and an opportunity for evaluation of hypothetical concepts.

PubMed Disclaimer

Figures

Figure 1
Figure 1
DAG cross-product example. In this example, a DAG whose nodes represent colors is crossed with a DAG whose nodes represent shapes. The result is a DAG whose nodes are colored shapes. Every combination is represented, so there are eight nodes in the result. An edge connects two nodes in the cross product whenever they have the same color and their shapes are connected in the Shape DAG (8 of these), or they have the same shape and their colors are connected in the Color DAG (4 of these). In general, the number of nodes in the cross product of DAGs A and B is the number of nodes in A times the number of nodes in B. The number of edges in the cross product is the number of edges in A times the number of nodes in B, plus the number of edges in B times the number of nodes in A.
Figure 2
Figure 2
A vocabulary describing heart development constructed using literature references. The format of the vocabulary and other vocabularies in this manuscript is as follows: Indentation reflects parent–child relationships; the < symbol indicates that the child is a part of its parent; and the % symbol indicates that the child is a type of its parent. Multiple parentage is indicated by two terms on the same line, where the first term is a child of the second term. The colored portion of the graph corresponds to the similarly colored diagram in the schematic shown in Figure 6.
Figure 3
Figure 3
A consolidated version of the mouse anatomical dictionary. The time component has been removed, and primitive structures have been defined as “types of” the more mature structure. This vocabulary can be combined with developmental processes to describe the processes underlying the development of the structures.
Figure 4
Figure 4
A modified developmental process ontology. The ontology describes processes, but does not refer to anatomical structures that would be included in an anatomical dictionary. This vocabulary can be combined with anatomical concepts to describe developmental processes occurring in specific structures.
Figure 5
Figure 5
(A) The global development concept has been combined with the anatomical concepts from the anatomical dictionary. This figure only illustrates the first 21 lines of the complete vocabulary. The complete vocabulary would include all of the anatomical terms in Figure 3. In the combinatorial terms presented here, the concept of development taken from the process ontology is shown in boldface. This ontology provides an “anatomical” view of heart development. (B) The developmental process ontology has been combined with the anatomical concept of the heart. In this case a simple rule of adding the phrase “during heart development” was added to the developmental process ontology. This new ontology gives a low-resolution “embryological” picture of heart development. This figure only illustrates the first 29 lines of the vocabulary. The complete vocabulary would include all of the terms shown in Figure 4.
Figure 6
Figure 6
A schematic representation of the processes that occur during the formation of the primitive heart tube. The schematic is modeled after the description given by Kaufman and Bard (1999). (A) The coelomic epithelial cells that are destined to form the heart tube. (B) The coelomic epithelial cells have formed the cuboidal cells of the cardiogenic plate. (C) The plate is undergoing morphogenesis to form the primitive heart tube. (D) The heart tube is complete, and cells have differentiated to give rise to the endocardium, cells forming the cardiac jelly, and the myocardium. For illustrative purposes, the colors of this figure correspond to the colors in each of the text-based graphs of Figures 2 and 8.
Figure 7
Figure 7
The sections of the initial combinatorial ontologies that require expanding to describe formation of the primitive heart tube. (A) This shows that we are dealing with the anatomical concepts of heart, cardiogenic plate, and primitive heart tube. (B) This shows that we need to describe the events that occur during the development of an epithelial sheet. (C) This illustrates that we need to include the processes involved in cell differentiation. The colored portion of the graph corresponds to the similarly colored diagram in the schematic shown in Figure 6.
Figure 8
Figure 8
The complete graph generated by successive combination of terms. In the first stage, terms describing the formation of an epithelial tube during the development of the primitive heart tube were inserted into the graph shown in Figure 7. In the second stage, terms that describe the differentiation of cells in the appropriate tissues were inserted. (A) The graph shown from an anatomical perspective. (B) The graph shown from a developmental process perspective. The portion of each term that was derived from the “anatomical” view of development generated in Figure 5A is shown in boldface.

References

    1. Abdelwahid E, Pelliniemi LJ, Niinikoski H, Simell O, Tuominen J, Rahkonen O, Jokinen E. Apoptosis in the pattern formation of the ventricular wall during mouse heart organogenesis. Anat Rec. 1999;256:208–217. - PubMed
    1. Aho AV, Hopcroft JE, Ullman JD. Data structures and algorithms. Reading, MA: Addison-Wesley; 1983. Directed graphs; pp. 219–221.
    1. Ashburner M, Lewis S. In silico biology. Novartis Symposium. 2002. On ontologies for biologists: The Gene Ontology—Uncoupling the web. (in press). - PubMed
    1. Bard JBL, Kaufman MH, Dubreuil C, Brune RM, Burger A, Baldock R, Davidson DR. An internet-accessible database of mouse developmental anatomy based on a systematic nomenclature. Mech Dev. 1998;74:111–120. - PubMed
    1. Berman JJ, Moore GW. SNOMED-encoded surgical pathology databases: A tool for epidemiologic investigation. Mod Pathol. 1996;9:944–950. - PubMed

Publication types