Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2010 May 25;8(5):e1000374.
doi: 10.1371/journal.pbio.1000374.

Ontologies in quantitative biology: a basis for comparison, integration, and discovery

Affiliations

Ontologies in quantitative biology: a basis for comparison, integration, and discovery

Lars J Jensen et al. PLoS Biol. .

Abstract

As biology is becoming a data-driven discipline, ontologies become increasingly important for systematically capturing the existing knowledge. This essay discusses current trends and how ontologies can also be used for discovery.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Figure 1
Figure 1. Typical structures of ontologies.
Almost all biomedical ontologies are either simple tree structures that represent hierarchical classifications or directed acyclic graphs (DAGs). The difference is that the latter allows a term to be related to multiple broader tems (green arrows) whereas the former does not. Directed cyclic graphs are very rarely used for ontologies; the reason is that cycles (red arrows) can only arise in ontologies that make use of other relationships than is-a and part-of are used . We illustrate each structure with simplified examples, namely an ontology of vertebrates, an ontology of cellular components, and an ontology of cell-cycle regulation that shows the mutual regulation of cyclin-dependent kinase (CDK) and anaphase-promoting complex/cyclosome (APC/C).
Figure 2
Figure 2. The growth of ontologies in biomedicine.
To illustrate the increasing use of ontologies, we mined PubMed abstracts for occurrences of the words ontology and gene ontology (and the plural forms thereof). We normalized for the general growth of PubMed by converting the raw counts per year to “hits per million abstracts.” The plot shows a steady increase in the awareness of ontologies over the past decade, and that GO became the dominating biological ontology over a period of just five years (note the logarithmic scale). However, ontologies appear to have reached a plateau over in the past three years, at least in terms of how often they are mentioned in abstracts. In contrast, the citations to GO and associated resources are steadily rising (end of 2009>5500) and imply a further increasing use.
Figure 3
Figure 3. Ontology subsumption reasoning.
This example from Washington et al. shows the relationships of the term “intestinal epithelium” to other anatomical entities within the ZFA ontology. Gray arrows with an “i” indicate an is-a relation, and blue arrows with a “p” indicate a part-of relation. The numbers indicate IC of the node, which is the negative log of the probability of that description being used to annotate a gene, allele, or genotype (collectively called a feature). As terms get more general, reading from bottom to top, they have a lower IC score because the more general terms subsume the annotations made to more specific terms.

Similar articles

Cited by

References

    1. Gruber T. R. A translation approach to portable ontology specifications. Knowledge Acquisition. 1993;5:199–220.
    1. Taylor W. R. The classification of amino acid conservation. J Theor Biol. 1986;119:205–218. - PubMed
    1. Murzin A. G, Brenner S. E, Hubbard T, Chothia C. SCOP: a structural classification of proteins database for the investigation of sequences and structures. J Mol Biol. 1995;247:536–540. - PubMed
    1. Orengo C. A, Michie A. D, Jones S, Jones D. T, Swindells M. B, et al. CATH–a hierarchic classification of protein domain structures. Structure. 1997;5:1093–1108. - PubMed
    1. Sonnhammer E. L, Eddy S. R, Durbin R. Pfam: a comprehensive database of protein domain families based on seed alignments. Proteins. 1997;28:405–420. - PubMed

Publication types