Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2005;6(5):R44.
doi: 10.1186/gb-2005-6-5-r44. Epub 2005 Apr 29.

The Sequence Ontology: a tool for the unification of genome annotations

Affiliations

The Sequence Ontology: a tool for the unification of genome annotations

Karen Eilbeck et al. Genome Biol. 2005.

Abstract

The Sequence Ontology (SO) is a structured controlled vocabulary for the parts of a genomic annotation. SO provides a common set of terms and definitions that will facilitate the exchange, analysis and management of genomic data. Because SO treats part-whole relationships rigorously, data described with it can become substrates for automated reasoning, and instances of sequence features described by the SO can be subjected to a group of logical operations termed extensional mereology operators.

PubMed Disclaimer

Figures

Figure 1
Figure 1
A section of the Sequence Ontology showing how terms and relationships are used together to describe knowledge about sequence. The kind_of relationships are depicted using arrows labeled with 'i', the part_of relationships use arrows with 'P' and the derives_from relationships with 'd'. By tracing the arrows that connect the terms, different logical inferences can be made regarding what a term 'is' and what are its allowable parts. For example, an exon is a part_of a transcript, a tRNA is a kind_of ncRNA which is a kind_of processed_transcript.
Figure 2
Figure 2
Using EM operations to characterize alternatively spliced transcripts and their exons. The EM operations overlap and disjoint can be used to characterize pair-wise relationships between alternative transcripts. Binary product and difference, on the other hand, pertain to exons shared, or not-shared between two alternative transcripts.
Figure 3
Figure 3
Examples of alternatively spliced genes from Entrez Gene at the NCBI. Of the seven classes of alternatively spliced genes, some classes are more likely to indicate annotation problems than others - particularly those genes having one or more sequence-disjoint transcripts. Parts-disjoint transcripts, on the other hand, are more suggestive of complex biology. Alternatively spliced genes having only overlapping transcripts (0:0:N) comprise the vast majority of instances.
Figure 4
Figure 4
A series of Venn diagrams showing the relationship between exon class and coding potential. An exon may be fully protein coding, partially protein coding, or be fully UTR. An exon may be a part_of a single transcript gene (single-transcript genes), be a part_of either one (UNIQUE exons), all (ALWAYS_FOUND exons), or a fraction (SOMETIMES_FOUND exons) of transcripts in an alternatively transcribed gene.

References

    1. Genbank http://www.ncbi.nlm.nih.gov/Genbank/index.html
    1. The Institute for Genome Research http://www.tigr.org
    1. Joint Genome Institute http://jgi.doe.gov
    1. Misra S, Crosby MA, Mungall CJ, Matthews BB, Campbell KS, Hradecky P, Huang Y, Kamiker JS, Millburn GH, Prochnik SE, et al. Annotation of the Drosophila melanogaster euchromatic genome: a systematic review. Genome Biol. 2002;3:research0083.1–0083.22. doi: 10.1186/gb-2002-3-12-research0083. - DOI - PMC - PubMed
    1. Stein L, Sternberg P, Durbin R, Thierry-Mieg J, Spieth J. WormBase: network access to the genome and biology of Caenorhabditis elegans. Nucleic Acids Res. 2001;29:82–86. doi: 10.1093/nar/29.1.82. - DOI - PMC - PubMed