Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2008 Aug;147(4):1788-99.
doi: 10.1104/pp.108.119560. Epub 2008 Jun 6.

A community-based annotation framework for linking solanaceae genomes with phenomes

Affiliations

A community-based annotation framework for linking solanaceae genomes with phenomes

Naama Menda et al. Plant Physiol. 2008 Aug.

Abstract

The amount of biological data available in the public domain is growing exponentially, and there is an increasing need for infrastructural and human resources to organize, store, and present the data in a proper context. Model organism databases (MODs) invest great efforts to functionally annotate genomes and phenomes by in-house curators. The SOL Genomics Network (SGN; http://www.sgn.cornell.edu) is a clade-oriented database (COD), which provides a more scalable and comparative framework for biological information. SGN has recently spearheaded a new approach by developing community annotation tools to expand its curational capacity. These tools effectively allow some curation to be delegated to qualified researchers, while, at the same time, preserving the in-house curators' full editorial control. Here we describe the background, features, implementation, results, and development road map of SGN's community annotation tools for curating genotypes and phenotypes. Since the inception of this project in late 2006, interest and participation from the Solanaceae research community has been strong and growing continuously to the extent that we plan to expand the framework to accommodate more plant taxa. All data, tools, and code developed at SGN are freely available to download and adapt.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Graphic representation of the major data types in the SGN schema for representing loci and phenotypes and their associated data. The two central data types are locus for storing gene information and accession for storing phenotype data. Both data types are interlinked and cross-reference to images, genetic map locations, the literature, and controlled vocabulary terms (ontologies). Phenotypes are linked to populations, and loci have sequence annotations to GenBank and SGN unigenes, which link to further information such as genetic markers, bacterial artificial chromosome sequences, and metabolic pathways (SolCyc database). This schema interacts highly with the Chado schema (Mungall et al., 2007). Chado tables are not shown in the figure. [See online article for color version of this figure.]
Figure 2.
Figure 2.
SGN locus module. A, Web user-editable locus details section. The interface grants edit privileges to locus editors and curators. A chromosome glyph with genetic mapping information is printed on the right. Clicking on the chromosome opens the comparative viewer. Clicking on a marker name opens a marker info page. B, Images are displayed on the locus page and provide links to the phenotype database. C, Metabolic pathway information. Clicking on the chemical reaction glyph opens the SolCyc reaction page. [See online article for color version of this figure.]
Figure 3.
Figure 3.
The ontology term annotation tool, available both on locus and on accession pages. Curators and submitters can select an ontology to browse from a drop-down menu. While typing an ontology term name or ID, a list of matches is displayed in the text area. When selecting a term from the results list, the user is required to choose a relationship type and an evidence code supporting the annotation. The field of evidence description is populated based on the selected evidence code. The fields of evidence with and reference are populated with the object's associated sequences and literature references. Clicking on the associate ontology button stores the selected information in the database along with the user details and date, and the annotation is displayed on the Web page. [See online article for color version of this figure.]
Figure 4.
Figure 4.
Locus ontology annotations by species. Number of annotations by controlled vocabulary name for each species. [See online article for color version of this figure.]
Figure 5.
Figure 5.
SGN accession page. A, The details section contains population and submitter information, followed by underlying loci entries. B, Images. C, Quantitative phenotypes. D, Genotype data. [See online article for color version of this figure.]

References

    1. Arabidopsis Genome Initiative (2000) Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature 408 796–815 - PubMed
    1. Avraham S, Tung CW, Ilic K, Jaiswal P, Kellogg EA, McCouch S, Pujar A, Reiser L, Rhee SY, Sachs MM, et al (2008) The Plant Ontology Database: a community resource for plant structure and developmental stages controlled vocabulary and annotations. Nucleic Acids Res 36 D449–D454 - PMC - PubMed
    1. Butler L (1952) The linkage map of the tomato. J Hered 43 25–35
    1. Caspi R, Foerster H, Fulcher CA, Kaipa P, Krummenacker M, Latendresse M, Paley S, Rhee SY, Shearer AG, Tissier C, et al (2008) The MetaCyc Database of metabolic pathways and enzymes and the BioCyc collection of Pathway/Genome Databases. Nucleic Acids Res 36 D623–D631 - PMC - PubMed
    1. Crosby MA, Goodman JL, Strelets VB, Zhang P, Gelbart WM, FlyBase Consortium (2007) FlyBase: genomes by the dozen. Nucleic Acids Res 35 D486–D491 - PMC - PubMed

Publication types

LinkOut - more resources