Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2005 Oct;139(2):619-31.
doi: 10.1104/pp.105.065201.

GERMINATE. a generic database for integrating genotypic and phenotypic information for plant genetic resource collections

Affiliations

GERMINATE. a generic database for integrating genotypic and phenotypic information for plant genetic resource collections

Jennifer M Lee et al. Plant Physiol. 2005 Oct.

Abstract

The extensive germplasm resource collections that are now available for major crop plants and their wild relatives will increasingly provide valuable biological and bioinformatics resources for plant physiologists and geneticists to dissect the molecular basis of key traits and to develop highly adapted plant material to sustain future breeding programs. A key to the efficient deployment of these resources is the development of information systems that will enable the collection and storage of biological information for these plant lines to be integrated with the molecular information that is now becoming available through the use of high-throughput genomics and post-genomics technologies. The GERMINATE database has been designed to hold a diverse variety of data types, ranging from molecular to phenotypic, and to allow querying between such data for any plant species. Data are stored in GERMINATE in a technology-independent manner, such that new technologies can be accommodated in the database as they emerge, without modification of the underlying schema. Users can access data in GERMINATE databases either via a lightweight Perl-CGI Web interface or by the more complex Genomic Diversity and Phenotype Connection software. GERMINATE is released under the GNU General Public License and is available at http://germinate.scri.sari.ac.uk/germinate/.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Graphic of the GERMINATE schema. Codes: PK, Primary Key; U, Unique index; I, Index; FK, Foreign Key (the numbers following indicate if multiple columns are a part of the same index or key). The four modules and 54 tables are depicted.
Figure 2.
Figure 2.
Genetic dataset loading example. The Data table represents a sample of how molecular marker data are typically submitted: a set of markers analyzed in a set of accessions. The arrows in the figure show flow of information as it is inserted into the database. Black arrows indicate data are being held temporarily, green indicates the insertion to the database, and blue that data already inserted are being used to insert information into another table. In the latter case, IDs assigned by the database are used to trace back to the original data. The colors in the tables follow the dataset and metadatasets through the process of being inserted into the database. The peach color denotes the Accession metadataset, green denotes the Marker metadataset, and purple denotes the allele data. Box A represents the Accession data and metadata inserted into GERMINATE. On entry, each accession is assigned an accession_id that is unique in the database, and this ID is used to reference the appropriate accession in the accession metadataset. The order or number of accession_ids has no influence on the order of accessions in the metadataset. The ReferenceData table uses a data index to track the correct order of the accession_ids. Box B indicates where the marker information is inserted into the database, again retaining the order in the original dataset by the data index value. Box C demonstrates how the allelic state of the accession by marker is translated into an integer ID (enum_index). This ID is stored in appropriate order in the IntegerData table. The enum_index can then be used to translate back to the actual allele value or to an allele index if only the relative allele states between accessions are required in a query. The AlleleIndex table was created to speed up queries where technology is unimportant and the relative allele values will suffice to answer the question. Box D displays the metadata information recorded in the database required to recreate the dataset. This includes the number of dimensions for a dataset and relates the metadatasets to the dataset.
Figure 3.
Figure 3.
Example of GDPC and the GERMINATE Perl-CGI interfaces returning information about Pisum sativum subspecies abyssinicum. A, The GDPC interface showing a taxa query that has retrieved accessions that are from P. sativum subspecies abyssinicum and that have a source geographical location. The properties shown are for accession number 691, Small Black Pea from Ethiopia, one of the accessions returned. B, The Perl-CGI interface showing a similar query. The passport descriptor subtaxa have been searched for abyssinicum; taxonomy information, some location information, accession name and number, and institution code have been returned. The same accession (691) highlighted in A is highlighted here for comparison.

Similar articles

Cited by

References

    1. Aharoni A, Vorst O (2002) DNA microarrays for functional genomics. Plant Mol Biol 48: 99–118 - PubMed
    1. Alercia A, Diulgheroff S, Metz T (2001) FAO/IPGRI Multi-Crop Passport Descriptors. http://www.ipgri.cgiar.org/publications (June 26, 2003)
    1. Brettschneider R (1998) RFLP analysis. In A Karp, PG Isaac, DS Ingram, eds, Molecular Tools for Screening Biodiversity. Chapman and Hall, London, pp 83–95
    1. Bruskiewich RM, Cosico AB, Eusebio W, Portugal AM, Ramos LM, Reyes MT, Sallan MA, Ulat VJ, Wang X, McNally KL, et al (2003) Linking genotype to phenotype: the International Rice Information System (IRIS). Bioinformatics (Suppl 1) 19: i63–i65 - PubMed
    1. Casstevens TM, Buckler ES (2004) GDPC: connecting researchers with multiple integrated data sources. Bioinformatics 20: 2839–2840 - PubMed

Publication types

LinkOut - more resources