PlantGDB, plant genome database and analysis tools

Qunfeng Dong¹, Shannon D Schlueter, Volker Brendel

Affiliations

PMID: 14681433
PMCID: PMC308780
DOI: 10.1093/nar/gkh046

PlantGDB, plant genome database and analysis tools

Qunfeng Dong et al. Nucleic Acids Res. 2004.

. 2004 Jan 1;32(Database issue):D354-9.

doi: 10.1093/nar/gkh046.

Authors

Qunfeng Dong¹, Shannon D Schlueter, Volker Brendel

Affiliation

¹ Department of Genetics, Development and Cell Biology, Iowa State University, Ames, IA 50011-3260, USA.

PMID: 14681433
PMCID: PMC308780
DOI: 10.1093/nar/gkh046

Abstract

PlantGDB (http://www.plantgdb.org/) is a database of molecular sequence data for all plant species with significant sequencing efforts. The database organizes EST sequences into contigs that represent tentative unique genes. Contigs are annotated and, whenever possible, linked to their respective genomic DNA. Genome sequence fragments are assembled similarly. The goal of the PlantGDB web site is to establish the basis for identifying sets of genes common to all plants or specific to particular species by integrating a number of bioinformatics tools that facilitate gene prediction and cross- species comparisons. For species with large-scale genome sequencing efforts, PlantGDB provides genome browsing capabilities that integrate all available EST and cDNA evidence for current gene models (for Arabidopsis thaliana, see the AtGDB site at http://www.plantgdb.org/AtGDB/).

PubMed Disclaimer

Figures

**Figure 1**
EST contig display at PlantGDB. The screenshot represents a typical display of an EST contig record at PlantGDB, assembled from species-specific EST collections using the PaCE (19) and CAP3 (20) programs. The central panel provides basic annotation for the sequence, including tentative functional annotation based on significant protein-level similarities. Sequence similarity search tool functions (BLAST) are linked via the icons on top (selection pastes the sequence into the query screen of the respective tool), as are download functions for the sequences. The diagram in the lower panel displays a schematic of the multiple sequence alignment derived with the CAP3 program. The black line represents the contig consensus sequence, and the red lines indicate each member EST. The actual alignments can be viewed by clicking on the link below the diagram. The table at the bottom links to the library sources of member ESTs and individual sequence records. GSS contigs are displayed similarly (not shown).

**Figure 2**
AtGDB visualization of current genome annotation and cDNA and EST spliced alignments. The figure illustrates AtGDB visualization of cDNA and EST spliced alignments compared with current GenBank gene structure annotation for a 9 kb region on chromosome 4. Exons are indicated by solid boxes, connected by lines representing intron sequences. The arrows denote 3′-ends. Dark blue, current GenBank mRNA annotation. Light blue, cognate cDNA spliced alignments. Red and pink, cognate and non-cognate EST spliced alignments. Green and blue indicators within the EST structures represent 5′- and 3′-clone end designators, respectively. Green boxes surrounding EST structures associate members of a clone pair. cDNAs and ESTs are labeled by their GenBank GI numbers. Note the example of typical difficulties of automated gene structure annotation: the 3′-ends of genes At4g38500 and At4g38510 are annotated as overlapping at GenBank (dark blue), although several full-length cDNAs clearly indicate the correct 3′-ends of both genes. The erroneous annotation is likely caused by assignment of single-exon 3′-ESTs from At4g38500 transcripts to pseudo-transcripts of both At4g38500 and At4g38510. The importance of correct gene structure annotations (and visualization of all evidence) is underscored by the links to gene expression data accessible at the Stanford Microarray Database: EST probes 2748113 and 3449530 could not be resolved to a single gene based on the current genome annotation, and probe 2757897 is seen to contain an intron.

See this image and copyright information in PMC

References

1. Fleischmann R.D., Adams,M.D., White,O., Clayton,R.A., Kirkness,E.F., Kerlavage,A.R., Bult,C.J., Tomb,J.F., Dougherty,B.A., Merrick,J.M. et al. (1995) Whole-genome random sequencing and assembly of Haemophilus influenzae Rd. Science, 269, 496–512. - PubMed
1. Jordan I.K., Rogozin,I.B., Wolf,Y.I. and Koonin,E.V. (2002) Microevolutionary genomics of bacteria. Theor. Popul. Biol., 61, 435–447. - PubMed
1. The Arabidopsis Genome Initiative (2000) Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature, 408, 796–815. - PubMed
1. Goff S.A., Ricke,D., Lan,T.-H., Presting,G., Wang,R., Dunn,M., Glazebrook,J., Sessions,A., Oeller,P., Varma,H. et al. (2002) A draft sequence of the rice genome (Oryza sativa L. ssp. japonica). Science, 296, 92–100. - PubMed
1. Yu J., Hu,S., Wang,J., Wong,G.K.-S., Li,S., Liu,B., Deng,Y., Dai,L., Zhou,Y., Zhang,X. et al. (2002) A draft sequence of the rice genome (Oryza sativa L. ssp. indica). Science, 296, 79–92. - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
Other Literature Sources
- The Lens - Patent Citations Database
Molecular Biology Databases
- The Arabidopsis Information Resource
Research Materials
- NCI CPTC Antibody Characterization Program

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

PlantGDB, plant genome database and analysis tools

Affiliation

PlantGDB, plant genome database and analysis tools

Authors

Affiliation

Abstract

Figures

References

Publication types

MeSH terms

LinkOut - more resources

Full Text Sources

Other Literature Sources

Molecular Biology Databases

Research Materials