Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2004 Jun 2:5:67.
doi: 10.1186/1471-2105-5-67.

Genome SEGE: a database for 'intronless' genes in eukaryotic genomes

Affiliations

Genome SEGE: a database for 'intronless' genes in eukaryotic genomes

Meena Kishore Sakharkar et al. BMC Bioinformatics. .

Abstract

Background: A number of completely sequenced eukaryotic genome data are available in the public domain. Eukaryotic genes are either 'intron containing' or 'intronless'. Eukaryotic 'intronless' genes are interesting datasets for comparative genomics and evolutionary studies. The SEGE database containing a collection of eukaryotic single exon genes is available. However, SEGE is derived using GenBank. The redundant, incomplete and heterogeneous qualities of GenBank data are a bottleneck for biological investigation in comparative genomics and evolutionary studies. Such studies often require representative gene sets from each genome and this is possible only by deriving specific datasets from completely sequenced genome data. Thus Genome SEGE, a database for 'intronless' genes in completely sequenced eukaryotic genomes, has been constructed.

Availability: http://sege.ntu.edu.sg/wester/intronless

Description: Eukaryotic 'intronless' genes are extracted from nine completely sequenced genomes (four of which are unicellular and five of which are multi-cellular). The complete dataset is available for download. Data subsets are also available for 'intronless' pseudo-genes. The database provides information on the distribution of 'intronless' genes in different genomes together with their length distributions in each genome. Additionally, the search tool provides pre-computed PROSITE motifs for each sequence in the database with appropriate hyperlinks to InterPro. A search facility is also available through the web server.

Conclusions: The unique features that distinguish Genome SEGE from SEGE is the service providing representative 'intronless' datasets for completely sequenced genomes. 'Intronless' gene sets available in this database will be of use for subsequent bio-computational analysis in comparative genomics and evolutionary studies. Such analysis may help to revisit the original genome data for re-examination and re-annotation.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Database construction A flowchart describing the development of the database is shown. CDS = coding sequence.
Figure 2
Figure 2
Illustration of an example search. This example illustrates a search for human 'G protein' in the database. The interface, search page and results (annotation, sequence, Prosite, InterPro links) are shown.

References

    1. Gilbert W. Why genes in pieces? Nature. 1978;271:501. - PubMed
    1. Old RW, Woodland HR. Histone genes: not so simple after all. Cell. 1984;38:624–626. doi: 10.1016/0092-8674(84)90256-3. - DOI - PubMed
    1. Mollapour M, Piper P. Targeted gene deletion in zygo-saccharomyces bailii. Yeast. 2001;18:173–186. doi: 10.1002/1097-0061(20010130)18:2<173::AID-YEA663>3.3.CO;2-6. - DOI - PubMed
    1. Gentles AJ, Karlin S. Why are human G-protein-coupled receptors predominantly 'intronless'? Trends Genet. 1999;15:47–49. doi: 10.1016/S0168-9525(98)01648-5. - DOI - PubMed
    1. Brosius J. Genomes were forged by massive bombardments with retro-elements and retrosequences. Genetica. 1999;107:209–238. doi: 10.1023/A:1004018519722. - DOI - PubMed

Publication types