Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2000 Apr;10(4):502-10.
doi: 10.1101/gr.10.4.502.

MAGPIE/EGRET annotation of the 2.9-Mb Drosophila melanogaster Adh region

Affiliations

MAGPIE/EGRET annotation of the 2.9-Mb Drosophila melanogaster Adh region

T Gaasterland et al. Genome Res. 2000 Apr.

Abstract

Our challenge in annotating the 2.91-Mb Adh region of the Drosophila melanogaster genome was to identify genetic and genomic features automatically, completely, and precisely within a 6-week period. To do so, we augmented the MAGPIE microbial genome annotation system to handle eukaryotic genomic sequence data. The new configuration required the integration of eukaryotic gene-finding tools and DNA repeat tools into the automatic data collection module. It also required us to define in MAGPIE new strategies to combine data about eukaryotic exon predictions with functional data to refine the exon predictions. At the heart of the resulting new eukaryotic genome annotation system is a reverse comparison of public protein and complementary DNA sequences against the input genome to identify missing exons and to refine exon boundaries. The software modules that add eukaryotic genome annotation capability to MAGPIE are available as EGRET (Eukaryotic Genome Rapid Evaluation Tool).

PubMed Disclaimer

Figures

Figure 1
Figure 1
Eukaryotic genome analysis strategy.
Figure 2
Figure 2
Annotated features in contig 59 of 50-kb subsequences. The 31 exon coding region labeled dm_059_2 encodes a calcium-ion channel protein.
Figure 3
Figure 3
Annotated features in contig 63 of 50-kb subsequences. Of 13 proteins are encoded, 8 are functionally annotated and 4 are confirmed by ESTs. Individual exons are shown in series in the middle of the graphic and in frame top and bottom.
Figure 4
Figure 4
Evidence shows missing and mispredicted exons for the calcium ion channel protein. The first gap in the first row indicates that exons from the next predicted gene should be merged with the calcium-ion channel gene.
Figure 5
Figure 5
Screen shot of annotation form. Exon boundaries and multiple functions can be edited and saved on the annotation database for further querying.
Figure 6
Figure 6
Evidence summary for DNA mismatch repair protein showing matches in 11 bacterial genomes (blue) and 2 eukaryotic genomes (cyan).
Figure 7
Figure 7
Full evidence view for DNA mismatch repair protein.
Figure 8
Figure 8
Bacterial, archaeal, eukaryotic genomes matched by each gene product with cDNA or protein sequence verified exon boundaries.

Comment in

References

    1. Ashburner M, Misra S, Roote J, Lewis SE, Blazej R, Davis T, Doyle C, Galle R, George R, Harris N. An exploration of the sequence of a 2.9-Mb region of the genome of Drosophila melanogaster: The Adh region. Genetics. 1999;153:179–219. - PMC - PubMed
    1. Attwood T, Flower D, Lewis A, Mabey J, Morgan S, Scordis P, Selley J, Wright W. {PRINTS} prepares for the new millennium. Nucleic Acids Res. 1999;27:220–225. - PMC - PubMed
    1. Altschul S, Madden T, Schaffer A, Zhang J, Miller W, Lipman D. Gapped BLAST and PSI-BLAST: A new generation of protein database search programs. Nucleic Acids Res. 1997;25:3389–3402. - PMC - PubMed
    1. Birney, E. 1999. http://www.sanger.ac.uk/software/wise2/.
    1. Burge C, Karlin S. Finding the genes in genomic DNA. Curr Opin Struct Biol. 1998;8:346–354. - PubMed

Publication types

LinkOut - more resources