Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2005 Jun;15(6):893-9.
doi: 10.1101/gr.3756405. Epub 2005 May 17.

AnoEST: toward A. gambiae functional genomics

Affiliations

AnoEST: toward A. gambiae functional genomics

Evgenia V Kriventseva et al. Genome Res. 2005 Jun.

Abstract

Here, we present an analysis of 215,634 EST and cDNA sequences of a major vector of human malaria Anopheles gambiae structured into the AnoEST database. The expressed sequences are grouped into clusters using genomic sequence as template and associated with inferred functional annotation, including the following: corresponding Ensembl gene prediction, putative orthologous genes in other species, homology to known proteins, protein domains, associated Gene Ontology terms, and corresponding classification into broad GO-slim functional groups. AnoEST is a vital resource for interpretation of expression profiles derived using recently developed A. gambiae cDNA microarrays. Using these cDNA microarrays, we have experimentally confirmed the expression of 7961 clusters during mosquito development. Of these, 3100 are not associated with currently predicted genes. Moreover, we found that clusters with confirmed expression are nonbiased with respect to the current gene annotation or homology to known proteins. Consequently, we expect that many as yet unconfirmed clusters are likely to be actual A. gambiae genes. [AnoEST is publicly available at http://komar.embl.de, and is also accessible as a Distributed Annotation Service (DAS).].

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Analysis of the 21,478 T-clusters. The chart lists numbers of T-clusters, of which expression during mosquito development was confirmed by microarray experiments (pink) and numbers of clusters for which microarray-based expression was not tested or detected (blue). Numbers are provided separately for clusters with two or more ESTs (right) and singletons (clusters with one EST, left). For each category, the numbers of clusters with and without Ensembl gene predictions, as well as the numbers with and without homologs in UniProt/SWISS-PROT are indicated. The inner ring lists the total number of EST clusters with and without microarray data, and the outer two rings partition these clusters according to the associated annotation.
Figure 2.
Figure 2.
(A) Comparison of log2 expression value distributions for T-clusters with and without overlaps with Ensembl gene predictions. The graph also depicts mean and standard deviation values for the corresponding distributions. (B) Comparison of log2 expression value distributions for T-clusters with and without homology hits in the SWISS-PROT knowledgebase; mean and standard deviation values are also shown.
Figure 3.
Figure 3.
Interactive searches available in AnoEST. Searching with an EST clone identifier allows (A) an overview of associated EST sequences, corresponding clusters, overlapping Ensembl genes, and best hit in the SWISS-PROT database and its description. (B) The detailed view gives, in addition, coordinates of EST cluster match to the genome, links to orthologous groups identified on the basis of corresponding Ensembl gene predictions, protein domains collected in the InterPro database, and corresponding GO terms. When examining EST clusters, (C) a graphical representation of overlapping ESTs permits visualization of the underlying exon-intron structure.

References

    1. Apweiler, R., Bairoch, A., Wu, C.H., Barker, W.C., Boeckmann, B., Ferro, S., Gasteiger, E., Huang, H., Lopez, R., Magrane, M., et al. 2004. UniProt: The Universal Protein knowledgebase. Nucleic Acids Res. 32: D115-D119. - PMC - PubMed
    1. Ashburner, M., Ball, C.A., Blake, J.A., Botstein, D., Butler, H., Cherry, J.M., Davis, A.P., Dolinski, K., Dwight, S.S., Eppig, J.T., et al. 2000. Gene ontology: Tool for the unification of biology. The Gene Ontology Consortium. Nat. Genet. 25: 25-29. - PMC - PubMed
    1. Bateman, A., Coin, L., Durbin, R., Finn, R.D., Hollich, V., Griffiths-Jones, S., Khanna, A., Marshall, M., Moxon, S., Sonnhammer, E.L., et al. 2004. The Pfam protein families database. Nucleic Acids Res. 32: D138-D141. - PMC - PubMed
    1. Benson, D.A., Karsch-Mizrachi, I., Lipman, D.J., Ostell, J., and Wheeler, D.L. 2004. GenBank: Update. Nucleic Acids Res. 32: D23-D26. - PMC - PubMed
    1. Birney, E., Andrews, T.D., Bevan, P., Caccamo, M., Chen, Y., Clarke, L., Coates, G., Cuff, J., Curwen, V., Cutts, T., et al. 2004. An overview of Ensembl. Genome Res. 14: 925-928. - PMC - PubMed

WEB SITE REFERENCES

    1. http://komar.embl.de; AnoEST database.
    1. http://komar.embl.de:9000/das; AnoEST DAS server.
    1. http://www.genoscope.org/; Genoscope—Centre National de Séquençage.
    1. http://www.girinst.org/; Genetic Information Research Institute.
    1. http://www.anobase.org/; AnoBase database.

Publication types