GeneFarm, structural and functional annotation of Arabidopsis gene and protein families by a network of experts
- PMID: 15608279
- PMCID: PMC540069
- DOI: 10.1093/nar/gki115
GeneFarm, structural and functional annotation of Arabidopsis gene and protein families by a network of experts
Abstract
Genomic projects heavily depend on genome annotations and are limited by the current deficiencies in the published predictions of gene structure and function. It follows that, improved annotation will allow better data mining of genomes, and more secure planning and design of experiments. The purpose of the GeneFarm project is to obtain homogeneous, reliable, documented and traceable annotations for Arabidopsis nuclear genes and gene products, and to enter them into an added-value database. This re-annotation project is being performed exhaustively on every member of each gene family. Performing a family-wide annotation makes the task easier and more efficient than a gene-by-gene approach since many features obtained for one gene can be extrapolated to some or all the other genes of a family. A complete annotation procedure based on the most efficient prediction tools available is being used by 16 partner laboratories, each contributing annotated families from its field of expertise. A database, named GeneFarm, and an associated user-friendly interface to query the annotations have been developed. More than 3000 genes distributed over 300 families have been annotated and are available at http://genoplante-info.infobiogen.fr/Genefarm/. Furthermore, collaboration with the Swiss Institute of Bioinformatics is underway to integrate the GeneFarm data into the protein knowledgebase Swiss-Prot.
Figures



References
-
- Arabidopsis Genome Initiative (2000) Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature, 408, 796–815. - PubMed
-
- Bork P. and Koonin,E.V. (1998) Predicting functions from protein sequences, where are the bottlenecks? Nature Genet., 18, 313–318. - PubMed
-
- Terryn N., Heijnen,L., De Keyser,A., Van Asseldonck,M., De Clercq,R., Verbakel,H., Gielen,J., Zabeau,M., Villarroel,R., Jesse,T. et al. (1999) Evidence for an ancient chromosomal duplication in Arabidopsis thaliana by sequencing and analysing a 400-kb contig at the APETALA2 locus on chromosome 4. FEBS Lett., 445, 237–245. - PubMed
-
- Smith T.F. and Zhang,X. (1997) The challenges of genome sequence annotation or ‘the devil is in the details’. Nat. Biotechnol., 15, 1222–1223. - PubMed
-
- Gilks W.R., Audit,B., De Angelis,D., Tsoka,S. and Ouzounis,C.A. (2002) Modelling the percolation of annotation errors in a database of protein sequences. Bioinformatics, 18, 1641–1649. - PubMed
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Molecular Biology Databases