Developing a biocuration workflow for AgBase, a non-model organism database
- PMID: 23160411
- PMCID: PMC3500517
- DOI: 10.1093/database/bas038
Developing a biocuration workflow for AgBase, a non-model organism database
Abstract
AgBase provides annotation for agricultural gene products using the Gene Ontology (GO) and Plant Ontology, as appropriate. Unlike model organism species, agricultural species have a body of literature that does not just focus on gene function; to improve efficiency, we use text mining to identify literature for curation. The first component of our annotation interface is the gene prioritization interface that ranks gene products for annotation. Biocurators select the top-ranked gene and mark annotation for these genes as 'in progress' or 'completed'; links enable biocurators to move directly to our biocuration interface (BI). Our BI includes all current GO annotation for gene products and is the main interface to add/modify AgBase curation data. The BI also displays Extracting Genic Information from Text (eGIFT) results for each gene product. eGIFT is a web-based, text-mining tool that associates ranked, informative terms (iTerms) and the articles and sentences containing them, with genes. Moreover, iTerms are linked to GO terms, where they match either a GO term name or a synonym. This enables AgBase biocurators to rapidly identify literature for further curation based on possible GO terms. Because most agricultural species do not have standardized literature, eGIFT searches all gene names and synonyms to associate articles with genes. As many of the gene names can be ambiguous, eGIFT applies a disambiguation step to remove matches that do not correspond to this gene, and filtering is applied to remove abstracts that mention a gene in passing. The BI is linked to our Journal Database (JDB) where corresponding journal citations are stored. Just as importantly, biocurators also add to the JDB citations that have no GO annotation. The AgBase BI also supports bulk annotation upload to facilitate our Inferred from electronic annotation of agricultural gene products. All annotations must pass standard GO Consortium quality checking before release in AgBase. Database URL: http://www.agbase.msstate.edu/.
Figures





Similar articles
-
eGIFT: mining gene information from the literature.BMC Bioinformatics. 2010 Aug 9;11:418. doi: 10.1186/1471-2105-11-418. BMC Bioinformatics. 2010. PMID: 20696046 Free PMC article.
-
AgBase: a functional genomics resource for agriculture.BMC Genomics. 2006 Sep 8;7:229. doi: 10.1186/1471-2164-7-229. BMC Genomics. 2006. PMID: 16961921 Free PMC article.
-
AgBase: supporting functional modeling in agricultural organisms.Nucleic Acids Res. 2011 Jan;39(Database issue):D497-506. doi: 10.1093/nar/gkq1115. Epub 2010 Nov 12. Nucleic Acids Res. 2011. PMID: 21075795 Free PMC article.
-
How to learn about gene function: text-mining or ontologies?Methods. 2015 Mar;74:3-15. doi: 10.1016/j.ymeth.2014.07.004. Epub 2014 Aug 1. Methods. 2015. PMID: 25088781 Review.
-
Gene Ontology annotation of the rice blast fungus, Magnaporthe oryzae.BMC Microbiol. 2009 Feb 19;9 Suppl 1(Suppl 1):S8. doi: 10.1186/1471-2180-9-S1-S8. BMC Microbiol. 2009. PMID: 19278556 Free PMC article. Review.
Cited by
-
Functional and expression analyses of transcripts based on full-length cDNAs of Sorghum bicolor.DNA Res. 2015 Dec;22(6):485-93. doi: 10.1093/dnares/dsv030. Epub 2015 Nov 5. DNA Res. 2015. PMID: 26546227 Free PMC article.
-
Genome Sequencing of the Pyruvate-producing Strain Candida glabrata CCTCC M202019 and Genomic Comparison with Strain CBS138.Sci Rep. 2016 Oct 7;6:34893. doi: 10.1038/srep34893. Sci Rep. 2016. PMID: 27713500 Free PMC article.
-
Machine learning approaches and databases for prediction of drug-target interaction: a survey paper.Brief Bioinform. 2021 Jan 18;22(1):247-269. doi: 10.1093/bib/bbz157. Brief Bioinform. 2021. PMID: 31950972 Free PMC article. Review.
-
Xenbase: key features and resources of the Xenopus model organism knowledgebase.Genetics. 2023 May 4;224(1):iyad018. doi: 10.1093/genetics/iyad018. Genetics. 2023. PMID: 36755307 Free PMC article.
-
Preliminary evaluation of the CellFinder literature curation pipeline for gene expression in kidney cells and anatomical parts.Database (Oxford). 2013 Apr 18;2013:bat020. doi: 10.1093/database/bat020. Print 2013. Database (Oxford). 2013. PMID: 23599415 Free PMC article.
References
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
Research Materials