Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2014 Jun 30:2014:bau053.
doi: 10.1093/database/bau053. Print 2014.

BioC interoperability track overview

Affiliations
Review

BioC interoperability track overview

Donald C Comeau et al. Database (Oxford). .

Abstract

BioC is a new simple XML format for sharing biomedical text and annotations and libraries to read and write that format. This promotes the development of interoperable tools for natural language processing (NLP) of biomedical text. The interoperability track at the BioCreative IV workshop featured contributions using or highlighting the BioC format. These contributions included additional implementations of BioC, many new corpora in the format, biomedical NLP tools consuming and producing the format and online services using the format. The ease of use, broad support and rapidly growing number of tools demonstrate the need for and value of the BioC format. Database URL: http://bioc.sourceforge.net/.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
BioC process sequence. The BioC workflow allows data in the BioC format, from a file or any other stream, to be read into the BioC data classes via the Input Connector, or written into a new stream, via the Output Connector. The Data Processing module stands for any kind of NLP or text mining process that uses these data. Several processing modules may be chained together between input and output.
Figure 2.
Figure 2.
Simple example of a BioC file.
Figure 3.
Figure 3.
Key file describing BioC file in Figure 2.

References

    1. Comeau D.C., Islamaj Doğan R., Ciccarese P., et al. (2013) BioC: a minimalist approach to interoperability for biomedical text processing. Database, 2013, bat064 - PMC - PubMed
    1. Liu W., Islamaj Dogan R., Kwon D., et al. (2014) BioC implementations in Go, Perl, Python and Ruby. Database, (Manuscript ID: DATABASE-2014-0031.R1, to appear in this special issue of Database) - PMC - PubMed
    1. Mao Y., Van Auken K., Li D., et al. (2014) Overview of the Gene Ontology Task at BioCreative IV. Database, (Manuscript ID: DATABASE-2014-0047, to appear in this special issue of Database) - PMC - PubMed
    1. Wiegers T.C., Davis A.P., Mattingly C.J., et al. (2014) Web services-based text-mining demonstratesbroad impacts for interoperability and process simplification. Database, bau050 - PMC - PubMed
    1. Smith L., Rindflesch T., Wilbur W.J. (2004) MedPost: a part-of-speech tagger for bioMedical text. Bioinformatics, 20, 2320–2321 - PubMed

Publication types

LinkOut - more resources