Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2016 Aug 10:2016:baw106.
doi: 10.1093/database/baw106. Print 2016.

BioC viewer: a web-based tool for displaying and merging annotations in BioC

Affiliations

BioC viewer: a web-based tool for displaying and merging annotations in BioC

Soo-Yong Shin et al. Database (Oxford). .

Abstract

BioC is an XML-based format designed to provide interoperability for text mining tools and manual curation results. A challenge of BioC as a standard format is to align annotations from multiple systems. Ideally, this should not be a major problem if users follow guidelines given by BioC key files. Nevertheless, the misalignment between text and annotations happens quite often because different systems tend to use different software development environments, e.g. ASCII vs. Unicode. We first implemented the BioC Viewer to assist BioGRID curators as a part of the BioCreative V BioC track (Collaborative Biocurator Assistant Task). For the BioC track, the BioC Viewer helped curate protein-protein interaction and genetic interaction pairs appearing in full-text articles. Here, we describe the BioC Viewer itself as well as improvements made to the BioC Viewer since the BioCreative V Workshop to address the misalignment issue of BioC annotations. While uploading BioC files, a BioC merge process is offered when there are files from the same full-text article. If there is a mismatch between an annotated offset and text, the BioC Viewer adjusts the offset to correctly align with the text. The BioC Viewer has a user-friendly interface, where most operations can be performed within a few mouse clicks. The feedback from BioGRID curators has been positive for the web interface, particularly for its usability and learnability.Database URL: http://viewer.bioqrator.org.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Workflow of the BioCreative V BioC track. Text mining systems first annotate PPI/GI passages as well as gene/protein/organism mentions appearing in full-text articles. After a merging process, BioC documents are imported to the visualization tool (BioC Viewer). Finally, BioGRID curators record PPI/GI pairs by using the BioC Viewer.
Figure 2.
Figure 2.
List of projects. A project is a basic unit of the viewer, and each project consists of a set of documents. When a user creates a new project, one of two modes, ‘Normal’ or ‘BioGRID’, can be chosen. In the ‘BioGRID’ mode, the PPI/GI curation tool is visible in the document viewer.
Figure 3.
Figure 3.
List of documents. A project is a collection of BioC documents. In this page, a user can upload, download or delete BioC documents. Sharing with other users or removing a project can be done using buttons at the bottom of the page.
Figure 4.
Figure 4.
Document viewer in the ‘BioGRID’ mode. The document viewer consists of four main parts: the Annotation Toggle Bar (top), the outline viewer (left side), the text viewer (center) and the PPI/GI curator (right side). The PPI/GI curation tool is only available in the ‘BioGRID’ mode.

References

    1. Comeau D.C., Doğan R.I., Ciccarese P. et al. (2013) BioC: a minimalist approach to interoperability for biomedical text processing. Database, 2013, bat064. - PMC - PubMed
    1. Comeau D.C., Batista-Navarro R.T., Dai H.J. et al. (2014) BioC interoperability track overview. Database, 2014, bau053. - PMC - PubMed
    1. Liu W., Doğan R.I., Kwon D. et al. (2014) BioC implementations in Go, Perl, Python and Ruby. Database, 2014, bau059. - PMC - PubMed
    1. Comeau D.C., Liu H., Doğan R.I., Wilbur W.J. (2014) Natural language processing pipelines to annotate BioC collections with an application to the NCBI disease corpus. Database, 2014, bau056. - PMC - PubMed
    1. Khare R., Wei C.H., Mao Y. et al. (2014) tmBioC: improving interoperability of text-mining tools with BioC. Database, 2014, bau073. - PMC - PubMed

LinkOut - more resources