Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012 Nov 15;28(22):2963-70.
doi: 10.1093/bioinformatics/bts542. Epub 2012 Sep 6.

Application and evaluation of automated methods to extract neuroanatomical connectivity statements from free text

Affiliations

Application and evaluation of automated methods to extract neuroanatomical connectivity statements from free text

Leon French et al. Bioinformatics. .

Abstract

Motivation: Automated annotation of neuroanatomical connectivity statements from the neuroscience literature would enable accessible and large-scale connectivity resources. Unfortunately, the connectivity findings are not formally encoded and occur as natural language text. This hinders aggregation, indexing, searching and integration of the reports. We annotated a set of 1377 abstracts for connectivity relations to facilitate automated extraction of connectivity relationships from neuroscience literature. We tested several baseline measures based on co-occurrence and lexical rules. We compare results from seven machine learning methods adapted from the protein interaction extraction domain that employ part-of-speech, dependency and syntax features.

Results: Co-occurrence based methods provided high recall with weak precision. The shallow linguistic kernel recalled 70.1% of the sentence-level connectivity statements at 50.3% precision. Owing to its speed and simplicity, we applied the shallow linguistic kernel to a large set of new abstracts. To evaluate the results, we compared 2688 extracted connections with the Brain Architecture Management System (an existing database of rat connectivity). The extracted connections were connected in the Brain Architecture Management System at a rate of 63.5%, compared with 51.1% for co-occurring brain region pairs. We found that precision increases with the recency and frequency of the extracted relationships.

Availability and implementation: The source code, evaluations, documentation and other supplementary materials are available at http://www.chibi.ubc.ca/WhiteText.

Contact: paul@chibi.ubc.ca.

Supplementary information: Supplementary data are available at Bioinformatics Online.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
Flow chart depicting the processing steps for comparison with the Brain Architecture Management System

References

    1. Airola A, et al. All-paths graph kernel for protein-protein interaction extraction with evaluation of cross-corpus learning. BMC Bioinformatics. 2008;9(Suppl. 11):S2. - PMC - PubMed
    1. Blaschke C, et al. Automatic extraction of biological information from scientific text: protein-protein interactions. Proc. Int. Conf. Intell. Syst. Mol. Biol. 1999:60–67. - PubMed
    1. Bota M, et al. From gene networks to brain networks. Nat. Neurosci. 2003;6:795–799. - PubMed
    1. Bota M, et al. Brain architecture management system. Neuroinformatics. 2005;3:15–48. - PubMed
    1. Broadwell RD, Jacobowitz DM. Olfactory relationships of the telencephalon and diencephalon in the rabbit. III. The ipsilateral centrifugal fibers to the olfactory bulbar and retrobulbar formations. J. Comp. Neurol. 1976;170:321–345. - PubMed

Publication types