Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Apr 24;14(4):e0215313.
doi: 10.1371/journal.pone.0215313. eCollection 2019.

A context-based ABC model for literature-based discovery

Affiliations

A context-based ABC model for literature-based discovery

Yong Hwan Kim et al. PLoS One. .

Abstract

Background: In the literature-based discovery, considerable research has been done based on the ABC model developed by Swanson. ABC model hypothesizes that there is a meaningful relation between entity A extracted from document set 1 and entity C extracted from document set 2 through B entities that appear commonly in both document sets. The results of ABC model are relations among entity A, B, and C, which is referred as paths. A path allows for hypothesizing the relationship between entity A and entity C, or helps discover entity B as a new evidence for the relationship between entity A and entity C. The co-occurrence based approach of ABC model is a well-known approach to automatic hypothesis generation by creating various paths. However, the co-occurrence based ABC model has a limitation, in that biological context is not considered. It focuses only on matching of B entity which commonly appears in relation between two entities. Therefore, the paths extracted by the co-occurrence based ABC model tend to include a lot of irrelevant paths, meaning that expert verification is essential.

Methods: In order to overcome this limitation of the co-occurrence based ABC model, we propose a context-based approach to connecting one entity relation to another, modifying the ABC model using biological contexts. In this study, we defined four biological context elements: cell, drug, disease, and organism. Based on these biological context, we propose two extended ABC models: a context-based ABC model and a context-assignment-based ABC model. In order to measure the performance of the both proposed models, we examined the relevance of the B entities between the well-known relations "APOE-MAPT" as well as "FUS-TARDBP". Each relation means interaction between neurodegenerative disease associated with proteins. The interaction between APOE and MAPT is known to play a crucial role in Alzheimer's disease as APOE affects tau-mediated neurodegeneration. It has been shown that mutation in FUS and TARDBP are associated with amyotrophic lateral sclerosis(ALS), a motor neuron disease by leading to neuronal cell death. Using these two relations, we compared both of proposed models to co-occurrence based ABC model.

Results: The precision of B entities by co-occurrence based ABC model was 27.1% for "APOE-MAPT" and 22.1% for "FUS-TARDBP", respectively. In context-based ABC model, precision of extracted B entities was 71.4% for "APOE-MAPT", and 77.9% for "FUS-TARDBP". Context-assignment based ABC model achieved 89% and 97.5% precision for the two relations, respectively. Both proposed models achieved a higher precision than co-occurrence-based ABC model.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Fig 1
Fig 1. Outline of methodology.
Fig 2
Fig 2. Example of tree parsing.
Fig 3
Fig 3. Example of MeSH hierarchical structure of Alzheimer’s disease.
Fig 4
Fig 4. Example of method to connect entity relations.
Fig 5
Fig 5. Change of precision according to biological context similarity threshold.
Fig 6
Fig 6. Comparison of Context-based, context-assignment based and co-occurrence based ABC model (APOE—MAPT).
Fig 7
Fig 7. Precision, recall and F-measure of Context-based ABC model (APOE–MAPT).

Similar articles

Cited by

References

    1. Swanson DR. Undiscovered public knowledge. The Library Quarterly. 1986;56(2):103–118.
    1. Jenssen TK, Lægreid A, Komorowski J, Hovig E. A literature network of human genes for high-throughput analysis of gene expression. Nature genetics. 2001;28(1):21 10.1038/88213 - DOI - PubMed
    1. Jelier R, Jenster G, Dorssers LC, van der Eijk CC, van Mulligen EM, Mons B, et al. Co-occurrence based meta-analysis of scientific texts: retrieving biological relationships between genes. Bioinformatics. 2005;21(9):2049–2058. 10.1093/bioinformatics/bti268 - DOI - PubMed
    1. Leroy G, Chen H. Genescene: An ontology-enhanced integration of linguistic and co-occurrence based relations in biomedical texts. Journal of the American Society for Information Science and Technology. 2005;56(5):457–468.
    1. Li S, Wu L, Zhang Z. Constructing biological networks through combined literature mining and microarray analysis: a LMMA approach. Bioinformatics. 2006;22(17):2143–2150. 10.1093/bioinformatics/btl363 - DOI - PubMed

Publication types