Extracting drug-drug interactions from literature using a rich feature-based linear kernel approach

Sun Kim¹, Haibin Liu², Lana Yeganova³, W John Wilbur⁴

Affiliations

¹ National Center for Biotechnology Information (NCBI), Bethesda, MD, USA. Electronic address: sun.kim@nih.gov.
² National Center for Biotechnology Information (NCBI), Bethesda, MD, USA. Electronic address: haibin.liu@nih.gov.
³ National Center for Biotechnology Information (NCBI), Bethesda, MD, USA. Electronic address: lana.yeganova@nih.gov.
⁴ National Center for Biotechnology Information (NCBI), Bethesda, MD, USA. Electronic address: wilbur@ncbi.nlm.nih.gov.

PMID: 25796456
PMCID: PMC4464931
DOI: 10.1016/j.jbi.2015.03.002

Extracting drug-drug interactions from literature using a rich feature-based linear kernel approach

Sun Kim et al. J Biomed Inform. 2015 Jun.

. 2015 Jun:55:23-30.

doi: 10.1016/j.jbi.2015.03.002. Epub 2015 Mar 19.

Authors

Sun Kim¹, Haibin Liu², Lana Yeganova³, W John Wilbur⁴

Affiliations

¹ National Center for Biotechnology Information (NCBI), Bethesda, MD, USA. Electronic address: sun.kim@nih.gov.
² National Center for Biotechnology Information (NCBI), Bethesda, MD, USA. Electronic address: haibin.liu@nih.gov.
³ National Center for Biotechnology Information (NCBI), Bethesda, MD, USA. Electronic address: lana.yeganova@nih.gov.
⁴ National Center for Biotechnology Information (NCBI), Bethesda, MD, USA. Electronic address: wilbur@ncbi.nlm.nih.gov.

PMID: 25796456
PMCID: PMC4464931
DOI: 10.1016/j.jbi.2015.03.002

Abstract

Identifying unknown drug interactions is of great benefit in the early detection of adverse drug reactions. Despite existence of several resources for drug-drug interaction (DDI) information, the wealth of such information is buried in a body of unstructured medical text which is growing exponentially. This calls for developing text mining techniques for identifying DDIs. The state-of-the-art DDI extraction methods use Support Vector Machines (SVMs) with non-linear composite kernels to explore diverse contexts in literature. While computationally less expensive, linear kernel-based systems have not achieved a comparable performance in DDI extraction tasks. In this work, we propose an efficient and scalable system using a linear kernel to identify DDI information. The proposed approach consists of two steps: identifying DDIs and assigning one of four different DDI types to the predicted drug pairs. We demonstrate that when equipped with a rich set of lexical and syntactic features, a linear SVM classifier is able to achieve a competitive performance in detecting DDIs. In addition, the one-against-one strategy proves vital for addressing an imbalance issue in DDI type classification. Applied to the DDIExtraction 2013 corpus, our system achieves an F1 score of 0.670, as compared to 0.651 and 0.609 reported by the top two participating teams in the DDIExtraction 2013 challenge, both based on non-linear kernel methods.

Keywords: Biomedical literature; Drug–drug interaction; Linear kernel approach.

Published by Elsevier Inc.

PubMed Disclaimer

Figures

**Figure 1**
Two-phase DDI extraction framework. DDI detection (①) decides whether a drug pair interacts. DDI type classification (②) assigns DDI types to interacting pairs.

**Figure 2**
An example of preprocessing and feature extraction. The underlined drug pair, *clidinium* and *phenothiazines*, is the candidate DDI. ‘NPC’ means noun phrase-constrained coordination and ‘BE’ denotes *between* candidate drugs.

**Figure 3**
A solution for presenting drug pairs with significant word and word pair features. Highly weighted words are highlighted in the sentence and emphasized according to all the weights they receive in the feature list. ‘BF’, ‘BE’ and ‘AF’ mean *before*, *between* and *after*, respectively. *DRUG* indicates a target drug.

**Figure 4**
Example sentences which start with “[drug name]:”. “[drug name]:” is a section title which is concatenated with the next sentence in the DDIExtraction set.

**Figure 5**
Example sentences, where some drug names are not annotated. “calcium” in the first sentence and the second “corticosteroids” in the second sentence are not annotated as drug names.

See this image and copyright information in PMC

References

1. Baxter K, Preston CL, editors. Stockley’s Drug Interactions. London, UK: Pharmaceutical Press; 2013.
1. Knox C, Law V, Jewison T, Liu P, Ly S, Frolkis A, Pon A, Banco K, Mak C, Neveu V, Djoumbou Y, Eisner R, Guo AC, Wishart DS. DrugBank 3.0: a comprehensive resource for ‘Omics’ research on drugs. Nucleic Acids Research. 2011;39(Suppl 1):D1035–D1041. - PMC - PubMed
1. Duda S, Aliferis C, Miller R, Statnikov A, Johnson K. Extracting drug-drug interaction articles from MEDLINE to improve the content of drug databases; AMIA Annual Symposium Proceedings; 2005. pp. 216–220. - PMC - PubMed
1. Rubin DL, Thorn CF, Klein TE, Altman RB. A statistical approach to scanning the biomedical literature for pharmacogenetics knowledge. Journal of the American Medical Informatics Association. 2005;12(2):121–129. - PMC - PubMed
1. Segura-Bedmar I, Martínez P, de Pablo-Sánchez C. A linguistic rule-based approach to extract drug-drug interactions from pharmacological documents. BMC Bioinformatics. 2011;12(Suppl 2):S1. - PMC - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

Z99 LM999999/ImNIH/Intramural NIH HHS/United States

LinkOut - more resources

Full Text Sources
Other Literature Sources
- scite Smart Citations
Medical
- MedlinePlus Health Information

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Extracting drug-drug interactions from literature using a rich feature-based linear kernel approach

Affiliations

Extracting drug-drug interactions from literature using a rich feature-based linear kernel approach

Authors

Affiliations

Abstract

Figures

References

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources

Medical