Toward creation of a cancer drug toxicity knowledge base: automatically extracting cancer drug-side effect relationships from the literature
- PMID: 23686935
- PMCID: PMC3912715
- DOI: 10.1136/amiajnl-2012-001584
Toward creation of a cancer drug toxicity knowledge base: automatically extracting cancer drug-side effect relationships from the literature
Abstract
Objective: A comprehensive and machine-understandable cancer drug-side effect (drug-SE) relationship knowledge base is important for in silico cancer drug target discovery, drug repurposing, and toxicity predication, and for personalized risk-benefit decisions by cancer patients. While US Food and Drug Administration (FDA) drug labels capture well-known cancer drug SE information, much cancer drug SE knowledge remains buried the published biomedical literature. We present a relationship extraction approach to extract cancer drug-SE pairs from the literature.
Data and methods: We used 21,354,075 MEDLINE records as the text corpus. We extracted drug-SE co-occurrence pairs using a cancer drug lexicon and a clean SE lexicon that we created. We then developed two filtering approaches to remove drug-disease treatment pairs and subsequently a ranking scheme to further prioritize filtered pairs. Finally, we analyzed relationships among SEs, gene targets, and indications.
Results: We extracted 56,602 cancer drug-SE pairs. The filtering algorithms improved the precision of extracted pairs from 0.252 at baseline to 0.426, representing a 69% improvement in precision with no decrease in recall. The ranking algorithm further prioritized filtered pairs and achieved a precision of 0.778 for top-ranked pairs. We showed that cancer drugs that share SEs tend to have overlapping gene targets and overlapping indications.
Conclusions: The relationship extraction approach is effective in extracting many cancer drug-SE pairs from the literature. This unique knowledge base, when combined with existing cancer drug SE knowledge, can facilitate drug target discovery, drug repurposing, and toxicity prediction.
Keywords: Cancer Drug Toxicity; Drug Repurposing; Drug Target Discovery; Information Extraction; Natural Language Processing; Text Mining.
Figures





Similar articles
-
Combining automatic table classification and relationship extraction in extracting anticancer drug-side effect pairs from full-text articles.J Biomed Inform. 2015 Feb;53:128-35. doi: 10.1016/j.jbi.2014.10.002. Epub 2014 Oct 13. J Biomed Inform. 2015. PMID: 25445920 Free PMC article.
-
Automatic construction of a large-scale and accurate drug-side-effect association knowledge base from biomedical literature.J Biomed Inform. 2014 Oct;51:191-9. doi: 10.1016/j.jbi.2014.05.013. Epub 2014 Jun 10. J Biomed Inform. 2014. PMID: 24928448 Free PMC article.
-
Comparing a knowledge-driven approach to a supervised machine learning approach in large-scale extraction of drug-side effect relationships from free-text biomedical literature.BMC Bioinformatics. 2015;16 Suppl 5(Suppl 5):S6. doi: 10.1186/1471-2105-16-S5-S6. Epub 2015 Mar 18. BMC Bioinformatics. 2015. PMID: 25860223 Free PMC article.
-
Literature mining, ontologies and information visualization for drug repurposing.Brief Bioinform. 2011 Jul;12(4):357-68. doi: 10.1093/bib/bbr005. Epub 2011 Jun 28. Brief Bioinform. 2011. PMID: 21712342 Review.
-
Text mining patents for biomedical knowledge.Drug Discov Today. 2016 Jun;21(6):997-1002. doi: 10.1016/j.drudis.2016.05.002. Epub 2016 May 11. Drug Discov Today. 2016. PMID: 27179985 Review.
Cited by
-
Large-scale automatic extraction of side effects associated with targeted anticancer drugs from full-text oncological articles.J Biomed Inform. 2015 Jun;55:64-72. doi: 10.1016/j.jbi.2015.03.009. Epub 2015 Mar 27. J Biomed Inform. 2015. PMID: 25817969 Free PMC article.
-
In silico approaches for drug repurposing in oncology: a scoping review.Front Pharmacol. 2024 Jun 11;15:1400029. doi: 10.3389/fphar.2024.1400029. eCollection 2024. Front Pharmacol. 2024. PMID: 38919258 Free PMC article.
-
Automatic signal extraction, prioritizing and filtering approaches in detecting post-marketing cardiovascular events associated with targeted cancer drugs from the FDA Adverse Event Reporting System (FAERS).J Biomed Inform. 2014 Feb;47:171-7. doi: 10.1016/j.jbi.2013.10.008. Epub 2013 Oct 28. J Biomed Inform. 2014. PMID: 24177320 Free PMC article.
-
Large-scale combining signals from both biomedical literature and the FDA Adverse Event Reporting System (FAERS) to improve post-marketing drug safety signal detection.BMC Bioinformatics. 2014 Jan 15;15:17. doi: 10.1186/1471-2105-15-17. BMC Bioinformatics. 2014. PMID: 24428898 Free PMC article.
-
Computational advances in cancer informatics (a).Cancer Inform. 2014 Oct 13;13(Suppl 1):45-8. doi: 10.4137/CIN.S19243. eCollection 2014. Cancer Inform. 2014. PMID: 25484572 Free PMC article. No abstract available.
References
-
- Ladewski LA, Belknap SM, Nebeker JR, et al. Dissemination of information on potentially fatal adverse drug reactions for cancer drugs from 2000 to 2002: first results from the research on adverse drug events and reports project. J Clin Oncol 2003;21:3859–66 - PubMed
-
- Campillos M, Kuhn M, Gavin AC, et al. Drug target identification using side-effect similarity. Science 2008;321:263–6 - PubMed
Publication types
MeSH terms
Substances
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources