Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 Mar 23:5:e3154.
doi: 10.7717/peerj.3154. eCollection 2017.

Systematic drug repositioning through mining adverse event data in ClinicalTrials.gov

Affiliations

Systematic drug repositioning through mining adverse event data in ClinicalTrials.gov

Eric Wen Su et al. PeerJ. .

Abstract

Drug repositioning (i.e., drug repurposing) is the process of discovering new uses for marketed drugs. Historically, such discoveries were serendipitous. However, the rapid growth in electronic clinical data and text mining tools makes it feasible to systematically identify drugs with the potential to be repurposed. Described here is a novel method of drug repositioning by mining ClinicalTrials.gov. The text mining tools I2E (Linguamatics) and PolyAnalyst (Megaputer) were utilized. An I2E query extracts "Serious Adverse Events" (SAE) data from randomized trials in ClinicalTrials.gov. Through a statistical algorithm, a PolyAnalyst workflow ranks the drugs where the treatment arm has fewer predefined SAEs than the control arm, indicating that potentially the drug is reducing the level of SAE. Hypotheses could then be generated for the new use of these drugs based on the predefined SAE that is indicative of disease (for example, cancer).

Keywords: ClinicalTrials.gov; Drug repositioning; Drug repurposing; Indication discovery; Text mining.

PubMed Disclaimer

Conflict of interest statement

Eric Wen Su and Todd M. Sanger are employees of Eli Lilly and Company, United States of America.

Figures

Figure 1
Figure 1. The I2E query.
See Supplemental Information 1 to reproduce the query by copying and pasting the YAML script into the I2E Pro interface. The query was run on the I2E index that covers the data posted in ClinicalTrials.gov up to August 14, 2016.
Figure 2
Figure 2. An example of the data extracted from ClinicalTrials.gov (A) into Excel (B) by the I2E query described above.
The top two rows in (B) show the data extracted from the table in (A). The precision of the I2E query described above is 100%, and the recall is estimated as 99% assuming 1% of the cancer terms that the trial sponsors used are not among the cancer synonyms collected by MeSH or NCI.

References

    1. Ashburn TT, Thor KB. Drug repositioning: identifying and developing new uses for existing drugs. Nature Reviews Drug Discovery. 2004;3:673–683. doi: 10.1038/nrd1468. - DOI - PubMed
    1. Bandy J, Milward D, McQuay S. Mining protein-protein interactions from published literature using Linguamatics I2E. Methods in Molecular Biology. 2009;563:3–13. doi: 10.1007/978-1-60761-175-2_1. - DOI - PubMed
    1. Coelho ED, Arrais JP, Oliveira JL. Computational discovery of putative leads for drug repositioning through drug-target interaction prediction. PLOS Computational Biology. 2016;12:e1005219. doi: 10.1371/journal.pcbi.1005219. - DOI - PMC - PubMed
    1. Cormack J, Nath C, Milward D, Raja K, Jonnalagadda SR. Agile text mining for the 2014 i2b2/UTHealth Cardiac risk factors challenge. Journal of Biomedical Informatics. 2015;58(Suppl):S120–S127. doi: 10.1016/j.jbi.2015.06.030. - DOI - PMC - PubMed
    1. Firnhaber C, Zungu K, Levin S, Michelow P, Montaner LJ, McPhail P, Williamson AL, Allan BR, Van der Horst C, Rinas A, Sanne I. Diverse and high prevalence of human papillomavirus associated with a significant high rate of cervical dysplasia in human immunodeficiency virus-infected women in Johannesburg, South Africa. Acta Cytologica. 2009;53:10–17. doi: 10.1159/000325079. - DOI - PMC - PubMed

LinkOut - more resources