Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2014 Feb 19;9(2):e89324.
doi: 10.1371/journal.pone.0089324. eCollection 2014.

Automated detection of off-label drug use

Affiliations

Automated detection of off-label drug use

Kenneth Jung et al. PLoS One. .

Abstract

Off-label drug use, defined as use of a drug in a manner that deviates from its approved use defined by the drug's FDA label, is problematic because such uses have not been evaluated for safety and efficacy. Studies estimate that 21% of prescriptions are off-label, and only 27% of those have evidence of safety and efficacy. We describe a data-mining approach for systematically identifying off-label usages using features derived from free text clinical notes and features extracted from two databases on known usage (Medi-Span and DrugBank). We trained a highly accurate predictive model that detects novel off-label uses among 1,602 unique drugs and 1,472 unique indications. We validated 403 predicted uses across independent data sources. Finally, we prioritize well-supported novel usages for further investigation on the basis of drug safety and cost.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: The authors have declared that no competing interests exist.

Figures

Figure 1
Figure 1. Overview of methods and results.
For each of the 2,362,950 possible drug-indication pairs, we calculated 9 empirical features (e.g., co-mention count) from the free text of clinical notes in STRIDE and 16 domain knowledge features (e.g., similarity in known usage to other drugs used to treat the indication) from Medi-Span and Drugbank. These features were used by an SVM classifier trained on a gold standard dataset to recognize the used-to-treat relationship, yielding a set of predictions that were filtered for known usages, near misses in the indications, and support in two independent and complementary datasets (FAERS and MEDLINE). Predicted usages that appeared to be drug adverse events listed in SIDER 2 were removed. The resulting set of 403 well-supported novel off-label usages were binned using indices of risk and cost.
Figure 2
Figure 2. Training and testing a classifier to recognize used-to-treat relationships.
We created a gold standard of positive and negative examples of known drug usage. Positive examples were taken from Medi-Span. We created negative examples by randomly selecting positive examples and then randomly choosing a drug and indication with roughly the same frequency of mentions in STRIDE as the real usage. These were then checked against Medi-Span to filter out inadvertently generated known usages. The gold standard dataset contained 4 negative examples for each positive case. For each drug-indication pair in the gold standard, we calculated features summarizing the pattern of mentions of the drugs and indications in 9.5 million clinical notes from STRIDE. We used Medi-Span and Drugbank to calculate features summarizing domain knowledge about drugs and their usages. 80% of the gold standard was used to train an SVM classifier, and the resulting model was tested on the remaining 20%.
Figure 3
Figure 3. Distribution of indication classes in predicted novel usages.
Each indication for the 403 high confidence novel usages with support in FAERS and MEDLINE was mapped to the first level of the NDF-RT disease hierarchy. 63 usages were not mapped to NDF-RT and were left out of this chart.
Figure 4
Figure 4. Using prior knowledge to calculate drug-drug and indication-indication similarity.
We represent known usage as a matrix where row i represents drug i and column j represents indication j. A check in entry (i,j) indicates that the drug i is used to treat the indication j, while a cross indicates the converse. We are interested in whether a given drug, lamotrigine, is used to treat migraine disorders. We thus ask — how similar is the known usage of lamotrigine to other drugs we know are used to treat migraine disorders? Topirimate is used to treat migraine disorders, and lamotrigine is similar to it in that both are used to treat tonic-clonic seizures and myoclonic epilepsies, but not non-Hodgkin's lymphoma. This similarity in usage profile suggests that it is more likely to be used to treat migraine disorders than, say, Rituximab. We measured this similarity using the maximum cosine and Jaccard similarity of lamotrigine versus all drugs known to treat the indication. We calculate the similarity between indications based on known usage using the same data, with the roles of drugs and indications reversed.

References

    1. Stafford RS (2012) Off-label use of drugs and medical devices: a review of policy implications. Clin Pharmacol Ther 91: 920–925. - PubMed
    1. Dal Pan GJ (2012) Monitoring the safety of medicines used off-label. Clin Pharmacol Ther 91: 787–795. - PubMed
    1. Radley DC, Finkelstein SN, Stafford RS (2006) Off-label prescribing among office-based physicians. Arch Intern Med 166: 1021–1026. - PubMed
    1. Chen DT, Wynia MK, Moloney RM, Alexander GC (2009) U.S. physician knowledge of the FDA-approved indications and evidence base for commonly prescribed drugs: results of a national survey. Pharmacoepidemiol Drug Saf 18: 1094–1100. - PubMed
    1. Flowers CM, Racoosin JA, Kortepeter C (2006) Seizure activity and off-label use of tiagabine. N Engl J Med 354: 773–774. - PubMed

Publication types

MeSH terms