Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2009 Aug;42(4):726-35.
doi: 10.1016/j.jbi.2009.03.010. Epub 2009 Apr 2.

Literature mining on pharmacokinetics numerical data: a feasibility study

Affiliations

Literature mining on pharmacokinetics numerical data: a feasibility study

Zhiping Wang et al. J Biomed Inform. 2009 Aug.

Abstract

A feasibility study of literature mining is conducted on drug PK parameter numerical data with a sequential mining strategy. Firstly, an entity template library is built to retrieve pharmacokinetics relevant articles. Then a set of tagging and extraction rules are applied to retrieve PK data from the article abstracts. To estimate the PK parameter population-average mean and between-study variance, a linear mixed meta-analysis model and an E-M algorithm are developed to describe the probability distributions of PK parameters. Finally, a cross-validation procedure is developed to ascertain false-positive mining results. Using this approach to mine midazolam (MDZ) PK data, an 88% precision rate and 92% recall rate are achieved, with an F-score=90%. It greatly out-performs a conventional data mining approach (support vector machine), which has an F-score of 68.1%. Further investigate on 7 more drugs reveals comparable performances of our sequential mining approach.

PubMed Disclaimer

Figures

Figure 1
Figure 1
The Architecture of Literature Mining tool
Figure 2
Figure 2
Precision Performance Analysis of the Machine Learning Algorithm in all MDZ Related Abstracts
Figure 3
Figure 3
The Estimated Clearance Distribution Note: The BLUE curve shows systemic clearance; the GREEN curve shows oral clearance. The 95% confidence interval is marked on each curve using vertical lines.
Figure 4
Figure 4
MDZ Clearance Data Note: (a) contains all mined MDZ clearance data before evaluation and outlier removal, and (b) contains the MDZ clearance data after evaluation outlier removal. The BLUE dots are true clearance data from MDZ PK relevant abstracts; the RED and GREEN dots are false MDZ clearance data, in which the red ones were removed by EM validation as outliers and green ones were not.
Figure 5
Figure 5
Recall and Precision Performance Analysis of the Machine Learning Algorithm in a MDZ Abstracts Subset

Similar articles

Cited by

References

    1. Woosley RLaCJ. Drug development and the FDA’s critical path initiative. Clinical Pharmacology and Therapeutics. 2007;(81):129–133. - PubMed
    1. Veit M. New strategies for drug development. Berl Munch Tierarztl Wochenschr. (117):276–287. - PubMed
    1. D’Andrea Gea. A polymorphism in the VKORC1 gene is associated with an interindividual variability in the dose-anticoagulant effect of warfarin. Blood. 2005;(105):645–649. - PubMed
    1. Kirchheiner JBJ. Clinical consequences of cytochrome P450 2C9 polymorphisms. Clin Pharmacol Ther. 2005;(77):1–16. - PubMed
    1. Badagnani Iea. Interaction of methotrexate with organic-anion transporting polypeptide 1A2 and its genetic variants. J Pharmacol Exp Ther. (318):521–529. - PubMed

LinkOut - more resources