Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 May 29;20(Suppl 10):247.
doi: 10.1186/s12859-019-2811-8.

Drug repositioning of herbal compounds via a machine-learning approach

Affiliations

Drug repositioning of herbal compounds via a machine-learning approach

Eunyoung Kim et al. BMC Bioinformatics. .

Abstract

Background: Drug repositioning, also known as drug repurposing, defines new indications for existing drugs and can be used as an alternative to drug development. In recent years, the accumulation of large volumes of information related to drugs and diseases has led to the development of various computational approaches for drug repositioning. Although herbal medicines have had a great impact on current drug discovery, there are still a large number of herbal compounds that have no definite indications.

Results: In the present study, we constructed a computational model to predict the unknown pharmacological effects of herbal compounds using machine learning techniques. Based on the assumption that similar diseases can be treated with similar drugs, we used four categories of drug-drug similarity (e.g., chemical structure, side-effects, gene ontology, and targets) and three categories of disease-disease similarity (e.g., phenotypes, human phenotype ontology, and gene ontology). Then, associations between drug and disease were predicted using the employed similarity features. The prediction models were constructed using classification algorithms, including logistic regression, random forest and support vector machine algorithms. Upon cross-validation, the random forest approach showed the best performance (AUC = 0.948) and also performed well in an external validation assessment using an unseen independent dataset (AUC = 0.828). Finally, the constructed model was applied to predict potential indications for existing drugs and herbal compounds. As a result, new indications for 20 existing drugs and 31 herbal compounds were predicted and validated using clinical trial data.

Conclusions: The predicted results were validated manually confirming the performance and underlying mechanisms - for example, irinotecan as a treatment for neuroblastoma. From the prediction, herbal compounds were considered to be drug candidates for related diseases which is important to be further developed. The proposed prediction model can contribute to drug discovery by suggesting drug candidates from herbal compounds which have potentials but few were studied.

Keywords: Data mining; Drug repositioning prediction; Machine learning.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no competing interests.

Figures

Fig. 1
Fig. 1
Overview of the proposed work. The training dataset was obtained from a previous study, and the drug and disease property data was retrieved from each database. Using the property information, similarity scores were calculated and combined to represent drug-disease associations. The drug-disease associations and similarity scores were used to construct a prediction model through cross-validation and external validation. Finally, the best performing model was applied to predict the repositioning candidates from the herbal compounds
Fig. 2
Fig. 2
Calculation of classification features for drug-disease associations. a An example of drug-disease associations considered as gold standard. b From known drug-disease associations, classification features were calculated using the similarity scores between the drugs and diseases of each association type. c Then, similarity scores of each association were combined into a Cartesian product, resulting in a total of 12 features, and the maximum value was selected to represent the query association
Fig. 3
Fig. 3
Performance of prediction models in cross-validation. Overall, the model involving the random forest algorithm performed better than those using other algorithms
Fig. 4
Fig. 4
Performance of prediction models in external validation. The random forest model showed the best performance in terms of accuracy and AUC
Fig. 5
Fig. 5
Comparison of performance with previous studies. The performance of the constructed model with the random forest algorithm was compared with related studies. The AUC and AUPR metrics were used for the comparison, as previously reported

Similar articles

Cited by

References

    1. Pammolli F, Magazzini L, Riccaboni M. The productivity crisis in pharmaceutical R&D. Nat Rev Drug Discov. 2011;10(6):428–438. doi: 10.1038/nrd3405. - DOI - PubMed
    1. Swinney DC, Anthony J. How were new medicines discovered? Nat Rev Drug Discov. 2011;10(7):507. doi: 10.1038/nrd3480. - DOI - PubMed
    1. Ashburn TT, Thor KB. Drug repositioning: identifying and developing new uses for existing drugs. Nat Rev Drug Discov. 2004;3(8):673–683. doi: 10.1038/nrd1468. - DOI - PubMed
    1. Li J, Zheng S, Chen B, Butte AJ, Swamidass SJ, Lu Z. A survey of current trends in computational drug repositioning. Brief Bioinform. 2016;17(1):2–12. doi: 10.1093/bib/bbv020. - DOI - PMC - PubMed
    1. Wang Y, Chen S, Deng N, Wang Y. Drug repositioning by kernel-based integration of molecular structure, molecular activity, and phenotype data. PLoS One. 2013;8(11):1–12. - PMC - PubMed