Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2008 Aug 5;47(4-5):677-82.
doi: 10.1016/j.jpba.2008.03.023. Epub 2008 Mar 28.

Prediction models of human plasma protein binding rate and oral bioavailability derived by using GA-CG-SVM method

Affiliations

Prediction models of human plasma protein binding rate and oral bioavailability derived by using GA-CG-SVM method

Chang-Ying Ma et al. J Pharm Biomed Anal. .

Abstract

In this study, support vector machine (SVM) method combined with genetic algorithm (GA) for feature selection and conjugate gradient (CG) method for parameter optimization (GA-CG-SVM), has been employed to develop prediction models of human plasma protein binding rate (PPBR) and oral bioavailability (BIO). The advantage of the GA-CG-SVM is that it can deal with feature selection and SVM parameter optimization simultaneously. Five-fold cross-validation as well as independent test set method were used to validate the prediction models. For the PPBR, a total of 692 compounds were used to train and test the prediction model. The prediction accuracy by means of 5-fold cross-validation is 86% and that for the independent test set (161 compounds) is 81%. These accuracies are markedly higher over that of the best model currently available in literature. The number of descriptors selected is 29. For the BIO, the training set is composed of 690 compounds and external 76 compounds form an independent validation set. The prediction accuracy for the training set by using 5-fold cross-validation and that for the independent test set are 80% and 86%, respectively, which are better than or comparable to those of other classification models in literature. The number of descriptors selected is 25. For both the PPBR and BIO, the descriptors selected by GA-CG method cover a large range of molecular properties which imply that the PPBR and BIO of a drug might be affected by many complicated factors.

PubMed Disclaimer

Similar articles

Cited by

Publication types

LinkOut - more resources