Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2009 Apr 28:1:4.
doi: 10.1186/1758-2946-1-4.

A constructive approach for discovering new drug leads: Using a kernel methodology for the inverse-QSAR problem

Affiliations

A constructive approach for discovering new drug leads: Using a kernel methodology for the inverse-QSAR problem

William Wl Wong et al. J Cheminform. .

Abstract

Background: The inverse-QSAR problem seeks to find a new molecular descriptor from which one can recover the structure of a molecule that possess a desired activity or property. Surprisingly, there are very few papers providing solutions to this problem. It is a difficult problem because the molecular descriptors involved with the inverse-QSAR algorithm must adequately address the forward QSAR problem for a given biological activity if the subsequent recovery phase is to be meaningful. In addition, one should be able to construct a feasible molecule from such a descriptor. The difficulty of recovering the molecule from its descriptor is the major limitation of most inverse-QSAR methods.

Results: In this paper, we describe the reversibility of our previously reported descriptor, the vector space model molecular descriptor (VSMMD) based on a vector space model that is suitable for kernel studies in QSAR modeling. Our inverse-QSAR approach can be described using five steps: (1) generate the VSMMD for the compounds in the training set; (2) map the VSMMD in the input space to the kernel feature space using an appropriate kernel function; (3) design or generate a new point in the kernel feature space using a kernel feature space algorithm; (4) map the feature space point back to the input space of descriptors using a pre-image approximation algorithm; (5) build the molecular structure template using our VSMMD molecule recovery algorithm.

Conclusion: The empirical results reported in this paper show that our strategy of using kernel methodology for an inverse-Quantitative Structure-Activity Relationship is sufficiently powerful to find a meaningful solution for practical problems.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Overall concept for the VSMMD inverse-QSAR approach.
Figure 2
Figure 2
Labels for atoms and bonds in a molecule.
Figure 3
Figure 3
Processing steps for the VSMMD.
Figure 4
Figure 4
The implicit function ɸ maps points in the input space over to the feature space.
Figure 5
Figure 5
Deriving a new image in kernel feature space.
Figure 6
Figure 6
Minimum enclosing and maximum excluding hyperspheres in the feature space.
Figure 7
Figure 7
The pre-image problem.
Figure 8
Figure 8
Kwok and Tsang pre-image strategy.
Figure 9
Figure 9
Simplified VSMMD with an aromatic ring treated as a super atom.
Figure 10
Figure 10
An example of a chemical structure template. Note: We use "O" to denote O#O#O#O#O#O and 'OA' to denote O#O#O#O#A.
Figure 11
Figure 11
The De Bruijn graph D for the VSMMD shown in Figure 9. Note: We use 'O' to denote O#O#O#O#O#O and 'OA' to denote O#O#O#O#A.
Figure 12
Figure 12
The Expanded graph M. Note: We use 'O' to denote O#O#O#O#O#O and 'OA' to denote O#O#O#O#A.
Figure 13
Figure 13
Some possible Euler circuits.
Figure 14
Figure 14
An example in edges traversal.
Figure 15
Figure 15
Six highest probability Euler Circuits for VSMMD shown in Figure 9 and the corresponding chemical structure templates. Note: We use 'O' to denote O#O#O#O#O#O and 'OA' to denote O#O#O#O#A.
Figure 16
Figure 16
A case where the pre-image vector did not form a fully connected De Bruijn Graph.
Figure 17
Figure 17
Ten highest active compounds in the COX-2 training set.
Figure 18
Figure 18
Verification test result.
Figure 19
Figure 19
The pre-image VSMMD of the center of the minimum enclosing and maximum excluding hyperspheres.
Figure 20
Figure 20
Two Euler circuits with the highest probability for the pre-image VSMMD in Figure 19 and the corresponding chemical structure templates. Note: We use 'O' to denote O#O#O#O#O#O and 'R*' to denote the cyclopentene ring.
Figure 21
Figure 21
Matching molecule in the test set. Note: We use 'O' to denote O#O#O#O#O#O and 'R*' to denote the cyclopentene ring.
Figure 22
Figure 22
The closest matching molecule in the test set for the generated chemical template across 8 data sets.

Similar articles

Cited by

References

    1. Sharp KA. Potential functions for virtual screening and ligand binding calculations: Some theoretical considerations . In: Alvarez J, Shoichet B, editors. Virtual Screening in Drug Discovery. New York: Taylor & Francis; 2005. pp. 229–248.
    1. Todeschini R, Consonni V. Handbook of molecular descriptors. Weinheim: Wiley-VCH; 2000.
    1. Faulon JL, Brown W, Martin S. Reverse engineering chemical structures from molecular descriptors: how many solutions? . J Comput-Aided Mol Des. 2005;19:637–650. doi: 10.1007/s10822-005-9007-1. - DOI - PubMed
    1. Lewis RA. A general method for exploiting QSAR models in lead optimization . J Med Chem. 2005;48:1638–1648. doi: 10.1021/jm049228d. - DOI - PubMed
    1. Brown N, McKay B, Gasteiger J. A novel workflow for the inverse QSPR problem using multi-objective optimization . J Comput-Aided Mol Des. 2006;20:333–341. doi: 10.1007/s10822-006-9063-1. - DOI - PubMed

LinkOut - more resources