Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2011 Jul 6;3(1):23.
doi: 10.1186/1758-2946-3-23.

4D Flexible Atom-Pairs: An efficient probabilistic conformational space comparison for ligand-based virtual screening

Affiliations

4D Flexible Atom-Pairs: An efficient probabilistic conformational space comparison for ligand-based virtual screening

Andreas Jahn et al. J Cheminform. .

Abstract

Background: The performance of 3D-based virtual screening similarity functions is affected by the applied conformations of compounds. Therefore, the results of 3D approaches are often less robust than 2D approaches. The application of 3D methods on multiple conformer data sets normally reduces this weakness, but entails a significant computational overhead. Therefore, we developed a special conformational space encoding by means of Gaussian mixture models and a similarity function that operates on these models. The application of a model-based encoding allows an efficient comparison of the conformational space of compounds.

Results: Comparisons of our 4D flexible atom-pair approach with over 15 state-of-the-art 2D- and 3D-based virtual screening similarity functions on the 40 data sets of the Directory of Useful Decoys show a robust performance of our approach. Even 3D-based approaches that operate on multiple conformers yield inferior results. The 4D flexible atom-pair method achieves an averaged AUC value of 0.78 on the filtered Directory of Useful Decoys data sets. The best 2D- and 3D-based approaches of this study yield an AUC value of 0.74 and 0.72, respectively. As a result, the 4D flexible atom-pair approach achieves an average rank of 1.25 with respect to 15 other state-of-the-art similarity functions and four different evaluation metrics.

Conclusions: Our 4D method yields a robust performance on 40 pharmaceutically relevant targets. The conformational space encoding enables an efficient comparison of the conformational space. Therefore, the weakness of the 3D-based approaches on single conformations is circumvented. With over 100,000 similarity calculations on a single desktop CPU, the utilization of the 4D flexible atom-pair in real-world applications is feasible.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Shortest flexible atom-pair path. Exemplary visualization of an atom-pair of the marked atoms. The shortest topological path is depicted by the red and green dotted lines. A red line represents a rigid bond, whereas a green line marks a rotatable bond. The last bond of the path (bond from the heterocycle to the carbon of the carboxyl group) is treated as a rigid bond because a rotation of this bond has no influence on the geometric distance of the atom-pair.
Figure 2
Figure 2
Atom-pair distance distribution. Histogram-based visualization of the distance distribution of the marked atom-pair of Figure 1. The line represents the corresponding GMM that models the distance behavior of the atom-pair in the conformational space.
Figure 3
Figure 3
Boltzmann weighted atom-pair distance distribution. Boltzmann weighted histogram-based visualization of the distance distribution of the atom-pair of Figure 1. The line describes the probability density of the GMM that was computed by the Boltzmann weighted EM algorithm.
Figure 4
Figure 4
Flexible atom-pair tree. The left molecule represents the example molecule for the tree on the right side. The white 'R' marks the atom that serves as root atom (point of origin for the atom-pairs) for the tree. The black numbers symbolize the topological distance for the rigid atom-pairs. The red and green numbers correspond with the leaf numbers of the tree on the right side. The red or green color of these atom numbers indicates the membership of the atom to the rigid or flexible sub-tree, respectively.
Figure 5
Figure 5
ROC plot on ACE. ROC plot of all optimal assignment methods on the filtered ACE data set. TPR and FPR denote the true positive rate and false positive rate, respectively.
Figure 6
Figure 6
Chemotype discovery on ACE. Chemotype discovery of all optimal assignment methods on the filtered ACE data set.
Figure 7
Figure 7
ROC plot on EGFr. ROC plot of all optimal assignment methods on the filtered EGFr data set.
Figure 8
Figure 8
Chemotype discovery on EGFr. Chemotype discovery of all optimal assignment methods on the filtered EGFr data set.
Figure 9
Figure 9
Optimal assignment with topological errors. Example mapping with several topological errors. Figure was taken from Jahn et al. [28]

Similar articles

Cited by

References

    1. von Korff M, Freyss J, Sander T. Flexophore, a New Versatile 3D Pharmacophore Descriptor That Considers Molecular Flexibility. J Chem Inf Model. 2008;48(4):797–810. doi: 10.1021/ci700359j. - DOI - PubMed
    1. Bajorath J. Integration of virtual and high-throughput screening. Nat Rev Drug Discov. 2002;1(11):882–894. doi: 10.1038/nrd941. - DOI - PubMed
    1. Varnek A, Tropsha A, (Eds) Chemoinformatics Approaches to Virtual Screening. Cambridge: The Royal Society of Chemistry; 2008.
    1. Geppert H, Vogt M, Bajorath J. Current Trends in Ligand-Based Virtual Screening: Molecular Representations, Data Mining Methods, New Application Areas, and Performance Evaluation. J Chem Inf Model. 2010;50(2):205–216. doi: 10.1021/ci900419k. - DOI - PubMed
    1. Bender A, Jenkins JL, Scheiber J, Sukuru SC, Glick M, Davies JW. How Similar Are Similarity Searching Methods? A Principal Component Analysis of Molecular Descriptor Space. J Chem Inf Model. 2009;49:108–119. doi: 10.1021/ci800249s. - DOI - PubMed