Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 May 22;13(1):8219.
doi: 10.1038/s41598-023-35132-5.

Algorithm selection for protein-ligand docking: strategies and analysis on ACE

Affiliations

Algorithm selection for protein-ligand docking: strategies and analysis on ACE

Tianlai Chen et al. Sci Rep. .

Abstract

The present study investigates the use of algorithm selection for automatically choosing an algorithm for any given protein-ligand docking task. In drug discovery and design process, conceptualizing protein-ligand binding is a major problem. Targeting this problem through computational methods is beneficial in order to substantially reduce the resource and time requirements for the overall drug development process. One way of addressing protein-ligand docking is to model it as a search and optimization problem. There have been a variety of algorithmic solutions in this respect. However, there is no ultimate algorithm that can efficiently tackle this problem, both in terms of protein-ligand docking quality and speed. This argument motivates devising new algorithms, tailored to the particular protein-ligand docking scenarios. To this end, this paper reports a machine learning-based approach for improved and robust docking performance. The proposed set-up is fully automated, operating without any expert opinion or involvement both on the problem and algorithm aspects. As a case study, an empirical analysis was performed on a well-known protein, Human Angiotensin-Converting Enzyme (ACE), with 1428 ligands. For general applicability, AutoDock 4.2 was used as the docking platform. The candidate algorithms are also taken from AutoDock 4.2. Twenty-eight distinctly configured Lamarckian-Genetic Algorithm (LGA) are chosen to build an algorithm set. ALORS which is a recommender system-based algorithm selection system was preferred for automating the selection from those LGA variants on a per-instance basis. For realizing this selection automation, molecular descriptors and substructure fingerprints were employed as the features characterizing each target protein-ligand docking instance. The computational results revealed that algorithm selection outperforms all those candidate algorithms. Further assessment is reported on the algorithms space, discussing the contributions of LGA's parameters. As it pertains to protein-ligand docking, the contributions of the aforementioned features are examined, which shed light on the critical features affecting the docking performance.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Figure 1
Figure 1
Illustration of Algorithm Selection. The traditional per instance Algorithm Selection (AS) process.
Figure 2
Figure 2
Framework of ALORS for Protein–Ligand Docking. All ligands are docked with ACE using 28 algorithms, each with a different parameter configuration in AutoDock4 during the data generation procedure. The algorithm configuration that produces the lowest docking scores averaged for 50 runs is selected as the best algorithm for the given instance, such as the 28th algorithm setting (A28). The ALORS model is trained using molecular descriptors and fingerprints, and the best algorithm labels corresponding to each ligand. Our model uses features of a single new ligand to determine the best algorithm configuration for inference.
Figure 3
Figure 3
Ranks of Docking Algorithms. (A) The ranks of the docking algorithms across all the instances, based on the AVG performance. (B) The ranks of the docking algorithms across all the instances, based on the BEST performance.
Figure 4
Figure 4
Mean Ranks of Docking Algorithms. The mean ranks of all the tested docking methods. (A) relative comparison on both AVG and BEST, (B) sorted comparison on AVG, (C) sorted comparison on BEST.
Figure 5
Figure 5
Clustering of Docking Algorithms. A hierarchical clustering of the constituent docking algorithms based on the latent features extracted by SVD (k = 5) on the AVG case.
Figure 6
Figure 6
Gini Importance of Features. The blues ones are the significantly more critical than the rest concerning their Gini values. (A) The Gini importance values of all the docking instance features, (B) The Gini importance values of the Fmd,top9 features, (C) The Gini importance values of the Fmd,top4+sf,top54 features, (D) The Gini importance values of the Fmd,top9+sf,top54 features, (E) The Gini importance values of the Fsf,top54 features.
Figure 7
Figure 7
Features Visualization with PCA, t-SNE and Kmeans. (A) 4, 9 and 40 features visualization with PCA and t-SNE. (B) In 2-D PCA and t-SNE space, Kmeans classification results of 9 features. (C) In 2-D PCA and t-SNE space, Kmeans classification results of 5 latent features, extracted by SVD, for a different feature set.
Figure 8
Figure 8
Boxplot of Features. Type 0 denote the same group 0 when conducting PCA and t-SNE and type 1 denote group 1. The distributions of 9 selected features in the two clusters are given to demonstrate the possible patterns for each group. Group 0 shows a clustered group while with more outliers compared to group 1.
Figure 9
Figure 9
Interaction Plot of Ligand ZINC000000000053 and ACE. (A) under default parameter configuration, (B) under best parameter configuration in AutoDock4.

References

    1. Everhardus JA. Drug Design: Medicinal Chemistry. Elsevier; 2017.
    1. Jeffrey C, Carl R, Parvesh K. The price of progress: Funding and financing alzheimer’s disease drug development. Alzheimer Dementia Trans. Res. Clin. Inter. 2018;20:875. - PMC - PubMed
    1. Reymond J-L. The chemical space project. Acc. Chem. Res. 2015;48(3):722–730. doi: 10.1021/ar500432k. - DOI - PubMed
    1. Mullard A. 2020 fda drug approvals. Nat. Rev. Drug Discov. 2021;20(2):85–91. doi: 10.1038/d41573-021-00002-0. - DOI - PubMed
    1. Edgar L-L, Jurgen B, Jose LM-F. Informatics for chemistry, biology, and biomedical sciences. J. Chem. Inf. Model. 2020;61(1):26–35. - PubMed

Publication types