Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Mar 5;6(10):6722-6735.
doi: 10.1021/acsomega.0c05645. eCollection 2021 Mar 16.

Predictive Global Models of Cruzain Inhibitors with Large Chemical Coverage

Affiliations

Predictive Global Models of Cruzain Inhibitors with Large Chemical Coverage

Jose Guadalupe Rosas-Jimenez et al. ACS Omega. .

Abstract

Chagas disease affects 8-11 million people worldwide, most of them living in Latin America. Moreover, migratory phenomena have spread the infection beyond endemic areas. Efforts for the development of new pharmacological therapies are paramount as the pharmacological profile of the two marketed drugs currently available, nifurtimox and benznidazole, needs to be improved. Cruzain, a parasitic cysteine protease, is one of the most attractive biological targets due to its roles in parasite survival and immune evasion. In this work, we compiled and curated a database of diverse cruzain inhibitors previously reported in the literature. From this data set, quantitative structure-activity relationship (QSAR) models for the prediction of their pIC50 values were generated using k-nearest neighbors and random forest algorithms. Local and global models were calculated and compared. The statistical parameters for internal and external validation indicate a significant predictability, with q loo 2 values around 0.66 and 0.61 and external R 2 coefficients of 0.725 and 0.766. The applicability domain is quantitatively defined, according to QSAR good practices, using the leverage and similarity methods. The models described in this work are readily available in a Python script for the discovery of novel cruzain inhibitors.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing financial interest.

Figures

Figure 1
Figure 1
Functional groups used for the classification of the molecules into chemical families for the development of local models. The amide group includes peptidic and nonpeptidic inhibitors with a central amide group.
Figure 2
Figure 2
Flowchart with the summary of the methods for QSAR model generation. MLR, multiple linear regression; KNN, k-nearest neighbor regression; RF, random forest regression; GA, genetic algorithm.
Figure 3
Figure 3
Distribution of pIC50 values for cruzain inhibitors. Molecules in the training set are shown in dark gray, and molecules in the test set are shown in light gray. The inhibitory potency of the test set falls within the interval of pIC50 values of the training set.
Figure 4
Figure 4
Selected molecules from the modeling data set.
Figure 5
Figure 5
Pairwise Tanimoto similarity matrices for compounds in each global and local data set based on MACCSKeys fingerprints.
Figure 6
Figure 6
(A) Cyclic system retrieval curve for the global and local models. (B) Consensus diversity plot comparing the global (complete) and local data sets. Marker size is proportional to the number of molecules in each database.
Figure 7
Figure 7
Regression plots of the best generated models. The first plot shows the predicted values by the KNN model for the training and test sets, and the plot below shows the results for the RF forest model. Qloo2 and Rext2 are presented inside the graphs.
Figure 8
Figure 8
Histograms of the distribution of model residuals. The upper plots show the residuals for the KNN model, and the lower histograms plot these results for the RF model.
Figure 9
Figure 9
Williams plots for the calculated models. In the upper graph, leverages and residuals are shown for the KNN model, whereas the lower graph shows the results for the RF model.
Figure 10
Figure 10
Structure of CHEMBL409024. This molecule has a high leverage in the RF group.
Figure 11
Figure 11
Maximum Tanimoto similarity values for each molecule in the test set. MACCSKeys were used to generate molecular fingerprints. Similarities are plotted against leverage values.
Figure 12
Figure 12
Results of the q2 values for randomized models in comparison with the q2 of the true model.

Similar articles

Cited by

References

    1. Flores-Ferrer A.; Marcou O.; Waleckx E.; Dumonteil E.; Gourbière S. Evolutionary ecology of Chagas disease; what do we know and what do we need?. Evol. Appl. 2018, 11, 470–487. 10.1111/eva.12582. - DOI - PMC - PubMed
    1. Martinez-Mayorga K.; Byler K. G.; Ramirez-Hernandez A. I.; Terrazas-Alvares D. E. Cruzain inhibitors: efforts made, current leads and a structural outlook of new hits. Drug Discovery Today 2015, 20, 890–898. 10.1016/J.DRUDIS.2015.02.004. - DOI - PubMed
    1. Ferreira L. G.; Andricopulo A. D. Targeting cysteine proteases in trypanosomatid disease drug discovery. Pharmacol. Ther. 2017, 180, 49–61. 10.1016/J.PHARMTHERA.2017.06.004. - DOI - PubMed
    1. Pérez-Molina J. A.; Molina I. Chagas disease. Lancet 2018, 391, 82–94. 10.1016/S0140-6736(17)31612-4. - DOI - PubMed
    1. Sales Junior P. A.; Molina I.; Fonseca Murta S. M.; Sánchez-Montalvá A.; Salvador F.; Corrêa-Oliveira R.; Carneiro C. M. Experimental and Clinical Treatment of Chagas Disease: A Review. Am. J. Trop. Med. Hyg. 2017, 97, 1289–1303. 10.4269/ajtmh.16-0761. - DOI - PMC - PubMed

LinkOut - more resources