Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Oct;34(7):709-728.
doi: 10.1002/pca.3239. Epub 2023 Jul 1.

R software for QSAR analysis in phytopharmacological studies

Affiliations

R software for QSAR analysis in phytopharmacological studies

Sanjoy Singh Ningthoujam et al. Phytochem Anal. 2023 Oct.

Abstract

Introduction: In recent decades, quantitative structure-activity relationship (QSAR) analysis has become an important method for drug design and natural product research. With the availability of bioinformatic and cheminformatic tools, a vast number of descriptors have been generated, making it challenging to select potential independent variables that are accurately related to the dependent response variable.

Objective: The objective of this study is to demonstrate various descriptor selection procedures, such as the Boruta approach, all subsets regression, the ANOVA approach, the AIC method, stepwise regression, and genetic algorithm, that can be used in QSAR studies. Additionally, we performed regression diagnostics using R software to test parameters such as normality, linearity, residual histograms, PP plots, multicollinearity, and homoscedasticity.

Results: The workflow designed in this study highlights the different descriptor selection procedures and regression diagnostics that can be used in QSAR studies. The results showed that the Boruta approach and genetic algorithm performed better than other methods in selecting potential independent variables. The regression diagnostics parameters tested using R software, such as normality, linearity, residual histograms, PP plots, multicollinearity, and homoscedasticity, helped in identifying and diagnosing model errors, ensuring the reliability of the QSAR model.

Conclusion: QSAR analysis is vital in drug design and natural product research. To develop a reliable QSAR model, it is essential to choose suitable descriptors and perform regression diagnostics. This study offers an accessible, customizable approach for researchers to select appropriate descriptors and diagnose errors in QSAR studies.

Keywords: MLR; QSAR; R software; descriptor; feature selection; regression assumption; regression diagnostics.

PubMed Disclaimer

References

REFERENCES

    1. Muratov EN, Bajorath J, Sheridan RP, et al. QSAR without borders. Chem Soc Rev. 2020;49(11):3525-3564. doi:10.1039/D0CS00098A
    1. Selassie C, Verma RP. History of quantitative structure-activity relationships. In: Burger's Medicinal Chemistry and Drug Discovery. Wiley. Vol.1; 2003:1-48.
    1. Veerasamy R. QSAR-an important in-silico tool in drug design and discovery. In: Advances in Computational Modeling and Simulation. Springer; 2022:191-208. doi:10.1007/978-981-16-7857-8_16
    1. Das AP, Agarwal SM. Recent advances in the area of plant-based anti-cancer drug discovery using computational approaches. Mol Divers. 2023;1-25. doi:10.1007/s11030-022-10590-7
    1. Ojo OA, Ojo AB, Okolie C, et al. Deciphering the interactions of bioactive compounds in selected traditional medicinal plants against Alzheimer's diseases via pharmacophore modeling, auto-QSAR, and molecular docking approaches. Molecules. 2021;26(7):1996. doi:10.3390/molecules26071996

LinkOut - more resources