Machine learning methods for property prediction in chemoinformatics: Quo Vadis?
- PMID: 22582859
- DOI: 10.1021/ci200409x
Machine learning methods for property prediction in chemoinformatics: Quo Vadis?
Abstract
This paper is focused on modern approaches to machine learning, most of which are as yet used infrequently or not at all in chemoinformatics. Machine learning methods are characterized in terms of the "modes of statistical inference" and "modeling levels" nomenclature and by considering different facets of the modeling with respect to input/ouput matching, data types, models duality, and models inference. Particular attention is paid to new approaches and concepts that may provide efficient solutions of common problems in chemoinformatics: improvement of predictive performance of structure-property (activity) models, generation of structures possessing desirable properties, model applicability domain, modeling of properties with functional endpoints (e.g., phase diagrams and dose-response curves), and accounting for multiple molecular species (e.g., conformers or tautomers).
Similar articles
-
Assessment and statistical modeling of the relationship between remotely sensed aerosol optical depth and PM2.5 in the eastern United States.Res Rep Health Eff Inst. 2012 May;(167):5-83; discussion 85-91. Res Rep Health Eff Inst. 2012. PMID: 22838153
-
Chemoinformatics as a Theoretical Chemistry Discipline.Mol Inform. 2011 Jan 17;30(1):20-32. doi: 10.1002/minf.201000100. Epub 2011 Jan 24. Mol Inform. 2011. PMID: 27467875 Review.
-
Critical assessment of QSAR models of environmental toxicity against Tetrahymena pyriformis: focusing on applicability domain and overfitting by variable selection.J Chem Inf Model. 2008 Sep;48(9):1733-46. doi: 10.1021/ci800151m. Epub 2008 Aug 26. J Chem Inf Model. 2008. PMID: 18729318
-
Comparison of combinatorial clustering methods on pharmacological data sets represented by machine learning-selected real molecular descriptors.J Chem Inf Model. 2011 Dec 27;51(12):3036-49. doi: 10.1021/ci2000083. Epub 2011 Dec 9. J Chem Inf Model. 2011. PMID: 22098113
-
Probabilistic models and machine learning in structural bioinformatics.Stat Methods Med Res. 2009 Oct;18(5):505-26. doi: 10.1177/0962280208099492. Epub 2009 Jan 19. Stat Methods Med Res. 2009. PMID: 19153168 Review.
Cited by
-
Advancing Ionic Liquid Research with pSCNN: A Novel Approach for Accurate Normal Melting Temperature Predictions.ACS Omega. 2024 Jul 8;9(29):31694-31702. doi: 10.1021/acsomega.4c02393. eCollection 2024 Jul 23. ACS Omega. 2024. PMID: 39072063 Free PMC article.
-
Automated machine learning approach for developing a quantitative structure-activity relationship model for cardiac steroid inhibition of Na+/K+-ATPase.Pharmacol Rep. 2023 Aug;75(4):1017-1025. doi: 10.1007/s43440-023-00508-x. Epub 2023 Jun 24. Pharmacol Rep. 2023. PMID: 37354314
-
QSAR modeling of imbalanced high-throughput screening data in PubChem.J Chem Inf Model. 2014 Mar 24;54(3):705-12. doi: 10.1021/ci400737s. Epub 2014 Feb 28. J Chem Inf Model. 2014. PMID: 24524735 Free PMC article.
-
Multi-PLI: interpretable multi-task deep learning model for unifying protein-ligand interaction datasets.J Cheminform. 2021 Apr 15;13(1):30. doi: 10.1186/s13321-021-00510-6. J Cheminform. 2021. PMID: 33858485 Free PMC article.
-
Impact of distance-based metric learning on classification and visualization model performance and structure-activity landscapes.J Comput Aided Mol Des. 2014 Feb;28(2):61-73. doi: 10.1007/s10822-014-9719-1. Epub 2014 Feb 4. J Comput Aided Mol Des. 2014. PMID: 24493411
LinkOut - more resources
Full Text Sources
Other Literature Sources