Machine learning methods in chemoinformatics
- PMID: 25285160
- PMCID: PMC4180928
- DOI: 10.1002/wcms.1183
Machine learning methods in chemoinformatics
Abstract
Machine learning algorithms are generally developed in computer science or adjacent disciplines and find their way into chemical modeling by a process of diffusion. Though particular machine learning methods are popular in chemoinformatics and quantitative structure-activity relationships (QSAR), many others exist in the technical literature. This discussion is methods-based and focused on some algorithms that chemoinformatics researchers frequently use. It makes no claim to be exhaustive. We concentrate on methods for supervised learning, predicting the unknown property values of a test set of instances, usually molecules, based on the known values for a training set. Particularly relevant approaches include Artificial Neural Networks, Random Forest, Support Vector Machine, k-Nearest Neighbors and naïve Bayes classifiers.
Figures
References
-
- Hammett LP. Reaction rates and indicator acidities. Chem Rev. 1935;16:67–79.
-
- Hansch C, Fujita T. p-σ-π Analysis. A method for the correlation of biological activity and chemical structure. J Am Chem Soc. 1964;86:1616–1626.
-
- Borman S. New QSAR techniques eyed for environmental assessments. Chem Eng News. 1990;19:20–23.
-
- Kowalski BR. Pattern recognition in chemical research. In: Klopfenstein CE, Wilkins CL, editors. Computers in Chemical and Biochemical Research. Vol. 2. Academic Press: New York; 1974. pp. 1–76.
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources