Review

. 2022 Mar 3;23(5):2797.

doi: 10.3390/ijms23052797.

Unsupervised Learning in Drug Design from Self-Organization to Deep Chemistry

Jaroslaw Polanski¹

Affiliations

PMID: 35269939
PMCID: PMC8910896
DOI: 10.3390/ijms23052797

Review

Unsupervised Learning in Drug Design from Self-Organization to Deep Chemistry

Jaroslaw Polanski. Int J Mol Sci. 2022.

. 2022 Mar 3;23(5):2797.

doi: 10.3390/ijms23052797.

Author

Jaroslaw Polanski¹

Affiliation

¹ Institute of Chemistry, Faculty of Science and Technology, University of Silesia, Szkolna 9, 40-006 Katowice, Poland.

PMID: 35269939
PMCID: PMC8910896
DOI: 10.3390/ijms23052797

Abstract

The availability of computers has brought novel prospects in drug design. Neural networks (NN) were an early tool that cheminformatics tested for converting data into drugs. However, the initial interest faded for almost two decades. The recent success of Deep Learning (DL) has inspired a renaissance of neural networks for their potential application in deep chemistry. DL targets direct data analysis without any human intervention. Although back-propagation NN is the main algorithm in the DL that is currently being used, unsupervised learning can be even more efficient. We review self-organizing maps (SOM) in mapping molecular representations from the 1990s to the current deep chemistry. We discovered the enormous efficiency of SOM not only for features that could be expected by humans, but also for those that are not trivial to human chemists. We reviewed the DL projects in the current literature, especially unsupervised architectures. DL appears to be efficient in pattern recognition (Deep Face) or chess (Deep Blue). However, an efficient deep chemistry is still a matter for the future. This is because the availability of measured property data in chemistry is still limited.

Keywords: deep chemistry; deep learning; drug design; feature engineering; feature learning; molecular representation; self-organizing maps; supervised learning; unsupervised learning.

PubMed Disclaimer

Conflict of interest statement

The author declares no conflict of interest.

Figures

**Figure 1**
The direct drug design problem can be defined as mapping property to structure (P → S)₃. Mainly, it is realized in the indirect mode by structure to property mapping (S → P). Individual methods allow to include various domains (S → P)₁ or (S → P)₂. Domain diversity is indicated schematically by colors.

**Figure 2**
Supervised learning vs. unsupervised learning architectures. Both modes demand optimization; however, while in supervised learning, we need a label within the inputs which we use to estimate the error between the label and the output value, in unsupervised learning, the error is minimized by comparing the unlabeled inputs.

**Figure 3**
The propane vs. butane colored by methyl (yellow or blue in butane, yellow or green in propane) and methylene (red or green in butane, red in propane) fragments (a) provides a series of two types of CoMSA (SOM) projections (b), depending upon the SOM network regulation. Two types of patterns (b) can be explained by fuzzy topology (c). Details in text.

**Figure 4**
A series of CBG steroid surface data projected by CoMSA (SOM) without superimposition [21]. Without a single misinterpretation, H (high) and M (medium) activity compounds can be differentiated from the L (low) activity compounds. Details in text. Copyright © 1996 Polish Chemical Society.

**Figure 5**
Automatic chemical design using a data-driven continuous representation of molecules. In the critical operation of the latent space formation, the architecture analyzes the similarity of the SMILES codes of the candidate and the known inhibitor structures. A deep neural network involves three coupled functions: an encoder, a decoder (a) and a predictor (b) [45]. Copyright © 2018 American Chemical Society.

See this image and copyright information in PMC

References

1. Polanski J. Encyclopedia of Bioinformatics and Computational Biology. Elsevier; Amsterdam, The Netherlands: 2019. Chemoinformatics: From Chemical Art to Chemistry in Silico; pp. 601–618. - DOI
1. Schneider G. Automating Drug Discovery. Nat. Rev. Drug Discov. 2017;17:97–113. doi: 10.1038/nrd.2017.232. - DOI - PubMed
1. Dreyfus H.L. What Computers Can’t Do—The Limits of Artificial Intelligence. Harper and Row; New York, NY, USA: 1979.
1. McCarthy What is AI?/Basic Questions. [(accessed on 26 February 2022)]. Available online: http://jmc.stanford.edu/artificial-intelligence/what-is-ai/index.html#:~....
1. Sutton R.S., Barto A.G. Reinforcement Learning: An Introduction. MIT Press; Cambridge, MA, USA: Cambridge Mass; Cambridge, MA, USA: 2018.

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

OPUS 2018/29/B/ST8/02303/National Science Center

LinkOut - more resources

Full Text Sources

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Unsupervised Learning in Drug Design from Self-Organization to Deep Chemistry

Affiliation

Unsupervised Learning in Drug Design from Self-Organization to Deep Chemistry

Author

Affiliation

Abstract

Conflict of interest statement

Figures

References

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources