Machine learning for in silico virtual screening and chemical genomics: new strategies

Jean-Philippe Vert¹, Laurent Jacob

Affiliations

PMID: 18795887
PMCID: PMC2748698
DOI: 10.2174/138620708785739899

Free PMC article

Review

Machine learning for in silico virtual screening and chemical genomics: new strategies

Jean-Philippe Vert et al. Comb Chem High Throughput Screen. 2008 Sep.

Free PMC article

. 2008 Sep;11(8):677-85.

doi: 10.2174/138620708785739899.

Authors

Jean-Philippe Vert¹, Laurent Jacob

Affiliation

¹ Centre for Computational Biology, Mines ParisTech, 35 rue, Saint-Honoré, France. jean-philippe.vert@ensmp.fr

PMID: 18795887
PMCID: PMC2748698
DOI: 10.2174/138620708785739899

Abstract

Support vector machines and kernel methods belong to the same class of machine learning algorithms that has recently become prominent in both computational biology and chemistry, although both fields have largely ignored each other. These methods are based on a sound mathematical and computationally efficient framework that implicitly embeds the data of interest, respectively proteins and small molecules, in high-dimensional feature spaces where various classification or regression tasks can be performed with linear algorithms. In this review, we present the main ideas underlying these approaches, survey how both the "biological" and the "chemical" spaces have been separately constructed using the same mathematical framework and tricks, and suggest different avenues to unify both spaces for the purpose of in silico chemogenomics.

PubMed Disclaimer

Figures

**Fig. (1)**
Defining a kernel over a space X, such as the space of all small molecules or the space of all proteins, is equivalent to embedding X in a vector space F of finite or infinite dimension through a mapping Φ:*X → F.*The kernel between two points in X is equal to the inner products of their images in F, as shown in (1).

**Fig. (2)**
We can define the distance between two objects x1 and x2, such as two small molecules or proteins, as the Euclidean distance between their images Φ(x₁) and Φ(x₂). If the mapping Φ is defined by a valid kernel k, then this distance can be computed easily without computing Φ(x₁) and Φ(x₂), as shown in (2). This kernel trick can be extended to a variety of linear algorithms that only manipulate the data through inner products.

See this image and copyright information in PMC

Cited by

Evaluation of deep and shallow learning methods in chemogenomics for the prediction of drugs specificity.
Playe B, Stoven V. Playe B, et al. J Cheminform. 2020 Feb 10;12(1):11. doi: 10.1186/s13321-020-0413-0. J Cheminform. 2020. PMID: 33431042 Free PMC article.
Designing focused chemical libraries enriched in protein-protein interaction inhibitors using machine-learning methods.
Reynès C, Host H, Camproux AC, Laconde G, Leroux F, Mazars A, Deprez B, Fahraeus R, Villoutreix BO, Sperandio O. Reynès C, et al. PLoS Comput Biol. 2010 Mar 5;6(3):e1000695. doi: 10.1371/journal.pcbi.1000695. PLoS Comput Biol. 2010. PMID: 20221258 Free PMC article.
Applicability Domain of Active Learning in Chemical Probe Identification: Convergence in Learning from Non-Specific Compounds and Decision Rule Clarification.
Polash AH, Nakano T, Takeda S, Brown JB. Polash AH, et al. Molecules. 2019 Jul 26;24(15):2716. doi: 10.3390/molecules24152716. Molecules. 2019. PMID: 31357419 Free PMC article.
Synthetic molecules: helping to unravel plant signal transduction.
Xuan W, Murphy E, Beeckman T, Audenaert D, De Smet I. Xuan W, et al. J Chem Biol. 2013 Mar 3;6(2):43-50. doi: 10.1007/s12154-013-0091-8. eCollection 2013 Mar 3. J Chem Biol. 2013. PMID: 24432124 Free PMC article. Review.
Comparison Study of Computational Prediction Tools for Drug-Target Binding Affinities.
Thafar M, Raies AB, Albaradei S, Essack M, Bajic VB. Thafar M, et al. Front Chem. 2019 Nov 20;7:782. doi: 10.3389/fchem.2019.00782. eCollection 2019. Front Chem. 2019. PMID: 31824921 Free PMC article. Review.

See all "Cited by" articles

References

1. Jaakkola T, Diekhans M, Haussler D.A. J. Comput. Biol. 2000;7:95–114. - PubMed
1. Lanckriet GRG, De Bie T, Cristianini N, Jordan MI, Noble WS. Bioinformatics. 2004;20:2626–2635. - PubMed
1. Dobson PD, Doig AJ. J. Mol. Biol. 2005;345:187–199. - PubMed
1. Matsuda A, Vert J-P, Saigo H, Ueda N, Toh H, Akutsu T. Protein Sci. 2005;14:2804–2813. - PMC - PubMed
1. Blake JF. Curr. Opin. Biotechnol. 2000;11:104–107. - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions

LinkOut - more resources

Full Text Sources

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Machine learning for in silico virtual screening and chemical genomics: new strategies

Affiliation

Machine learning for in silico virtual screening and chemical genomics: new strategies

Authors

Affiliation

Abstract

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

LinkOut - more resources

Full Text Sources