Ligand prediction from protein sequence and small molecule information using support vector machines and fingerprint descriptors
- PMID: 19309114
- DOI: 10.1021/ci900004a
Ligand prediction from protein sequence and small molecule information using support vector machines and fingerprint descriptors
Abstract
Support vector machine (SVM) database search strategies are presented that aim at the identification of small molecule ligands for targets for which no ligand information is currently available. In pharmaceutical research and chemical biology, this situation is faced, for example, when studying orphan targets or newly identified members of protein families. To investigate methods for de novo ligand identification in the absence of known three-dimensional target structures or active molecules, we have focused on combining sequence and ligand information for closely and distantly related proteins. To provide a basis for these investigations, a set of 11 protease targets from different families was assembled together with more than 2000 inhibitors directed against individual proteases. We have compared SVM approaches that combine protein sequence and ligand information in different ways and utilize 2D fingerprints as ligand descriptors. These methodologies were applied to search for inhibitors of individual proteases not taken into account during learning. A target sequence-ligand kernel and, in particular, a linear combination of multiple target-directed SVMs consistently identified inhibitors with high accuracy including test cases where homology-based similarity searching using data fusion and conventional SVM ranking nearly or completely failed. The SVM linear combination and target-ligand kernel methods described herein are intuitive and straightforward to adopt for ligand prediction against other targets.
Similar articles
-
Ligand prediction for orphan targets using support vector machines and various target-ligand kernels is dominated by nearest neighbor effects.J Chem Inf Model. 2009 Oct;49(10):2155-67. doi: 10.1021/ci9002624. J Chem Inf Model. 2009. PMID: 19780576
-
Potency-directed similarity searching using support vector machines.Chem Biol Drug Des. 2011 Jan;77(1):30-8. doi: 10.1111/j.1747-0285.2010.01059.x. Epub 2010 Nov 29. Chem Biol Drug Des. 2011. PMID: 21114788
-
Utilizing target-ligand interaction information in fingerprint searching for ligands of related targets.Chem Biol Drug Des. 2009 Jul;74(1):25-32. doi: 10.1111/j.1747-0285.2009.00829.x. Chem Biol Drug Des. 2009. PMID: 19519741
-
Computational methodologies for compound database searching that utilize experimental protein-ligand interaction information.Chem Biol Drug Des. 2010 Sep 1;76(3):191-200. doi: 10.1111/j.1747-0285.2010.01007.x. Epub 2010 Jul 15. Chem Biol Drug Des. 2010. PMID: 20636330 Review.
-
Virtual screening strategies in drug discovery.Curr Opin Chem Biol. 2007 Oct;11(5):494-502. doi: 10.1016/j.cbpa.2007.08.033. Curr Opin Chem Biol. 2007. PMID: 17936059 Review.
Cited by
-
Screening of selective histone deacetylase inhibitors by proteochemometric modeling.BMC Bioinformatics. 2012 Aug 22;13:212. doi: 10.1186/1471-2105-13-212. BMC Bioinformatics. 2012. PMID: 22913517 Free PMC article.
-
Development of drugs for Epstein-Barr virus using high-throughput in silico virtual screening.Expert Opin Drug Discov. 2010 Dec;5(12):1189-203. doi: 10.1517/17460441.2010.524640. Expert Opin Drug Discov. 2010. PMID: 22822721 Free PMC article.
-
Recent Advances in In Silico Target Fishing.Molecules. 2021 Aug 24;26(17):5124. doi: 10.3390/molecules26175124. Molecules. 2021. PMID: 34500568 Free PMC article. Review.
-
Evolution of Support Vector Machine and Regression Modeling in Chemoinformatics and Drug Discovery.J Comput Aided Mol Des. 2022 May;36(5):355-362. doi: 10.1007/s10822-022-00442-9. Epub 2022 Mar 19. J Comput Aided Mol Des. 2022. PMID: 35304657 Free PMC article.
-
A comparative study of SMILES-based compound similarity functions for drug-target interaction prediction.BMC Bioinformatics. 2016 Mar 18;17:128. doi: 10.1186/s12859-016-0977-x. BMC Bioinformatics. 2016. PMID: 26987649 Free PMC article.
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources