Design and evaluation of a molecular fingerprint involving the transformation of property descriptor values into a binary classification scheme
- PMID: 12870906
- DOI: 10.1021/ci030285+
Design and evaluation of a molecular fingerprint involving the transformation of property descriptor values into a binary classification scheme
Abstract
A new fingerprint design concept is introduced that transforms molecular property descriptors into two-state descriptors and thus permits binary encoding. This transformation is based on the calculation of statistical medians of descriptor distributions in large compound collections and alleviates the need for value range encoding of these descriptors. For binary encoded property descriptors, bit positions that are set off capture as much information as bit positions that are set on, different from conventional fingerprint representations. Accordingly, a variant of the Tanimoto coefficient has been defined for comparison of these fingerprints. Following our design idea, a prototypic fingerprint termed MP-MFP was implemented by combining 61 binary encoded property descriptors with 110 structural fragment-type descriptors. The performance of this fingerprint was evaluated in systematic similarity search calculations in a database containing 549 molecules belonging to 38 different activity classes and 5000 background molecules. In these calculations, MP-MFP correctly recognized approximately 34% of all similarity relationships, with only 0.04% false positives, and performed better than previous designs and MACCS keys. The results suggest that combinations of simplified two-state property descriptors have predictive value in the analysis of molecular similarity.
Similar articles
-
Profile scaling increases the similarity search performance of molecular fingerprints containing numerical descriptors and structural keys.J Chem Inf Comput Sci. 2003 Jul-Aug;43(4):1218-25. doi: 10.1021/ci030287u. J Chem Inf Comput Sci. 2003. PMID: 12870914
-
Design and evaluation of a novel class-directed 2D fingerprint to search for structurally diverse active compounds.J Chem Inf Model. 2006 Nov-Dec;46(6):2515-26. doi: 10.1021/ci600303b. J Chem Inf Model. 2006. PMID: 17125192
-
Bayesian screening for active compounds in high-dimensional chemical spaces combining property descriptors and molecular fingerprints.Chem Biol Drug Des. 2008 Jan;71(1):8-14. doi: 10.1111/j.1747-0285.2007.00602.x. Epub 2007 Dec 7. Chem Biol Drug Des. 2008. PMID: 18069988
-
Mini-fingerprints for virtual screening: design principles and generation of novel prototypes based on information theory.SAR QSAR Environ Res. 2003 Feb;14(1):27-40. doi: 10.1080/1062936021000058764. SAR QSAR Environ Res. 2003. PMID: 12688414 Review.
-
Electron-density descriptors as predictors in quantitative structure--activity/property relationships and drug design.Future Med Chem. 2011 Jun;3(8):969-94. doi: 10.4155/fmc.11.65. Future Med Chem. 2011. PMID: 21707400 Review.
Cited by
-
Structure-activity exploration of a small-molecule Lipid II inhibitor.Drug Des Devel Ther. 2015 Apr 24;9:2383-94. doi: 10.2147/DDDT.S79504. eCollection 2015. Drug Des Devel Ther. 2015. PMID: 25987836 Free PMC article.
-
JEDA: Joint entropy diversity analysis. An information-theoretic method for choosing diverse and representative subsets from combinatorial libraries.Mol Divers. 2006 Aug;10(3):333-9. doi: 10.1007/s11030-006-9042-4. Epub 2006 Sep 21. Mol Divers. 2006. PMID: 17031536
-
Introducing a Chemically Intuitive Core-Substituent Fingerprint Designed to Explore Structural Requirements for Effective Similarity Searching and Machine Learning.Molecules. 2022 Apr 4;27(7):2331. doi: 10.3390/molecules27072331. Molecules. 2022. PMID: 35408730 Free PMC article.
-
FINDSITE: a threading-based approach to ligand homology modeling.PLoS Comput Biol. 2009 Jun;5(6):e1000405. doi: 10.1371/journal.pcbi.1000405. Epub 2009 Jun 5. PLoS Comput Biol. 2009. PMID: 19503616 Free PMC article.
-
In silico methods for drug repurposing and pharmacology.Wiley Interdiscip Rev Syst Biol Med. 2016 May;8(3):186-210. doi: 10.1002/wsbm.1337. Epub 2016 Apr 15. Wiley Interdiscip Rev Syst Biol Med. 2016. PMID: 27080087 Free PMC article. Review.
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Medical