Bit silencing in fingerprints enables the derivation of compound class-directed similarity metrics
- PMID: 18698839
- DOI: 10.1021/ci8002045
Bit silencing in fingerprints enables the derivation of compound class-directed similarity metrics
Abstract
Fingerprints are molecular bit string representations and are among the most popular descriptors for similarity searching. In key-type fingerprints, each bit position monitors the presence or absence of a prespecified chemical or structural feature. In contrast to hashed fingerprints, this keyed design makes it possible to evaluate individual bit positions and the associated structural features during similarity searching. Bit silencing is introduced as a systematic approach to assess the contribution of each bit in a fingerprint to similarity search performance. From the resulting bit contribution profile, a bit position-dependent weight vector is derived that determines the relative weight of each bit on the basis of its individual contribution. By merging this weight vector with the Tanimoto coefficient, compound class-directed similarity metrics are obtained that further increase fingerprint search calculations compared to conventional calculations of Tanimoto similarity.
Similar articles
-
RelACCS-FP: a structural minimalist approach to fingerprint design.Chem Biol Drug Des. 2008 Nov;72(5):341-9. doi: 10.1111/j.1747-0285.2008.00723.x. Chem Biol Drug Des. 2008. PMID: 19012570
-
Improving the search performance of extended connectivity fingerprints through activity-oriented feature filtering and application of a bit-density-dependent similarity function.ChemMedChem. 2009 Apr;4(4):540-8. doi: 10.1002/cmdc.200800408. ChemMedChem. 2009. PMID: 19263458
-
Development of a compound class-directed similarity coefficient that accounts for molecular complexity effects in fingerprint searching.J Chem Inf Model. 2009 Jun;49(6):1369-76. doi: 10.1021/ci900108d. J Chem Inf Model. 2009. PMID: 19480406
-
Fingerprint design and engineering strategies: rationalizing and improving similarity search performance.Future Med Chem. 2012 Oct;4(15):1945-59. doi: 10.4155/fmc.12.126. Future Med Chem. 2012. PMID: 23088275 Review.
-
Similarity searching using 2D structural fingerprints.Methods Mol Biol. 2011;672:133-58. doi: 10.1007/978-1-60761-839-3_5. Methods Mol Biol. 2011. PMID: 20838967 Review.
Cited by
-
Statistical-based database fingerprint: chemical space dependent representation of compound databases.J Cheminform. 2018 Nov 22;10(1):55. doi: 10.1186/s13321-018-0311-x. J Cheminform. 2018. PMID: 30467740 Free PMC article.
-
Average Information Content Maximization--A New Approach for Fingerprint Hybridization and Reduction.PLoS One. 2016 Jan 19;11(1):e0146666. doi: 10.1371/journal.pone.0146666. eCollection 2016. PLoS One. 2016. PMID: 26784447 Free PMC article.
-
Practical application of the Average Information Content Maximization (AIC-MAX) algorithm: selection of the most important structural features for serotonin receptor ligands.Mol Divers. 2017 May;21(2):407-412. doi: 10.1007/s11030-017-9729-8. Epub 2017 Feb 9. Mol Divers. 2017. PMID: 28185036 Free PMC article.
-
Prediction of probability distributions of molecular properties: towards more efficient virtual screening and better understanding of compound representations.Mol Divers. 2024 Apr;28(2):437-448. doi: 10.1007/s11030-022-10589-0. Epub 2022 Dec 31. Mol Divers. 2024. PMID: 36586082
-
Consensus queries in ligand-based virtual screening experiments.J Cheminform. 2017 Nov 28;9(1):60. doi: 10.1186/s13321-017-0248-5. J Cheminform. 2017. PMID: 29185065 Free PMC article.
MeSH terms
Substances
LinkOut - more resources
Full Text Sources