Mining a chemical database for fragment co-occurrence: discovery of "chemical clichés"
- PMID: 16562983
- DOI: 10.1021/ci050370c
Mining a chemical database for fragment co-occurrence: discovery of "chemical clichés"
Abstract
Nowadays millions of different compounds are known, their structures stored in electronic databases. Analysis of these data could yield valuable insights into the laws of chemistry and the habits of chemists. We have therefore explored the public database of the National Cancer Institute (>250,000 compounds) by pattern searching. We split the molecules of this database into fragments to find out which fragments exist, how frequent they are, and whether the occurrence of one fragment in a molecule is related to the occurrence of another, nonoverlapping fragment. It turns out that some fragments and combinations of fragments are so frequent that they can be called "chemical clichés". We believe that the fragment data can give insight into the chemical space explored so far by synthesis. The lists of fragments and their (co-)occurrences can help create novel chemical compounds by (i) systematically listing the most popular and therefore most easily used substituents and ring systems for synthesizing new compounds, (ii) being an easily accessible repository for rarer fragments suitable for lead compound optimization, and (iii) pointing out some of the yet unexplored parts of chemical space.
Similar articles
-
Importance of molecular computer modeling in anticancer drug development.J BUON. 2007 Sep;12 Suppl 1:S101-18. J BUON. 2007. PMID: 17935268 Review.
-
Fragment formal concept analysis accurately classifies compounds with closely related biological activities.ChemMedChem. 2009 Jul;4(7):1174-81. doi: 10.1002/cmdc.200900035. ChemMedChem. 2009. PMID: 19384901
-
Developing an antituberculosis compounds database and data mining in the search of a motif responsible for the activity of a diverse class of antituberculosis agents.J Chem Inf Model. 2006 Jan-Feb;46(1):17-23. doi: 10.1021/ci050115s. J Chem Inf Model. 2006. PMID: 16426035
-
Chemical fragment spaces for de novo design.J Chem Inf Model. 2007 Mar-Apr;47(2):318-24. doi: 10.1021/ci6003652. Epub 2007 Feb 15. J Chem Inf Model. 2007. PMID: 17300171
-
[Development of antituberculous drugs: current status and future prospects].Kekkaku. 2006 Dec;81(12):753-74. Kekkaku. 2006. PMID: 17240921 Review. Japanese.
Cited by
-
Lost in chemical space? Maps to support organometallic catalysis.Chem Cent J. 2015 Jun 18;9:38. doi: 10.1186/s13065-015-0104-5. eCollection 2015. Chem Cent J. 2015. PMID: 26113874 Free PMC article.
-
Diversification of Nucleophile-Intercepted Beckmann Fragmentation Products and Related Density Functional Theory Studies.J Org Chem. 2020 Sep 4;85(17):11396-11408. doi: 10.1021/acs.joc.0c01486. Epub 2020 Aug 26. J Org Chem. 2020. PMID: 32786611 Free PMC article.
-
Exploration of Scaffolds from Natural Products with Antiplasmodial Activities, Currently Registered Antimalarial Drugs and Public Malarial Screen Data.Molecules. 2016 Jan 16;21(1):104. doi: 10.3390/molecules21010104. Molecules. 2016. PMID: 26784165 Free PMC article.
-
Synthesis of natural-product-like molecules with over eighty distinct scaffolds.Angew Chem Int Ed Engl. 2009;48(1):104-9. doi: 10.1002/anie.200804486. Angew Chem Int Ed Engl. 2009. PMID: 19016294 Free PMC article. No abstract available.
-
Residue preference mapping of ligand fragments in the Protein Data Bank.J Chem Inf Model. 2011 Apr 25;51(4):807-15. doi: 10.1021/ci100386y. Epub 2011 Mar 18. J Chem Inf Model. 2011. PMID: 21417260 Free PMC article.
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources