ChemmineR: a compound mining framework for R
- PMID: 18596077
- PMCID: PMC2638865
- DOI: 10.1093/bioinformatics/btn307
ChemmineR: a compound mining framework for R
Abstract
Motivation: Software applications for structural similarity searching and clustering of small molecules play an important role in drug discovery and chemical genomics. Here, we present the first open-source compound mining framework for the popular statistical programming environment R. The integration with a powerful statistical environment maximizes the flexibility, expandability and programmability of the provided analysis functions.
Results: We discuss the algorithms and compound mining utilities provided by the R package ChemmineR. It contains functions for structural similarity searching, clustering of compound libraries with a wide spectrum of classification algorithms and various utilities for managing complex compound data. It also offers a wide range of visualization functions for compound clusters and chemical structures. The package is well integrated with the online ChemMine environment and allows bidirectional communications between the two services.
Availability: ChemmineR is freely available as an R package from the ChemMine project site: http://bioweb.ucr.edu/ChemMineV2/chemminer
References
-
- Carhart R, et al. Atom pairs as molecular features in structure-activity studies: definition and applications. J. Chem. Inf. Comput. Sci. 1985;25:64–73.
-
- Chen J, et al. ChemDB: a public database of small molecules and related chemoinformatics resources. Bioinformatics. 2005;21:4133–4139. - PubMed
-
- Chen X, Reynolds C. Performance of similarity measures in 2D fragment-based similarity searching: comparison of structural descriptors and similarity coefficients. J. Chem. Inf. Comput. Sci. 2002;42:1407–1414. - PubMed
-
- Gedeck P, et al. QSAR–how good is it in practice? Comparison of descriptor sets on an unbiased cross section of corporate data sets. J. Chem. Inf. Model. 2006;46:1924–1936. - PubMed
-
- Gentleman R, et al. Bioinformatics and Computational Biology Solutions Using R and Bioconductor. New York: Springer; 2005.
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources
Molecular Biology Databases
