Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Jul 13;11(1):463.
doi: 10.1186/s13104-018-3535-y.

Feature optimization in high dimensional chemical space: statistical and data mining solutions

Affiliations

Feature optimization in high dimensional chemical space: statistical and data mining solutions

Jinuraj K R et al. BMC Res Notes. .

Abstract

Objectives: The primary goal of this experiment is to prioritize molecular descriptors that control the activity of active molecules that could reduce the dimensionality produced during the virtual screening process. It also aims to: (1) develop a methodology for sampling large datasets and the statistical verification of the sampling process, (2) apply screening filter to detect molecules with polypharmacological or promiscuous activity.

Results: Sampling from large a dataset and its verification were done by applying Z-test. Molecular descriptors were prioritized using principal component analysis (PCA) by eliminating the least influencing ones. The original dimensions were reduced to one-twelfth by the application of PCA. There was a significant improvement in statistical parameter values of virtual screening model which in turn resulted in better screening results. Further improvement of screened results was done by applying Eli Lilly MedChem rules filter that removed molecules with polypharmacological or promiscuous activity. It was also shown that similarities in the activity of compounds were due to the molecular descriptors which were not apparent in prima facie structural studies.

Keywords: Eli Lilly MedChem rules; Molecular descriptors; Molecular similarity; Principal component analysis; PubChem bioassay; Self-organizing maps; Virtual screening; Z-test.

PubMed Disclaimer

References

    1. Shoichet BK. Virtual screening of chemical libraries. Nature. 2004;432(7019):862–865. doi: 10.1038/nature03197. - DOI - PMC - PubMed
    1. Geromichalos GD. Virtual screening strategies and application in drug designing. Drug Des. 2012;2:1–2. doi: 10.4172/2169-0138.1000e109. - DOI
    1. Geppert H, Vogt M, Bajorath J. Current trends in ligand-based virtual screening: molecular representations, data mining methods, new application areas, and performance evaluation. J Chem Inf Model. 2010;50:205–216. doi: 10.1021/ci900419k. - DOI - PubMed
    1. Lavecchia A, Di Giovanni C. Virtual screening strategies in drug discovery: a critical review. Curr Med Chem. 2013;20(23):2839–2860. doi: 10.2174/09298673113209990001. - DOI - PubMed
    1. Clarke R, Ressom HW, Wang A, Xuan J, Liu MC, Gehan EA, Wang Y. The properties of high-dimensional data spaces: implications for exploring gene and protein expression data. Nat Rev Cancer. 2008;8(1):37–49. doi: 10.1038/nrc2294. - DOI - PMC - PubMed

LinkOut - more resources