Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 1999 Sep-Oct;39(5):881-6.
doi: 10.1021/ci990308d.

Database searching for compounds with similar biological activity using short binary bit string representations of molecules

Affiliations
Comparative Study

Database searching for compounds with similar biological activity using short binary bit string representations of molecules

L Xue et al. J Chem Inf Comput Sci. 1999 Sep-Oct.

Abstract

In an effort to identify biologically active molecules in compound databases, we have investigated similarity searching using short binary bit strings with a maximum of 54 bit positions. These "minifingerprints" (MFPs) were designed to account for the presence or absence of structural fragments and/or aromatic character, flexibility, and hydrogen-bonding capacity of molecules. MFP design was based on an analysis of distributions of molecular descriptors and structural fragments in two large compound collections. The performance of different MFPs and a reference fingerprint was tested by systematic "one-against-all" similarity searches of molecules in a database containing 364 compounds with different biological activities. For each fingerprint, the most effective similarity cutoff value was determined. An MFP accounting for only 32 structural fragments showed less than 2% false positive similarity matches and correctly assigned on average approximately 40% of the compounds with the same biological activity to a query molecule. Inclusion of three numerical two-dimensional (2D) molecular descriptors increased the performance by 15%. This MFP performed better than a complex 2D fingerprint. At a similarity cutoff value of 0.85, the 2D fingerprint totally eliminated false positives but recognized less than 10% of the compounds within the same activity class.

PubMed Disclaimer

Publication types

LinkOut - more resources