Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2009 Aug 1;76(2):365-74.
doi: 10.1002/prot.22352.

Prediction of 3D metal binding sites from translated gene sequences based on remote-homology templates

Affiliations

Prediction of 3D metal binding sites from translated gene sequences based on remote-homology templates

Ronen Levy et al. Proteins. .

Abstract

Database-scale analysis was performed to determine whether structural models, based on remote homologues, are effective in predicting 3D transition metal binding sites in proteins directly from translated gene sequences. The extent by which side chain modeling alone reduces sensitivity and selectivity is shown to be <10%. Surprisingly, selectivity was not dependent on the level of sequence homology between template and target, or on the presence of a metal ion in the structural template. Applying a modification of the CHED algorithm (Babor et al., Proteins 2008;70:208-217) and machine learning filters, a selectivity of approximately 90% was achieved for protein sequences using unrelated structural templates over a sequence identity range of 18-100%. Below approximately 18% identity, the number of analyzable target-template pairs and predictability of metal binding sites falls off sharply. A full third of structural templates were found to have target partners only in the remote homology range of 18-30%. In this range, nonmetal-binding templates are calculated to be the majority and serve to predict with 50% sensitivity at the geometric level. Overall, sensitivity at the geometric level for targets having templates in the 18-30% sequence identity range is 73%, with an average of one false positive site per true site. Protein sequences described as "unknown" in the UniProt database and composed largely of unidentified genome project sequences were studied and metal binding sites predicted. A web server for prediction of metal binding sites from protein sequence is provided.

PubMed Disclaimer

Similar articles

Cited by

LinkOut - more resources