Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2007 Apr 4:8:116.
doi: 10.1186/1471-2105-8-116.

ASH structure alignment package: sensitivity and selectivity in domain classification

Affiliations

ASH structure alignment package: sensitivity and selectivity in domain classification

Daron M Standley et al. BMC Bioinformatics. .

Abstract

Background: Structure alignment methods offer the possibility of measuring distant evolutionary relationships between proteins that are not visible by sequence-based analysis. However, the question of how structural differences and similarities ought to be quantified in this regard remains open. In this study we construct a training set of sequence-unique CATH and SCOP domains, from which we develop a scoring function that can reliably identify domains with the same CATH topology and SCOP fold classification. The score is implemented in the ASH structure alignment package, for which the source code and a web service are freely available from the PDBj website http://www.pdbj.org/ASH/.

Results: The new ASH score shows increased selectivity and sensitivity compared with values reported for several popular programs using the same test set of 4,298,905 structure pairs, yielding an area of .96 under the receiver operating characteristic (ROC) curve. In addition, weak sequence homologies between similar domains are revealed that could not be detected by BLAST sequence alignment. Also, a subset of domain pairs is identified that exhibit high similarity, even though their CATH and SCOP classification differs. Finally, we show that the ranking of alignment programs based solely on geometric measures depends on the choice of the quality measure.

Conclusion: ASH shows high selectivity and sensitivity with regard to domain classification, an important step in defining distantly related protein sequence families. Moreover, the CPU cost per alignment is competitive with the fastest programs, making ASH a practical option for large-scale structure classification studies.

PubMed Disclaimer

Figures

Figure 1
Figure 1
ROC Curves. The ROC curve for the NER score (black), the new ASH score, and the sequence similarity term (blue) are shown. The new ASH score was evaluated both on the full training set (red), on the reduced training set (orange), and the test set using parameters derived from the reduced training set (green). The area under each curve is indicated in the legend.
Figure 2
Figure 2
Geometric Analysis. The relationship between average NER score and average SAS score. The average was computed over 2,071 query-template pairs using the geometric test set.

References

    1. Holm L, Sander C. Protein structure comparison by alignment of distance matrices. J Mol Biol. 1993;233:123–138. doi: 10.1006/jmbi.1993.1489. - DOI - PubMed
    1. Sierk ML, Pearson WR. Sensitivity and selectivity in protein structure comparison. Protein Sci. 2004;13:773–785. doi: 10.1110/ps.03328504. - DOI - PMC - PubMed
    1. Kolodny R, Koehl P, Levitt M. Comprehensive evaluation of protein structure alignment methods: scoring by geometric measures. J Mol Biol. 2005;346:1173–1188. doi: 10.1016/j.jmb.2004.12.032. - DOI - PMC - PubMed
    1. Orengo CA, Thornton JM. Protein families and their evolution-a structural perspective. Annu Rev Biochem. 2005;74:867–900. doi: 10.1146/annurev.biochem.74.082803.133029. - DOI - PubMed
    1. Mason SJ, Graham NE. Areas beneath the relative operating characteristics (ROC) and relative operating levels (ROL) curves: Statistical significance and interpretation. QJR Meterol Soc. 2002;128:2145–2166. doi: 10.1256/003590002320603584. - DOI