Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2013 Dec 20;7(Suppl 7):S1.
doi: 10.1186/1753-6561-7-S7-S1. Epub 2013 Dec 20.

The impact of structural diversity and parameterization on maps of the protein universe

The impact of structural diversity and parameterization on maps of the protein universe

Daniel Asarnow et al. BMC Proc. .

Abstract

Background: Low dimensional maps of protein structure space (MPSS) provide a powerful global representation of all proteins. In such mappings structural relationships are depicted through spatial adjacency of points, each of which represents a molecule. MPSS can help in understanding the local and global topological characteristics of the structure space, as well as elucidate structure-function relationships within and between sets of proteins. A number of meta- and method-dependent parameters are involved in creating MPSS. However, at the state-of-the-art, a systematic investigation of the influence of these parameters on MPSS construction has yet to be carried out. Further, while specific cases in which MPSS out-perform pairwise distances for prediction of functional annotations have been noted, no general explanation for this phenomenon has yet been advanced.

Methods: We address the above questions within the technical context of creating MPSS by utilizing multidimensional scaling (MDS) for obtaining low-dimensional projections of structure alignment distances.

Results and conclusion: MDS is demonstrated as an effective method for construction of MPSS where related structures are co-located, even when their functional and evolutionary proximity cannot be deduced from distributions of pairwise comparisons alone. In particular, we show that MPSS exceed pairwise distance distributions in predictive capability for those annotations of shared function or origin which are characterized by a high level of structural diversity. We also determine the impact of the choice of structure alignment and MDS algorithms on the accuracy of such predictions.

PubMed Disclaimer

Figures

Figure 1
Figure 1
MPSS by alignment algorithm and MDS method. (A)-(C) MPSS using classical MDS for Dali, CE and FATCAT, respectively. (D)-(F) MPSS using SMACOF for Dali, CE and FATCAT, respectively. Each point represents a single structure, colored by SCOP class.
Figure 2
Figure 2
Prediction of SCOP annotations by pairwise distances and MPSS proximity. ROC curves indicate the performance of a classifier. The solid black diagonal represents a random classifier; better performing classifiers bend towards the upper left of the plot. Parts (A) and (B) contain ROC curves for prediction of membership in SCOP superfamiles and SCOP familes, respectively. Curves are given for each aligner (CE, Dali, FATCAT), using either raw distances, or proximity within MPSS constructed with either classical MDS or stress majorization. In particular, the plots demonstrate that MPSS distances are never significantly worse than pairwise alignment distances.
Figure 3
Figure 3
Prediction of CATH annotations by pairwise distances and MPSS proximity. As Figure 2, but for CATH homologous superfamilies ("H").
Figure 4
Figure 4
Distance distributions of diverse and self-similar annotation groups. Each part displays histograms of FATCAT alignment distance between members of a group and all other structures (bottom, "vs. PDB25") and between group members only (top, "vs. self"). The left two groups for each classification level are considered structurally diverse, while the right two are considered structurally self-similar. SCOP superfamily, SCOP family and CATH superfamily ("H") (Parts (A)-(C) respectively) were selected because they are based on shared function or evolutionary origin rather than structure alone.
Figure 5
Figure 5
Differential classification performance for diverse and self-similar annotation groups. The left column shows ROC for prediction of annotation within diverse groups of the three classification types, using pairwise distances from all three alignment methods and both MDS methods, and the right column shows ROC curves for prediction of annotation within homogeneous or self-similar groups. While MPSS proximity is an effective classifier for both diverse and self-similar groups of structures, pairwise distances do not perform well for diverse groups, regardless of alignment method. MPSS produced using SMACOF are especially successful for diverse groups, but may very slightly over-fit the positions of structures within homogeneous groups.

References

    1. Orengo CA, Flores TP, Taylor WR, Thornton JM. Identification and classification of protein fold families. Protein Eng. 1993;7:485–500. doi: 10.1093/protein/6.5.485. - DOI - PubMed
    1. Holm L, Sander C. Mapping the Protein Universe. Science. 1996;7:595–602. doi: 10.1126/science.273.5275.595. - DOI - PubMed
    1. Hou J, Sims GE, Zhang C, Kim S-H. A global representation of the protein fold space. Proceedings of the National Academy of Sciences of the United States of America. 2003;7:2386–2390. doi: 10.1073/pnas.2628030100. - DOI - PMC - PubMed
    1. Hou J, Jun S-R, Zhang C, Kim S-H. Global mapping of the protein structure space and application in structure-based inference of protein function. Proceedings of the National Academy of Sciences of the United States of America. 2005;7:3651–3656. doi: 10.1073/pnas.0409772102. - DOI - PMC - PubMed
    1. Weinhold N, Sander O, Domingues FS, Lengauer T, Sommer I. Local Function Conservation in Sequence and Structure Space. PLoS Comput Biol. 2008;7:e1000105. doi: 10.1371/journal.pcbi.1000105. - DOI - PMC - PubMed