Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012 Jul;40(Web Server issue):W334-9.
doi: 10.1093/nar/gks436. Epub 2012 May 25.

Super: a web server to rapidly screen superposable oligopeptide fragments from the protein data bank

Affiliations

Super: a web server to rapidly screen superposable oligopeptide fragments from the protein data bank

James H Collier et al. Nucleic Acids Res. 2012 Jul.

Abstract

Searching for well-fitting 3D oligopeptide fragments within a large collection of protein structures is an important task central to many analyses involving protein structures. This article reports a new web server, Super, dedicated to the task of rapidly screening the protein data bank (PDB) to identify all fragments that superpose with a query under a prespecified threshold of root-mean-square deviation (RMSD). Super relies on efficiently computing a mathematical bound on the commonly used structural similarity measure, RMSD of superposition. This allows the server to filter out a large proportion of fragments that are unrelated to the query; >99% of the total number of fragments in some cases. For a typical query, Super scans the current PDB containing over 80,500 structures (with ∼40 million potential oligopeptide fragments to match) in under a minute. Super web server is freely accessible from: http://lcb.infotech.monash.edu.au/super.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Super web server information flow.
Figure 2.
Figure 2.
A screenshot of the results window showing some of the returned matches, one of which is selected for viewing in the top-left of the figure.
Figure 3.
Figure 3.
Timing results of screening the entire PDB using Super across two data sets containing 10 000 query fragments each (a) Set 1 (b) Set 2. The graphs show the RMSD threshold on the X-axis and time on the Y-axis, giving the mean, the median, the maximum and the minimum time taken over the 10 000 queries for each set.
Figure 4.
Figure 4.
Timing results of screening the entire PDB using 10 000 randomly selected query fragments, running Super without RMSD lower bound filtering and other engineering optimizations. X-axis gives the RMSD threshold and Y-axis gives the runtime.

Similar articles

Cited by

References

    1. Rustici M, Lesk AM. Three-dimensional searching for recurrent structural motifs in data bases of protein structures. J. Comput. Biol. 1994;1:121–132. - PubMed
    1. Lesk AM. Extraction of well-fitting substructures: root-mean-square deviation and the difference distance matrix. Fold. Des. 1997;2:12–14. - PubMed
    1. Lesk AM. Detection of three-dimensional patterns of atoms in chemical structures. Commun. ACM. 1979;22:219–224.
    1. Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, Shindyalov IN, Bourne PE. The Protein Data Bank. Nucleic Acids Res. 2000;28:235–242. - PMC - PubMed
    1. Gu J, Bourne PE. Structural Bioinformatics. NJ, USA: Wiley-Blackwell; 2009.

Publication types