Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2006 Jul 4:7:331.
doi: 10.1186/1471-2105-7-331.

PROMPT: a protein mapping and comparison tool

Affiliations

PROMPT: a protein mapping and comparison tool

Thorsten Schmidt et al. BMC Bioinformatics. .

Abstract

Background: Comparison of large protein datasets has become a standard task in bioinformatics. Typically researchers wish to know whether one group of proteins is significantly enriched in certain annotation attributes or sequence properties compared to another group, and whether this enrichment is statistically significant. In order to conduct such comparisons it is often required to integrate molecular sequence data and experimental information from disparate incompatible sources. While many specialized programs exist for comparisons of this kind in individual problem domains, such as expression data analysis, no generic software solution capable of addressing a wide spectrum of routine tasks in comparative proteomics is currently available.

Results: PROMPT is a comprehensive bioinformatics software environment which enables the user to compare arbitrary protein sequence sets, revealing statistically significant differences in their annotation features. It allows automatic retrieval and integration of data from a multitude of molecular biological databases as well as from a custom XML format. Similarity-based mapping of sequence IDs makes it possible to link experimental information obtained from different sources despite discrepancies in gene identifiers and minor sequence variation. PROMPT provides a full set of statistical procedures to address the following four use cases: i) comparison of the frequencies of categorical annotations between two sets, ii) enrichment of nominal features in one set with respect to another one, iii) comparison of numeric distributions, and iv) correlation of numeric variables. Analysis results can be visualized in the form of plots and spreadsheets and exported in various formats, including Microsoft Excel.

Conclusion: PROMPT is a versatile, platform-independent, easily expandable, stand-alone application designed to be a practical workhorse in analysing and mining protein sequences and associated annotation. The availability of the Java Application Programming Interface and scripting capabilities on one hand, and the intuitive Graphical User Interface with context-sensitive help system on the other, make it equally accessible to professional bioinformaticians and biologically-oriented users. PROMPT is freely available for academic users from http://webclu.bio.wzw.tum.de/prompt/.

PubMed Disclaimer

Figures

Figure 1
Figure 1
General Workflow of PROMPT.
Figure 2
Figure 2
Example PROMPT XML File. The file contains a set definition property that encompasses all E.coli proteins together with their amino acid sequences. Additionally, annotation information stored in the numeric property transmembrane segments and in the symbolic property funcat is provided.
Figure 3
Figure 3
Data input and mapping workflow.
Figure 4
Figure 4
Graphical User Interface (GUI). Shown is a typical workspace session with input data and results. The information panel in the bottom part of the screen provides context sensitive information related to the current user action.
Figure 5
Figure 5
Screenshots of a typical statistical test dialog. A. The Mann-Whitney test dialog in the simple handling mode with reduced parameters. B. The same test in the advanced view with all options allowing full control. C. The built-in help with general description of the test and its parameters. The statistical background information was derived from the R documentation.
Figure 6
Figure 6
Example of an interactive browsable figure. Shown is a comparison of EC numbers found in the annotation of two protein sets. By clicking on the bars the user can zoom in and out the different levels of the Enzyme Nomenclature.
Figure 7
Figure 7
Example of a categorical comparison analysis. Frequency of SCOP folds in GroEL substrates compared with the whole E.coli lysate. Only folds that were found at least two times in both sets and that were significantly different at a significance level of 0.05 are shown. The stars on top of the red bars show that the differences are significant with the p-values: < 0.05 *, < 0.01 ** and < 0.001 ***. The figure is a screenshot of an interactive built-in visualisation module provided by PROMPT. All interactive plots allow easy adjustments (changing font sizes, title, axis labels, etc.) and can be saved as graphic files.
Figure 8
Figure 8
Results of a correlation analysis. A. Scatter plot of protein abundance against steady-state mRNA expression levels in yeast. The solid and dotted lines show the local polynomial loess fitting curve and the linear regression, respectively. The axes are scaled logarithmically. The box plots visualise the value distribution of each variable. B. PROMPT's spread sheet viewer with the Pearson correlation coefficient of 0.44, a highly significant p-value of 0.0 (values below 10-300 are rounded to zero), and further statistical key values. All analysis results can be exported to tab-delimited, comma separated, or Microsoft Excel files.
Figure 9
Figure 9
Comparison of two numeric distributions by PROMPT. Here normalized abundance distributions of all observed proteins (blue) and essential proteins only (green), as well as the relative difference (red) are shown. These distributions are significantly different (Kolmogorov-Smirnov p-value 6.2 E-12, Mann-Whitney p-value 1.7 E-13). Additionally the stars on top of the red bars show the specific intervals in which the difference is significant. The p-values are indicated by the number of stars: p-value *< 0.05, ** < 0.01 and *** < 0.001.

Similar articles

Cited by

References

    1. Koonin EV, Altschul SF, Bork P. BRCA1 protein products ... Functional motifs. Nat Genet. 1996;13:266–268. doi: 10.1038/ng0796-266. - DOI - PubMed
    1. Pagel P, Mewes HW, Frishman D. Conservation of protein-protein interactions - lessons from ascomycota. Trends Genet. 2004;20:72–76. doi: 10.1016/j.tig.2003.12.007. - DOI - PubMed
    1. Jordan IK, Rogozin IB, Wolf YI, Koonin EV. Essential genes are more evolutionarily conserved than are nonessential genes in bacteria. Genome Res. 2002;12:962–968. 10.1101/gr.87702. Article published online before print in May 2002. - PMC - PubMed
    1. Frishman D, Mewes HW. Protein structural classes in five complete genomes. Nat Struct Biol. 1997;4:626–628. doi: 10.1038/nsb0897-626. - DOI - PubMed
    1. Gerstein M. A structural census of genomes: comparing bacterial, eukaryotic, and archaeal genomes in terms of protein structure. J Mol Biol. 1997;274:562–576. doi: 10.1006/jmbi.1997.1412. - DOI - PubMed

Publication types

MeSH terms

LinkOut - more resources