Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2005 Oct 14:6:254.
doi: 10.1186/1471-2105-6-254.

Prediction of beta-barrel membrane proteins by searching for restricted domains

Affiliations

Prediction of beta-barrel membrane proteins by searching for restricted domains

Oliver Mirus et al. BMC Bioinformatics. .

Abstract

Background: The identification of beta-barrel membrane proteins out of a genomic/proteomic background is one of the rapidly developing fields in bioinformatics. Our main goal is the prediction of such proteins in genome/proteome wide analyses.

Results: For the prediction of beta-barrel membrane proteins within prokaryotic proteomes a set of parameters was developed. We have focused on a procedure with a low false positive rate beside a procedure with lowest false prediction rate to obtain a high certainty for the predicted sequences. We demonstrate that the discrimination between beta-barrel membrane proteins and other proteins is improved by analyzing a length limited region. The developed set of parameters is applied to the proteome of E. coli and the results are compared to four other described procedures.

Conclusion: Analyzing the beta-barrel membrane proteins revealed the presence of a defined membrane inserted beta-barrel region. This information can now be used to refine other prediction programs as well. So far, all tested programs fail to predict outer membrane proteins in the proteome of the prokaryote E. coli with high reliability. However, the reliability of the prediction is improved significantly by a combinatory approach of several programs. The consequences and usability of the developed scores are discussed.

PubMed Disclaimer

Figures

Figure 1
Figure 1
The selection criteria. (A) Schematic view of the scores for prediction. (B) The amount of sequences in all test pools with 0 or 1 transmembrane helix (TMH) predicted by TMHMM. (C) The amount of sequences in the pools with NOM proteins (lane 1) greater than 79 amino acids (lane 2) and with less than 2 predicted transmembrane helices (lane 3).
Figure 2
Figure 2
The BSN selection. (A) The relationship between BSN calculated by the old and new procedure is shown for the sequences of E. coli (circle, bottom x-axis). The percentage of sequences with a certain BSN is shown as line plot (top x-axis). (B) The sequence length (in amino acids) dependence of the new BSN for sequences of E. coli is shown. (C-D) Sequences with a BSN value above 6 (solid), 8 (dashed), 10 (dashed-dotted) or 12 (dotted) were selected from the PDB (C) or PSort (D) pools. Subsequently, for the generated sequences pools the percentage of false positive selected sequences from NOM protein pools (black lines) and the false negative selected sequences from OM protein pools (grey lines) in relation to the BSN/aa cut off was determined. (E) The numbers of structurally determined strands and of predicted strands are shown; the line indicates a similar detection value. (F) The amount of strands predicted at identical position (maximum 1 amino acid mismatch; identical), of strands predicted at identical or overlapping position (maximum 5 amino acids mismatch; overlap) and the amount of false negative and false positive predicted strands (false) is shown.
Figure 3
Figure 3
Analysis of the BBS and BSHS. (A) The β-strand locations of a N. meningitidis (NalP) and an E. coli (OmpF) OM protein are shown. The window for calculating the BBS or a domain based BBS (BBS-x) is indicated. (B-C) The false prediction rate for BBS-x (B) or BSHS-x (C) calculation using different amino acid windows and different cut off scores is shown. The regions with the lowest false prediction rates (black) for the three times weighted pool of the NOM proteins is shown. (D) The percentage of sequences above a certain threshold value of BBS275 minus BBS is shown for the sequences of E. coli.
Figure 4
Figure 4
Score definition for the linear predictor. (A) The false positive rate for the NOM protein pool in dependence on the BBS275 cut off (grey) or BSHS225 cut off (black) was calculated. (B, C) The false positive selection rate for the NOM protein pool and the false negative rate of the OM protein pool was calculated for a sliding window for BBS275 and BSHS225 considering BSN>10 and BSN/aa>0.026 as preselection rule. The false prediction rate was calculated using a three times higher weight of the false positive rate of NOM proteins. In (B) the false prediction rates of the individual selection by BBS275 and BSHS225 is shown. In (C) the false prediction rate of the dependent selection by BBS275 and BSHS225 is shown.
Figure 5
Figure 5
Identification of β-barrel protein sequences from the E. coli proteome. (A) Sequences were selected from the E. coli proteome by the three parameter sets developed (Table 2). The percentage of selected sequences in comparison to the proteome size is shown (bars 1–3). Also shown are the percentage of sequences selected by MCMBB (bar 4), MCMBB filtered by TMHMM (bar 5, MCMBB*), by BOMP (bar 6; please note, that only two sequences were selected by BOMP with αTM >1 according to TMHMM), by TMB-Hunt, BBTM protein score >0 and E-value <1 (TMB-Hunt°, bar 7), by TMB-Hunt, BBTM protein score >0 and E-value <1 controlled by TMHMM (TMB-HUNT°*, bar 8) and by the global procedure (bar 9). (B) The sequences selected by the three procedures proposed in here were analyzed for known or assigned function or localization. The percentage of the sequences either classified as hypothetical, outer membrane, extra-cellular or soluble intracellular is shown. (C) The false positive rate for the three in here generated sequence pools (bars 1–3), for the sequence pool generated by MCMBB (bar 4), by MCMBB controlled by TMHMM (bar 5), by BOMP (bar 6), by TMB-Hunt, BBTM protein score >0 and E-value <1 (TMB-Hunt°, bar 7) or by TMB-Hunt, BBTM protein score >0 and E-value <1, controlled by TMHMM (TMB-Hunt°*, bar 8) is shown.
Figure 6
Figure 6
The performance of the combinatory approach. (A) The percentage of sequences selected from the E. coli proteome by our three methods in combination with MCMBB (black bar), BOMP (light grey bar) and TMB-Hunt, BBTM protein score >0 and E-value <1 (dark grey bar) is given. (B) The false positive rate for combinatory approach performed as under (A) is shown. (C) The percentage of sequences selected by BOMP and our AND selection were analyzed in comparison to the BOMP selection sorted according to the BOMP rank (BR) assigned (top panel, grey). The percentage of the (putative) outer membrane β-barrel proteins (black bar) and (putative) non-outer membrane β-barrel proteins (white bar) in relation to the total amount of rejected sequences is given on the bottom. (D, E) The percentage of sequences selected by TMB-Hunt° and our AND selection were analyzed in comparison to the TMB-Hunt° selection sorted according to the BB score (D) or E-value (E) assigned ({explained in [17]} grey). In (E), the percentage of the (putative) outer membrane β-barrel proteins (black bar) and (putative) non-outer membrane β-barrel proteins (white bar) in relation to the total amount of rejected sequences sorted according to the E-value is given on the bottom. (F) The percentage of the E. coli proteome selected by the combinatory approach between TMB-Hunt° & BOMP, TMB-Hunt° & MCMBB, and BOMP & MCMBB is given on the left side. The right side shows the false positive rate as explained in Fig. 5C.

References

    1. Ashurst JL, Collins JE. Gene annotation: prediction and testing. Annu Rev Genomics Hum Genet. 2003;4:69–88. doi: 10.1146/annurev.genom.4.070802.110300. - DOI - PubMed
    1. Drewes G, Bouwmeester T. Global approaches to protein-protein interactions. Curr Opin Cell Biol. 2003;15:199–205. doi: 10.1016/S0955-0674(03)00005-X. - DOI - PubMed
    1. Gerstein M, Hegyi H. Comparing genomes in terms of protein structure: surveys of a finite parts list. FEMS Microbiol Rev. 1998;22:277–304. doi: 10.1016/S0168-6445(98)00019-9. - DOI - PubMed
    1. Schatz G, Dobberstein B. Common principles of protein translocation across membranes. Science. 1996;271:1519–1526. - PubMed
    1. Emanuelsson O, von Heijne G. Prediction of organellar targeting signals. Biochim Biophys Acta. 2001;1541:114–119. doi: 10.1016/S0167-4889(01)00145-8. - DOI - PubMed

Publication types

MeSH terms

Substances

LinkOut - more resources