Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012 Oct 1;28(19):2425-30.
doi: 10.1093/bioinformatics/bts478. Epub 2012 Jul 27.

TMBB-DB: a transmembrane β-barrel proteome database

Affiliations

TMBB-DB: a transmembrane β-barrel proteome database

Thomas C Freeman Jr et al. Bioinformatics. .

Abstract

Motivation: We previously reported the development of a highly accurate statistical algorithm for identifying β-barrel outer membrane proteins or transmembrane β-barrels (TMBBs), from genomic sequence data of Gram-negative bacteria (Freeman,T.C. and Wimley,W.C. (2010) Bioinformatics, 26, 1965-1974). We have now applied this identification algorithm to all available Gram-negative bacterial genomes (over 600 chromosomes) and have constructed a publicly available, searchable, up-to-date, database of all proteins in these genomes.

Results: For each protein in the database, there is information on (i) β-barrel membrane protein probability for identification of β-barrels, (ii) β-strand and β-hairpin propensity for structure and topology prediction, (iii) signal sequence score because most TMBBs are secreted through the inner membrane translocon and, thus, have a signal sequence, and (iv) transmembrane α-helix predictions, for reducing false positive predictions. This information is sufficient for the accurate identification of most β-barrel membrane proteins in these genomes. In the database there are nearly 50 000 predicted TMBBs (out of 1.9 million total putative proteins). Of those, more than 15 000 are 'hypothetical' or 'putative' proteins, not previously identified as TMBBs. This wealth of genomic information is not available anywhere else.

Availability: The TMBB genomic database is available at http://beta-barrel.tulane.edu/.

Contact: wwimley@tulane.edu.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
Prediction of TMBBs using signal peptide prediction and TMBB structure prediction. Schematic of a TMBB-encoding protein shows signal peptide predicted using SignalP (Petersen et al., 2011) and TMBB domain using Freeman–Wimley β-barrel analysis (Freeman, Jr and Wimley, 2010). The Freeman–Wimley algorithm is as follows: (i) amino acid abundances are assigned to each residue within a 10-residue sliding window. The three terminal residues at either end are assigned as interfacial residues and the remainder as bilayer core residues. (ii) The β-strand score is the sum of scores within the window, where peaks indicate the middle of predicted β-strands. (iii) The β-hairpin score is a sum of β-strand scores, where two β-strand peaks are separated by a five-residue gap (representing the hairpin turn). (iv) The topology prediction shown in the β-hairpin score is simplified to a single value called the β-barrel score
Fig. 2.
Fig. 2.
From β-barrel score to probability. (A) The probability that a particular β-barrel score is a positive prediction can be estimated from an assessment of the PPV and a function of the arbitrary β-barrel score for a given dataset. The dataset used to assess the PPV of the β-barrel score included the annotated genes from an E. coli chromosome. There were 40 TMBBs and 2378 non-TMBBs identified out of 5253 total sequences (see the text). Proteins annotated as hypothetical, putative, or predicted were excluded. The PPV was plotted as a function of β-barrel score and was fit with a sigmoidal function. (B) Histogram of β-barrel probability for the E. coli O157 genome. Based on our previous work, a protein with probability value above 0.28 (β-barrel score above 45) is a strong candidate TMBB
Fig. 3.
Fig. 3.
TMBB prediction analysis. Sample protein sequences were analyzed for propensity to fold into TMBBs. The Freeman–Wimley prediction plots show the β-strand and β-hairpin prediction scores over the sequences of OmpW and ECS5270 (gi 38704255), a predicted TMBB from E. coli O157 (strain Sakai). Threshold values are indicated for each. The β-hairpin threshold is an empirical value. Most TMBBs have a significant portion of their sequence above the threshold. The structure of OmpW has been solved (Hong et al., 2006). It has eight transmembrane β-strands arranged in four hairpins. The topology prediction of the hypothetical protein looks very similar to OmpW, which suggests it has a similar structure. The signal peptide scores for both sequences indicate that a signal peptide is present. Although it has not been studied experimentally, ECS5270 is predicted with high confidence to be a TMBB using this orthogonal strategy of TMBB and signal peptide prediction
Fig. 4.
Fig. 4.
Analysis of sample genomes. Three sample genomes of Gram-negative organisms were analyzed using the dual strategy of TMBB prediction and signal peptide prediction. The results for each protein in each genome are plotted in the two-dimensional scatter plot. The coloring of the plots indicates the density of points in an area, with red being the most dense, and purple being the least dense. The plot in the upper right shows a legend identifying where certain classes of proteins will populate the scatterplots. In this panel, we also show values for the 40 known TMBBs of E. coli. These genomic data show that most proteins score near zero using both prediction methods (Signal peptide and β-barrel). TMBBs, i.e. sequences with high β-barrel scores and high signal peptide prediction probability, range in these examples from 2.1 to 10.6% of the genomes
Fig. 5.
Fig. 5.
Overall database statistics. ‘Left’: Current database coverage. A positively predicted TMBBs has a β-barrel probability >0.28 and a SignalP score >0.3. Unknown TMBBs are positively predicted unknown or hypothetical proteins. ‘Right’: Histogram of TMBB in % of genome. ‘Inset’: The region above 5%, highlighting the few genomes with high-TMBB content. Most genomes have between 1 and 5% TMBBs and the median value is 2.5%

Similar articles

Cited by

References

    1. Bagos PG, et al. Evaluation of methods for predicting the topology of beta-barrel outer membrane proteins and a consensus prediction method. BMC Bioinformatics. 2005;6:7. - PMC - PubMed
    1. Bagos PG, et al. A Hidden Markov Model method, capable of predicting and discriminating beta-barrel outer membrane proteins. BMC Bioinformatics. 2004a;5:29. - PMC - PubMed
    1. Bagos PG, et al. PRED-TMBB: a web server for predicting the topology of beta-barrel outer membrane proteins. Nucleic Acids Res. 2004b;32:W400–W404. - PMC - PubMed
    1. Bigelow H, Rost B. PROFtmb: a web server for predicting bacterial transmembrane beta barrel proteins. Nucleic Acids Res. 2006;34:W186–W188. - PMC - PubMed
    1. Bigelow HR, et al. Predicting transmembrane beta-barrels in proteomes. Nucleic Acids Res. 2004;32:2566–2577. - PMC - PubMed

Publication types