Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012;13 Suppl 7(Suppl 7):S3.
doi: 10.1186/1471-2164-13-S7-S3. Epub 2012 Dec 13.

Bayesian prediction of bacterial growth temperature range based on genome sequences

Affiliations

Bayesian prediction of bacterial growth temperature range based on genome sequences

Dan B Jensen et al. BMC Genomics. 2012.

Abstract

Background: The preferred habitat of a given bacterium can provide a hint of which types of enzymes of potential industrial interest it might produce. These might include enzymes that are stable and active at very high or very low temperatures. Being able to accurately predict this based on a genomic sequence, would thus allow for an efficient and targeted search for production organisms, reducing the need for culturing experiments.

Results: This study found a total of 40 protein families useful for distinction between three thermophilicity classes (thermophiles, mesophiles and psychrophiles). The predictive performance of these protein families were compared to those of 87 basic sequence features (relative use of amino acids and codons, genomic and 16S rDNA AT content and genome size). When using naïve Bayesian inference, it was possible to correctly predict the optimal temperature range with a Matthews correlation coefficient of up to 0.68. The best predictive performance was always achieved by including protein families as well as structural features, compared to either of these alone. A dedicated computer program was created to perform these predictions.

Conclusions: This study shows that protein families associated with specific thermophilicity classes can provide effective input data for thermophilicity prediction, and that the naïve Bayesian approach is effective for such a task. The program created for this study is able to efficiently distinguish between thermophilic, mesophilic and psychrophilic adapted bacterial genomes.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Phylogenetic relationships of 117 bacteria from the four different thermophilicity classes. The relationship is based on predicted 16S rRNA sequences. Red tips are hyperthermophiles, orange are thermophiles, green are mesophiles and blue are psychrophiles. The purple lines indicate the species exemplifying evolutionary flexibility, as discussed in the text.
Figure 2
Figure 2
Pearson's correlation coefficients between thermophilicity class-associated protein families, shown as a heat map. Lighter colors indicate stronger correlations. The top seven protein families (2075, 4698, 1149, 6954, 11184 and 14495) were all found to be overrepresented in thermophile genomes. The remaining protein families were overrepresented in psychrophile genomes. Families associated with the same thermophilicity class tend to correlate moderately with each other and anti-correlate moderately with families associated with other classes.
Figure 3
Figure 3
Pearson's correlation coefficients of sequence features, presented as a heat map. Lighter color indicates stronger correlation. The blue lines separate the amino acids from the codons, while the green lines separate the codons from the genome size and the AT content (genomic and 16S rDNA).

References

    1. Handelsman J. "Metagenomics : Application of Genomics to Uncultured Microorganisms. Society. 2004;68(4) - PMC - PubMed
    1. Mahale KN, Kempraj V, Dasgupta D. "Does the growth temperature of a prokaryote influence the purine content of its mRNAs?,". Gene. 2012;497(1):83–9. doi: 10.1016/j.gene.2012.01.040. - DOI - PubMed
    1. Tekaia F, Yeramian E. "Evolution of proteomes: fundamental signatures and global trends in amino acid compositions.,". BMC genomics. 2006;7:307. doi: 10.1186/1471-2164-7-307. - DOI - PMC - PubMed
    1. Smole Z, Nikolic N, Supek F, Šmuc T, Sbalzarini IF, Krisko A. "Proteome sequence features carry signatures of the environmental niche of prokaryotes.,". BMC evolutionary biology. 2011;11(1):26. doi: 10.1186/1471-2148-11-26. - DOI - PMC - PubMed
    1. Zheng H, Wu H. "Gene-centric association analysis for the correlation between the guanine-cytosine content levels and temperature range conditions of prokaryotic species.,". BMC bioinformatics. 2010;11(Suppl 1):S7. doi: 10.1186/1471-2105-11-S1-S7. Suppl 11. - DOI - PMC - PubMed

Substances

LinkOut - more resources