Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2015 Jul 21;10(7):e0133183.
doi: 10.1371/journal.pone.0133183. eCollection 2015.

Genome Wide Re-Annotation of Caldicellulosiruptor saccharolyticus with New Insights into Genes Involved in Biomass Degradation and Hydrogen Production

Affiliations

Genome Wide Re-Annotation of Caldicellulosiruptor saccharolyticus with New Insights into Genes Involved in Biomass Degradation and Hydrogen Production

Nupoor Chowdhary et al. PLoS One. .

Abstract

Caldicellulosiruptor saccharolyticus has proven itself to be an excellent candidate for biological hydrogen (H2) production, but still it has major drawbacks like sensitivity to high osmotic pressure and low volumetric H2 productivity, which should be considered before it can be used industrially. A whole genome re-annotation work has been carried out as an attempt to update the incomplete genome information that causes gap in the knowledge especially in the area of metabolic engineering, to improve the H2 producing capabilities of C. saccharolyticus. Whole genome re-annotation was performed through manual means for 2,682 Coding Sequences (CDSs). Bioinformatics tools based on sequence similarity, motif search, phylogenetic analysis and fold recognition were employed for re-annotation. Our methodology could successfully add functions for 409 hypothetical proteins (HPs), 46 proteins previously annotated as putative and assigned more accurate functions for the known protein sequences. Homology based gene annotation has been used as a standard method for assigning function to novel proteins, but over the past few years many non-homology based methods such as genomic context approaches for protein function prediction have been developed. Using non-homology based functional prediction methods, we were able to assign cellular processes or physical complexes for 249 hypothetical sequences. Our re-annotation pipeline highlights the addition of 231 new CDSs generated from MicroScope Platform, to the original genome with functional prediction for 49 of them. The re-annotation of HPs and new CDSs is stored in the relational database that is available on the MicroScope web-based platform. In parallel, a comparative genome analyses were performed among the members of genus Caldicellulosiruptor to understand the function and evolutionary processes. Further, with results from integrated re-annotation studies (homology and genomic context approach), we strongly suggest that Csac_0437 and Csac_0424 encode for glycoside hydrolases (GH) and are proposed to be involved in the decomposition of recalcitrant plant polysaccharides. Similarly, HPs: Csac_0732, Csac_1862, Csac_1294 and Csac_0668 are suggested to play a significant role in biohydrogen production. Function prediction of these HPs by using our integrated approach will considerably enhance the interpretation of large-scale experiments targeting this industrially important organism.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: The authors have declared that no competing interests exist.

Figures

Fig 1
Fig 1. Visualization of alignment of the 8 Caldicellulosiruptor genomes generated using Blast Ring Image Generator (BRIG) showing (from inner to outer), % G+C, GC skew and the homology based on BLASTn.
The deep purple circle represents the reference sequence, C. saccharolyticus. Outer rings show shared identity (according to BLASTn) with various other Caldicellulosiruptor genomes. BLASTn matches between 50% and 100% nucleotide identity are colored from lightest to darkest shade respectively, according to the graduated scale on the right of the circular BLAST image. Matches with less than 50% identity appear as blank spaces in each ring.
Fig 2
Fig 2. Statistics of Known, Hypothetical and Putative genes before and after re-annotation.
Before re-annotation category include: Known (1854), Hypothetical (781) and Putative (47). After re-annotation category include: Known (2285), Hypothetical: old CDSs (372) + new CDSs (182) and putative: old CDSs (25) + new CDSs (49).
Fig 3
Fig 3. Figure showing a network of predicted associations for a particular group of proteins (related to COG2006 (containing the hypothetical protein Csac_1294 (red)) which is predicted to be involved in valine, leucine and isoleucine biosynthesis pathway and oxidation-reduction process.
The network edges represent the predicted functional associations. Any edge may be drawn with differently coloured lines: a red line indicates the presence of fusion evidence; a green line—neighborhood evidence; a blue line—Co-occurrence evidence, a black line—Co-expression evidence, a yellow line—text mining evidence, and a light blue line indicates database evidence.

References

    1. Das D, Veziroǧlu TN. Hydrogen production by biological processes: a survey of literature. Int J Hydrogen Energy. 2001; 26: 13–28.
    1. Kalia VC, Lal S, Ghai R, Mandal M, Chauhan A. Mining genomic databases to identify novel hydrogen producers. Trends Biotechnol. 2003; 21: 152–156. - PubMed
    1. Nandi R, Sengupta S. Microbial production of hydrogen: an overview. Crit Rev Microbiol. 1998; 24: 61–84. - PubMed
    1. van de Werken HJ, Verhaart MR, VanFossen AL, Willquist K, Lewis DL, Nichols JD, et al. Hydrogenomics of the extremely thermophilic bacterium Caldicellulosiruptor saccharolyticus . Appl Environ Microbiol. 2008; 74: 6720–6729. 10.1128/AEM.00968-08 - DOI - PMC - PubMed
    1. Lee RA, Lavoie JM. From first- to third-generation biofuels: Challenges of producing a commodity from a biomass of increasing complexity. The review magazine of animal agriculture. 2013; 3: 6–11.

Publication types