Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2014 Dec 16;42(22):13525-33.
doi: 10.1093/nar/gku1147. Epub 2014 Nov 14.

An integrated approach for genome annotation of the eukaryotic thermophile Chaetomium thermophilum

Affiliations

An integrated approach for genome annotation of the eukaryotic thermophile Chaetomium thermophilum

Thomas Bock et al. Nucleic Acids Res. .

Abstract

The thermophilic fungus Chaetomium thermophilum holds great promise for structural biology. To increase the efficiency of its biochemical and structural characterization and to explore its thermophilic properties beyond those of individual proteins, we obtained transcriptomics and proteomics data, and integrated them with computational annotation methods and a multitude of biochemical experiments conducted by the structural biology community. We considerably improved the genome annotation of Chaetomium thermophilum and characterized the transcripts and expression of thousands of genes. We furthermore show that the composition and structure of the expressed proteome of Chaetomium thermophilum is similar to its mesophilic relatives. Data were deposited in a publicly available repository and provide a rich source to the structural biology community.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Literature overview of PDB-deposited structures and scientific publications derived from or referring to Chaetomium thermophilum proteins. Structures deposited before initial genome sequencing in 2011 were enabled by access to partial genome information.
Figure 2.
Figure 2.
Summary and added value of the analyses performed on the experimental transcriptomics and proteomics data, which included: refinement of intron/exon structures, analysis of previously predicted ORFs that contain a stop codon, de novo ORF/peptide analysis and expression analysis of protein termini.
Figure 3.
Figure 3.
Example of experimental validation of gene sequence reannotation. (A) Reverse transcriptase polymerase chain reaction of the ct IOC4 (CTHT_0009460) gene. Size of gel band was high compared to annotated gene model (> 1722 nt instead of 1635 nt). Gene sequencing (RT-PCR) reveals 87 nt additional sequence extending exon 3. The extended sequence partially matches intron 3 of the originally annotated ct IOC4 sequence. (B) Original Ct IOC4 coding sequence (top) and MS-identified peptides overlaid to the Protter protein sequence view (bottom). The expression of assumed exon 3 extension initially found by RT-PCR (red box) was verified by MS-based identification of the peptide sequence “.ATSEEDEDVEMEDAPSATETSAK.” (blue bar) which covers the MS-detectable part of the assumed exon 3 extension. The insert site of the exon 3 extended sequence is indicated (red dot) in the Protter (29) protein sequence image of ct IOC4, together with all other MS-identified peptides (highlighted in blue, N- and C-terminus and potential tryptic cleavage sites for MS indicated). An alternative splice variant for ct IOC4 containing the extended exon 3 sequence is included in the reannotated Ct genome.
Figure 4.
Figure 4.
Abundance of functional protein category levels. (A) Qualitative and quantitative overview of the fraction of the genome, transcriptome and proteome dedicated to defined functional categories. ‘Genome’ refers to the number of genes, ‘Transcriptome’ to the number of identified mRNAs, ‘Proteome’ to the number of identified proteins, ‘Quantitative transcriptome’ to mRNA abundances and ‘Quantitative proteome’ to protein abundances within any given functional category. (B) Comparison of protein abundance changes in selected functional protein categories between the thermophile, Chaetomium thermophilum, and the mesophile, Chaetomium globosum. Functional categories were selected from the three present main categories available in eggNOG. (C) Comparison of protein abundance in selected major protein complexes and proteins of similar function between Chaetomium thermophilum and Chaetomium globosum. Protein abundance is based on “intensity-based absolute quantification” scores (iBAQ).

References

    1. Perutz M.F., Raidt H. Stereochemical basis of heat stability in bacterial ferredoxins and in haemoglobin A2. Nature. 1975;255:256–259. - PubMed
    1. Perutz M.F. Electrostatic effects in proteins. Science. 1978;201:1187–1191. - PubMed
    1. Amlacher S., Sarges P., Flemming D., van Noort V., Kunze R., Devos D.P., Arumugam M., Bork P., Hurt E. Insight into structure and assembly of the nuclear pore complex by utilizing the genome of a eukaryotic thermophile. Cell. 2011;146:277–289. - PubMed
    1. Thierbach K., von Appen A., Thoms M., Beck M., Flemming D., Hurt E. Protein interfaces of the conserved Nup84 complex from Chaetomium thermophilum shown by crosslinking mass spectrometry and electron microscopy. Structure. 2013;21:1672–1682. - PubMed
    1. Monecke T., Haselbach D., Voß B., Russek A., Neumann P., Thomson E., Hurt E., Zachariae U., Stark H., Grubmüller H., et al. Structural basis for cooperativity of CRM1 export complex formation. Proc. Natl Acad. Sci. U.S.A. 2013;110:960–965. - PMC - PubMed

Publication types