Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2009 Jul;5(7):e1000569.
doi: 10.1371/journal.pgen.1000569. Epub 2009 Jul 17.

A strand-specific RNA-Seq analysis of the transcriptome of the typhoid bacillus Salmonella typhi

Affiliations

A strand-specific RNA-Seq analysis of the transcriptome of the typhoid bacillus Salmonella typhi

Timothy T Perkins et al. PLoS Genet. 2009 Jul.

Abstract

High-density, strand-specific cDNA sequencing (ssRNA-seq) was used to analyze the transcriptome of Salmonella enterica serovar Typhi (S. Typhi). By mapping sequence data to the entire S. Typhi genome, we analyzed the transcriptome in a strand-specific manner and further defined transcribed regions encoded within prophages, pseudogenes, previously un-annotated, and 3'- or 5'-untranslated regions (UTR). An additional 40 novel candidate non-coding RNAs were identified beyond those previously annotated. Proteomic analysis was combined with transcriptome data to confirm and refine the annotation of a number of hpothetical genes. ssRNA-seq was also combined with microarray and proteome analysis to further define the S. Typhi OmpR regulon and identify novel OmpR regulated transcripts. Thus, ssRNA-seq provides a novel and powerful approach to the characterization of the bacterial transcriptome.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Figure 1
Figure 1. Genome-wide assessment.
(A) Circular plot of the reads mapping to the S. Typhi Ty2 genome. The outer circle is marked in megabases (0–4). The outermost circles represent CDS on the forward (outermost) and reverse (second outermost) strand coloured according to functional class assigned to CT18 annotation , respectively. The inner jagged circle represents a plot of mapped sequence reads with a minimum quality score of 30. Dark shading represents greater (green) than the average and lower (purple). Each base is represented as a pileup of reads and averaged over a window size of 10000 bp. Peaks represent highly sequenced transcripts such as fliC (1013788..1015308), viaB locus (4494169..4506949) and sdhCDABsucABCD (2198361..2208317). (B) Identification of highly expressed genes on the coding and non-coding strands. Log10 of AM of the coding strand minus Log10 AM of reads mapped to the corresponding reverse strand (y-axis) for each S. Typhi Ty2 CDS (x-axis). Greatest and lowest 20 genes are identified by locus tag or gene name.
Figure 2
Figure 2. ssRNA–seq data sequence data mapped to the S. Typhi Ty2 genome and visualised using Artemis software.
(A) Salmonella pathogenicity island 1. Sequence data represented as a plot aligned with the annotation after strand specific filtering (annotation represented above or below genes (N.B., not all gene annotations are represented); forward strand blue and reverse strand red, window size = 200 bp). (B) Exemplar genomic region with multiple divergently transcribed genes supports the strand specific mapping of sequence data and previously published annotation. The histidine utilisation operon hutHUCGI is transcribed from the reverse strand, followed by three hypothetical genes conserved in E. coli. The molybdenum transport system is encoded by two divergently transcribed operons and has been characterised in E. coli followed by the galactose operon galETKM (forward strand blue and reverse strand red) (window size = 200 bp). (C) An example of a potential mis-annotation. Hypothetical gene t2145 identified as an outlier in Figure 1B exhibits significant sequence reads mapped to opposite strand and upstream region of gltA. Forward strand (blue) and reverse strand (red). Window size = 200 bp.
Figure 3
Figure 3. Overview of the S. Typhi Ty2 transcriptome generated by ssRNA–seq according to functional classification.
The total number of reads/bp mapped to each CDS are assigned to functional class described previously . These data were then normalised by the number of CDS for each function encoded within the entire genome. A ratio of 1 represents transcription of functional class on par with its genome content. A ratio of more than one represents a transcriptionally over-active class, and less than one, under-active.
Figure 4
Figure 4. Overview of S. Typhi Ty2 transcriptome and proteome assigned to functional class.
The percentage of CDS in each functional class with an AM≥1 (grey bar). The percentage of hypothetical CDS with at least one mapped sequenced peptide (FDR<0.076) (red).
Figure 5
Figure 5. Annotation of a variable region of S. Typhi Ty2.
Alignment of ssRNA–seq sequence data to an unannotated region of Ty2 provided evidence for putative CDS. This region is encoded between annotated genes t0869 and t0874. It may have been missed in the published annotation due to the variation between Ty2 and the previously sequenced S. Typhi CT18 strain . Resequencing of 21 S. Typhi isolates identified 2 different configurations of this variable region and were annotated as ST20a and b. This figure represents ST20b. Colour code: white - previously annotated genes; pale green - hypothetical genes; orange - conserved hypothetical genes; pink - genes involved in integration or mobilisation; red - DNA modification genes. DR; direct repeats. Transcript represented by plot (log scale, forward strand (blue) and reverse strand (red), window size 200 bp). Note that the previously annotated CDS t0872 (white) represents only the 3′ end of the gene. The remaining segment of the gene is indicated (red).
Figure 6
Figure 6. Previously identified and novel putative ncRNA.
The AM for each intergenic feature (mean and range) derived over three biological replicates. (A) Previously identified ncRNA and (B) ncRNA elements identified by this study. (C) Putative ncRNA elements in SPI-1. Identification of 4 intergenic regions of sequenced transcript, 3 predicted to be cis-acting 5′ elements (upstream of sprA (RUF_220c, 1) sprB (RUF_219c, 2) iagA (RUF_221, 4), and 1 possible 3′ UTR (downstream of sprB, RUF_218c, 3), within the cell invasion locus, SPI-1. Transcript represented by plot (predicted ncRNA represented by red box, log scale, forward strand (blue) and reverse strand (red), window size 200 bp).
Figure 7
Figure 7. Transcriptionally active prophage genes.
(A) Genetic organisation of the SopE prophage aligned with mapped sequence reads illustrates “expression” of the sopE moron (AM = 283) and another putative cargo region (t4323–t4325, AM = 21.4, 40.3, 18.6, respectively). Transcription of the cI repressor is required for maintaining lysogeny and this region mapped an AM = 5.15 compared with median AM for entire phage = 0.93. (B) Genetic organisation of the ST35 prophage. The low GC region maps significant sequence coverage compared with the prophage “machinery”, putatively identifying it as cargo. Putative prophage cargo in (C) ST2-27 and (D) ST46 with transcriptionally active low GC regions. (All plots, forward strand blue and reverse strand red, window size = 200 bp).

References

    1. Nagalakshmi U, Wang Z, Waern K, Shou C, Raha D, et al. The transcriptional landscape of the yeast genome defined by RNA sequencing. Science. 2008;320:1344–1349. - PMC - PubMed
    1. Wilhelm BT, Marguerat S, Watt S, Schubert F, Wood V, et al. Dynamic repertoire of a eukaryotic transcriptome surveyed at single-nucleotide resolution. Nature. 2008;453:1239–1243. - PubMed
    1. Mejean V, Iobbi-Nivol C, Lepelletier M, Giordano G, Chippaux M, et al. TMAO anaerobic respiration in Escherichia coli: involvement of the tor operon. Mol Microbiol. 1994;11:1169–1179. - PubMed
    1. Sittka A, Lucchini S, Papenfort K, Sharma CM, Rolle K, et al. Deep sequencing analysis of small noncoding RNA and mRNA targets of the global post-transcriptional regulator, Hfq. PLoS Genet. 2008;4:e1000163. doi:10.1371/journal.pgen.1000163. - PMC - PubMed
    1. Parkhill J, Dougan G, James KD, Thomson NR, Pickard D, et al. Complete genome sequence of a multiple drug resistant Salmonella enterica serovar Typhi CT18. Nature. 2001;413:848–852. - PubMed

Publication types