Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2010 Jan 29;395(4):671-85.
doi: 10.1016/j.jmb.2009.10.062. Epub 2009 Nov 3.

Molecular evolution of multisubunit RNA polymerases: sequence analysis

Affiliations

Molecular evolution of multisubunit RNA polymerases: sequence analysis

William J Lane et al. J Mol Biol. .

Abstract

Transcription in all cellular organisms is performed by multisubunit, DNA-dependent RNA polymerases that synthesize RNA from DNA templates. Previous sequence and structural studies have elucidated the importance of shared regions common to all multisubunit RNA polymerases. In addition, RNA polymerases contain multiple lineage-specific domain insertions involved in protein-protein and protein-nucleic acid interactions. We have created comprehensive multiple sequence alignments using all available sequence data for the multisubunit RNA polymerase large subunits, including the bacterial beta and beta' subunits and their homologs from archaebacterial RNA polymerases, the eukaryotic RNA polymerases I, II, and III, the nuclear-cytoplasmic large double-stranded DNA virus RNA polymerases, and plant plastid RNA polymerases. To overcome technical difficulties inherent to the large-subunit sequences, including large sequence length, small and large lineage-specific insertions, split subunits, and fused proteins, we created an automated and customizable sequence retrieval and processing system. In addition, we used our alignments to create a more expansive set of shared sequence regions and bacterial lineage-specific domain insertions. We also analyzed the intergenic gap between the bacterial beta and beta' genes.

PubMed Disclaimer

Figures

Fig. 1
Fig. 1
Sequence retrieval, processing, and alignment methodology. The creation of the bacterial β/β′ and All RNAP Large Subunit alignments required several steps. First BlaFA (gray dashed region) was used to retrieve and process the sequences, which were then aligned using PCMA, followed by manual alignment fixing. In the case of the All RNAP Large Subunit the class of the RNAPs also had to be reassigned and merged together.
Fig. 2
Fig. 2
Phylogenic analysis of the All RNAP Large Subunits MSA. The two All RNAP Large Subunit alignments were combined by species and the residue positions pruned to only keep the regions shared among all the sequences. The phylogenic trees were calculated using PhyML v3.0 and analyzed using TreeDyn (see Materials and Methods). Due to the large number of sequences, only the boundaries for each group of leaves are shown colored by RNAP class: bRNAP (red), pRNAP (yellow), eRNAP I (green), eRNAP II (blue), and eRNAP III (cyan). The branches for each leaf region are colored by taxonomy: bacteria (yellow), eukaryota (green), archeaea (orange), and viruses (magenta). Due to their diversity, the proteobacteria (gray dashed region) and firmicutes (light blue dashed region) taxonomy subdivisions have been individually labeled. Selected branch support values are indicated in light grey. A. All RNAP Classes tree. For each class, the total number of complete β/β′ homolog sequences is shown. The second number in parentheses is the number of sequences nonredundant within the shared regions. B. eRNAP-like RNAPs (eRNAP I, II, III, aRNAP, vRNAPs). C. Bacterial and plastid RNAPs.
Fig. 3
Fig. 3
Bacterial rpoB and rpoC intergenic gap. The distance between the rpoB gene (encoding bRNAP β) stop codon and the rpoC gene (encoding bRNAP β′) start codon was analyzed. The number of sequences vs. intergenic gap is plotted as a blue line. The x-axis has been split between 500 and 900 bp. The red vertical line indicates an intergenic gap of zero with minus values indicating overlapping rpoB and rpoC genes. The species with fused β/β′ subunits are not shown since they do not have a true intergenic gap. Please refer to the supplemental information on our website for additional details.
Fig. 4
Fig. 4
Shared sequence regions common to multi-subunit RNAPs. The vertical bars represent the primary sequence of the Tth (or Taq) bRNAP large subunit (β/β′) sequences. For both β (top, blue) and β′ (bottom, pink), three representations are shown. On top are the originally defined sequence regions for β and β′ . Below are the regions common to all multi-subunit RNAPs (Lane – all), and regions common to bRNAPs (Lane – bact). Structural features are labeled above. The locations of the lineage-specific inserts (see Figs. 7, 8) are indicated below. Evolutionarily conserved domains are superimposed on the sequences according to Iyer et al. ; . Domain designations are as follows: DPBB, double-psi-β-barrel; SBHM, sandwich barrel hybrid motif; ZNR, zinc ribbon; ATL, AT-hook like motif; BBM1, β-β′ specific module 1; BBM2, β-β′ specific module 2.
Fig. 5
Fig. 5
Structural Mapping of Shared Sequence Regions on the bRNAP structure; Bottom, Back, Channel, and Front views. The Tth bRNAP ternary elongation complex structure (PDB 2O5J) is shown as backbone ribbons. The color-coding is shown in the key. MgI chelated at the active site is shown as a yellow sphere. The small black arrow points in the downstream direction of the template DNA. The views (Bottom, Back, Channel, Front) are as defined by Cramer et al. .
Fig. 6
Fig. 6
Structural mapping of shared sequence regions on the bRNAP structure; Bottom, β′-side, β-side, and Top views . Representation and color-coding the same as Fig. 5.
Fig. 7
Fig. 7
bRNAP β lineage-specific domain insertions. The locations of the β Inserts (βIn1 - βIn12) are indicated using numbered light green circles. Red text or lines indicate inserts or lineage details identified in our study. The light gray boxes indicate the identities of previously well studied inserts. The taxonomy lineage details are as inclusively broad as possible. Where subfamily taxonomy is given, the root taxonomy name to which it belongs is given in square brackets (Proteobacteria and Firmicutes are given taxonomy names one level more specific). The individual bacteria species name is given if it is the only member of a number of related bacteria to contain the insert. Missing in some Mollicutes. βIn4 and βIn10 contain the same Mollicutes species and are mutually exclusive with βIn3 in terms of Mollicutes species. Missing in some Firmicutes species. Some of the Firmicutes missing this insert represent the top 8 species with the smallest combined β/β′ sequence lengths. The Wolbachia species, which also have fused β/β′, have an additional 69 amino acid extension at the N-term of this insert. Please refer to the supplemental information on our website for additional details about all of the inserts.
Fig. 8
Fig. 8
bRNAP β′ lineage-specific domain insertions. Same as Fig. 7, but with the locations of the β′ Inserts (β′In1 - β′In7). Missing in Symbiobacterium subfamily species. Missing in Acholeplasmatales subfamily species. Missing a region of sequence in the middle of the insert that removes domains b and c, which interestingly both extend past the σ subunit and therefore lack interactions at the interface between this insert and σ. The ε-Proteobacteria subfamily inserts contain ~150 additional amino acids. Missing in Candidatus Kuenenia stuttgartiensis. Please refer to the supplemental information on our website for additional details about all of the inserts.
Fig. 9
Fig. 9
Structural mapping of bRNAP lineage-specific domain insertions on the bRNAP structure; Bottom, β-side, and Top views . The Taq core bRNAP structure, including the complete structure of Taq β′In2, is shown as backbone ribbons , color-coded as follows: αI, αII, ω, grey; β, cyan; β′, pink. The locations of the bRNAP lineage-specific domain insertions are labeled (according to Figs. 4, 5) and shown as spheres (β insertions, blue; β′ insertions, red), except three insertions with known structures are shown in blue (βIn12, found in the Tth and Taq bRNAP structures) and red (Taq β′In2 and Eco β′In6). The attachment of Eco β′In6 in the trigger loop is schematically denoted by dashed lines.

References

    1. Cramer P. Multisubunit RNA polymerases. Curr. Opinion Struct. Biol. 2002;12:89–97. - PubMed
    1. Darst SA. Bacterial RNA polymerase. Curr. Opinion Struct. Biol. 2001;11:155–162. - PubMed
    1. Initiative TAG. Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature. 2000;408:796–815. - PubMed
    1. Hu J, Troxler RF, Bogorad L. Maize chloroplast RNA polymerase: the 78-kilodalton polypeptide is encoded by the plastid rpoC1 gene. Nucleic Acids Res. 1991;19:3431–3434. - PMC - PubMed
    1. Hu J, Bogorad L. Maize chloroplast RNA polymerase: the 180-, 120-, and 38-kilodalton polypeptides are encoded in chloroplast genes. Proc. Natl. Acad. Sci. USA. 1990;87:1531–1535. - PMC - PubMed

Publication types

LinkOut - more resources