Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Oct 24;228(4):iyae172.
doi: 10.1093/genetics/iyae172. Online ahead of print.

Higher-order epistasis within Pol II trigger loop haplotypes

Affiliations

Higher-order epistasis within Pol II trigger loop haplotypes

Bingbing Duan et al. Genetics. .

Abstract

RNA polymerase II (Pol II) has a highly conserved domain, the trigger loop (TL), that controls transcription fidelity and speed. We previously probed pairwise genetic interactions between residues within and surrounding the TL for the purpose of understand functional interactions between residues and to understand how individual mutants might alter TL function. We identified widespread incompatibility between TLs of different species when placed in the Saccharomyces cerevisiae Pol II context, indicating species-specific interactions between otherwise highly conserved TLs and its surroundings. These interactions represent epistasis between TL residues and the rest of Pol II. We sought to understand why certain TL sequences are incompatible with S. cerevisiae Pol II and to dissect the nature of genetic interactions within multiply substituted TLs as a window on higher order epistasis in this system. We identified both positive and negative higher-order residue interactions within example TL haplotypes. Intricate higher-order epistasis formed by TL residues was sometimes only apparent from analysis of intermediate genotypes, emphasizing complexity of epistatic interactions. Furthermore, we distinguished TL substitutions with distinct classes of epistatic patterns, suggesting specific TL residues that potentially influence TL evolution. Our examples of complex residue interactions suggest possible pathways for epistasis to facilitate Pol II evolution.

Keywords: deep mutational scanning; epistasis; haplotypes; trigger loop.

PubMed Disclaimer

Conflict of interest statement

Conflicts of interest The authors declare no conflicts of interest.

Figures

Fig. 1.
Fig. 1.
Systematic detection of TL-internal epistasis with natural TL alleles and intermediates. a) The Pol II TL is the key domain balancing transcription speed and fidelity in Pol II active site. Left panel: yeast Pol II structure (PDB: 5C4X). Right panel: the open, catalytic disfavoring (PDB: 5C4X; Barnes et al. 2015b) (Barnes et al. 2015a) and closed, catalytic favoring (PDB: 2E2H; Wang et al. 2006a) (Wang et al. 2006b) conformations of TL in the active site. The figure was generated using PyMol (Schrodinger, LLC 2024). b) We selected 632 TL haplotypes representing TL alleles from bacterial, archaeal, and the 3 conserved eukaryotic msRNAPs to detect TL-internal epistasis (bacterial TLs n = 182, archaeal TLs n = 116, Pol I TLs n = 144, Pol II TLs n = 94, Pol III TLs n = 95). c) The selected TL alleles were synthesized and transformed into yeast Pol II to form chimeric Pol II enzymes. Yeast chimeric Pol II enzymes were phenotyped under selective conditions to detect growth defects, which are represented by fitness (see Methods). d) Seven Pol II TL alleles were selected to construct intermediate haplotypes representing all possible combinations of substitutions of the 7 selected alleles. The intermediates were transformed into yeast to measure growth defects as in c). e and f) Analytical scheme of primary deviation score e) and secondary deviation score f). Details are in Methods.
Fig. 2.
Fig. 2.
Phenotypes fluctuate with changes in substitution composition within haplotype groups. a) Fitness of constituent single substitutions, and expected and observed fitness of the haplotypes, and deviation score are shown in heatmap for the selected 7 haplotypes. The deviations shown in the heatmap are the primary deviation scores calculated by comparing observed fitness to expected fitness. b–h) Distribution of growth fitness across all intermediate substitution combinations categorized by number of amino acid substitutions for each selected TL. Each spot on the graph represents a haplotype. The y-axis indicates the fitness of each haplotype. The x-axis indicates the number of amino acid substitutions within these haplotypes. For example, consider a point in b), which represents a haplotype from the substitution combination in the TL of E. invadens IP1. If the point is positioned at x = 3, showing it contains 3 substitutions, and at y = −0.5, indicating a fitness value of −0.5 for this haplotype.
Fig. 3.
Fig. 3.
Greater TL-internal epistasis is detectable with closer evolutionary distance to eukaryotic Pol II. a–e) Fitness and primary deviation score heatmaps of TL haplotypes from 5 msRNAP evolutionary groups. The x-axis of each heatmap is the 31 residue positions of the Pol II TL (1,076–1,106). The y-axis of each heatmap is the TL haplotypes belonging to each group clustered by hierarchical clustering with Euclidean distance. Each row represents 1 haplotype with several single substitutions. Light gray blocks in each row represent the residue from the haplotype at the position is the same with yeast Pol II TL residue, in other words, no substitution at the position. Colored blocks represent different residues in the haplotype compared with yeast Pol II TL (substitutions). The color of the block represents the growth fitness of the single substitution in the yeast Pol II TL background. Expected fitness of the haplotypes was calculated from individual substitution fitnesses using the log additive model. Observed fitness was measured in the screening experiments, and deviation scores were calculated by comparing the observed and expected scores (shown at the right end of each row). Sequence logos were generated with MSA of the 5 groups individually in Weblogo 3.7.12 (Crooks et al. 2004). The labeled numbers of the sequence logo represent yeast Pol II TL residue position (1,076–1,106). Bacteria n = 465. Archaea n = 426. Pol I n = 605. Pol II n = 405. Pol III n = 444. f) Lethal haplotypes can contain 3 distinct types of lethality: one attributed to synthetic lethal interactions among substitutions, representing negative interactions that result in lethality beyond expectation, additive lethality expected from combination of individual substitutions, and that arising directly from the presence of an individually lethal single substitution. The calculated ratio specifically reflects the proportion of lethality due to negative interactions and additive lethality within all types of lethal haplotypes. g) The ratio of viable haplotypes in haplotypes containing lethal substitutions, representing the ratio of positive interactions (suppression) in TL haplotypes of each group. Haplotypes containing lethal substitutions are expected to be lethal based on the additive model. If haplotypes with lethal substitutions are observed to be viable, it suggests other substitutions suppress the lethal substitution in the haplotypes, implying positive epistasis (suppression). Approximately 20% of examined eukaryotic TL haplotypes containing individual lethal substitutions are viable, whereas only roughly 2% of bacterial and archaeal TL haplotypes are viable, suggesting positive interactions in eukaryotic TLs can buffer negative effects in more closely related TL sequences.
Fig. 4.
Fig. 4.
The epistasis landscape provides a comprehensive view of primary and secondary deviation scores, emphasizing substitutions with notable epistatic effects. a) Fitness of 8 E. invadens IP1 TL single substitutions in the yeast Pol II background, and the expected and observed fitness and the primary deviation score are shown in the heatmap. b) The epistasis landscape of E. invadens IP1 TL substitutions. The heatmap illustrates the fitness and epistasis of all unique intermediate haplotypes coming from combinations of 8 substitutions. Intermediate haplotypes are grouped by number of substitutions from 1 to 8. The fitness values are displayed in the upper panel, and the epistasis, represented by primary and secondary deviation scores, is displayed in the lower panel. The colors of substitution names indicate their phenotype profiles as single mutants.
Fig. 5.
Fig. 5.
Correlations between deviation scores reflect specific residue interactions in E. invadens IP1 TL substitutions. a) Correlations between secondary deviation scores of all 8 substitutions (y-axis) and the primary deviation score (x-axis). Linear regression was applied to each comparison of secondary deviation scores against primary deviation scores to check the correlation. Substitutions with an R2 value exceeding 0.5 are annotated on the xy plot, indicating their substantial impact on primary epistasis of the haplotypes. b–d) Correlations between secondary deviations of the other 7 substitutions (y-axis) vs V1089T b), S1096E c), and S1091E d) on the x-axes, respectively. e) The fitness landscape of intermediate combinations with fitnesses in the ultrasick/lethal range. Their observed fitness levels are in the lethal range as indicated with black blocks in the heatmap while the expected fitness scores calculated from the additive model is in viable range (light blue blocks), indicating potential negative interactions. The fitness scores of the constituent single mutants within lethal haplotypes are shown in the heatmap. F1086H, V1089T, S1091E, and K1093T are present in each lethal haplotype while S1096E is absent. Names of substitutions are colored based on their phenotype profiles as single mutants. f) The fitness of all ultrasick to lethal haplotypes with S1096E incorporated is no longer in the ultrasick/lethal range. g) Scheme of specific residue interactions within substitutions of E. invadens IP1 TL.
Fig. 6.
Fig. 6.
Intricate higher-order epistasis observed in substitutions of P. persalinus (Ciliate) TL haplotype. a) The heatmap displays the fitness of 9 single substitutions in P. persalinus (Ciliate) TL in the yeast Pol II background, along with the epistasis between them represented by the primary deviation score. b) Similar to Fig. 5a, we checked correlations between secondary deviation scores (y-axis) to the primary deviation score (x-axis) to identify substitutions with substantial impact on primary deviation scores. Simple linear regression was applied to each comparison. Substitutions with R2 > 0.5 are annotated in the plot. c) The fitness and deviation scores of substitution combinations related to group I are shown in the heatmap. Names of substitutions are colored based on their phenotypic profiles as single mutants. Each line shows the fitness and deviation scores of substitutions in a certain combination. Left: the fitness of individual substitutions, and the expected and observed fitness. Right: the primary deviations calculated by comparing observed and expected fitness and the secondary deviation scores of each constituent substitution. A1076T/A1090S/S1091D/S1096L/I1104L is in the first line. Its observed fitness is smaller than expected and when compared, results in a negative primary deviation score, representing a negative interaction. The secondary deviation scores of each constituent substitution are all negative, indicating each of them showing negative interactions when added to corresponding compound mutants. The following 4 lines represent the 4 combinations where V1089S, K1092R, K1093N, and K1102Q are incorporated into A1076T/A1090S/S1091D/S1096L/I1104L, respectively. All observed fitnesses for combinations are healthier than A1076T/A1090S/S1091D/S1096L/I1104L, and the secondary deviation scores of V1089S, K1092R, K1093N, and K1102Q are all positive, implying a positive effect (suppression) on each combination, respectively. d) Scheme illustrating the substitution interaction network observed in c). e) Similar to c), the fitness and deviation scores of combinations within group II are shown in the heatmap. The first row shows the fitness and deviation scores detected within the combination A1076T/A1090S/S1091D/K1092R/K1093N. The following rows display the corresponding fitness and deviation scores when the other 4 substitutions are incorporated. Notably, the effect of V1089S on A1076T/A1090S/S1091D/K1092R/K1093N cannot be determined because the observed fitness of the combination (V1089S + A1076T/A1090S/S1091D/K1092R/K1093N) is in the ultrasick/lethal range, and its expected fitness calculated from the log additive model is also in the lethal range due to additivity. In this case, the secondary deviation of V1089S is represented by a black block in the heatmap. Moreover, the effect of A1076T on V1089S/A1090S/S1091D/K1092R/K1093N cannot be determined because the observed fitness falls within the lethal range. The expected fitness of (A1076T + V1089S/A1090S/S1091D/K1092R/K1093N) is also in the lethal range due to the presence of the lethal compound V1089S/A1090S/S1091D/K1092R/K1093N. The expected lethality of the combination is because it contains a lethal component. The secondary deviation score of A1076T cannot be determined either and is indicated by a dark gray block in the heatmap. Similarly, the scores of A1090S and S1091D could not be determined for the same reason with A1076T and are also marked with dark gray blocks. f) Scheme representing the substitution interaction networks observed in e).
Fig. 7.
Fig. 7.
Different classes of epistatic effects. a) Histogram of mutants' epistatic effects, represented by their respective maximum likelihood estimate (σ2) of secondary deviation scores. A higher epistatic effect indicates a greater impact of a certain substitution. b) Medians of secondary deviation scores of substitutions were plotted against their corresponding σ2. Substitutions are colored based on their phenotypes. c) Comparing epistatic effects of mutants in each category. Each scatter plot shows the measured fitness of haplotypes without (x-axis) vs with (y-axis) a substitution incorporated. The colors of the plots represent the mutants' phenotypes. The colored line marks the simple linear regression of the spots, representing the observed epistatic effect of the substitution. r2 values of the regressions are labeled in the plots. The black line indicates the additive (nonepistatic) expectation.

Update of

References

    1. Bakerlee CW, Nguyen Ba AN, Shulgina Y, Rojas Echenique JI, Desai MM. 2022. Idiosyncratic epistasis leads to global fitness-correlated trends. Science. 376(6593):630–635. doi:10.1126/science.abm4774. - DOI - PMC - PubMed
    1. Bank C, Hietpas RT, Jensen JD, Bolon DN. 2015. A systematic survey of an intragenic epistatic landscape. Mol Biol Evol. 32(1):229–238. doi:10.1093/molbev/msu301. - DOI - PMC - PubMed
    1. Bar-Nahum G, Epshtein V, Ruckenstein AE, Rafikov R, Mustaev A, Nudler E.. 2005. A ratchet mechanism of transcription elongation and its control. Cell. 120(2):183–193. doi:10.1016/j.cell.2004.11.045. - DOI - PubMed
    1. Barnes CO, Calero M, Malik I, Graham BW, Spahr H, Lin G, Cohen AE, Brown IS, Zhang Q, Pullara F, et al. . 2015a. Crystal structure of a transcribing RNA polymerase II complex reveals a complete transcription bubble. Mol Cell. 59(2):258–269. doi:10.1016/j.molcel.2015.06.034. - DOI - PMC - PubMed
    1. Barnes CO, Calero M, Malik I, Spahr H, Zhang Q, Pullara F, Kaplan CD, Calero G. 2015b. Crystal structure of a transcribing RNA Polymerase II complex reveals a complete transcription bubble. PDB. doi:10.2210/pdb5C4X/pdb. - DOI - PMC - PubMed

LinkOut - more resources