Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Sep 6;24(9):e57020.
doi: 10.15252/embr.202357020. Epub 2023 Jul 10.

An extended transcription factor regulatory network controls hepatocyte identity

Affiliations

An extended transcription factor regulatory network controls hepatocyte identity

Julie Dubois-Chevalier et al. EMBO Rep. .

Abstract

Cell identity is specified by a core transcriptional regulatory circuitry (CoRC), typically limited to a small set of interconnected cell-specific transcription factors (TFs). By mining global hepatic TF regulons, we reveal a more complex organization of the transcriptional regulatory network controlling hepatocyte identity. We show that tight functional interconnections controlling hepatocyte identity extend to non-cell-specific TFs beyond the CoRC, which we call hepatocyte identity (Hep-ID)CONNECT TFs. Besides controlling identity effector genes, Hep-IDCONNECT TFs also engage in reciprocal transcriptional regulation with TFs of the CoRC. In homeostatic basal conditions, this translates into Hep-IDCONNECT TFs being involved in fine tuning CoRC TF expression including their rhythmic expression patterns. Moreover, a role for Hep-IDCONNECT TFs in the control of hepatocyte identity is revealed in dedifferentiated hepatocytes where Hep-IDCONNECT TFs are able to reset CoRC TF expression. This is observed upon activation of NR1H3 or THRB in hepatocarcinoma or in hepatocytes subjected to inflammation-induced loss of identity. Our study establishes that hepatocyte identity is controlled by an extended array of TFs beyond the CoRC.

Keywords: cell identity; core regulatory network; hepatocyte dedifferentiation; liver disease; transcription factors.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no conflict of interest.

Figures

Figure 1
Figure 1. Promoter‐centric mining of the hepatic TF network using ProTFnet
  1. A

    The cistromes of 8 Hep‐ID TFs (CEBPA, FOXA2, HNF4A, NR1H4, NR5A2, ONECUT1, PPARA, PROX1; Dataset EV2) were used to define the percentage of those TFs binding to Hep‐ID TF (n = 13 genes) or non‐Hep‐ID TF (control TF genes; n = 13) encoding gene promoters. The control group used was selected for providing data representative of those obtained with 1,000 reiterations of this analysis. Box plots are composed of a box from the 25th to the 75th percentile with the median as a line. Whiskers extent to the most extreme data point which is no more than 1.5 times the interquartile range from the box. Two‐sided Wilcoxon rank sum test with continuity correction was used to assess statistical significance. *P < 0.05.

  2. B

    Overview of the ProTFnet strategy implemented in this study where (identity) TF binding to TF‐encoding gene promoters is monitored and subsequently used to define distinct clusters of promoters through SOM and hierarchical clustering. Clusters are subsequently characterized using multi‐omics (cistromic, epigenomic and transcriptomic) data mining to explore the complexity of the identity TF network.

  3. C

    Planar view of the toroidal map issued from the SOM analysis was used here to display clusters A–G and hereafter to visualize different features of these clusters (panels D, E and Appendix Fig S2E). The dendrogram issued from the hierarchical clustering analysis is shown on the right.

  4. D

    The map issued from the SOM analyses was used to show the average number of co‐recruited TFs at gene promoters contained in individual cells. Bold orange lines indicate the borders of clusters A–G.

  5. E

    The map issued from the SOM analyses was used to show the average ChIP‐seq signal for mouse liver H3K27ac at gene promoters contained in individual cells. Bold black lines indicate the borders of clusters A–G.

  6. F

    Heatmap showing the occurrence (percentage) of individual transcriptional regulators in the core co‐recruitment nodes of clusters A–G, that is, binding combinations found in at least 50% of the promoters of a given cluster.

  7. G, H

    (G) All TF‐encoding genes were grouped into deciles based on increasing expression levels in the mouse liver. Then, the distribution of TF‐encoding genes from cluster G within these deciles was plotted. Two‐sided two‐sample Kolmogorov–Smirnov test was used to assess statistically significant bias in the distribution of genes from cluster G when compared to all other TF‐encoding genes. *P < 0.05. (H) All TF‐encoding genes were grouped into deciles based on increasing liver‐specific expression levels (i.e., expression in mouse liver compared to average expression in other organs). Then, the distribution of TF‐encoding genes from cluster G within these deciles was plotted. Statistical analysis was performed as in panel H.

  8. I

    Principal component analysis (PCA) of TF gene expression in mouse (n = 39) and human (n = 126) primary cell‐types (see Materials and Methods). Individual TFs are displayed as dots projected on the first two components and the three main clusters issued from hierarchical clustering are shown and labeled as UBQ (ubiquitous), CTE (cell‐type enriched) and CTS (cell‐type specific; Fig EV2A and B).

  9. J

    The data from panel I were used to selectively display TF genes from cluster G.

Figure EV1
Figure EV1. Hep‐ID TF cistromes at example TF‐encoding gene loci
  1. A, B

    The Integrated Genome Browser (IGB) was used to display the cistromes of the indicated eight Hep‐ID TFs together with levels of H3K4me3 and H3K27ac from mouse liver ChIP‐seq data (Dataset EV2). Example Hep‐ID TF (A) and control TF‐encoding genes (B) are shown. The promoters are highlighted by green boxes. The scales of the individual ChIP‐seq tracks were kept constant for all analyzed genes.

Figure EV2
Figure EV2. Characterization of CTS, CTE, and UBQ TF genes
  1. Data were displayed as in Fig 1J to show the Tau index of tissue‐specific expression for individual TF genes within the CTS (cell‐type specific), CTE (cell‐type enriched), and UBQ (ubiquitous) clusters.

  2. Density plot showing the distribution of the expression rank of CTS, CTE, and UBQ TF‐encoding genes in primary mouse cell‐types (n = 39). All TF genes were ranked from high to low expression (i.e., from 0 to 1,009 in each cell‐type) and the distribution of TFs from the CTS, CTE and UBQ groups are shown. As expected, CTS TFs display low ranks in a very limited subsets of cell‐types while having high ranks in most cell‐types, which is the opposite from the pattern obtained for UBQ TFs.

Figure 2
Figure 2. Identification and characterization of Hep‐IDCONNECT TFs
  1. A

    The CTE/UBQ TFs from cluster G (Fig 1K) were plotted based on their expression in mouse primary hepatocytes (MPH) when compared to other primary mouse cell‐types (n = 38). For each individual TF, cells were first ranked according to decreasing gene expression and the rank of MPH was plotted on the x axis (i.e., rank 1 indicates highest expression is in MPH). Second, expression in MPH was divided by the average expression in all other primary cells and plotted on the y axis as log2 fold difference. Hep‐IDCONNECT TF‐encoding genes were defined as those preferentially expressed in MPH (rank ≤ 10 and FC > 0). For comparison, Hep‐ID TF genes were plotted in an additional box on the right of the one highlighting Hep‐IDCONNECT TFs.

  2. B–E

    Expression of Hep‐ID (n = 13 genes), Hep‐IDCONNECT (n = 26 genes) and remaining TF‐encoding genes from cluster G (Others; n = 82 genes) was monitored in indicated transcriptomic data (Dataset EV2). Box plots show log2 fold changes between adult versus newborn mouse livers (B), MPH of Hnf4a hep−/− (Hnf4a KO) versus wild‐type mice (C), a meta‐analysis of severe mouse liver injuries versus control livers (D; see Materials and Methods and Appendix Fig S6) and microdissected hepatocytes from alcohol‐related human liver cirrhosis (alcoholic steatohepatitis) versus control livers (E). Box plots are composed of a box from the 25th to the 75th percentile with the median as a line. Whiskers extent to the most extreme data point which is no more than 1.5 times the interquartile range from the box. Statistical significance was assessed using one‐sided Wilcoxon rank sum test with Benjamini–Hochberg correction for multiple testing to determine if the mean log2 FC was statistically lower (B, D, E) or higher (C) than 0. *q < 0.05.

  3. F

    Distribution of H3K4me3 domain length at the TSS of Hep‐ID (n = 13 genes), Hep‐IDCONNECT (n = 26 genes) and remaining TF‐encoding genes from cluster G (Others; n = 82 genes) as defined through broad peak calling on mouse liver H3K4me3 ChIP‐seq data. Box plots are composed of a box from the 25th to the 75th percentile with the median as a line. Whiskers extent to the most extreme data point which is no more than 1.5 times the interquartile range from the box. Statistical difference between groups was defined using Kruskal–Wallis with Wilcoxon pairwise comparison tests followed by Benjamini–Hochberg correction for multiple testing correction. *q < 0.05.

  4. G

    Mouse phenotypes associated with Hep‐ID, Hep‐IDCONNECT and remaining TF‐encoding genes from cluster G (Others) were defined using ToppCluster. Dendrograms of hierarchical clustering are shown. ToppCluster uses hypergeometric tests and Bonferroni correction.

  5. H

    Enrichment for identity effector genes among the top 1,000 transcriptionally dysregulated genes in the MPH/livers of indicated genetically deficient mice. Log2 odds ratios were computed to compare the proportion of dysregulated versus non‐dysregulated genes within identity effector genes or control non‐TF‐encoding genes. Then a two‐sided Fisher exact test was performed to assess if the proportion of dysregulated genes was significantly different within the identity and control gene groups with Benjamini–Hochberg correction. *q < 0.05.

  6. I

    Binding of indicated Hep‐ID and Hep‐IDCONNECT TFs to the promoter of identity effector genes and a control group of non‐TF‐encoding genes of similar size (n = 424) was monitored using mouse liver ChIP‐seq data. A control group of non‐TF‐encoding genes (n = 424) matched for their promoter mouse liver activity was also used (Appendix Fig S9A and B). The control groups used were selected for providing data representative of those obtained with 1,000 reiterations of this analysis (see Materials and Methods; Appendix Fig S9A and B). The distribution of ChIP‐seq signals is shown using box plots composed of a box from the 25th to the 75th percentile with the median as a line. Whiskers extent to the most extreme data point which is no more than 1.5 times the interquartile range from the box. Pairwise one‐sided Wilcoxon Rank Sum Tests with Benjamini–Hochberg correction was used to define whether the binding at identity effector genes versus control genes was significantly greater for each analyzed TF recruitment. *q < 0.05.

  7. J

    Correlation between Hep‐ID TFs, Hep‐IDCONNECT TFs and CTCF recruitment, as judged through the mining of mouse liver ChIP‐seq data, to identity effector genes. The dendrogram is issued from hierarchical clustering analysis.

Source data are available online for this figure.
Figure EV3
Figure EV3. CTE and UBQ TF genes with privileged expression in MPH
The right shows a zoomed view of CTE (black) and UBQ (green) TFs from cluster G comprised within the framed area from Fig 2A (shown again on the left).
Figure 3
Figure 3. Hep‐IDCONNECT TF binding to and regulation of Hep‐ID TF‐encoding genes in basal conditions
  1. Binding of indicated Hep‐ID and Hep‐IDCONNECT TFs to the promoter of Hep‐ID TF genes and a control group of non‐Hep‐ID TF‐encoding genes matched for their promoter activity (Appendix Fig S10A) of similar size (n = 13) was monitored using mouse liver ChIP‐seq data. The control group used was selected for providing data representative of those obtained with 1,000 reiterations of this analysis (see Materials and Methods; Appendix Fig S10A). The distribution of ChIP‐seq signals is shown using box plots composed of a box from the 25th to the 75th percentile with the median as a line. Whiskers extent to the most extreme data point which is no more than 1.5 times the interquartile range from the box. One‐sided Wilcoxon rank sum tests with Benjamini–Hochberg correction was used to define whether the binding on Hep‐ID TF gene promoters was greater than on control genes for each individual TF ChIP‐seq dataset. *q < 0.05.

  2. Transcriptional modulation of Hep‐ID TFs and a control group of non‐Hep‐ID TF‐encoding genes matched for their promoter activity (Appendix Fig S10A) of similar size (n = 13) in mouse liver/MPH of mice deleted for the indicated Hep‐ID or Hep‐IDCONNECT TF genes. The control group used was selected for providing data representative of those obtained with 1,000 reiterations of this analysis (see Materials and Methods; Appendix Fig S10A). The distribution of log2 fold changes is shown using box plots composed of a box from the 25th to the 75th percentile with the median as a line. Whiskers extent to the most extreme data point which is no more than 1.5 times the interquartile range from the box. One‐sided Wilcoxon rank sum tests with Benjamini–Hochberg correction was used to define whether log2 fold changes for Hep‐ID TF genes were lower than those of the control genes for each individual transcriptomic dataset. *q < 0.05.

  3. 12 h gene expression oscillation analyses in the mouse liver performed by Meng et al (2020) from WT and XBP1 hep−/− animals were used to identify XBP1‐dependent oscillating expression patterns for Hep‐ID TF genes (Table EV2).

  4. Average gene expression levels of Mlxipl in the livers of WT and XBP1 hep−/− mice across circadian time (n = 2 mice per group). Error bars show standard deviations.

Source data are available online for this figure.
Figure 4
Figure 4. Resetting Hep‐ID TF gene expression through Hep‐IDCONNECT TF activation in hepatocellular carcinoma
  1. A

    Comparison of the average expression of Hep‐IDCONNECT and Hep‐ID TF genes in HCC (n = 367). The Spearman correlation coefficient r and associated P‐value is indicated.

  2. B

    Overall survival of patients with HCC expressing low (n = 289) or high (n = 73) levels of the Hep‐IDCONNECT TF‐encoding genes. Differential overall survival analysis was assessed by Kaplan–Meier (KM) log rank adjusted for 100 permutations (Cheng et al, 2022).

  3. C

    Analyses similar to that described in panel A using individual Hep‐IDCONNECT TF genes. Hep‐IDCONNECT TFs which have been ascribed with HCC suppressive functions in the literature are indicated at the bottom.

  4. D

    Hep‐IDCONNECT TF cistromes in HepG2 cells (n = 13 independent cistromes) were mined for binding to Hep‐ID TF‐encoding genes. Peaks localized ± 10 kilobases from transcriptional start sites were considered in these analyses.

  5. E

    Transcriptional modulation of Hep‐ID TFs (n = 13 genes) and a control group of non‐Hep‐ID TF‐encoding genes matched for their promoter activity (Appendix Fig S10A) of similar size (n = 13 genes) in Huh7 or HepG2 cells treated with GW3965, T3 or GC‐1, respectively. The control group used was selected for providing data representative of those obtained with 1,000 reiterations of this analysis (see Materials and Methods; Appendix Fig S10A). The distribution of log2 fold changes is shown using box plots. Kruskal–Wallis with two‐sided Wilcoxon pairwise comparison tests followed by Benjamini–Hochberg correction was used to define whether transcriptional regulation of Hep‐ID TF genes was different from that of the control genes for each individual transcriptomic dataset. *q < 0.05.

  6. F–H

    mRNA expression of the indicated genes was monitored using RT‐qPCR in HepG2 cells treated with T3 or GC‐1 for 24 h or 96 h. Bar graphs show mean ± SD (n = 3 biological replicates) of log2 fold changes in treated versus untreated HepG2 cells. For Hnf4a, the log2 fold change in the ratio of P1 over P2 promoter‐derived isoforms is also shown. Gray dots show the results obtained from the three independent biological replicates (each performed in technical triplicates). One‐sample t‐test with Benjamini–Hochberg correction for multiple testing was used to determine if the mean log2 FC was statistically different from 0. *q < 0.05.

  7. I

    Western blots assays performed using antibodies against the indicated proteins on extracts from HepG2 cells treated or not with T3 for 24 h. Rep#1–3 indicates the three independent biological replicates analyzed.

  8. J

    Modulation of Hep‐ID TF gene expression in precancerous nodules compared to control rat livers and in liver nodules of rats treated with T3 compared to nodules of non‐treated rats. A control group of non‐Hep‐ID TF‐encoding genes matched for their promoter activity (Appendix Fig S10A) of similar size (n = 13) is also shown. The control group used was selected for providing data representative of those obtained with 1,000 reiterations of this analysis (see Materials and Methods; Appendix Fig S10A). Box plots show the log2 fold changes. Two‐sided Wilcoxon rank sum tests with Benjamini–Hochberg correction was used to define whether transcriptional regulation of Hep‐ID TF genes was different from that of the control genes for each individual transcriptomic dataset. *q < 0.05.

  9. K

    Dot plots showing the transcriptional regulation of individual Hep‐ID TF gene expression in precancerous nodules compared to control rat livers (left; Nodules/Control) and in liver nodules of rats treated with T3 compared to nodules of non‐treated rats (right; Nodules + T3/Nodules). No data were recovered for Onecut2, Prox1 and Mlxipl.

  10. L

    Identity effector genes significantly downregulated in Nodules/Control (q < 0.05) were split in three groups according to their log2 fold changes (i.e., low, intermediate, and high repression; pink boxes) and then monitored for induction in Nodules + T3/Nodules (green boxes). Statistical differences between the High repression and the other groups regarding the Nodules/Control comparison, on the one hand, or the Nodules + T3/Nodules comparison, on the other hand, were defined using Kruskal–Wallis with two‐sided Wilcoxon pairwise comparison tests followed by Benjamini–Hochberg correction for multiple testing correction. *q < 0.05.

Data information: Box plots in panels D, E, and J are composed of a box from the 25th to the 75th percentile with the median as a line. Whiskers extent to the most extreme data point which is no more than 1.5 times the interquartile range from the box. Source data are available online for this figure.
Figure EV4
Figure EV4. T3‐mediated regulation of Hep‐ID TF gene expression in human IHH cells
  1. A–C

    mRNA expression of the indicated genes was monitored using RT‐qPCR in IHH cells treated with T3 or GC‐1 for 24 or 96 h. Bar graphs show mean ± SD (n = 3 biological replicates) of log2 fold changes in treated versus untreated HepG2 cells. For Hnf4a, the log2 fold change in the ratio of P1 over P2 promoter‐derived isoforms is also shown. Gray dots show the results obtained from the three independent biological replicates. One‐sample t‐test with Benjamini–Hochberg correction for multiple testing was used to determine if the mean log2 FC was statistically different from 0. *q < 0.05.

Figure EV5
Figure EV5. Acute IL1B challenge triggers partial hepatic loss of identity
  1. Analysis similar to that shown in Appendix Fig S6 showing that IL1B treatment induced a hepatic transcriptomic profile leaning towards that of not fully mature hepatocytes pointing to partial dedifferentiation.

  2. Dot plots showing the transcriptional regulation of individual Hep‐ID TF gene expression in livers of IL1B‐challenged mice compared to non‐treated animals issued from transcriptomic analyses.

  3. Correlation between Dio1 and Ccl2 mRNA expression levels assessed using RT‐qPCR and livers of all mice treated with IL1B + T3 from Fig 5. Gene expression are log2 FC relative to control (PBS injected) mice. Linear regression and coefficient of determination (r 2) are shown.

  4. Gene expression levels of Ly6g (neutrophil marker) and Ptprc (also known as CD45; broad immune cell marker) were analyzed as described for Fig 5B–D. Box plots are composed of a box from the 25th to the 75th percentile with the median as a line (n = 13 mice for the PBS group, 17 for the IL1B group and 10 for the other groups). Whiskers extent to the most extreme data point which is no more than 1.5 times the interquartile range from the box.

Figure 5
Figure 5. Resetting Hep‐ID TF gene expression through THRB activation in inflammation‐induced hepatocyte dedifferentiation
  1. A

    Experimental protocol for acute inflammation‐induced loss of hepatocyte identity in vivo. Mice were injected with IL1B (IL1B; n = 17), IL1B followed by T3 (IL1B + T3; n = 30) or a control group (PBS; n = 14). All livers were collected 6 h after the initial injection.

  2. B–D

    mRNA expression of the indicated genes was monitored in mouse livers using RT‐qPCR. Mice treated with IL1B + T3 were subdivided into tertiles based on the mean expression of the Dio1 and Hectd3 genes and defined as low, intermediate or high T3 responsiveness groups (Low resp., Int. resp. and High resp., respectively). Fold change relative to the mean of the control group is shown using box plots composed of a box from the 25th to the 75th percentile with the median as a line (n = 13 mice for the PBS group, 17 for the IL1B group and 10 for the other groups). Whiskers extend to the maximum and minimum values. Statistical differences between the High resp. and the other IL1B‐treated groups were defined using Kruskal–Wallis with two‐sided Wilcoxon pairwise comparison tests followed by Benjamini–Hochberg correction for multiple testing correction. *q < 0.05.

Source data are available online for this figure.
Figure 6
Figure 6. Proposed model for the control of hepatocyte identity by an extended transcription factor network
Schematic summarizing the main findings of our study pointing to an extended hepatic TF identity network which includes THRB. See Discussion for greater details.

References

    1. Afgan E, Baker D, Batut B, van den Beek M, Bouvier D, Cech M, Chilton J, Clements D, Coraor N, Gruning BA et al (2018) The galaxy platform for accessible, reproducible and collaborative biomedical analyses: 2018 update. Nucleic Acids Res 46: W537–W544 - PMC - PubMed
    1. Almeida N, Chung MWH, Drudi EM, Engquist EN, Hamrud E, Isaacson A, Tsang VSK, Watt FM, Spagnoli FM (2021) Employing core regulatory circuits to define cell identity. EMBO J 40: e106785 - PMC - PubMed
    1. Amemiya HM, Kundaje A, Boyle AP (2019) The ENCODE blacklist: identification of problematic regions of the genome. Sci Rep 9: 9354 - PMC - PubMed
    1. Anders S, Pyl PT, Huber W (2015) HTSeq–a python framework to work with high‐throughput sequencing data. Bioinformatics 31: 166–169 - PMC - PubMed
    1. Arendt D, Musser JM, Baker CVH, Bergman A, Cepko C, Erwin DH, Pavlicev M, Schlosser G, Widder S, Laubichler MD et al (2016) The origin and evolution of cell types. Nat Rev Genet 17: 744–757 - PubMed

Publication types

MeSH terms

Substances

Associated data