Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Apr 13;3(5):100301.
doi: 10.1016/j.xgen.2023.100301. eCollection 2023 May 10.

Single-nucleus RNA sequencing of pre-malignant liver reveals disease-associated hepatocyte state with HCC prognostic potential

Affiliations

Single-nucleus RNA sequencing of pre-malignant liver reveals disease-associated hepatocyte state with HCC prognostic potential

Rodrigo Carlessi et al. Cell Genom. .

Abstract

Current approaches to staging chronic liver diseases have limited utility for predicting liver cancer risk. Here, we employed single-nucleus RNA sequencing (snRNA-seq) to characterize the cellular microenvironment of healthy and pre-malignant livers using two distinct mouse models. Downstream analyses unraveled a previously uncharacterized disease-associated hepatocyte (daHep) transcriptional state. These cells were absent in healthy livers but increasingly prevalent as chronic liver disease progressed. Copy number variation (CNV) analysis of microdissected tissue demonstrated that daHep-enriched regions are riddled with structural variants, suggesting these cells represent a pre-malignant intermediary. Integrated analysis of three recent human snRNA-seq datasets confirmed the presence of a similar phenotype in human chronic liver disease and further supported its enhanced mutational burden. Importantly, we show that high daHep levels precede carcinogenesis and predict a higher risk of hepatocellular carcinoma development. These findings may change the way chronic liver disease patients are staged, surveilled, and risk stratified.

Keywords: HCC; cancer surveillance; chronic liver disease; daHep; disease-associated hepatocyte; hepatocellular carcinoma; pre-malignancy; prognostic biomarker; snRNA-seq.

PubMed Disclaimer

Conflict of interest statement

M.A.F. is the founder and shareholder of Celesta Therapeutics.

Figures

None
Graphical abstract
Figure 1
Figure 1
A single-nucleus atlas of the healthy and pre-malignant mouse liver (A) Experimental design and workflow to discovery of disease staging and predictive transcriptional signatures. (B) t-SNE visualization and unsupervised clustering of 40,748 single hepatic nuclei. Nine major liver cell types were annotated based on cell-specific marker expression and are displayed in order of abundance. Hepatocytes (Hep), mesenchymal cells, endothelial cells (Endo), biliary epithelial cells (BECs), myeloid cells, B cells, T and NK cells (T/NK cells), plasmacytoid dendritic cells (pDCs) and mesothelial cells (Meso) are shown. (C) Heatmap showing average expression of the top 25 genes in each cluster. Five cell type-specific canonical markers found within the top 25 genes per cluster are displayed in brackets. (D) Expression of individual marker genes of each cluster in the t-SNE space. (E) Absolute cell type counts in each sample. CDE and TAA mice showed an increase in non-parenchymal cell types associated with inflammation and tissue repair and a relative reduction in hepatocyte numbers. (F) t-SNE visualization split by experimental condition. Healthy (normal chow); choline-deficient, ethionine-supplemented (CDE) diet; and thioacetamide (TAA) in the drinking water (n = 3 per group) are shown, all at the 3-week time point. (G) Correlation heatmap between mouse and human cells. Normalized gene expression values in each cell type were used to calculate Pearson correlation coefficient values.
Figure 2
Figure 2
Identification of a disease-associated hepatocyte (daHep) signature (A) UMAP visualization of hepatocyte subclustering. Four subsets were identified: three representing normal hepatocyte zonation (Zone_1_Hep, Zone_2_Hep, Zone_3_Hep) and one cluster of hepatocytes with a disease-associated signature (daHep). (B) Expression of zonation marker genes in the UMAP space (left). Immunohistochemistry panels for HAL, CYP2E1, and GLUL from the Human Protein Atlas depicting zone-specific expression (right). (C) Dot plot of top differently expressed genes across hepatocyte subsets. Zonated hepatocyte clusters are defined by well-characterized zonation markers. Circle size denotes detection frequency and color denotes expression level. (D) UMAP visualization split by experimental condition. Disease-associated hepatocyte cluster is found in CDE and TAA and is nearly absent in healthy mice. Highlighted by red ellipses. (E) Disease-associated signature is enriched with stress response, cell death, cell-cycle arrest, and senescence markers, while normal hepatocyte function and identity genes are downregulated. (F) Overrepresentation analysis (ORA) of upregulated genes in daHep with KEGG terms. The dotted line shows the adjusted false discovery rate (FDR) cutoff of ≤0.05. (G) As in (F) with downregulated genes in daHep. (H) Gene set enrichment analysis (GSEA) of ranked daHep differentially expressed genes (DEGs) using co-expression modules generated from The Cancer Genome Atlas (TCGA) hepatocellular carcinoma RNA-seq dataset (TCGA-LIHC) (top). Enrichment plots for top overrepresented and underrepresented modules, M246 and M333, respectively (bottom). (I) Boxplots depicting expression levels of indicated modules in tumor samples from TCGA-LIHC patients (tumor, n = 369) in comparison with adjacent non-involved tissue from TCGA and healthy human liver from the GTEx datasets (normal, n = 160), ∗p < 0.0001 by one-way ANOVA. (J) Kaplan-Meier survival analysis of TCGA-LIHC patients ranked high (top quartile, n = 91) and low (bottom quartile, n = 91) in terms of their expression of indicated modules. Hazard ratio (HR) and p values were calculated by the log rank test; 95% confidence intervals are denoted by the dotted curves. Analyses in (F), (G), and (H) were performed at WebGestalt and in (I) and (J) using the GEPIA2 web server.
Figure 3
Figure 3
Identification of daHeps in situ and characterization of their genomic mutational landscape (A) Expression distribution of Anxa2 and G6pc in the UMAP space of hepatocyte subsets. Red ellipses highlight the location of daHeps (left). RNA in situ hybridization (RNAscope) images of healthy, CDE, and TAA mice at the indicated time points (right). G6pc (green) and Anxa2 (purple) can distinguish healthy hepatocytes and daHep areas, respectively. White scale bar, 500 μm; yellow scale bar, 100 μm. (B) Expression distribution of Anxa2 and G6pc in the UMAP space of all liver cells. Black arrows indicate the hepatocyte cluster; red ellipses highlight the location of daHeps (left). High-magnification RNAscope images of the fibrotic area of a TAA-treated mouse at 3 weeks (right). G6pc, green; Anxa2, purple; and DAPI, blue. Scale bars, 20 μm; white arrows, normal hepatocytes; yellow arrows, daHeps. (C) RNAscope image of a tumor-bearing CDE-treated mouse at 32 weeks. Anxa2, purple; G6pc green; scale bar, 1,000 μm; red dashed line, tumor area. (D) Immunostaining of ANXA2 and G6PC from the Human Protein Atlas, showing expression in healthy liver and HCC samples. Scale bar, 100 μm. (E) ORA of transcription factor targets with daHep DEGs. (F) Genome-wide copy number profiles determined by ichorCNA analysis of microdissected TAA-treated mouse liver sectors as indicated. Plotted are log2 ratio of read counts from 1 Mb bins. Copy neutral, blue; deletion, green; gain, brown; and duplication, red. Predicted ploidy estimates from ichorCNA analysis are indicated above each graph.
Figure 4
Figure 4
High frequency of daHeps is a hallmark of chronic liver disease and correlates with disease stage (A) Gene expression deconvolution by CIBERSORTx was used to estimate daHep frequencies in publicly available RNA-seq datasets of chronic liver disease and hepatocellular carcinoma. (B) Bar plot of CIBERSORTx output showing frequencies of each hepatocyte subtype in individual mice fed normal chow or a NASH-inducing diet from Xiong et al. (left). Summarized data of daHep frequencies (right). Bars indicate mean ± SD; ∗∗∗∗p < 0.0001 by unpaired t test. (C) As in (B), for individual human subjects grouped according to stage in the non-alcoholic fatty liver disease (NAFLD) spectrum from Suppli et al. (left). Summarized data of daHep frequencies (center). Bars indicate mean ± SD; ∗p < 0.05, ∗∗∗p < 0.001, ∗∗∗∗p < 0.0001 by one-way ANOVA with Dunnett’s multiple comparisons test vs. normal weight. Receiver operating characteristic (ROC) curve assessing the power of daHep frequencies to discriminate NASH patients vs. patients in earlier stages of NAFLD and healthy normal-weight individuals (right). AUC, area under the curve. (D) daHep frequencies in the SteatoSITE dataset (n = 679). Patients were categorized according to the histological fibrosis scoring system NASH CRN (left) and Ishak scores (center). Bars indicate mean ± SD. ∗p < 0.05, ∗∗p < 0.01, ∗∗∗p < 0.001, ∗∗∗∗p < 0.0001 by Kruskal-Wallis with Dunn’s post hoc test vs. fibrosis stage F1; # for p vs. F2; and % for p vs. F3; and by one-way ANOVA with Tukey’s multiple comparisons test vs. Ishak score 0, # for p vs. 1; % for p vs. 2; & for p vs. 3, and ! for p vs. 4. ROC curve assessing the power of daHep frequencies to discriminate patients between fibrosis stages 1 and 4 (right). (E) Violin plots depicting log-normalized expression levels of indicated fibrosis-, inflammation-, and ductular reaction-associated genes in SteatoSITE subjects grouped according to high (90th percentile) or low (10th percentile) daHep frequencies. (F) Visualization of individuals with high and low daHep levels in the SteatoSITE dataset in the UMAP space. UMAP was implemented using the top 25 principal components calculated using the 2,000 most variable genes in the dataset. (G–L) Summarized data of daHep frequencies in the indicated datasets. Bars indicate mean ± SD. ∗∗p < 0.01, ∗∗∗p < 0.001, ∗∗∗∗p < 0.0001 by one-way ANOVA with Dunnett’s multiple comparisons test vs. NAS score 0 in (G) and vs. control in (L); by Mann-Whitney test in (H) and (K); and by Wilcoxon matched-pairs signed-rank test in (I) and (J).
Figure 5
Figure 5
Human daHeps identified in public snRNA-seq datasets show an exacerbated mutational burden (A) UMAP visualizations and unsupervised clustering of 117,123 single hepatic nuclei from integrated GSE185477, GSE174748, GSE192742, and GSE212046 datasets. Six liver cell types were annotated based on cell-specific marker expression (left). Hepatocytes (Hep) and endothelial (Endo), lymphoid, myeloid, mesenchymal, and biliary epithelial cells (BECs) are shown. Nuclei were labeled according to sample IDs (right). (B) Relative frequencies of cell types in each sample. (C) Consensus k-means clustered dot plot showing expression of top genes in each cell type. Circle size, detection frequency; color, expression levels. (D) UMAP visualizations and unsupervised clustering of 78,250 hepatocyte nuclei from integrated GSE185477, GSE174748, GSE192742, and GSE212046 datasets. Five hepatocyte subsets were annotated based on gene expression and liver pathology metadata (left). UMAP visualizations split by liver pathology metadata (right). Red ellipses highlight the daHep cluster region. (E) Relative frequencies of hepatocyte subsets in each sample. Liver pathology metadata of sample groups are highlighted above the bar plot. (F) Frequencies of daHeps (top) and combined HCC clusters (bottom) in each liver pathology group. Bars indicate mean ± SD. ∗p < 0.05 by one-way ANOVA with Dunnett’s multiple comparisons test vs. healthy for each group. (G) Consensus k-means clustered dot plot showing expression of top genes in each hepatocyte subset. Circle size, detection frequency; color, expression levels. (H) Boxplots depicting expression levels of genes upregulated in daHeps and HCC vs. normal hepatocytes (top) and downregulated in daHeps and HCC vs. normal hepatocyte (bottom) in tumor samples from TCGA-LIHC patients (tumor, n = 369) in comparison with adjacent non-involved tissue and healthy human liver from the GTEx dataset (normal, n = 160). Log2 FC cutoff 0.25, ∗p < 0.0001 by one-way ANOVA. (I) ORA of upregulated (top) and downregulated (bottom) genes in human daHeps with WikiPathways terms. (J) Heatmap showing inferCNV output of downsampled (10,000 nuclei) human hepatocyte subsets. Normal hepatocytes were set as reference and compared with daHep, HCC_1, HCC_2, and HCC_3 clusters. Rows, single nuclei; columns, genes. Genes were ordered according to genomic positioning, and individual chromosomes labeled in the x axis are delineated by black vertical lines. (K) Distribution of inferCNV hidden Markov model (HMM) predictions on the UMAP of human hepatocyte clusters. Nuclei are labeled according to proportions of expressed genes within CNVs for the indicated chromosomes. (L) Ridgeline plots depicting the fraction of genome with inferred CNVs across hepatocyte nuclei grouped according to liver pathology metadata (top) and cluster identity (bottom).
Figure 6
Figure 6
Transcriptional signature of daHeps is a predictor of future HCC development (A) Schematic representation of the hepatocellular carcinoma-predictive study using a partial penetrance model: MUP-uPA mice fed a high-fat-diet (HFD). Representative H&E image of MUP-uPA HFD-fed mice at 24 weeks. Scale bar, 100 μm. (B) Bar plot of individual MUP-uPA HFD-fed mice grouped according to tumor development outcome at 40 weeks (TF, tumor-free, and TB, tumor-bearing), showing frequencies of each hepatocyte subtype (left). Summarized data of daHep frequencies at 24 weeks (right). Bars indicate mean ± SD; ∗∗p < 0.01 by unpaired t test. (C) Receiver operating characteristic (ROC) curve assessing the power of daHep frequencies to predict tumor development outcome at 40 weeks. AUC, area under the curve. (D) Alanine aminotransferase (ALT) levels at 24 weeks, grouped according to tumor development outcome at 40 weeks. (E) Scatterplot showing log2 fold change of all 2,014 DEGs in daHeps (x axis) compared with their fold changes in MUP-uPA HFD TB vs. TF mice at 24 weeks (y axis). Pearson’s correlation analysis p < 0.0001. (F) Heatmap showing normalized average expression of top 80 DEGs in TB vs. TF across hepatocyte subsets in the snRNA-seq dataset. (G) Counts per million (CPM) values of two top upregulated (Tinag and Cyp2a4) and downregulated (C6 and Cyp7b1) genes in MUP-uPA HFD TB vs. TB mice at 24 weeks (top), and their expression in the UMAP space of hepatocytes in the mouse snRNA-seq dataset (bottom). Red ellipses, daHep cluster region. (H) CDKN1A expression in the UMAP space of human hepatocyte subsets. Red ellipse, daHep cluster region (top left). Representative P21 immunohistochemistry in biopsies of HCV patients that progressed vs. did not progress to HCC. Black dashed line marks magnified area (right). Summary of P21 count data. Error bars indicate mean ± SD; ∗∗∗p < 0.001 by Mann-Whitney test (bottom left).
Figure 7
Figure 7
Trem2 macrophages are spatially associated with the daHep niche (A) UMAP visualization of myeloid reclustering reveals six subsets, including Kupffer cells, monocytes, Trem2 macrophages, two subsets of conventional dendritic cells (cDC1 and cDC2), and “mature DCs enriched in immunoregulatory molecules” (Mreg_DCs). (B) Frequencies of myeloid subsets in each experimental group. Bars indicate mean ± SEM; ∗p < 0.05 by one-way ANOVA with Dunnett’s multiple comparisons test vs. healthy for each subset. (C) Heatmap showing expression of the top 10 marker genes in each cluster. (D) Expression of marker genes of each subset in the UMAP space. (E) UMAP visualization split by experimental condition. Trem2 macrophages are enriched in disease models and nearly absent in healthy mice (red ellipses). (F) RNA in situ hybridization (RNAscope) images of healthy and TAA mice. Anxa2, purple; Gpnmb, white; DAPI, blue. White scale bar, 200 μm; red scale bar, 100 μm; PV, portal vein; CV, central vein.

References

    1. Sung H., Ferlay J., Siegel R.L., Laversanne M., Soerjomataram I., Jemal A., Bray F. Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA. Cancer J. Clin. 2021;71:209–249. doi: 10.3322/caac.21660. - DOI - PubMed
    1. Moon A.M., Singal A.G., Tapper E.B. Contemporary epidemiology of chronic liver disease and cirrhosis. Clin. Gastroenterol. Hepatol. 2020;18:2650–2666. doi: 10.1016/j.cgh.2019.07.060. - DOI - PMC - PubMed
    1. Rawla P., Sunkara T., Muralidharan P., Raj J.P. Update in global trends and aetiology of hepatocellular carcinoma. Contemp. Oncol. 2018;22:141–150. doi: 10.5114/wo.2018.78941. - DOI - PMC - PubMed
    1. Ramachandran P., Matchett K.P., Dobie R., Wilson-Kanamori J.R., Henderson N.C. Single-cell technologies in hepatology: new insights into liver biology and disease pathogenesis. Nat. Rev. Gastroenterol. Hepatol. 2020;17:457–472. doi: 10.1038/s41575-020-0304-x. - DOI - PubMed
    1. Halpern K.B., Shenhav R., Matcovitch-Natan O., Toth B., Lemze D., Golan M., Massasa E.E., Baydatch S., Landen S., Moor A.E., et al. Single-cell spatial reconstruction reveals global division of labour in the mammalian liver. Nature. 2017;542:352–356. doi: 10.1038/nature21065. - DOI - PMC - PubMed