Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Aug 8;25(1):211.
doi: 10.1186/s13059-024-03354-z.

Transcriptional and epigenetic characterization of a new in vitro platform to model the formation of human pharyngeal endoderm

Affiliations

Transcriptional and epigenetic characterization of a new in vitro platform to model the formation of human pharyngeal endoderm

Andrea Cipriano et al. Genome Biol. .

Abstract

Background: The Pharyngeal Endoderm (PE) is an extremely relevant developmental tissue, serving as the progenitor for the esophagus, parathyroids, thyroids, lungs, and thymus. While several studies have highlighted the importance of PE cells, a detailed transcriptional and epigenetic characterization of this important developmental stage is still missing, especially in humans, due to technical and ethical constraints pertaining to its early formation.

Results: Here we fill this knowledge gap by developing an in vitro protocol for the derivation of PE-like cells from human Embryonic Stem Cells (hESCs) and by providing an integrated multi-omics characterization. Our PE-like cells robustly express PE markers and are transcriptionally homogenous and similar to in vivo mouse PE cells. In addition, we define their epigenetic landscape and dynamic changes in response to Retinoic Acid by combining ATAC-Seq and ChIP-Seq of histone modifications. The integration of multiple high-throughput datasets leads to the identification of new putative regulatory regions and to the inference of a Retinoic Acid-centered transcription factor network orchestrating the development of PE-like cells.

Conclusions: By combining hESCs differentiation with computational genomics, our work reveals the epigenetic dynamics that occur during human PE differentiation, providing a solid resource and foundation for research focused on the development of PE derivatives and the modeling of their developmental defects in genetic syndromes.

Keywords: Epigenomics; Human Development; Pharyngeal Endoderm; Retinoic Acid; Transcription Factors; Transcriptomics.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Fig. 1
Fig. 1
Bulk RNA-Seq analysis reveals extensive transcriptomic changes during the in vitro differentiation of PE cells. A Schematic representation of the protocol for the in vitro differentiation of hESCs into Pharyngeal Endoderm cells. Created with BioRender.com. B RT-qPCR time course analysis showing the relative expression of DE and PE specific markers in the presence (black) or absence (grey) of RA. Data were normalized on PDGB expression and represent means ± SEM of three independent time-course experiments. *p < 0.05, **p < 0.01, ***p < 0.001, unpaired Student’s t-test. C Heatmap showing the expression in DE (d2), AFE (d5 -RA), and PE (d5 +RA) samples of the greatly varying (strict) DEGs identified in DE vs AFE and DE vs PE contrasts, as well as their separation in ten clusters produced via k-means clustering. Hierarchical clustering of the ten clusters is also shown; see also Additional File 3. Known markers belonging to each cluster are shown in the boxes on the right. The expression values reported in the heatmap correspond to row-scaled (Z-score) rlog-transformed count data. D Heatmap showing the results of the GO BP term enrichment analysis performed on the ten gene clusters shown in C; the color intensity in each cell is proportional to the Enrichment Ratio. The heatmap reports only a set of the significantly enriched categories (FDR < 0.05 in at least one cluster), selected in order to reduce redundancy; Enrichment Ratio is plotted only when FDR < 0.25; see also Additional File 4. E Spearman correlation matrices showing the similarity between hESCs, DE, AFE, and PE cells (rows) and embryonic mouse foregut endodermal cell type clusters identified by Han and colleagues (columns), based on the expression of the top 10 transcription factors enriched in each cluster. Text and circle width in each cell of the matrices are proportional to the absolute value of the Spearman correlation. Each matrix corresponds to a different murine developmental stage (day 8.5, day 9.0, and day 9.5)
Fig. 2
Fig. 2
Single-cell RNA-Seq analysis demonstrates the transcriptional homogeneity of the in vitro-derived PE (d5 +RA) cells. A UMAP-based visualization of scRNA-Seq data produced from hESCs, DE (d2), AFE (d3), AFE (d5 -RA), and PE (d5 +RA) cells, with cells colored based on the differentiation stage (top panel) and Leiden clustering (with resolution 0.4) (bottom panel). B Matrix plots showing the log2(FC) of the top 50 DEGs identified in the hESCs, DE (d2), AFE (d3), AFE (d5 -RA), and PE (d5 +RA) clusters (log2(FC) ≥ 4). Known markers for the hESCs, DE, and PE stages are highlighted in blue, orange and green, respectively. C Diffusion map of scRNA-Seq data colored according to the differentiation stage. We show the second and the fourth diffusion components (DC) versus the first in the top and bottom panels, respectively. D UMAP plot representation of the single-cell gene expression (log[norm.counts + 1]) of the marker genes highlighted in B, in hESCs, DE (d2), AFE (d3), AFE (d5 -RA), and PE (d5 +RA) cells
Fig. 3
Fig. 3
ATAC-Seq analysis reveals functional differentially accessible regions among DE, AFE, and PE cell types. A Heatmap showing the ATAC-Seq signal in DE, AFE, and PE samples of the DARs identified in all contrasts, as well as their separation in six clusters produced via k-means clustering. Hierarchical clustering of the six clusters is also shown. Average PhyloP conservation scores, calculated for each genomic position within DARs and Common peaks, are shown in the plots on the right. The ATAC-Seq signal values reported in the heatmap correspond to row-scaled (Z-score) log2-transformed library size-normalized count data. B Bar plot showing the genomic annotation of DARs belonging to each cluster shown in Fig. 3A and Common ATAC-Seq peaks. Each genomic feature is represented by a specific color shown in the legend. C Table showing the Normalized Enrichment Scores (NES) calculated performing GSEA on each differential gene expression contrast (DE vs AFE, DE vs PE, and AFE vs PE) and using sets of expressed protein-coding genes having a TSS in proximity (< 50 kb) of cluster-specific DARs. Positive NES: the gene set is enriched among the upregulated genes; Negative NES: the gene set is enriched among the downregulated genes. D Heatmap showing the results of the GREAT analysis performed on the six DAR clusters shown in A; the color intensity in each cell is proportional to the adjusted p-value. The heatmap reports a set of the significantly enriched GO BP terms (adjusted p-value < 0.01 in at least one cluster), selected in order to reduce redundancy
Fig. 4
Fig. 4
Chromatin accessibility and expression data allow the inference of cell type-specific transcription factor activity. Heatmaps showing the results of the integrated analysis of cell type-specific TF activity. The TF motifs here reported were selected based on maelstrom Z-score (heatmap on the left), BiFET adjusted p-value (heatmap in the middle), and TF expression (heatmap on the right), and on the correlation between these measures (see Methods). The motifs spanning multiple rows are associated with multiple TFs having expression correlated with enrichment; see also Additional File 8
Fig. 5
Fig. 5
ChIP-Seq analysis reveals functional chromatin state changes between DE and PE stages. A Emission probabilities of the 10-state ChromHMM model. Each row represents a chromatin state and reports the frequency of occurrence of each HM in that state. Red and orange boxes indicate promoter and enhancer states, respectively. B Heatmaps showing the fold enrichment of each ChromHMM state for different genomic features (left panel) and at fixed positions relative to TSS (right panel) in DE and PE cells. The fold enrichments are calculated as the ratio between observed and expected number of genomic bins for each overlap, except for the Genome % column, which reports the percentage of genomic bins occupied by each state. The color intensities in the left panel are normalized within each column between its minimum value (white) and its maximum value (blue), while those in the right panel are normalized between the minimum value (white) and the maximum value (blue) of the whole matrix. Red and orange boxes indicate promoter and enhancer states, respectively. C Heatmap showing how many genomic bins transition from a chromatin state to another in the differentiation from DE to PE, as well as the fold enrichment of each transition (see Methods). Only cells corresponding to transitions with fold enrichment > 1.5 are colored, the color intensity being proportional to the fold enrichment. Poorly represented transitions (< 200 bins) are masked using dark grey color. D Heatmap showing the chromatin state transitions that are enriched in upregulated (red) or downregulated (blue) nearest genes. The color intensity is proportional to the difference between the number of upregulated and downregulated nearest genes, divided by the total number of nearest genes, while the digits within each cell correspond to the –log10(adjusted p-value) of the enrichment, with a – sign when the enrichment is towards downregulated genes. E Heatmap showing the chromatin state transitions that are enriched in Gain (red) or Lose (blue) ATAC-Seq peaks. The color intensity is proportional to the difference between the number of Gain and Lose overlapping peaks, divided by the total number of bins involved in the transition, while the digits within each cell correspond to the –log10(adjusted p-value) of the enrichment, with a – sign when the enrichment is towards Lose peaks. F Dot plot showing the GO BP terms enriched among the genes whose promoter transition from a bivalent state in DE (TssBiv, EnhBiv) to a TSS state in PE (TssA, Tss, TssFlnk) and overlap with a Common or Gain ATAC-Seq peak. The plot reports only a set of the significantly enriched categories (FDR < 0.05 in at least one class of TSS), selected in order to reduce redundancy
Fig. 6
Fig. 6
Multi-omics data integration allows to infer a PE-specific transcription factor network. A Heatmaps showing RARA ChIP-Seq and ATAC-Seq signal, measured in AFE (d5 -RA) and PE (d5 +RA) cells, within 3 kb-long regions centered on the summits of Enriched, Equal, and Depleted RARA-ChIP-Seq peaks. ChIP-Seq Signal was calculated as log2-transformed fold change of the RPGC values of IP over input, with a bin size of 50 bp. ATAC-Seq Signal was calculated on merged replicates as RPGC values with a bin size of 50 bp. Summary plots reporting the position-specific average signal calculated for each cell type and peak category are shown on top. B PE-specific TF-TF activation network. Nodes represent TFs that are specifically active and upregulated in PE (d5 +RA) with respect to DE (d2). Thin directed edges indicate the presence of a Gain ATAC-Seq peak harboring a source TF-specific FP located less than 25 kb from the TSS of the target TF; their color reflects the minimum distance between the target TSS and a Gain peak with a source-specific FP. Thick directed black edges connect RARA to its TF targets, identified via ChIP-Seq. Node color intensity is proportional to the log2(FC) in the expanded AFE vs PE comparison. See also Additional File 12. C Heatmap showing the expression in AFE (d5 -RA), PE-RAi (d5 +RA +AGN193109), and PE (d5 +RA) samples from Differentiation_2 experiment (see Methods) of the TFs belonging to the TFN, stratified based on their dependence on RA. Hierarchical clustering within each group is also shown. The expression values reported in the heatmap correspond to row-scaled (Z-score) rlog-transformed count data with batch effect correction

References

    1. Resto Irizarry AM, Nasr Esfahani S, Fu J. Bioengineered pluripotent stem cell models: new approaches to explore early human embryo development. Curr Opin Biotechnol. 2020;66:52–8. 10.1016/j.copbio.2020.06.005 - DOI - PubMed
    1. Sonnen KF, Janda CY. Signalling dynamics in embryonic development. Biochem J. 2021;478:4045–70. 10.1042/BCJ20210043 - DOI - PMC - PubMed
    1. Zheng GX, Terry JM, Belgrader P, Ryvkin P, Bent ZW, Wilson R, Ziraldo SB, Wheeler TD, McDermott GP, Zhu J, et al. Massively parallel digital transcriptional profiling of single cells. Nat Commun. 2017;8:14049. 10.1038/ncomms14049 - DOI - PMC - PubMed
    1. Murry CE, Keller G. Differentiation of embryonic stem cells to clinically relevant populations: lessons from embryonic development. Cell. 2008;132:661–80. 10.1016/j.cell.2008.02.008 - DOI - PubMed
    1. Magaletta ME, Siller R, Maehr R. Differentiation of human pluripotent stem cells toward pharyngeal endoderm derivatives: Current status and potential. Curr Top Dev Biol. 2020;138:175–208. 10.1016/bs.ctdb.2020.01.004 - DOI - PubMed