Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Jun 17;6(6):102188.
doi: 10.1016/j.xcrm.2025.102188. Epub 2025 Jun 10.

Cell states and neighborhoods in distinct clinical stages of primary and metastatic esophageal adenocarcinoma

Affiliations

Cell states and neighborhoods in distinct clinical stages of primary and metastatic esophageal adenocarcinoma

Josephine Yates et al. Cell Rep Med. .

Abstract

Esophageal adenocarcinoma (EAC) is a highly lethal cancer of the upper gastrointestinal tract with rising incidence in western populations. To decipher EAC disease progression and therapeutic response, we perform multiomic analyses of a cohort of primary and metastatic EAC tumors, incorporating single-nuclei transcriptomic and chromatin accessibility sequencing along with spatial profiling. We recover tumor microenvironmental features previously described to associate with therapy response. We subsequently identify five malignant cell programs, including undifferentiated, intermediate, differentiated, epithelial-to-mesenchymal transition, and cycling programs, which are associated with differential epigenetic plasticity and clinical outcomes, and for which we infer candidate transcription factor regulons. Furthermore, we reveal diverse spatial localizations of malignant cells expressing their associated transcriptional programs and predict their significant interactions with microenvironmental cell types. We validate our findings in three external single-cell RNA sequencing (RNA-seq) and three bulk RNA-seq studies. Altogether, our findings advance the understanding of EAC heterogeneity, disease progression, and therapeutic response.

Keywords: bioinformatics; computational biology; epigenetics; esophageal adenocarcinoma; gastrointestinal cancer; oncology; single cell; spatial transcriptomics; transcriptomics.

PubMed Disclaimer

Conflict of interest statement

Declaration of interests E.M.V.A.—advisory/consulting: Enara Bio, Manifold Bio, Monte Rosa, Novartis Institute for Biomedical Research, Serinus Bio, and TracerDx; research support: Novartis, BMS, Sanofi, and NextPoint; equity: Tango Therapeutics, Genome Medical, Genomic Life, Enara Bio, Manifold Bio, Microsoft, Monte Rosa, Riva Therapeutics, Serinus Bio, Syapse, and TracerDx; travel reimbursement: none; patents: institutional patents filed on chromatin mutations and immunotherapy response, and methods for clinical interpretation; intermittent legal consulting on patents for Foaley & Hoag; editorial boards: Science Advances. A.J.A. has consulted for Anji Pharmaceuticals, Affini-T Therapeutics, Arrakis Therapeutics, AstraZeneca, Boehringer Ingelheim, Kestrel Therapeutics, Merck & Co., Inc., Mirati Therapeutics, Nimbus Therapeutics, Oncorus, Inc., Plexium, Quanta Therapeutics, Revolution Medicines, Reactive Biosciences, Riva Therapeutics, Servier Pharmaceuticals, Syros Pharmaceuticals, T-knife Therapeutics, Third Rock Ventures, and Ventus Therapeutics. A.J.A. holds equity in Riva Therapeutics and Kestrel Therapeutics. A.J.A. has research funding from Amgen, AstraZeneca, Boehringer Ingelheim, Bristol Myers Squibb, Deerfield, Inc., Eli Lilly, Mirati Therapeutics, Nimbus Therapeutics, Novartis, Novo Ventures, Revolution Medicines, and Syros Pharmaceuticals.

Figures

None
Graphical abstract
Figure 1
Figure 1
EAC primary and metastatic samples show a diverse landscape of TME and malignant cells in transcriptomic and epigenetic data (A) Schematic representation of the study workflow. Biopsies from 10 patients in our discovery cohort, including normal adjacent tissue (NAT), primary tissue, and metastatic samples, were subjected to single-nuclei RNA and ATAC sequencing using 10X Chromium technology. For a subset of these patients as well as three additional patients, matched primary and metastatic samples were profiled with 10X Visium and 10X Xenium spatial transcriptomics (ST) technologies. For single-nuclei data, cells were annotated by cell type and categorized into malignant and TME components. TME subtypes were linked to metastasis, with validation against an external pan-cancer fibroblast atlas. The malignant cell components underwent analysis using consensus non-negative matrix factorization (cNMF) to uncover malignant programs, which were further characterized for transcriptional and epigenetic heterogeneity at a single-cell and spatial level and candidate master transcription factors. External validation was performed in two single-cell validation cohorts,, and associations with clinical and molecular characteristics, as well as survival, were assessed in three bulk validation cohorts.,, (B) Uniform manifold approximation and projection (UMAP) representation of the full cohort in Harmony-corrected integrated transcriptomic data, with major cell type compartments labeled and cell counts indicated. (C) Proportion of major cell types in each sample based on transcriptomic data, with percentages for compartments representing over 5% of the total sample composition. (D) UMAP representation of the full cohort in Harmony-corrected integrated ATAC data, with cell type annotations transferred from the RNA annotations. “NA” denotes cells without paired associated RNA information. (E) Proportion of major cell types in each sample based on ATAC data, with percentages for compartments representing over 5% of the total sample composition.
Figure 2
Figure 2
The EAC TME contains several pro- and anti-inflammatory populations of macrophages and RUNX1/RUNX2/PRRX1/BNC2-regulated inflammatory cancer-associated fibroblasts enriched in metastatic samples (A) Uniform manifold approximation and projection (UMAP) representation of the myeloid compartment in Harmony-corrected integrated transcriptomic data, with annotated subtypes indicated. (B) Proportion of myeloid subtypes per patient. (C) Distribution of Milo fold change scores between normal-adjacent and tumor samples for myeloid cells; Milo scores measure differential abundances of specific cell subtypes by assigning cells to overlapping neighborhoods in a k-nearest neighbor graph. (D) Marker genes of annotated myeloid subtypes, with cells grouped by subtype and expression information provided. (E) UMAP representation of the fibroblast compartment in Harmony-corrected integrated transcriptomic data, with annotated subtypes indicated. (F) Proportion of fibroblast subtypes per patient. (G) Distribution of Milo fold change scores between metastatic and primary tumor samples for fibroblast subtypes, with labeling and exclusion criteria similar to (C). (H) Marker genes of annotated fibroblast subtypes, with cells grouped by subtype and expression information provided. (I) Distribution of the inflammatory cancer-associated fibroblast (CAF) score in the stromal compartment of Carroll et al.’s cohort, stratified by response to immune checkpoint inhibitor (ICI) therapy: clinical benefit (CB) and no clinical benefit (NCB). The inflammatory CAF program is scored on the entire cohort. Paired measurements of patients were made before treatment (PreTx) and after a 4-week ICI treatment window (ICI-4W). The distribution of the inflammatory CAF score is compared among the CB and NCB groups across PreTx and ICI-4W time points. Significance testing is conducted using a Mann-Whitney test to assess differences between the CB and NCB groups. (J–L) Results for SCENIC+-derived transcription factor (TF) candidates for inflammatory fibroblasts, with cells grouped by subtype and Z scores of TF expression (J), eRegulon gene-based expression (K), and eRegulon region-based expression (L) are shown. (M) TF gene expression correlation with inflammatory CAF score in the external pan-cancer fibroblast validation cohort of Luo et al., with candidate TFs identified with the SCENIC+ analysis highlighted. (N) Correlation of all available TFs’ gene expression and SCENIC-estimated gene-based eRegulon score with the inflammatory CAF score in the pan-cancer fibroblast atlas. Only PRRX1’s eRegulon activity, but not BNC2 and RUNX1/2, was estimated using SCENIC.
Figure 3
Figure 3
Five recurrent transcriptomic programs characterize EAC malignant cells with distinct RNA profiles (A) Illustration of the methodology employed for identifying transcriptomic programs. For each patient, consensus non-negative matrix factorization (cNMF) is performed on the malignant cell compartment, followed by manual filtration to retain high-quality programs characterized by gene weightings. Pairwise cosine similarity between programs across all patients is computed to cluster programs using hierarchical clustering with average linkage. (B) Cosine similarity matrix representing the similarity between cNMF-derived programs across all samples, clustered using hierarchical clustering with average linkage. The five identified programs (cNMF1 through cNMF5) are delineated. (C) UMAP representation of the malignant cell compartment using unintegrated transcriptomic data, colored according to their program score (cNMF1 through cNMF5) and sample ID. (D) GSEA enrichment of the five programs in the 50 hallmarks of cancer, based on genes ranked according to their weight contribution to cNMF programs. Hallmarks are grouped according to category. Enrichments that did not reach significance (FDR = 0.05) are blanked out. (E) GSEA enrichment plots for selected programs described by Nowicki et al. in Barrett’s esophagus. (F) GSEA enrichment plots for hallmarks G2M checkpoint in cNMF2 and epithelial-to-mesenchymal transition in cNMF3. (G) Distribution of the five program scores in metastatic and primary samples. Significance is computed using the Mann-Whitney U test. The difference in median score is indicated as Δ. (H and I) Cosine similarity between programs derived with cNMF in external datasets and cNMF1 through cNMF5 programs, derived in the Carroll et al. dataset (H) and in the Croft et al. dataset (I). The cosine similarity is computed between the cNMF-derived gene weights of programs for all patients in the external datasets and the median gene weight associated with each cNMF program derived in the discovery set.
Figure 4
Figure 4
EAC malignant cell programs display unique ATAC profiles and epigenetic plasticity (A) UMAP representation of the malignant cell compartment using unintegrated snATAC-seq data, color-coded according to their cNMF gene signature score (cNMF1 through cNMF5) and sample ID. The program score is transferred from the RNA annotation. (B) Number of open chromatin regions significantly correlated with each program (FDR < 0.05, Pearson’s R > 0.1). (C) Heatmap illustrating chromatin accessibility in cNMF-associated regions for representative program cells. Cells are scored using cNMF signatures derived from RNA, with the top 5% unique cells in each score selected as representative cells. The top 200 regions with the higher correlation between chromatin accessibility and each program are represented. (D) Chromatin accessibility of representative cNMF program cells for genes of interest. Genes are selected based on their association with the regions of the highest correlation between chromatin accessibility and gene signature scores of cNMF programs. Chromatin accessibility of promoters for AKT2, MKI67, SPARC, BHLHE41, and ANXA11 is depicted for representative cells of cNMF1 through cNMF5 and all remaining carcinoma cells. (E) This heatmap illustrates the accuracy of classifying cells into cNMF programs based on chromatin accessibility profiles. Cells are scored using the average Z score of chromatin accessibility across the top 200 cNMF-associated regions, and the highest score determines their chromatin accessibility identity. Each point represents the percentage of cells with a given transcriptional identity (e.g., cNMF3) classified under a specific chromatin accessibility identity (e.g., cNMF1). Off-diagonal points indicate cells whose transcriptional and chromatin accessibility identities differ, suggesting potential plasticity or transitions between programs. (F) Distribution of the epigenetic plasticity scores across representative cells of cNMF1 to cNMF5. Average Z scores of ATAC accessibility vectors are transformed into a probability distribution using a softmax transformation with temperature, and the plasticity score is computed as the Shannon entropy over the resulting probability distribution. (G) Representation of the candidate master transcription factors (mTFs) associated with programs consistent across datasets. We jointly model chromatin accessibility and gene expression to obtain candidate master transcription factors for each cNMF program in the discovery cohort that are subsequently validated in the two external validation cohorts. The identified mTFs consistent across datasets are represented.
Figure 5
Figure 5
Single-nuclei-derived transcriptional programs highlight different spatial regions of EAC tumors (A) Spatial transcriptomics (ST) slides of P8 primary tumor A, colored according to cNMF program score and the CNV-derived label. For each spot, we infer the CNV profile with inferCNV and assign spots to tumor, mixed, and normal status. cNMF scores are computed as the average Z score of signature genes using the deconvolved carcinoma-specific gene expression profile of spots derived with Cell2Location. (B) Average cNMF score according to the position of the spots compared to the tumor-leading edge. For each tumor spot, we compute the distance to the edge as the shortest path to a normal or mixed spot. The distribution of cNMF scores with standard error is represented for normal spots, mixed spots, and spots of a certain distance to the edge. (C) cNMF scores for carcinoma cells and cell type annotations in a subset of Xenium-profiled samples (P4_A, P11_B, and P12_D). cNMF scores are computed as the average Z score across all carcinoma cells of signature genes included in the Xenium panel. (D) CellCharter cluster assignments for a subset of Xenium-profiled samples (P4_A, P11_B, and P12_D). (E) CellCharter cluster cell type proportion: each cell is assigned a cluster and we represent the proportion of the different cell types in each of the CellCharter clusters. (F) Distribution of cNMF scores of carcinoma cells belonging to the CellCharter clusters with a substantial amount of carcinoma cells (>5%), CC1, CC2, CC3, and CC4. For all comparisons, Mann-Whitney U p < 0.000005.
Figure 6
Figure 6
Discovered malignant programs have different clinical characteristics and predicted drug sensitivity (A and B) Link between uncovered programs and (A) N stage, i.e., proxy of the number of nearby lymph nodes that have cancer, and (B) T stage, i.e., size and extent of the main tumor in the TCGA bulk cohort. Patients are scored using single-sample gene set enrichment analysis (ssGSEA) with a cancer-specific gene signature. Statistical testing is performed using the Mann-Whitney U test. (C–E) Hazard ratio associated with scores in bulk validation cohorts of (C) TCGA, (D) Hoefnagel et al., and (E) Carroll et al. Cox proportional hazard univariate models are employed using disease-specific survival for TCGA and overall survival for Hoefnagel et al. and Carroll et al. The p values are estimated using the Wald test. (F) Distribution of the cNMF5 score in the malignant cell compartment of Carroll et al.’s cohort, stratified by response to immune checkpoint inhibitor (ICI) therapy: clinical benefit (CB) and no clinical benefit (NCB). The cNMF5 program is scored on the full cohort. Paired measurements of patients were made before treatment (PreTx) and after a 4-week ICI treatment window (ICI-4W). The distribution of the cNMF5 score is compared among the CB and NCB groups across PreTx and ICI-4W time points. Significance testing is conducted using a Mann-Whitney U test. (G) Predicted drug sensitivity by program. scTherapy is used to infer which drugs may be effective against EAC cells exhibiting specific tumor programs. The upset plot represents the total number of drugs predicted to target cells expressing a specific program on the left, as well as the size of the intersection represented in the middle on the top. Drugs of interest are indicated on the plot. (H) Selected predicted candidate combination therapies that could target cells in all five programs at a time are represented.
Figure 7
Figure 7
Uncovered malignant programs show associations with clinical and molecular characteristics, prognosis, and distinct ecotypes (A–D) Pathways significantly correlated with TME and malignant programs. Using COMMOT, sender and receiver activity scores were assigned to each spot for specific pathways and correlated with program scores. The heatmap displays the median Pearson’s r across samples for pathways significantly correlated with a program in at least 4/5 samples; (A) and (B) depict significant sender and receiver signals for cell-cell contact pathways, while (C) and (D) show significant signals for secreted pathways. (E) Directionality of THY1, MK, and MIF signaling in sample P8_A. Spots are colored based on sender activity for each pathway, with arrows indicating the estimated direction of signaling. Yellow outlines mark tumor/mixed tissue boundaries. Bottom graphs display program scores significantly correlated with THY1, MK, and MIF sender signals. (F) Receiver activity of the THY1, MK, and MIF pathways in two Xenium samples (P4_A and P13_C). The top illustrates the average expression of receptor genes included in the Xenium panel. The left inset shows the distribution of average receptor gene expression across relevant cell populations, while the right inset highlights cell types predicted to participate in these pathways based on COMMOT analysis. (G) Schematic representation of inferred interactions between malignant cells expressing cNMF programs and iCAFs, TAM1 macrophages, and DCs. Recurrent interactions between cells are depicted with colored arrows.

Update of

References

    1. Nowicki-Osuch K., Zhuang L., Jammula S., Bleaney C.W., Mahbubani K.T., Devonshire G., Katz-Summercorn A., Eling N., Wilbrey-Clark A., Madissoon E., et al. Molecular phenotyping reveals the identity of Barrett’s esophagus and its malignant transition. Science. 2021;373:760–767. - PubMed
    1. Que J., Garman K.S., Souza R.F., Spechler S.J. Pathogenesis and Cells of Origin of Barrett’s Esophagus. Gastroenterology (New York, N. Y., 1943) 2019;157:349–364.e1. - PMC - PubMed
    1. Jiang M., Li H., Zhang Y., Yang Y., Lu R., Liu K., Lin S., Lan X., Wang H., Wu H., et al. Transitional basal cells at the squamous-columnar junction generate Barrett’s oesophagus. Nature. 2017;550:529–533. - PMC - PubMed
    1. Pennathur A., Gibson M.K., Jobe B.A., Luketich J.D. Oesophageal carcinoma. Lancet. 2013;381:400–412. - PubMed
    1. Derakhshan M.H., Arnold M., Brewster D.H., Going J.J., Mitchell D.R., Forman D., McColl K.E.L. Worldwide Inverse Association between Gastric Cancer and Esophageal Adenocarcinoma Suggesting a Common Environmental Factor Exerting Opposing Effects. Am. J. Gastroenterol. 2016;111:228–239. - PubMed

Supplementary concepts