Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Sep 25;146(13):1533-1545.
doi: 10.1182/blood.2024028055.

Association of Epstein-Barr virus genomic alterations with human pathologies

Affiliations

Association of Epstein-Barr virus genomic alterations with human pathologies

Htet Thiri Khine et al. Blood. .

Abstract

Epstein-Barr virus (EBV) infects >90% of humans and is associated with both hematological and epithelial malignancies. Here, we analyzed 990 EBV genomes (319 newly sequenced and 671 from public databases) from patients with various diseases to comprehensively characterize genomic variations, including single nucleotide variations (SNVs) and structural variations (SVs). Although most SNVs were a result of conservative evolution and reflected the geographical origins of the viral genomes, we identified several convergent SNV hot spots within the central homology domain of EBNA3B, the transactivation domain of EBNA2, and the second transmembrane domain of LMP1. These convergent SNVs seem to fine-tune viral protein functionality and immunogenicity. SVs, particularly large deletions, were frequently observed in chronic active EBV disease (28%), EBV-positive diffuse large B-cell lymphoma (48%), extranodal natural killer/T-cell lymphoma (41%), and Burkitt lymphoma (25%), but were less common in infectious mononucleosis (11%), posttransplant lymphoproliferative disorder (7%), and epithelial malignancies (5%). In hematological malignancies, deletions often targeted viral microRNA clusters, potentially promoting viral reactivation and lymphomagenesis. Nondeletion SVs, such as inversions, were also prevalent, with several inversions disrupting the C promoter to suppress latent gene expression, thereby maintaining viral dormancy. Furthermore, recurrent EBNA3B deletions suggested that this viral transcription factor functions as a tumor suppressor. EBNA3B knockout experiments in vitro revealed downregulation of human tumor suppressors, including PTEN and RB1, which could explain the enhanced lymphomagenesis observed in EBNA3B-deficient lymphoblastoid cell line xenografts. Our findings highlight both disease-specific and general contributions of EBV genomic alterations to human cancers, particularly in hematological malignancies.

PubMed Disclaimer

Conflict of interest statement

Conflict-of-interest disclosure: The authors declare no competing financial interests.

Figures

None
Graphical abstract
Figure 1.
Figure 1.
Genomic landscape of EBV strains and SNVs. (A) Concordance of viral genome sequencing across institutions. Eight cell lines were sequenced at ≥2 institutions (Jijoye and Namalwa were sequenced at 3 institutions, whereas the remaining 6 were sequenced at 2). Blue bars represent discordant SNVs, whereas the red box indicates a deletion. The numbers denote the total genomic differences observed between institutions. (B) Hierarchical clustering of 990 EBV genomes based on nucleotide variations. Alongside the hierarchical clustering, 2 PCs, countries of origin, associated diseases, assigned cluster numbers, EBV types (1/2), BALF2 mutations, and the presence of SVs are displayed. (C) Frequency of type 2 EBV genomes across different disease categories. B-cell–associated diseases exhibit a higher frequency of type 2 EBV compared with T/NK and epithelial cell (epi) diseases. (D) LD mapping across the EBV genome, with genome coordinates referenced to NC_007605.1. The W repeat region is masked in gray. (E) LD trend as a function of genomic distance. (F) Geometric distribution of single nucleotide variant frequency, the fraction of synonymous variants, and nucleotide alteration patterns. ∗P < .05. AITL, angioimmunoblastic T-cell lymphoma; GC, gastric carcinoma; HLH, hemophagocytic lymphohistiocytosis; LD, linkage disequilibrium; PC, principal component; T/NK-cell, T cell/natural killer cell.
Figure 2.
Figure 2.
Convergent mutations and their hot spots. (A) Distribution of convergent mutations in cluster 1 (comprising samples primarily from Japan). Convergent SNVs are illustrated in red, whereas other SNVs are in gray. (B) Classification of convergent and other SNVs. Synonymous SNVs were less frequent among convergent SNVs compared with other SNVs. (C) Distribution of convergent SNVs within EBNA3B. (D) Amino acid substitutions resulting from convergent SNVs in the core central homology domain of EBNA3B. Identical amino acids among the 3 EBNA3 homologs are illustrated in red and green. Blue and yellow lines represent predicted antigenic peptides and critical residues for RBP/J binding, respectively. (E) Mutual exclusivity of convergent SNVs in the EBNA3B core central homology domain. Ten distinct SNVs (each affecting 2-35 samples) were distributed across 91 samples without any overlap. (F) Monte Carlo simulation of mutual exclusivity. Random distribution of 10 distinct SNVs across 366 samples (in cluster 1) resulted in the expectation that some samples might harbor 2 or more SNVs. Under 10 million simulation trials, the likelihood of no overlap was calculated as 4.92 × 10−5. (G) Convergent SNVs affecting LMP1. A relatively large number of convergent SNVs were found within the transmembrane (TM) 2 domain. (H) Amino acid substitutions from convergent SNVs in the TM 2 domain. Hydrophobic amino acid residues are indicated in red. (I) Convergent SNVs upstream of LMP1. Among the 3 known promoters, pED-L1 was most frequently affected by convergent SNVs. (J) Convergent SNVs in EBNA3A. The upper histogram illustrates convergent SNVs found in cluster 1 (mainly Japan), whereas the lower histogram illustrates those found in cluster 2 (mainly China). GC, gastric carcinoma.
Figure 3.
Figure 3.
Intragenic deletions shape EBV genomes across disease landscapes. (A) Comprehensive overview of intragenic deletions (>50 bases) identified across diverse diseases. Each gray line represents the EBV genome of an individual patient. Filled bars denote complete deletions (with no remaining alleles), whereas open bars represent partial deletions (retaining some alleles). The affected viral components are indicated to the right of each EBV genome. A histogram summarizes the frequency of deletions in specific genomic regions, with blue and gray bars representing complete and partial deletions, respectively. The color codes for diseases (consistent with those used in Figure 1A) and EBV genome components are also indicated. (B) Distribution of deletion lengths across various disease categories, visualized with violin plots. Box plots within the violins illustrate interquartile ranges (the edges represent the 25th and 75th percentiles, whereas the inner bar indicates the median). ∗P < .01 (.00014 for CAEBV+DLBCL+ENKL+BL+HL vs IM+PTLD; .00197 for CAEBV+DLBCL+ENKL+BL+HL vs epithelial cell malignancies; .00181 for CAEBV+DLBCL+ENKL+BL+HL vs healthy donors). Other, other hematological diseases; epithelial, epithelial cell malignancies; healthy, healthy donors. (C) Disease-specific frequencies of deletions illustrating both overall deletions and those affecting specific viral components. (D-I) Zoomed-in views of specific EBV genomic regions highlighting deletion patterns. Numbers in these panels correspond to the individual miRNAs within the BARTmiR clusters. Color coding follows the same scheme as in panel A for consistency and clarity. BARTmiR, BART miRNAs; essential, essential genes for virion production; OriP, replication origin involved in latent infection; oriLyt, replication origin involved in lytic infection.
Figure 4.
Figure 4.
SVs in EBV genomes. (A-C) Representative examples of intragenic deletions. Integrative genomics viewer images display intragenic deletions within EBV genomes from 3 CAEBV patients. Histograms depict read coverage, with individual gray lines representing sequence reads. (A) In patient UPN257, a near-complete deletion is observed, with minimal evidence of retained alleles. (B) In patient UPN404, partial retention of reads within the deleted region suggests the presence of undeleted alleles. Mean read coverage within the deletion (150×) amounts to 3.5% of the genome-wide average (4264×). (C) Patient UPN417 exhibits a different pattern, with mean read coverage within the deletion reaching 16.8% of the genomic average. (D) Partial deletions affecting EBNA1. Five partial deletions targeting EBNA1 were identified, each involving at least half of the EBV genome. All deletions retained the essential replication origin, oriP. (E) Schematic of EBNA1 interaction between viral and host genomes. EBNA1 binds to both the oriP region within the viral genome and a specific site in the host genome. This binding is crucial for viral genome stability, ensuring that viral replication occurs alongside host DNA during cell division. Consequently, every EBV genome copy must possess oriP, and at least 1 functional EBNA1 copy per cell is required for viral persistence. (F) Coexistence of complete and partial genomes. In patients UPN920, UPN1213, and eBL-Tumor-0007, ∼50% of the remaining allele fraction suggests a balanced coexistence of complete and partial EBV genomes within each cell. (G) Predominance of partial genomes. In contrast, patients UPN1301 and UPN1306 exhibited a lower remaining allele fraction (∼10%), indicating a predominance of partial genomes over complete ones. (H) Genomic inversions. Horizontal lines represent individual EBV genomes, with arcs indicating inversions. In total, 12 EBV genomes harbored inversions. The locations of the C promoter and EBNA genes are also illustrated. In the top 6 genomes (marked with asterisks), inversions seem to disrupt the connection between the C promoter and the EBNA genes. (I) Integration into the human genome. Four EBV genomes, all derived from Burkitt lymphoma patients, were found integrated into the human genome. The GK_BL29 EBV genome integrated into intron 3 of the ANKMY1 gene, potentially forming a fusion between RPMS1 within the viral genome and ANKMY1 in the host genome. The GK_Farage EBV genome integrated within the immunoglobulin heavy chain (IGH) locus. The Namalwa EBV genome integrated into intron 4 of the PPIEL gene. Instances where repetitive sequences obscured the precise location of the SV are marked with an asterisk. Last, the eBL-Tumor-0031 EBV genome integrated into chromosome 12q21.31, a region devoid of annotated genes within a 100-kb radius. GC, gastric carcinoma.
Figure 5.
Figure 5.
Transcriptional analysis of EBNA3B. (A) Summary of deletions and amino acid changes identified in EBNA3B. Color coding for diseases is the same as in Figure 3A. (B) Summary of nonsense and frameshift mutations identified in EBNA3B. The c.2653dupG and c.2804_2807delCGAA mutations were identified in 2 samples. (C) t-SNE plot illustrating gene expression profiles of LCLs established with WT EBV (gray), EBNA3B-deficient EBV (dEBNA3B, blue), and revertant EBV (orange). (D) Volcano plot comparing gene expression between WT- and dEBNA3B-LCLs. Red circles represent genes with at least one EBNA3B ChIP-seq peak within 100 kb of their transcription start sites. (E) Frequency of EBNA3B ChIP-seq peaks in upregulated and downregulated genes. (F) Gene set enrichment analysis comparing WT- and dEBNA3B-LCLs, highlighting the enrichment of a gene set frequently deleted in glioblastoma (TCGA_GLIOBLASTOMA_COPY_NUMBER_DN). (G) Heat map and dot plots illustrating differential gene expression among the 3 LCLs. Gene symbols highlighted in yellow denote established tumor suppressor genes. (H) Enriched gene set (SOTIRIOU_BREAST_CANCER_GRADE_1_VS_3_UP) identified in WT- and dEBNA3B-LCLs. (I) Gene set enrichment analysis (PTEN_DN.V2_UP) suggesting possible downregulation of the PTEN pathway in dEBNA3B-LCLs. (J) Gene set enrichment analysis (RB_P107_DN.V1_UP) suggesting downregulation of the RB pathway in dEBNA3B-LCLs. FDR, false discovery rate; FWER, family-wise error rate; NES, normalized enrichment score.
Figure 5.
Figure 5.
Transcriptional analysis of EBNA3B. (A) Summary of deletions and amino acid changes identified in EBNA3B. Color coding for diseases is the same as in Figure 3A. (B) Summary of nonsense and frameshift mutations identified in EBNA3B. The c.2653dupG and c.2804_2807delCGAA mutations were identified in 2 samples. (C) t-SNE plot illustrating gene expression profiles of LCLs established with WT EBV (gray), EBNA3B-deficient EBV (dEBNA3B, blue), and revertant EBV (orange). (D) Volcano plot comparing gene expression between WT- and dEBNA3B-LCLs. Red circles represent genes with at least one EBNA3B ChIP-seq peak within 100 kb of their transcription start sites. (E) Frequency of EBNA3B ChIP-seq peaks in upregulated and downregulated genes. (F) Gene set enrichment analysis comparing WT- and dEBNA3B-LCLs, highlighting the enrichment of a gene set frequently deleted in glioblastoma (TCGA_GLIOBLASTOMA_COPY_NUMBER_DN). (G) Heat map and dot plots illustrating differential gene expression among the 3 LCLs. Gene symbols highlighted in yellow denote established tumor suppressor genes. (H) Enriched gene set (SOTIRIOU_BREAST_CANCER_GRADE_1_VS_3_UP) identified in WT- and dEBNA3B-LCLs. (I) Gene set enrichment analysis (PTEN_DN.V2_UP) suggesting possible downregulation of the PTEN pathway in dEBNA3B-LCLs. (J) Gene set enrichment analysis (RB_P107_DN.V1_UP) suggesting downregulation of the RB pathway in dEBNA3B-LCLs. FDR, false discovery rate; FWER, family-wise error rate; NES, normalized enrichment score.

Comment in

References

    1. Young LS, Yap LF, Murray PG. Epstein–Barr virus: more than 50 years old and still providing surprises. Nat Rev Cancer. 2016;16(12):789–802. - PubMed
    1. Donzel M, Bonjour M, Combes JD, et al. Lymphomas associated with Epstein-Barr virus infection in 2020: results from a large, unselected case series in France. EClinicalMedicine. 2022;54 - PMC - PubMed
    1. Khan G, Fitzmaurice C, Naghavi M, Ahmed LA. Global and regional incidence, mortality and disability-adjusted life-years for Epstein-Barr virus-attributable malignancies, 1990-2017. BMJ Open. 2020;10(8) - PMC - PubMed
    1. Morales-Sánchez A, Fuentes-Panana EM. The immunomodulatory capacity of an Epstein-Barr virus abortive lytic cycle: potential contribution to viral tumorigenesis. Cancers (Basel) 2018;10(4):98. - PMC - PubMed
    1. Okabe A, Huang KK, Matsusaka K, et al. Cross-species chromatin interactions drive transcriptional rewiring in Epstein-Barr virus-positive gastric adenocarcinoma. Nat Genet. 2020;52(9):919–930. - PubMed

MeSH terms