Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2013:4:2513.
doi: 10.1038/ncomms3513.

The landscape of viral expression and host gene fusion and adaptation in human cancer

Affiliations

The landscape of viral expression and host gene fusion and adaptation in human cancer

Ka-Wei Tang et al. Nat Commun. 2013.

Abstract

Viruses cause 10-15% of all human cancers. Massively parallel sequencing has recently proved effective for uncovering novel viruses and virus-tumour associations, but this approach has not yet been applied to comprehensive patient cohorts. Here we screen a diverse landscape of human cancer, encompassing 4,433 tumours and 19 cancer types, for known and novel expressed viruses based on >700 billion transcriptome sequencing reads from The Cancer Genome Atlas Research Network. The resulting map confirms and extends current knowledge. We observe recurrent fusion events, including human papillomavirus insertions in RAD51B and ERBB2. Patterns of coadaptation between host and viral gene expression give clues to papillomavirus oncogene function. Importantly, our analysis argues strongly against viral aetiology in several cancers where this has frequently been proposed. We provide a virus-tumour map of unprecedented scale that constitutes a reference for future studies of tumour-associated viruses using transcriptome sequencing data.

PubMed Disclaimer

Figures

Figure 1
Figure 1. Unbiased detection of viral expression in 4,433 tumours.
(a) Analysis pipeline. Non-human reads were matched to a database of 3,590 RefSeq viral genomes, that was complemented with 12 additional known and 2 partial novel genomes detected by de novo assembly of viral reads. (b) Included cancer types and statistics. Bar graphs show fraction of tumours with strong viral expression (>10 p.p.m. viral reads in library) as well as weaker detections (2–10 p.p.m.). (c) Relative numbers of positive tumours for major virus categories, with strong and weak detections shown separately.
Figure 2
Figure 2. RNA expression and host–virus fusion for 28 viruses detected in 178 tumours.
(a) RNA-seq-derived expression levels for 28 viruses (vertical axis) detected at >2 p.p.m of total library reads in at least one tumour, across 178 virus-positive tumours from 19 cancer types (horizontal axis). Viruses identified only because of sequence similarity with related strains were not included. (b) In addition to viral gene expression, genomic viral integration may have functional consequences. A large fraction of positive tumours identified in a carried viral integrations (top row), as evidenced by host–virus fusion transcripts in paired-end RNA-seq. Some genes showed recurrent integration in multiple tumours (six bottom rows). Integrations were quasi-randomly distributed across the genome (bottom chromosome plot) with some preferred loci. Select genes are shown for cytobands with recurrent integrations (number of tumours in parentheses). n/a, no paired-end data available.
Figure 3
Figure 3. Distribution of viral expression levels across HPV- and HBV-positive tumours.
The histograms show viral expression levels (FVR) for 138 HPV-positive (a) and 12 HBV-positive (b) tumours in 100-p.p.m. intervals.
Figure 4
Figure 4. Integrations in HNSC colocalize with DNA copy-number breakpoints.
One hundred and ten HPV integration clusters (31 unique integrations) were compared with copy-number breakpoints determined using segmented Affymetrix SNP6 microarray data from TCGA. The distance to the nearest breakpoint was calculated for each cluster, and the observed distribution was tested for non-random colocalization by comparing with a uniform random integration model (P<1e−8 based on 1e−8 randomizations; 100 shown). Integration clusters (41.8%) were within 10 kb, whereas the random expectation was <0.5%. Ten kilobases are close to the SNP6 mapping resolution (average probeset spacing ~3 kb).
Figure 5
Figure 5. Fusion is associated with altered expression of recurrent target genes.
(a) Expression levels of ERBB2 (n=2), PVT1 (n=3), LOC727677 (n=3) and RAD51B (n=3) were typically altered in CESC tumours with HPV integration, as evidenced by host–virus fusion. P-values were calculated using Student’s t-test. (b) Similar to a, but for LIHC samples with and without HBV integration in MLL4 (n=3) and FN1 (n=2). In the box plots, the central mark is the median and the box edges are the 25th and 75th percentiles.
Figure 6
Figure 6. Host gene expression and virus–host coadaptation.
(a) Five hundred and ninety-seven host genes were associated with HPV status in HNSC, at a false discovery rate (q)<0.05 and with an absolute log2 median expression ratio >2. Known cancer genes in the Cancer Gene Census are indicated. The colour code indicates log2-transformed mRNA levels relative to the overall median. (b) PCA analysis of tumour mRNA expression profiles in CESC, HNSC and BLCA. Although there were systematic expression differences between cancer types, HPV-positive tumours clustered together regardless of type. (c) HPV-positive CESC tumours were subdivided by their viral gene expression patterns: E7-, E6/E7- and E4/E5/E7-expressing tumour subsets were tested for differential expression of host genes relative to remaining samples. One hundred and twenty host genes were differentially expressed in the E6/E7 subset, using criteria described above. (d) Validation of the E6/E7 signature. Most of the 120 genes were consistently induced/repressed in E6/E7 compared with E7 samples, also when only considering HPV16 (red)- or HPV18 (green)-positive tumours. In addition, most genes in the signature showed consistent expression changes in HNSC E6/E7 compared with E6 tumours (blue). E6*, truncated and probably non-functional E6 open reading frame.

References

    1. Moore P. S. & Chang Y. Why do viruses cause cancer? Highlights of the first century of human tumour virology. Nat. Rev. Cancer 10, 878–889 (2010). - PMC - PubMed
    1. Williams R. Global challenges in liver disease. Hepatology 44, 521–526 (2006). - PubMed
    1. Strong K., Mathers C., Epping-Jordan J., Resnikoff S. & Ullrich A. Preventing cancer through tobacco and infection control: how many lives can we save in the next 10 years? Eur. J. Cancer Prev. 17, 153–161 (2008). - PubMed
    1. Feng H., Shuda M., Chang Y. & Moore P. S. Clonal integration of a polyomavirus in human Merkel cell carcinoma. Science 319, 1096–1100 (2008). - PMC - PubMed
    1. Arora R., Chang Y. & Moore P. S. MCV and Merkel cell carcinoma: a molecular success story. Curr. Opin. Virol. 2, 489–498 (2012). - PMC - PubMed

Publication types

MeSH terms