Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Feb 26;12(1):1316.
doi: 10.1038/s41467-021-21612-7.

Retroviral integrations contribute to elevated host cancer rates during germline invasion

Affiliations

Retroviral integrations contribute to elevated host cancer rates during germline invasion

Gayle K McEwen et al. Nat Commun. .

Abstract

Repeated retroviral infections of vertebrate germlines have made endogenous retroviruses ubiquitous features of mammalian genomes. However, millions of years of evolution obscure many of the immediate repercussions of retroviral endogenisation on host health. Here we examine retroviral endogenisation during its earliest stages in the koala (Phascolarctos cinereus), a species undergoing germline invasion by koala retrovirus (KoRV) and affected by high cancer prevalence. We characterise KoRV integration sites (IS) in tumour and healthy tissues from 10 koalas, detecting 1002 unique IS, with hotspots of integration occurring in the vicinity of known cancer genes. We find that tumours accumulate novel IS, with proximate genes over-represented for cancer associations. We detect dysregulation of genes containing IS and identify a highly-expressed transduced oncogene. Our data provide insights into the tremendous mutational load suffered by the host during active retroviral germline invasion, a process repeatedly experienced and overcome during the evolution of vertebrate lineages.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Fig. 1
Fig. 1. Overview of IS detected in healthy and tumour tissues of ten koalas.
The name and sex of each koala is given, with age and year of birth in square brackets (age and year of birth are estimated from tooth wear). Number of IS detected in all tissues are shown in green for each koala and represent ERVs, whereas tumour-specific IS (shown in orange circles) represent somatic integrations in tumour tissues. Koalas sharing many IS are grouped (see Fig. 3), with the numbers of shared IS given in blue boxes (dark blue = total IS shared, including IS shared with other koalas; light blue = IS shared only between given koalas; lines connect koalas that share IS and do not represent specific family relationships). Outside of these groups, koalas share nine or fewer IS. Examples of genes containing IS shared only between indicated koalas are given (with genes containing IS in exons shown in boldface).
Fig. 2
Fig. 2. qPCR expression of genes containing tumour-specific KoRV IS and structure of KoRV containing transduced oncogene.
a Expression of a set of genes containing tumour-specific IS. Quantitative PCR expression levels (∆∆Ct values) are shown, per gene, for tumour tissue from the individual with the tumour-specific IS (square icon), and for tumour (triangles) or healthy tissues (circles) from other koalas (colour-coded). All qPCRs were performed in triplicate and the mean used for analysis. Boxplots indicate the interquartile range (IQR) with the centre line showing the median; whiskers show 1.5 × IQR. Two zoo koalas (Bilyarra and Mirali) not collected for the study of neoplasms were also used as controls, with Bilyarra used as a baseline for ∆∆Ct values (i.e. for each gene all ∆Ct values were divided by that of Bilyarra, so Bilyarra does not appear in the graph). Genes with significant changes in expression were: RASSF2 (P = 0.0023), RNF216 (P = 0.0096), KIF26B (P = 0.0015) and TIAM2 (P = 0.0425) based on t tests of ∆Ct values. The expression of the transduced oncogene BCL2L1 (Bcl-XL) shown on the right side of the graph was also significantly increased (P = 0.0004). The expression levels of BCL2L1 were 500-fold higher in Kathy than the other koalas tested (fold change calculated as 2−∆∆Ct). b Structure of the KoRV containing transduced BCL2L1 detected in Kathy. The two BCL2L1 exons are joined together precisely at the splice junction and are flanked by the beginning and end of the KoRV env gene. The BCL2L1 3′ UTR joins the 5′ and 3′ ends of env via micro-homologies (CAG and CCTCCC, respectively). Most of gag and pol are also replaced by a sequence similar to the 3′ UTR of koala gene ZBTB18.
Fig. 3
Fig. 3. Sharing of KoRV IS.
a Numbers of IS unique to one koala (either tumour-specific IS or IS in healthy tissue in a single koala) compared to those shared in PCA groups (shown in c). The single IS shared by ten koalas is located in the gene SLC29A1, an equilibrative nucleoside transporter. The three IS shared by five koalas were located 1 kb downstream of MAP4K4, in a 140 kb region between the genes CTNNBL1 and SRC, and in an intron of RCAN2. Insert shows percentages in the categories according to genomic location. Shared (other) refers to all IS shared between two or more koalas that are not in PCA groups. IS shared within the PCA groups are more likely to be in exons that those shared outside the groups. b From published RNA-seq data, the expression of SLC29A1 in lymph nodes was found to be differentially expressed in northern koalas from Queensland (QLD) compared to koalas from South Australia (SA); this was the most significantly differentially expressed annotated gene between the two koala populations (using the Limma R package for differential expression analysis with false discovery rate used for multiple test correction: log fold change = 3.65, P = 1.35 × 10−5). Boxplots indicate the interquartile range (IQR) with the centre line showing the median; whiskers show 1.5 × IQR. c Sharing reflects the geographic proximity of the koalas with principle component analysis showing groups which correspond to proximity. Note: PCA analysis was carried out without prior knowledge of the geographic origin of each koala and led us to request the location data of the samples. d Provenance of the koalas shown on a map of coastal region bordering Queensland and New South Wales, Australia (~130 km north and south of Brisbane is shown). e Plot of number of shared IS versus geographical distance with pairs of koalas sharing the highest numbers of IS highlighted.
Fig. 4
Fig. 4. Regions with significantly high IS density (hotspots) in the koala genome.
IS locations are shown with arrows. Regions longer than the identified hotspot windows are displayed to show the surrounding genes; locations of hotspots are indicated by thick back line on contig scale bar. Long non-coding RNAs are labelled “lnc” and genes without annotation are unlabelled. All regions contain multiple IS that are found in both healthy and tumour tissue (ERVs) (red arrows). Regions (ac) contain IS that are conserved in multiple koalas (purple arrows). Regions (a, d, e) contain tumour-specific IS (pink arrows). All regions contain genes with associations to cancer, namely a, TM9SF2; b, PPFIBP1; c, SPOCK1; d, AHI1 and MYB; e, MYC. There may also be positive benefits for increased expression of some genes in these regions; for example, a also contains a gene involved in vitamin metabolism (CLYBL), and the cancer genes PPFIBP1, SPOCK1 and AHI1 in regions bd, respectively, have roles in body weight and feeding behaviour. Two other identified hotspots are not shown here because they were not in close proximity to protein coding genes: one was surrounded by non-coding RNAs and the other was at the very start of a contig (see Supplementary Table 11).
Fig. 5
Fig. 5. Overview of KoRV transformation processes in koalas.
All northern koalas carry endogenous KoRVs (red koala images) which are subject to Mendelian inheritance (purple arrow). Having multiple potentially active ERVs in every cell, with reintegration (blue arrows) leading to additional IS in somatic cells (blue triangles), would confer a tremendous mutational load on the population. Both inherited and new somatic IS can potentially dysregulate oncogenes, leading to an increased risk of transformation, with additional new integrations occurring in the transformed population of cells promoting cancer and metastasis. The potential for KoRV to transduce oncogenes also increases cancer risk. Koalas without endogenous KoRV (grey koala images) will only develop cancers (at a much lower rate) through other mechanisms.

References

    1. Lee A, Nolan A, Watson J, Tristem M. Identification of an ancient endogenous retrovirus, predating the divergence of the placental mammals. Philos. Trans. R. Soc. Lond. B Biol. Sci. 2013;368:20120503. doi: 10.1098/rstb.2012.0503. - DOI - PMC - PubMed
    1. Sokol M, Wabl M, Ruiz IR, Pedersen FS. Novel principles of gamma-retroviral insertional transcription activation in murine leukemia virus-induced end-stage tumors. Retrovirology. 2014;11:36. doi: 10.1186/1742-4690-11-36. - DOI - PMC - PubMed
    1. Biswas HH, et al. Increased all-cause and cancer mortality in HTLV-II infection. J. Acquir. Immune Defic. Syndr. 2010;54:290–296. doi: 10.1097/QAI.0b013e3181cc5481. - DOI - PMC - PubMed
    1. Tsatsanis C, et al. Genetic determinants of feline leukemia virus-induced lymphoid tumors: patterns of proviral insertion and gene rearrangement. J. Virol. 1994;68:8296–8303. doi: 10.1128/JVI.68.12.8296-8303.1994. - DOI - PMC - PubMed
    1. Nellaker C, et al. The genomic landscape shaped by selection on transposable elements across 18 mouse strains. Genome Biol. 2012;13:R45. doi: 10.1186/gb-2012-13-6-r45. - DOI - PMC - PubMed

Publication types