Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Dec 11;115(50):12565-12572.
doi: 10.1073/pnas.1814589115. Epub 2018 Nov 19.

ERVmap analysis reveals genome-wide transcription of human endogenous retroviruses

Affiliations

ERVmap analysis reveals genome-wide transcription of human endogenous retroviruses

Maria Tokuyama et al. Proc Natl Acad Sci U S A. .

Abstract

Endogenous retroviruses (ERVs) are integrated retroviral elements that make up 8% of the human genome. However, the impact of ERVs on human health and disease is not well understood. While select ERVs have been implicated in diseases, including autoimmune disease and cancer, the lack of tools to analyze genome-wide, locus-specific expression of proviral autonomous ERVs has hampered the progress in the field. Here we describe a method called ERVmap, consisting of an annotated database of 3,220 human proviral ERVs and a pipeline that allows for locus-specific genome-wide identification of proviral ERVs that are transcribed based on RNA-sequencing data, and provide examples of the utility of this tool. Using ERVmap, we revealed cell-type-specific ERV expression patterns in commonly used cell lines as well as in primary cells. We identified 124 unique ERV loci that are significantly elevated in the peripheral blood mononuclear cells of patients with systemic lupus erythematosus that represent an IFN-independent signature. Finally, we identified additional tumor-associated ERVs that correlate with cytolytic activity represented by granzyme and perforin expression in breast cancer tissue samples. The open-source code of ERVmap and the accompanied web tool are made publicly available to quantify proviral ERVs in RNA-sequencing data with ease. Use of ERVmap across a range of diseases and experimental conditions has the potential to uncover novel disease-associated antigens and effectors involved in human health that is currently missed by focusing on protein-coding sequences.

Keywords: RNA sequencing; cancer; endogenous retroviruses; lupus; retroelements.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Fig. 1.
Fig. 1.
Cell-type–specific ERV expression in cell lines. (A) Description of cell lines used in the ERV analysis. (B) Histogram of the amount of reads attributed to each of the 3,220 ERV loci sorted in order of highest to lowest expressed ERVs for each cell line. (C) Sum of all ERV reads per cell line compared across cell types. (D) Heatmap of ERVs that are expressed across indicated cell types. ERVs with zero reads across all cell lines were excluded. A total of 1,704 ERVs are displayed. Two-dimensional t-SNE analysis (E) and PCA (F) of ERVs expressed by indicated cell types using the same set of 1,704 ERVs as in D. t-SNE analysis was performed using a perplexity of 30 and maximum iteration of 1,000. N/A, cell assignment not possible due to multiple cell lines expressing the same exact amount of the particular ERV.
Fig. 2.
Fig. 2.
Cell-type–specific ERV expression in primary cells. (A) Cell types and associated information for each sample used in the ERV analysis. (B) Histogram of the amount of reads attributed to each of the 3,220 ERV loci sorted in order of highest to lowest expressed ERVs for each cell type. For cell types with multiple samples, the average number of reads per locus was plotted. (C) Sum of all ERV reads per sample compared across cell types. For cell types with multiple datasets, the average and SEM are graphed. (D) Heatmap of ERVs that are expressed across indicated cell types. The 500 most varying ERVs were used for the analysis to reduce noise. Two-dimensional t-SNE analysis (E) and PCA (F) of ERVs expressed by indicated cell types using the same set of 500 ERVs as in D. t-SNE analysis was performed using a perplexity of 30 and maximum iteration of 1,000.
Fig. 3.
Fig. 3.
Patients with SLE have elevated ERV expression. (A) A volcano plot depicting differential expression of all 3,220 ERVs. Red ERVs are significantly elevated in SLE patients compared with healthy controls (padj < 0.05, log2 fold-change > 1.0). The top 30 significantly elevated ERVs are indicated by their names. (B) Comparison of the sum of all significantly different ERV reads between healthy and SLE donors (SLE, n = 20; healthy, n = 6). Error bars represent SEM and nonparametric Mann–Whitney U test was performed to calculate significance. ***P < 0.001. (C) Heatmap of the 124 significantly elevated ERVs in SLE patients compared with healthy controls as determined by using a cut-off of padj < 0.05. (D) Heatmap of the sum of reads for significantly elevated ERVs and the sum of reads for ISGs per patient sample.
Fig. 4.
Fig. 4.
ERVmap reveals additional breast cancer-associated ERVs that correlate with cytolytic activity. (A) A volcano plot depicting all 3,220 ERVs. Blue ERVs are significantly repressed and red ERVs are significantly elevated in breast cancer tissues compared with healthy controls (padj < 0.05, log2 fold-change < −1.5 or > 1.5). The top 30 significantly elevated and repressed ERVs are indicated by their names. (B) Normalized DESeq ERV read counts for previously reported TSERVs or the top 10 significantly elevated or significantly repressed ERVs identified in this report (C) are plotted as dot plots for the indicated ERVs for each sample (brca, breast cancer n = 1,246, red; normal n = 221, gray). (D) Spearman’s r correlation was calculated between the significantly elevated or repressed ERVs (padj < 0.05, log2 fold-change > 1.5 or < −1.5) and the average expression level of granzyme and perforin (CYT) for all breast cancer tissue samples. (*P < 0.05; **P < 0.01; ***P < 0.001; ****P < 0.0001).

Comment in

Similar articles

Cited by

References

    1. Virgin HW, Wherry EJ, Ahmed R. Redefining chronic viral infection. Cell. 2009;138:30–50. - PubMed
    1. Xu F, et al. Trends in herpes simplex virus type 1 and type 2 seroprevalence in the United States. JAMA. 2006;296:964–973. - PubMed
    1. Staras SAS, et al. Seroprevalence of cytomegalovirus infection in the United States, 1988-1994. Clin Infect Dis. 2006;43:1143–1151. - PubMed
    1. Barton ES, et al. Herpesvirus latency confers symbiotic protection from bacterial infection. Nature. 2007;447:326–329. - PubMed
    1. Furman D, et al. Cytomegalovirus infection enhances the immune response to influenza. Sci Transl Med. 2015;7:281ra43. - PMC - PubMed

Publication types

MeSH terms