Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 Apr 8;18(1):286.
doi: 10.1186/s12864-017-3669-7.

A comprehensive hybridization model allows whole HERV transcriptome profiling using high density microarray

Affiliations

A comprehensive hybridization model allows whole HERV transcriptome profiling using high density microarray

Jérémie Becker et al. BMC Genomics. .

Abstract

Background: Human endogenous retroviruses (HERVs) have received much attention for their implications in the etiology of many human diseases and their profound effect on evolution. Notably, recent studies have highlighted associations between HERVs expression and cancers (Yu et al., Int J Mol Med 32, 2013), autoimmunity (Balada et al., Int Rev Immunol 29:351-370, 2010) and neurological (Christensen, J Neuroimmune Pharmacol 5:326-335, 2010) conditions. Their repetitive nature makes their study particularly challenging, where expression studies have largely focused on individual loci (De Parseval et al., J Virol 77:10414-10422, 2003) or general trends within families (Forsman et al., J Virol Methods 129:16-30, 2005; Seifarth et al., J Virol 79:341-352, 2005; Pichon et al., Nucleic Acids Res 34:e46, 2006).

Methods: To refine our understanding of HERVs activity, we introduce here a new microarray, HERV-V3. This work was made possible by the careful detection and annotation of genomic HERV/MaLR sequences as well as the development of a new hybridization model, allowing the optimization of probe performances and the control of cross-reactions. RESULTS: HERV-V3 offers an almost complete coverage of HERVs and their ancestors (mammalian apparent LTR-retrotransposons, MaLRs) at the locus level along with four other repertoires (active LINE-1 elements, lncRNA, a selection of 1559 human genes and common infectious viruses). We demonstrate that HERV-V3 analytical performances are comparable with commercial Affymetrix arrays, and that for a selection of tissue/pathological specific loci, the patterns of expression measured on HERV-V3 is consistent with those reported in the literature.

Conclusions: Given its large HERVs/MaLRs coverage and additional repertoires, HERV-V3 opens the door to multiple applications such as enhancers and alternative promoters identification, biomarkers identification as well as the characterization of genes and HERVs/MaLRs modulation caused by viral infection.

Keywords: Biostatistics; Microarray; Repetitive elements; Transcriptomics.

PubMed Disclaimer

Figures

Fig. 1
Fig. 1
Mains steps of the HERV-V3 array design. The design involved three steps of (a) database creation, where HERV copies were either detected by RepeatMasker using 42 prototypes or reconstructed from Dfam predictions; (b) development of a hybridization model, illustrated by models predictions and observed intensities on Affymetrix probeset associated with CD59 gene; and (c) design of probes and probesets. The difference of annotation level between consensus and prototypes is shown, where LTR subregions and ORFs are only identified in prototypes. It can further be noted that the agreement between observed and predicted intensities increases with the k-mers size and the complexity of spatial information (a more thorough description is provided in the Additional file 3: Figure S1)
Fig. 2
Fig. 2
Platform evaluation. a Pre-processing methods were evaluated on the whole array using the titration response as a function of the fold-change between samples A and B. Probesets were binned according to the fold-change values between A and B. Unlike GCBG-RMA, the three methods RMA-TPRN, RMA and Li-Wong present narrow titration curves, indicative of good performances. The two confounding factors (b) intensity and (c, same colour code as in 2b) probeset size distribution are represented in HERVs/MaLRs, gU133/gHTA and gPEHM compartments: the intensities are lower in HERVs/MaLRs than in genes (gPEHM, gU133/gHTA), reffecting a smaller proportion of expressed loci in the former. The three compartments, HERVs/MaLRs, gU133/gHTA, gPEHM, and downsized gPEHM (dgPEHM) are compared on (d) repeatability (CV) and accuracy measured both by (e) the titration response and (f) the estimated dilution mixture (β^C,β^D). The grey horizontal lines in (f) symbolizes the theoretical mixture values β C and β D. Only probesets differentially expressed between samples A and B (fold-change A/B and B/A > 2, P < 0.01) were used to generate the boxplots in (f). The gene repertoires show similar level of repeatability and accuracy (similar median CVs, titration curves and β^C,β^D distributions), whereas HERVs/MaLRs performances are slightly lower, due to smaller probesets
Fig. 3
Fig. 3
Consistency with Affymetrix design and model validation. Gene expression variation is compared across the three gene compartments based on fold-change correlation (ac) and intersections of genes differentially expressed in the gene repertoires (d). The hybridization model PEHM is evaluated by correlating predicted and observed intensities on gU133 probes (e) and HERV-V2 training set (f)
Fig. 4
Fig. 4
Biological validation. a Intensity heatmap of tissue and pathology specific loci in seven HERV-V3 arrays: the observed intensities correlate well with the expected loci specificity. For each of the eight locus, the family and the probesets names are indicated (the family name and the sub-region annotation are abbreviated in the probeset name). b Distribution of differentially expressed loci (DELs) between hPSCs and embryoid bodies. While most of LDEs are found in MaLR-Dfam, HERV-Dfam and HERV-H, when normalized within family, the proportion of LDEs is higher in HERV-H and HERV-XA34, consistently with Wang et al. [13]. c Intersection between pluripotent loci identified by HERV-V3 and NGS (Wang et al.): despite a small number of shared loci (115), 55.7% of HERV-V3 loci coverage is contained in this intersection

References

    1. Lander ES, Linton LM, Birren B, Nusbaum C, Zody MC, Baldwin J, Devon K, Dewar K, Doyle M, FitzHugh W, Evans GA, Athanasiou M, Schultz R, Patrinos A, Morgan MJ. Initial sequencing and analysis of the human genome. Nature. 2001;409(6822):860–921. - PubMed
    1. Sperber GO, Airola T, Jern P, Blomberg J. Automated recognition of retroviral sequences in genomic data–RetroTector. Nucleic Acids Res. 2007;35(15):4964–4976. doi: 10.1093/nar/gkm515. - DOI - PMC - PubMed
    1. Mager DL, Medstrand P. Retroviral repeat sequences. Chichester: eLS. Wiley; 2005.
    1. Gifford R, Tristem M. The evolution, distribution and diversity of endogenous retroviruses. Virus Genes. 2003;26(3):291–315. doi: 10.1023/A:1024455415443. - DOI - PubMed
    1. Bannert N, Kurth R. Retroelements and the human genome: new perspectives on an old relation. Proc Natl Acad Sci U S A. 2004;101:14572–14579. doi: 10.1073/pnas.0404838101. - DOI - PMC - PubMed

Publication types