Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Feb 4;10(1):575.
doi: 10.1038/s41467-019-08333-8.

Physical principles of retroviral integration in the human genome

Affiliations

Physical principles of retroviral integration in the human genome

D Michieletto et al. Nat Commun. .

Abstract

Certain retroviruses, including HIV, insert their DNA in a non-random fraction of the host genome via poorly understood selection mechanisms. Here, we develop a biophysical model for retroviral integration as stochastic and quasi-equilibrium topological reconnections between polymers. We discover that physical effects, such as DNA accessibility and elasticity, play important and universal roles in this process. Our simulations predict that integration is favoured within nucleosomal and flexible DNA, in line with experiments, and that these biases arise due to competing energy barriers associated with DNA deformations. By considering a long chromosomal region in human T-cells during interphase, we discover that at these larger scales integration sites are predominantly determined by chromatin accessibility. Finally, we propose and solve a reaction-diffusion problem that recapitulates the distribution of HIV hot-spots within T-cells. With few generic assumptions, our model can rationalise experimental observations and identifies previously unappreciated physical contributions to retroviral integration site selection.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Fig. 1
Fig. 1
DNA elasticity biases HIV integration in nucleosomes. a Model for tDNA and vDNA as diffusing bead-spring polymers with bending rigidity. The potentials associated with bending (Ub(θ)), steric/attractive interactions (ULJ) and stretching of the bonds (Uh) contribute to the energy E of a given configuration. b A quasi-equilibrium stochastic integration event takes into account the energy before and after the integration event (E(Ω) and E(Ω′), respectively) to determine a successful integration probability p = min{exp(−ΔE/kBT), 1}. c The integration profile Pint(x) as a function of the relative tDNA site, x = n/L, displays a ~4-fold enhancement in the region wrapped around the histone-like protein. The same behaviour is observed when a kinked site (corresponding to the intasome flanked by LTR) is included in the vDNA. Considering flexible tDNA (lp = 30 nm) weakens this preference. d Direct quantitative comparison of relative integration enhancement with the data reported in ref. . The integration profiles in c are averages over 1000 independent simulations and the dynamics of the simulated process can be seen in Supplementary Movies 1 and 2. Source data are provided as a Source Data file
Fig. 2
Fig. 2
Local chromatin structure affects nucleosomal integration. a Snapshot of an open chromatin fibre composed of 10 nucleosomes. Nearest-neighbour (nn) attraction between the histone-like particles induce partially folded (nnp, (b)) and fully condensed (nnf, (c)) structures. Next-nearest-neighbour attractions lead to zig–zagging fibres (nnn, (d)). e The integration probability Pint(x) as a function of the relative tDNA site x = n/L displays peaks whose location depend on the compaction level. Open fibres are integrated mostly within nucleosomes while folded arrays also display peaks within linker DNA. f The bending energy profile shows that fibres with nearest-neighbour attractions (nnp and nnf) but not those with next-nearest-neighbour attraction, display stress within linker DNA. This only partially explains why these regions are targeted within these chromatin structures. g The ratio of nucleosomal versus linker DNA integrations suggest that not only energy barrier but also dynamic accessibility plays a role in determining the integration profiles (the expected value for random integration 200/90 = 2.2 is shown as a dashed line). h The number of successful integration events over the total simulation time, nint, decreases with chromatin condensation. In all cases, the fibre is reconstituted independently before performing the quasi-equilibrium stochastic integration. Data are generated by averaging over 2000 independent integration events. See Supplementary Movies 3–6 for the full dynamics. Source data are provided as a Source Data file
Fig. 3
Fig. 3
Large-scale 3D chromatin folding enhances euchromatic integration. a Pictorial representation of our coarse-grained model which describes chromatin as a fibre with epigenetic marks. These marks dictate 3D folding by self-association through proteins and transcription factors,. b Snapshots of our polymer model where the fraction of heterochromatin is set at ϕhet = 70%. We show two typical configurations, before and after integration events. c The integration probability displays a strong enrichment in euchromatic regions. df Simulations of a 5 Mbp region of human chromosome 11 (46–51 Mbp) modelled at 1 kb resolution with a polymer N = 5000 beads long. d In this model, expression level in Jurkat T-cells and GC content are used to label beads as euchromatic (red) or heterochromatic (blue), respectively. We assign attractive interactions (ϵ=3 kBT) between heterochromatin beads so that the fixed epigenetic pattern guides the folding of the chromatin fibre (see snapshots). Steady-state conformations are then used as hosts for n = 500 integration events of a 10 kbp viral DNA. e Comparison between the distribution of integration sites in folded and unfolded chromatin conformations. The latter is obtained by assigning no self-attraction between heterochromatin beads. Viral integration within unfolded chromatin is uniform (Pint = 1/n, dashed line) while it is not uniform (i.e. non-random) on folded chromatin. f Comparison between simulated and experimentally measured distribution of integration sites in Jurkat T-cells. The agreement between simulations and experiments is highly significant, with a p-value p < 0.001 when a Spearman Rank is used to test the null-hypothesis that the distributions are independent. This result can be compared with the p-value p = 0.6 obtained when the same test is performed to test independence of the integration profiles in experiments and unfolded chromatin. The dynamics corresponding to one of our simulations is shown in full in Supplementary Movie 7. Source data are provided as a Source Data file
Fig. 4
Fig. 4
Integration is slowed down in cells with large heterochromatin content. a, b Snapshots and probability distribution for a system with ϕhet = 30 and 80%, respectively. c Fraction of integrated loops fint as a function of time, for different levels of heterochromatin. In the inset, the integration time Tint defined as fint(Tint) = 0.5 is shown to (super-)exponentially increase as a function of ϕhet. d Integration probability in state s, with s being either euchromatin (red) or heterochromatin (blue), normalised by the fraction of the host polymer in s, ϕs. These show that the larger the fraction of heterochromatin, the more likely it is for a virus to be integrated in euchromatin. Source data are provided as a Source Data file
Fig. 5
Fig. 5
HIV integration hot-spots are affected by nuclear organisation. a, c, e Different cell lines display different chromatin organisations at the nuclear scale. a Shows a typical differentiated cells, modelled as a sphere with 3 concentric shells of equal volume. Zones 1 and 3 are populated by facultative and constitutive heterochromatin, respectively. Zone 2, the middle layer, is populated by euchromatin. This configuration may be viewed as an angularly averaged model and it is appropriate to study HIV integration in population averages. c Shows the model for a ‘retinal’ cell, where the two outer layers are inverted. e Shows the model for a realistic population of T-cells (typical configuration of a single cell is shown in the inset). Here the location of the boundaries between zones 1 and 2, and between zones 2 and 3, is varied to account for local density variations and cell-to-cell fluctuations (see text). b, d, f Nuclear distribution of HIV integration sites in b differentiated cells, d retinal cells and f T-cells. The result with uniform D and κ (yielding l ≃ 2.23 μm) is shown in grey in each panel. The number of integrations at distance r, ρint(r), is divided by the area of the shell, g(r) = 4πr2dr, and normalised to unity. Filled squares in f denote data from ref. . Source data are provided as a Source Data file

References

    1. Craigie R, Bushman FD. HIV DNA integration. Cold Spring Harb. Persp. Med. 2012;2:1–18. - PMC - PubMed
    1. Alberts, B., Johnson, A., Lewis, J., Morgan, D. & Raff, M. Molecular Biology of the Cell (Taylor & Francis, United Kingdom, 2014).
    1. Lusic M, Siliciano RF. Nuclear landscape of HIV-1 infection and integration. Nat. Rev. Microbiol. 2017;15:69–82. doi: 10.1038/nrmicro.2016.162. - DOI - PubMed
    1. Dewannieux M, et al. Identification of an infectious progenitor for the multiple-copy HERV-K human endogenous retroelements. Genome Res. 2006;16:1548–1556. doi: 10.1101/gr.5565706. - DOI - PMC - PubMed
    1. Griffiths DJ. Endogenous retroviruses in the human genome sequence. Genome Biol. 2001;2:reviews1017.1–1017.5. doi: 10.1186/gb-2001-2-6-reviews1017. - DOI - PMC - PubMed

Publication types