Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2014 Nov;21(11):969-75.
doi: 10.1038/nsmb.2895. Epub 2014 Oct 5.

A genome-wide map of adeno-associated virus-mediated human gene targeting

Affiliations

A genome-wide map of adeno-associated virus-mediated human gene targeting

David R Deyle et al. Nat Struct Mol Biol. 2014 Nov.

Abstract

To determine which genomic features promote homologous recombination, we created a genome-wide map of gene targeting sites. We used an adeno-associated virus vector to target identical loci introduced as transcriptionally active retroviral vectors. A comparison of ~2,000 targeted and untargeted sites showed that targeting occurred throughout the human genome and was not influenced by the presence of nearby CpG islands, sequence repeats or DNase I-hypersensitive sites. Targeted sites were preferentially located within transcription units, especially when the target loci were transcribed in the opposite orientation to their surrounding chromosomal genes. We determined the impact of DNA replication by mapping replication forks, which revealed a preference for recombination at target loci transcribed toward an incoming fork. Our results constitute the first genome-wide screen of gene targeting in mammalian cells and demonstrate a strong recombinogenic effect of colliding polymerases.

PubMed Disclaimer

Figures

Figure 1
Figure 1. Genome-wide gene targeting
(a) Experimental design with the inset showing the structures of the AAV2-HSN5′ targeting vector with a neo gene truncated at bp 629, MLV-LHSN63Δ53O target site provirus containing a 53 bp deletion at bp 63 of neo, and control vector MLV-LHSNO with a wild-type neo gene. The locations of the AAV inverted terminal repeats (ITR), retrovirus long terminal repeats (LTR), simian virus 40 (SV40) and Tn5 promoters, transcriptional start sites (arrows), hph and neo genes, and p15A replication origin are indicated. (b) Localized targeted (n = 2,015) and untargeted (n = 1,928) provirus sites are graphed per chromosome as a percentage of all mapped sites. There were no significant differences (P > 0.05, Chi-square test). (c) The locations of mapped sites are shown as red (targeted) or blue (untargeted) circles adjacent to each human chromosome ideogram.
Figure 2
Figure 2. Transcriptional effects on targeting
(a) The percent of targeted and untargeted sites (per kb) found within each interval relative to RefSeq gene transcription start sites. 1180 targeted sites were found in 895 genes, and 1042 untargeted sites were found in 889 genes. (b) The percent of intragenic sites found in genes binned into different expression levels by global gene expression ranking of HT-1080 cells, with low percentile rank indicating a higher expression level. (c) The percent of intragenic targeted and control provirus sites found in opposite transcriptional orientation to the chromosomal gene they are embedded in is shown after ranking and binning genes by expression level. P values were determined by Chi-square test and significant values (*P < 0.05, *P < 0.01,*P < 0.002) are shown by asterisks.
Figure 3
Figure 3. Genome-wide replication fork mapping
(a) Repli-Seq replication timing results are shown for chromosome 1 and aligned to the positions of random, targeted and untargeted sites used in our study. Sequence read tracings are shown for each cell cycle phase (late G1, four subset of S phase, and early G2), as are the weighted averages of these reads. (b) A 15 Mb close-up of these results is shown in the same format, except the target site transcription directions of targeted and of untargeted proviruses are shown with replication fork directions underneath, and transcript orientation relative to fork movement indicated (opposing in red, and same in blue).
Figure 4
Figure 4. DNA replication effects on targeting
(a) The ranked replication time distribution is shown for targeted, untargeted, and random sites, along with the difference between targeted and untargeted sites showing slightly earlier replication of targeted sites. (b) The percent of sites found at different distances from replication initiation zones. There were no statistical differences between targeted and untargeted sites, except for the 100–200 kb window (P = 0.03, Chi-square test). (c) The proportion of sites transcribed in the opposite direction of fork movement is shown at different distances from initiation zones. *P <0.05, **P <10−5, Chi-square test. (d) The number of sites transcribed in the opposite or same direction as fork movement are shown when replication timing differed by at least 10%, 20% or 30%, to increase confidence in fork direction calls. The total number of targeted, untargeted and random sites analyzed in A–C was 2007, 1909 and 2001 respectively.
Figure 5
Figure 5. Targeting frequencies in subclones with specific, mapped integration sites
Examples of Repli-Seq data for target sites transcribed in the opposite (a) or same (b) direction as replication fork movement. Site numbers refer to Supplementary Table 3. (c) Average targeting frequencies are shown for all target sites with discernable replication fork directions (Error bars, s.e.m. for n = 7 for each group). The two groups were significantly different (p<0.05 by Student’s test).
Figure 6
Figure 6. Stalled replication forks may promote vector pairing at target loci
(a) Model of a target locus transcribed in the opposite direction to an incoming replication fork, which stalls the incoming fork and produces a chicken-foot structure. This exposes single-stranded regions in the target locus in three possible ways, with distinct consequences for vector pairing. (b) An MVM targeting system is shown with vectors containing sense or anti-sense targeting strands (MVM-s and MVM-as) that can correct an ALPP (Alk Phos) reporter gene with a 4 bp deletion at bp 375 of its reading frame that was introduced by gammaretroviral vector MLV-LAP375Δ4SP with puromycin selection . Arrows indicate transcription start sites. The HT-1080 Repli-Seq data for this portion of human chromosome 9 is shown below the vector maps, with the target site located at bp 114,505,235. MSCV, murine stem cell virus promoter; SV40, SV40 viral promoter; GFP, green fluorescent protein gene; puro, puromycin resistance gene. (c) Average targeting frequencies (with standard deviations) of the sense and antisense MVM vectors when infections were done at the indicated multiplicities of infection . The “mixture” contained 2.5 × 105 vector genomes per cell of both MVM-s and MVM-as.

References

    1. Kong A, et al. A high-resolution recombination map of the human genome. Nat Genet. 2002;31:241–7. - PubMed
    1. Wilson JH, Leung WY, Bosco G, Dieu D, Haber JE. The frequency of gene targeting in yeast depends on the number of target copies. Proc Natl Acad Sci U S A. 1994;91:177–81. - PMC - PubMed
    1. Gray M, Honigberg SM. Effect of chromosomal locus, GC content and length of homology on PCR-mediated targeted gene replacement in Saccharomyces. Nucleic Acids Res. 2001;29:5156–62. - PMC - PubMed
    1. Yanez RJ, Porter AC. A chromosomal position effect on gene targeting in human cells. Nucleic Acids Res. 2002;30:4892–901. - PMC - PubMed
    1. Raynard SJ, Read LR, Baker MD. Evidence for the murine IgH mu locus acting as a hot spot for intrachromosomal homologous recombination. J Immunol. 2002;168:2332–9. - PubMed

Publication types

Substances

Associated data