Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2015 Apr 9;161(2):228-39.
doi: 10.1016/j.cell.2015.03.026.

Ubiquitous L1 mosaicism in hippocampal neurons

Affiliations

Ubiquitous L1 mosaicism in hippocampal neurons

Kyle R Upton et al. Cell. .

Abstract

Somatic LINE-1 (L1) retrotransposition during neurogenesis is a potential source of genotypic variation among neurons. As a neurogenic niche, the hippocampus supports pronounced L1 activity. However, the basal parameters and biological impact of L1-driven mosaicism remain unclear. Here, we performed single-cell retrotransposon capture sequencing (RC-seq) on individual human hippocampal neurons and glia, as well as cortical neurons. An estimated 13.7 somatic L1 insertions occurred per hippocampal neuron and carried the sequence hallmarks of target-primed reverse transcription. Notably, hippocampal neuron L1 insertions were specifically enriched in transcribed neuronal stem cell enhancers and hippocampus genes, increasing their probability of functional relevance. In addition, bias against intronic L1 insertions sense oriented relative to their host gene was observed, perhaps indicating moderate selection against this configuration in vivo. These experiments demonstrate pervasive L1 mosaicism at genomic loci expressed in hippocampal neurons.

PubMed Disclaimer

Figures

None
Graphical abstract
Figure 1
Figure 1
Single-Cell RC-Seq Workflow (A) NeuN+ hippocampal nuclei were first purified by FACS (see also Figure S1). (B) Nuclei were then picked using a self-contained microscope and micromanipulator. (C) DNA was extracted from nuclei and subjected to linear WGA, followed by exponential PCR in two separate reactions for each nucleus, using different enzymes. (D) Exponential WGA products for each nucleus were combined, used to prepare Illumina libraries, and analyzed via WGS to assess genome coverage and possible amplification biases. (E) Libraries prepared in (D) were enriched via hybridization to L1-Ta LNA probes. (F) Enriched libraries were sequenced with 2 × 150-mer Illumina reads and analyzed to identify novel L1 integration sites (see also Figure S2).
Figure 2
Figure 2
Single-Cell WGS and RC-Seq Analyses of 92 Hippocampal Neurons (A) Chromosome copy number in each amplified genome, assessed by WGS. Box-and-whisker plots indicate median chromosomal copy number and quartiles across all neurons. Empty circles represent chromosomes with copy number >1.5 IQR from the median. Sex chromosomes for CTRL-36 (female, ♀) and CTRL-42, CTRL-45, and CTRL-55 (male, ♂) are presented separately. Six autosomes, marked in red, had copy number ≤ 1. Two sex chromosomes with log2 copy number < −2 are colored purple. (B) WGS indicated 16.2 Mb and 9.4 Mb regions of localized AD (indicated by red bars) on chromosome 6 of neuron CTRL-45-HN-#2. Each blue diamond corresponds to a 600 kb “bin”. One bin with log2 copy number < −5 is colored purple. (C) Percentages of LD (dark gray) and AD (light gray) bins in each neuron, assessed by WGS. (D) Percentage of reference genome L1-Ta copies detected by single-cell RC-seq in each neuron. (E) Percentage of polymorphic L1-Ta insertions found in the corresponding bulk RC-seq libraries for each individual and also detected by single-cell RC-seq. (F) Somatic L1 insertion counts observed in each neuron by single-cell RC-seq. Note: in (C-F) yellow, brown, blue, and green histogram columns correspond to individuals CTRL-36, CTRL-42, CTRL-45, and CTRL-55, respectively. See also Figures S3 and S4 and Tables S1 and S2.
Figure 3
Figure 3
PCR Validation of Somatic L1 Insertions (A–F) Validated examples from hippocampal neuron single-cell RC-seq data included: (A) a full-length L1 insertion in neuron CTRL-42-HN-#13; (B) a truncated L1 insertion in neuron CTRL-42-HN-#11; (C) a heavily truncated L1 insertion in neuron CTRL-55-HN-#15; and (D) a very heavily truncated L1 insertion yielding a 3′ transduction in neuron CTRL-42-HN-#4, also validated in neuron CTRL-42-HN-#3, and traced to a donor L1-Ta on chromosome 3; (E) a very heavily truncated L1 insertion detected in CTRL-42-HN-#13 and validated in 10/21 CTRL-42 hippocampal neurons tested. Asterisks denote neurons where validation succeeded; (F) a very heavily truncated L1 insertion detected in CTRL-42-HN-#4 and also validated in CTRL-42-HN-#22. Note: in (A–F) the 3′ L1-genome junction was detected by single-cell RC-seq, while the 5′ L1-genome junction was identified by insertion-site PCR (using primers indicated by α and β) and sequencing. Green triangles indicate TSDs. Numbers below the 5′ L1-genome junction indicate the equivalent L1-Ta consensus position. See also Table S2 and Data S1.
Figure 4
Figure 4
L1 Mobilization in Diverse Neural Cell Types (A) Somatic L1 insertion counts observed by single-cell RC-seq applied to hippocampal glia. (B) As for (A) except for cortical neurons. Seven pyramidal neurons are indicated by an asterisk. (C) As for (A) except for AGS-1 hippocampal neurons. (D) L1 qPCR indicated lower L1 copy number in AGS-1 hippocampus versus controls (p < 0.002, two-tailed t test, df = 23). Data represent the mean of 5 technical replicates ± SD. (E) Mean somatic L1 insertion counts detected by single-cell RC-seq in each hippocampus strongly correlated (R2 = 0.93) with L1 copy number quantified by qPCR (D). See also Figure S5 and Table S2.
Figure 5
Figure 5
Single-Cell RC-Seq Efficiently Excludes Molecular Artifacts (A) Distribution of read “peaks” indicating possible somatic L1 insertions detected by single-neuron L1 insertion profiling (L1-IP) (Evrony et al., 2012). (B) As for (A), except for all single-cell RC-seq data presented here. Peaks were annotated as chimeric or as likely genuine L1 insertions by sequence analysis of RC-seq reads. (C) Distribution of read peak height for L1 insertions selected for validation by Evrony et al. The L1 insertion successfully validated by TSD discovery is colored black. The remaining insertions not validated to this standard are colored red. (D) As for (C), except for L1 insertions detected by single-cell RC-seq and selected at random for validation.
Figure 6
Figure 6
Hallmarks of TPRT Revealed by Bulk RC-Seq (A) A 6 kb L1-Ta element incorporates 5′ and 3′ UTRs and two ORFs. ORF2p presents EN and RT domains. Methylation of a CpG island present in the 5′ UTR regulates L1 promoter activity. The locations of two capture probes used by RC-seq are indicated below the L1. Note: TSDs and probes are not drawn to scale. See also Figure S2. (B) TPRT hallmark features, including TSDs and an L1 EN recognition motif, can be identified by RC-seq, including for insertions detected at only a 5′ or 3′ L1-genome junction. (C) Consensus L1 EN motifs for polymorphic and somatic L1 insertions detected at their 5′ and 3′ L1-genome junctions, and somatic L1 insertions found at only a 3′ L1-genome junction. (D) Observed TSD size distributions for polymorphic and somatic L1 insertions, normalized to random expectation. See also Figure S6.
Figure 7
Figure 7
Genome-Wide Somatic L1 Insertion Patterns (A) Somatic L1 insertions detected by single-cell RC-seq in hippocampal neurons and glia were enriched in genes differentially upregulated in hippocampus. Liver-specific L1 insertions detected by bulk RC-seq were moderately enriched in genes upregulated in liver. No enrichment was observed for cortical neurons. Color intensity is based on the absolute log2 transformed p value determined by Fisher’s exact test (Benjamini-Hochberg correction) with blue and orange colors representing depletion and enrichment, respectively. Note: in each matrix pairwise comparison, the more highly expressed tissue is on the y axis. (B) Hippocampal somatic L1 insertions were statistically enriched in genes upregulated in hippocampus versus liver (black) or hippocampus versus heart (gray), as shown in (A). However, as previously filtered molecular chimeras (see Figure 5B) were re-introduced into this dataset, enrichment rapidly became no longer significant. (C) Of the transcribed cell-type specific enhancers defined by FANTOM5, only those of neuronal stem cells were enriched (observed/expected) for somatic L1 insertions detected by bulk hippocampus RC-seq, compared with other enhancers (p < 1.0 × 10−4, Fisher’s exact test, Bonferroni correction). (D) Somatic L1 insertion enrichment in neuronal stem cell enhancers (black) extended 500 bp from enhancer boundaries. No enrichment was observed for astrocyte (gray) or hepatocyte (red) enhancers. See also Tables S2, S3, S4, S5, and S6.
Figure S1
Figure S1
Identification and Purity Confirmation of NeuN+ Hippocampal Nuclei via Fluorescence Activated Cell Sorting, Related to Figure 1 (A) Events were first gated on forward scatter of cells (FSC) and side scatter of cells (SSC). (B) A Sytox blue control confirmed clear separation of fluorescent spectra, and an absence of events in the sorting gate. (C) As for (B), except including a secondary antibody control. (D) NeuN+/Sytox+ events (in polygonal gate) were sorted into PBS. (E) A sample of sorted nuclei was re-analyzed by FACS to confirm sort purity. Note: expected photobleaching reduced signal intensities.
Figure S2
Figure S2
RC-Seq Capture Design and L1 Insertion Scenarios Detected, Related to Figure 1 (A) A full-length L1-Ta structure indicates the positions of two RC-seq probes designed to detect the 5′ or 3′ L1-genome junction of a given L1 insertion. Three categories of RC-seq reads are therefore generated, namely those that detect: the 5′ L1-genome junction of a full-length L1, the 5′ L1-genome junction of a heavily truncated L1 and the 3′ L1-genome junction of any L1. (B) L1 detection scenarios as outlined in (A). Insertions are either 1) full-length or heavily truncated and detected at only a 5′ L1-genome junction, 2) of any length and detected at only a 3′ L1-genome junction, 3) full-length or heavily truncated and detected at both L1-genome junctions. Note the percentages given in brackets, indicating the relative occurrence of each scenario in the single-cell RC-seq data presented.
Figure S3
Figure S3
Heatmap Representing Sequence Coverage across the Genome, Related to Figure 2 For each sample, sequence alignments were binned by alignment start position into 600 kb intervals across the human genome, excluding unplaced contigs, extra haplotypes, and the mitochondrial genome. Counts were quantile normalized across all samples before plotting. Chromosomes are indicated on the vertical axis. Sample brain region location (cortex, hippocampus), cell type (glial, neuron) and individual ID are indicated for groups of columns on the horizontal axis. For each individual, single cells are ordered numerically. Note: low and high coverage bins are indicated in yellow and blue, respectively.
Figure S4
Figure S4
WGS Revealed Limited Amplification Bias in Individual Neuronal Genomes, Related to Figure 2 (A) Median copy number for bins across all single-cell (gray circles) and bulk liver (black circles) WGS libraries, versus the percentile position of bins along the length of their corresponding chromosomes. (B) Fractions of LD and AD bins versus bin chromosome percentile position. Note: LD and AD fractions are highest at telomeres. (C) High resolution analysis of two localized AD regions on chromosome 6 of CTRL-45 hippocampal neuron 2 (CTRL-45-HN-#2), also presented at 600 kb resolution in Figure 2B. Copy number is displayed for ∼60 kb bins (black diamonds). Bins with absolute log2(copy number) ≥ 5 are colored in purple. Dropout regions are indicated by red bars. (D) Observed RC-seq read counts across reference 5′ L1-genome junctions for bulk liver and single-cell WGS libraries, normalized as a ratio to counts obtained by random sampling 107 220 bp sequences from the human reference genome, revealing minimal dropout of L1-genome junctions due to WGA. (E) As for (D) except at 3′ L1-genome junctions.
Figure S5
Figure S5
WGA Quality Control Analyses for Hippocampal Glia, Cortical Neurons and AGS-1 Hippocampal Neurons, Assessed by WGS, Related to Figure 4 (A) Chromosome copy number in each amplified genome. Box-and-whisker plots indicate median chromosome copy number and quartiles across all neurons. No examples of chromosome-wide AD were observed. (B) Percentages of LD (dark gray) and AD (light gray) bins in each cell. (C) Percentage of reference genome L1-Ta copies detected by RC-seq in each cell. (D) Percentage of polymorphic L1-Ta insertions found in the corresponding bulk RC-seq libraries for each individual and also detected by single-cell RC-seq.
Figure S6
Figure S6
Signatures of L1 Mobilization via TPRT Detected by Hippocampus and Liver Bulk RC-Seq, Related to Figure 6 (A) TSD size distribution for polymorphic L1 insertions detected at their 5′ and 3′ L1-genome junctions. (B) As for (A), except for hippocampal somatic L1 insertions. (C) TSD size distribution for hippocampal somatic L1 insertions detected at only a 5′ L1-genome junction. (D) Consensus L1 EN motif for liver somatic L1 insertions detected at only a 3′ L1-genome junction by RC-seq. Expected values for (A) and (B) were calculated by randomizing sense and antisense RC-seq read cluster genomic coordinates, to ascertain how many overlapping clusters in the opposing orientation and detecting opposite ends of an L1 insertion were found, using the same bioinformatics process as used for observed clusters. Expected values for (C) were calculated by random sampling of genomic coordinates and searching for the nearest upstream L1 EN motif, again following the same string matching process as for observed values. Note: the corresponding TSD size distribution for liver somatic L1 insertions detected at only their 5′ L1-genome junction contained insufficient data (n = 7) to make a meaningful comparison with hippocampal somatic L1 insertions.

References

    1. Andersson R., Gebhard C., Miguel-Escalada I., Hoof I., Bornholdt J., Boyd M., Chen Y., Zhao X., Schmidl C., Suzuki T., FANTOM Consortium An atlas of active enhancers across human cell types and tissues. Nature. 2014;507:455–461. - PMC - PubMed
    1. Baillie J.K., Barnett M.W., Upton K.R., Gerhardt D.J., Richmond T.A., De Sapio F., Brennan P.M., Rizzu P., Smith S., Fell M. Somatic retrotransposition alters the genetic landscape of the human brain. Nature. 2011;479:534–537. - PMC - PubMed
    1. Beck C.R., Collier P., Macfarlane C., Malig M., Kidd J.M., Eichler E.E., Badge R.M., Moran J.V. LINE-1 retrotransposition activity in human genomes. Cell. 2010;141:1159–1170. - PMC - PubMed
    1. Beck C.R., Garcia-Perez J.L., Badge R.M., Moran J.V. LINE-1 elements in structural variation and disease. Annu. Rev. Genomics Hum. Genet. 2011;12:187–215. - PMC - PubMed
    1. Boissinot S., Entezam A., Furano A.V. Selection against deleterious LINE-1-containing loci in the human lineage. Mol. Biol. Evol. 2001;18:926–935. - PubMed

Supplemental References

    1. Dombroski B.A., Scott A.F., Kazazian H.H., Jr. Two additional potential retrotransposons isolated from a human L1 subfamily that contains an active retrotransposable element. Proc. Natl. Acad. Sci. USA. 1993;90:6513–6517. - PMC - PubMed
    1. Feng Q., Moran J.V., Kazazian H.H., Jr., Boeke J.D. Human L1 retrotransposon encodes a conserved endonuclease required for retrotransposition. Cell. 1996;87:905–916. - PubMed
    1. Jiang Y., Matevossian A., Huang H.S., Straubhaar J., Akbarian S. Isolation of neuronal chromatin from brain tissue. BMC Neurosci. 2008;9:42. - PMC - PubMed
    1. Kiełbasa S.M., Wan R., Sato K., Horton P., Frith M.C. Adaptive seeds tame genomic sequence comparison. Genome Res. 2011;21:487–493. - PMC - PubMed
    1. Langmead B., Trapnell C., Pop M., Salzberg S.L. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 2009;10:R25. - PMC - PubMed

Publication types