Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2014 Aug 28;158(5):1187-1198.
doi: 10.1016/j.cell.2014.07.034.

The architecture of a scrambled genome reveals massive levels of genomic rearrangement during development

Affiliations

The architecture of a scrambled genome reveals massive levels of genomic rearrangement during development

Xiao Chen et al. Cell. .

Abstract

Programmed DNA rearrangements in the single-celled eukaryote Oxytricha trifallax completely rewire its germline into a somatic nucleus during development. This elaborate, RNA-mediated pathway eliminates noncoding DNA sequences that interrupt gene loci and reorganizes the remaining fragments by inversions and permutations to produce functional genes. Here, we report the Oxytricha germline genome and compare it to the somatic genome to present a global view of its massive scale of genome rearrangements. The remarkably encrypted genome architecture contains >3,500 scrambled genes, as well as >800 predicted germline-limited genes expressed, and some posttranslationally modified, during genome rearrangements. Gene segments for different somatic loci often interweave with each other. Single gene segments can contribute to multiple, distinct somatic loci. Terminal precursor segments from neighboring somatic loci map extremely close to each other, often overlapping. This genome assembly provides a draft of a scrambled genome and a powerful model for studies of genome rearrangement.

PubMed Disclaimer

Figures

Figure 1
Figure 1. Development of the Oxytricha Macronuclear Genome from the Micronuclear Genome
In the micronucleus (MIC), macronuclear destined sequences (MDSs) are interrupted by internal eliminated sequences (IESs); MDSs may be disordered (e.g., MDS 3, 4, and 5) or inverted (e.g., MDS 4). During development after conjugation, IESs, as well as other MIC-limited DNA, are removed. MDSs are stitched together, some requiring inversion and/or unscrambling. Pointers are short identical sequences at consecutive MDS-IES junctions. One copy of the pointer is retained in the new macronucleus (MAC). The old macronuclear genome degrades. Micronuclear chromosome fragmentation produces genesized nanochromosomes (capped by telomeres) in the new macronuclear genome. DNA amplification brings nanochromosomes to a high copy number. See also Figure S1.
Figure 2
Figure 2. The MIC Genome Is Fragmented into Hundreds of Thousands of Segments with Massive Levels of Scrambling
(A) Comparison of chromosome length and number of MDS segments between 13,910 non-scrambled (blue) and 2,310 scrambled nanochromosomes (red) completely covered on single MIC contigs. (B) A chord diagram mapping MIC ctg7180000068801 to its rearranged form, MAC Contig17454.0. Lines connect precursor (MIC) and product (MAC) MDS locations; black dotted line, an inverted MDS; all 245 MDSs (242 are scrambled) drawn in a blue to yellow MAC color gradient; IESs, gray. (C) Length distribution of 44,191 nonscrambled and 9,841 scrambled MDSs <100 nt (excluding pointers). Inset: 150,615 nonscrambled MDSs and 16,350 scrambled MDSs excluding pointers, showing the most typical length of scrambled MDSs is <50 nt. (D) Length distribution of 101,345 nonscrambled and 8,333 scrambled high-confidence IESs <100 nt (excluding pointers and IESs that contain other MDSs). Inset: 147,122 nonscrambled versus 9,040 scrambled IESs excluding pointers and IESs that contain other MDSs. We identified six strong cases of 0 bp MDSs (four nonscrambled [Contig7827.0 MDS3, Contig11190.0.1 MDS18, Contig13633.0 MDS3, and Contig9208.0.0 MDS18] and two scrambled [Contig6325.0.0 MDS58 and Contig1267.1 MDS7]). (E) Length distribution of 112,125 nonscrambled IESs (excluding those that contain other MDSs) <150 nt, with one copy of the pointer included (i.e., the total length of DNA deleted). (F) MIC genomic distance between scrambled MDSs that are consecutive in the MAC (n = 12,197); distance calculated from the pointer flanking MDS N to its paired pointer flanking MDS N+1. See also Figure S2.
Figure 3
Figure 3. Gene Segments for Multiple Distinct MAC Chromosomes Are Sometimes Interwoven or Reused
(A) Germline map of MIC ctg7180000067411 (drawn to scale), containing precursor MDSs (bars, orientation as shown, including pointers) for five MAC chromosomes (purple, Contig1267.1; green, Contig18709.0; red, Contig20652.0; gold, Contig18297.0; blue, Contig6980.0) whose MDSs are scrambled and interwoven with each other. IES regions are gaps. MDS numbers are consecutive in the MAC. (B) Germline map of a MIC region (ctg7180000067243) with four shared MDSs that assemble into five distinct MAC chromosomes with identical 5′ ends (red, scrambled Contig14686.0; green, Contig7507.0; blue, Contig7395.0; gray, Contig15152.0; gold, Contig4858.0); start/stop codons annotated in blue and red, respectively. (C) Germline map (ctg7180000068430) depicting a scrambled MAC chromosome (gold, Contig19716.0) that arose by recombination between MDSs from two different gene loci (green, Contig16277.0; blue, Contig22490.0) at a new pointer (11 bp direct repeat, magenta triangles). Note that the green Contig16277.0 is an alternatively processed chromosome, itself, with two predicted stop codons; the shorter, more abundant isoform (not shown) terminates at an alternative telomere addition site between MDS 12-13, upstream of an intron 3′ splice site. This creates an earlier, in-frame stop codon within the retained portion of the unspliced intron (Swart et al., 2013).
Figure 4
Figure 4. The Distance between Adjacent Terminal MDSs Is Much Smaller Than that between Terminal MDSs and Adjacent Internal MDSs for a Different MAC Chromosome
Negative values represent the length of overlapping regions, with the peak distance between terminal MDSs from −1 to −10 bp (10,006 pairs, black) and the peak between terminal MDS to internal MDS between 10–19 bp (2,863 pairs, gray). See also Figure S3.
Figure 5
Figure 5. MIC-Limited Genes and Transpo-sons Are Preferentially Expressed during Conjugation, while IES-less Genes Are More Constitutive but Show Universally High Expression during Conjugation
(A) Clustered expression profile of 810 germline-limited nonrepetitive genes across different time points (vegetative stage (fed); 0, 10, 20, 40, and 60 hr during the conjugating time course). Gene expression levels are represented by log2 (100,000 × normalized RNA-seq counts/coding sequence length). (B) Mass spectrometry validated 208 MIC-limited genes (outer, pink circle) and 103 were found to contain posttranslational modifications (PTMs) (inner, purple circle). Representative members of each group are shown within the circles. (C) Clustered expression profiles of 530 IES-less genes. (D) Clustered expression profiles of MIC-limited transposon-associated genes. Upper: 275 reverse transcriptase and endonuclease domain proteins encoded by LINEs; Middle: 21 Helitron-associated helicases; Lower: 12 DDE_Tnp_IS1595 (ISXO2-like transposase) domain proteins from insertion sequences (ISs). See also Tables S2, S3, S4, S5, S6, S7, and S8 and Figure S4.

References

    1. Arnaiz O, Mathy N, Baudry C, Malinsky S, Aury J-M, Wilkes CD, Garnier O, Labadie K, Lauderdale BE, Le Mouël A, et al. The Paramecium germline genome provides a niche for intragenic parasitic DNA: evolutionary dynamics of internal eliminated sequences. PLoS Genet. 2012;8:e1002984. - PMC - PubMed
    1. Aury J-M, Jaillon O, Duret L, Noel B, Jubin C, Porcel BM, Ségurens B, Daubin V, Anthouard V, Aiach N, et al. Global trends of whole-genome duplications revealed by the ciliate Paramecium tetraurelia. Nature. 2006;444:171–178. - PubMed
    1. Boswell RE, Jahn CL, Greslin AF, Prescott DM. Organization of gene and non-gene sequences in micronuclear DNA of Oxytricha nova. Nucleic Acids Res. 1983;11:3651–3663. - PMC - PubMed
    1. Camacho C, Coulouris G, Avagyan V, Ma N, Papadopoulos J, Bealer K, Madden TL. BLAST+: architecture and applications. BMC Bioinformatics. 2009;10:421. - PMC - PubMed
    1. Cheng J, Blum R, Bowman C, Hu D, Shilatifard A, Shen S, Dynlacht BD. A role for H3K4 monomethylation in gene repression and partitioning of chromatin readers. Mol. Cell. 2014;53:979–992. - PMC - PubMed

Publication types

Associated data