Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2009 Jun;5(6):e1000516.
doi: 10.1371/journal.pgen.1000516. Epub 2009 Jun 12.

Change of gene structure and function by non-homologous end-joining, homologous recombination, and transposition of DNA

Affiliations

Change of gene structure and function by non-homologous end-joining, homologous recombination, and transposition of DNA

Wolfgang Goettel et al. PLoS Genet. 2009 Jun.

Abstract

An important objective in genome research is to relate genome structure to gene function. Sequence comparisons among orthologous and paralogous genes and their allelic variants can reveal sequences of functional significance. Here, we describe a 379-kb region on chromosome 1 of maize that enables us to reconstruct chromosome breakage, transposition, non-homologous end-joining, and homologous recombination events. Such a high-density composition of various mechanisms in a small chromosomal interval exemplifies the evolution of gene regulation and allelic diversity in general. It also illustrates the evolutionary pace of changes in plants, where many of the above mechanisms are of somatic origin. In contrast to animals, somatic alterations can easily be transmitted through meiosis because the germline in plants is contiguous to somatic tissue, permitting the recovery of such chromosomal rearrangements. The analyzed region contains the P1-wr allele, a variant of the genetically well-defined p1 gene, which encodes a Myb-like transcriptional activator in maize. The P1-wr allele consists of eleven nearly perfect P1-wr 12-kb repeats that are arranged in a tandem head-to-tail array. Although a technical challenge to sequence such a structure by shotgun sequencing, we overcame this problem by subcloning each repeat and ordering them based on nucleotide variations. These polymorphisms were also critical for recombination and expression analysis in presence and absence of the trans-acting epigenetic factor Ufo1. Interestingly, chimeras of the p1 and p2 genes, p2/p1 and p1/p2, are framing the P1-wr cluster. Reconstruction of sequence amplification steps at the p locus showed the evolution from a single Myb-homolog to the multi-gene P1-wr cluster. It also demonstrates how non-homologous end-joining can create novel gene fusions. Comparisons to orthologous regions in sorghum and rice also indicate a greater instability of the maize genome, probably due to diploidization following allotetraploidization.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Figure 1
Figure 1. p1 alleles.
p1 gives rise to phlobaphenes in female floral tissues (pericarp, cob, husks, and silk) and tassel glume margins of the male inflorescence. However, pigmentation is most obvious in pericarp (hence the name of the gene) and in glumes, palea and lemma of the cob. Pericarp or seed coat is the outermost layer of the kernel that is derived from the ovary wall and accordingly is maternal tissue. Glumes, palea and lemma are bracts enclosing the ovary and are also of maternal origin.
Figure 2
Figure 2. P1-wr BACs bridge a FPC and sequence gap.
The p cluster sequence of 379 kb is represented by a yellow rectangle. Each P1-wr repeat is illustrated by a red triangle pointing in the transcriptional orientation of the copy. Individual BACs are displayed as green and blue rectangles, and grey rectangles stand for fingerprinted contigs (FPCs). BACs shown in green were sequenced for this analysis, while BACs in blue were sequenced by a shotgun approach as part of the public maize-sequencing project. Due to the high similarity and large size of each P1-wr copy, gaps remain in the p cluster for the FPC map and publically available maize sequence as of November 2008. Our sequencing effort bridged the gaps and resolved the structural arrangement of all P1-wr repeats. BAC names, their accession numbers and sizes are given on top of each rectangle. Nucleotide positions written underneath the rectangles refer to the p cluster sequence that is covered by the BACs. Because BACs shown in blue are not fully assembled, their contig numbers are indicated in the rectangles. Therefore the BAC size corresponds to all added contigs. The calculated BAC size, which is written under the rectangle when available, can be smaller or larger dependent on overlaps or sequence gaps. Information whether BACs shown in green were fingerprinted and assembled in an FPC map is given underneath the rectangles. The vertical lines in two BACs represent a sequence gap in a CACTA and retro element.
Figure 3
Figure 3. Representation of the P1-wr cluster and flanking sequences.
P1-wr repeats, as well as flanking p genes are depicted as red pentagons with the apex pointing in the direction of transcription. Two predicted genes (pink pentagons) that encode a calmodulin binding protein (g1) and an expressed protein (g2) are positioned downstream of p1/p2. Regions containing probable pseudogenes are illustrated as pink hexagons. The fragmented genes downstream of the predicted genes are associated with a Helitron 3′ terminal sequence. Class I and class II transposable elements (drawn as rectangles and rounded rectangles, respectively) include mostly nested LTR retrotransposons, two CACTA elements (misfit and doppia), one hAT element, one LINE element and several MITEs (not shown). LTRs of retrotransposons are represented as triangles indicating the transcriptional orientation. Notice that the 3′ end of p1/p2 is separated from the coding region by a large retroelement block. Transposons depicted in white are not well conserved. P1-wr repeats are displayed in transcriptional orientation from left to right, while p2/p1 is proximal and p1/p2 is distal to the centromere.
Figure 4
Figure 4. Schematic alignment of p genes.
While P1-wr is the only described p1 allele with a multi-copy structure, only one copy is shown here. The P1-wr 5′ region aligns well with other p1 alleles. Regulatory elements, i.e. distal and proximal enhancer and basal promoter, depicted as blue arrows, were only determined for P1-rr. In other p genes or alleles, the arrows merely refer to sequence homology to P1-rr. Functional homology has not been investigated. A Heartbreaker MITE (purple bar) and a Mu-like element (tan bar) are part of the proximal enhancer. The p2 sequences upstream of the transcription start site depicted as blue rectangles are nearly identical. Notice that p2 shares the initial promoter sequences (orange rectangle) with p1 alleles. Upstream of p2, maize and teosinte differ in their composition of retrotransposons (not shown). The transcribed component of p1 and p2 genes (with the exception of the P1-rr allele) consists of 3 exons (illustrated in red) and 2 introns. The fourth exon of P1-rr is not displayed. P1-rr differs from P1-wr in the 3′UTR. The p2 genes from maize (dotted) and parviglumis (horizontal lines) are very similar to p1 alleles in their transcribed regions. All other sequences shown are hybrids between p1 and p2. The 5′ region of p2/p1 containing exon 1 and 2 (horizontal red lines) is of p2 origin while the 3′ end including exon 3 (full red) is derived from P1-wr. p1/p2 switches from a p1 to a p2 sequence in exon 2. The 3′ UTR of p1/p2 was separated by retrotransposon insertions as indicated by two parallel lines. Intron 2 comprises numerous transposable elements of various kinds: a hAT-like element (light green) and several MITEs or repeat elements ((S) Stowaway, (H) Heartbreaker, (P) Pilgrim, unnamed MITE).
Figure 5
Figure 5. Polymorphisms among P1-wr repeats.
This plot displays polymorphisms in individual P1-wr repeats compared to a P1-wr consensus sequence. Polymorphisms are subdivided in separate classes: SNPs are depicted in red, insertions in yellow, and deletions in green. Exons are shaded in red, a putative basal promoter region in green, and potential enhancer sequences in blue. Notice that exon 1 is split by a hAT-like transposable element in P1-wr repeat 6 and 11.
Figure 6
Figure 6. Origin of the chimeric p2/p1 gene by NHEJ.
The recombinant p2/p1 gene is represented as a rectangle above the filler DNA sequence shown in a yellow box. Exons 1 and 2 from p2/p1 are similar to p2, but the third exon is derived from P1-wr. The complex filler DNA (yellow rectangle) originated from two nearby downstream sequences as indicated by two gray balloons. A potential ancestral sequence, consisting of a putative p2 gene (tan rectangle) and a neighboring P1-wr gene (gray rectangle), is shown on top. DNA double-strand break, deletion and repair events resulting in the p2/p1 recombinant are explained in the main text.
Figure 7
Figure 7. Retrotransposon insertions displaced the p1/p2 3′ UTR.
The structural organization of the p1/p2 gene (including exons and putative regulatory sequences) is shown at the bottom of this figure. The first retroelement that inserted approximately 1.38 million years ago (mya) in the 3′UTR of p1/p2 was Eninu. Shadowspawn transposed “shortly” after (1.31 mya), in a region 4.1 kb downstream of Eninu. Ji jumped into Eninu 0.77 mya. Based on the nested nature of insertions, an Opie element that was truncated later must have inserted after Eninu but before Huck, which entered Opie 0.62 mya. Similarly, Diguus, now being a solo LTR, must have inserted into Eninu before Zeon, which was integrated into Diguus 0.58 mya. Finally, Opie and Diguus jumped into Huck 0.35 mya and 0.19 mya, respectively, pushing both ends of p1/p2 to the total of 68 kb apart. The order of the transposition events can be inferred by the nested nature of insertions, which is consistent with all computed insertion dates.
Figure 8
Figure 8. Synteny is maintained in maize, sorghum, and rice.
The genomic comparison is centered around the maize p genes and its orthologs and includes five flanking genes. Chromosome segments of maize, rice and sorghum that contain p or orthologous genes are mostly collinear. However, a break in synteny occurred adjacent to the p orthologous gene on maize chr. 9. While rice has only one p ortholog, y and p genes in sorghum and maize chr. 1 are amplified. A gene encoding a conserved hypothetical protein in sorghum is duplicated as well. The expansion of the maize genome compared to rice and sorghum is most obvious even in this small genomic region. Genes are displayed as color-coded rectangles, and their transcriptional orientation is indicated by + and −. Undefined gene fragments are depicted in black. A dotted line indicates a sequence that consists of several contigs. Therefore, orientation of and distance between genes located in these contigs cannot be determined as of now. The gene orientation chosen in this figure is based on orthologous rice and sorghum genes. Lightning bolts stand for mutation events that generated pseudogenes.

References

    1. Aguilera A, Gomez-Gonzalez B. Genome instability: a mechanistic view of its causes and consequences. Nat Rev Genet. 2008;9:204–217. - PubMed
    1. Messing J, Bennetzen J. Grass genome structure and evolution. Genome Dynamics. 2008;4:41–56. - PubMed
    1. McClintock B. The significance of responses of the genome to challenge. Science. 1984;226:792–801. - PubMed
    1. Puchta H. The repair of double-strand breaks in plants: mechanisms and consequences for genome evolution. J Exp Bot. 2005;56:1–14. - PubMed
    1. Gorbunova V, Levy AA. How plants make ends meet: DNA double-strand break repair. Trends Plant Sci. 1999;4:263–269. - PubMed

Publication types