Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Oct;5(10):1382-1393.
doi: 10.1038/s41559-021-01523-y. Epub 2021 Aug 19.

Concerted genomic and epigenomic changes accompany stabilization of Arabidopsis allopolyploids

Affiliations

Concerted genomic and epigenomic changes accompany stabilization of Arabidopsis allopolyploids

Xinyu Jiang et al. Nat Ecol Evol. 2021 Oct.

Abstract

During evolution successful allopolyploids must overcome 'genome shock' between hybridizing species but the underlying process remains elusive. Here, we report concerted genomic and epigenomic changes in resynthesized and natural Arabidopsis suecica (TTAA) allotetraploids derived from Arabidopsis thaliana (TT) and Arabidopsis arenosa (AA). A. suecica shows conserved gene synteny and content with more gene family gain and loss in the A and T subgenomes than respective progenitors, although A. arenosa-derived subgenome has more structural variation and transposon distributions than A. thaliana-derived subgenome. These balanced genomic variations are accompanied by pervasive convergent and concerted changes in DNA methylation and gene expression among allotetraploids. The A subgenome is hypomethylated rapidly from F1 to resynthesized allotetraploids and convergently to the T-subgenome level in natural A. suecica, despite many other methylated loci being inherited from F1 to all allotetraploids. These changes in DNA methylation, including small RNAs, in allotetraploids may affect gene expression and phenotypic variation, including flowering, silencing of self-incompatibility and upregulation of meiosis- and mitosis-related genes. In conclusion, concerted genomic and epigenomic changes may improve stability and adaptation during polyploid evolution.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Fig. 1
Fig. 1. Conservation and diversification of A. suecica genome.
a, Diagram of resynthesized allotetraploids and natural A. suecica (Asu). Allo733 and Allo738 are two stable A. suecica-like allotetraploids derived from tetraploid A. thaliana (Ath, Ler4) and A. arenosa (Aar, Care-1). Natural Asu was formed ~300,000 years ago. b, Genomic synteny of A. thaliana Col genome (Col-T), A (Aar-related) subgenome of Allo738, sT and sA subgenomes of Asu and A. lyrata (Aly). Syntenic blocks with 30 or more genes are shown. c, Rearrangements between sT (1–5) and sA (1–8) subgenomes of natural Asu and extant progenitors, A. thaliana (Col, T1–T5) and A. arenosa (A subgenome of Allo738, A1–A8). Ribbons indicate translocation between subgenomes (TLb, black), within a subgenome (TLs, blue) and transposition in the same chromosomes (TP, red). d, A large (~5 Mb) translocation is present between sT1 and sA1 relative to T1 and A1 chromosomes, which was validated by chromatin contact (Hi-C) maps. e, Proportion of sequence variation in sA and sT subgenomes of Asu relative to extant progenitors. INV, inversion; TP, transpositions; TLs, translocations within a subgenome; TLb, translocations between subgenomes. f, Boxplots of the estimated time for intact LTR insertion (million years ago, Ma) in A (Aar-related) subgenome of 738, Col genome (Col-T), sT and sA subgenomes of Asu, diploid Ler genome (Ler2-T), T (Ler4) subgenome of Allo738 and A. lyrata (Aly) genome. Single and double asterisks indicate statistical significance levels of P < 0.05 and 0.01, respectively (permutation test using 1,000 permutations). Source data
Fig. 2
Fig. 2. Gene family expansion and contraction.
a, Venn diagram of orthologue clusters among A. thaliana (Ath, Col), A. arenosa (A subgenome of 738) and A. suecica (sT and sA subgenomes). b, GO enrichment terms of the genes specific to T lineage (Col, sT subgenome of A. suecica) and A lineage (A. arenosa, sA subgenome of A. suecica, A. lyrata, A. halleri). Dashed line indicates onefold enrichment. c, Expansion and contraction of gene families in Arabidopsis-related species with the numbers in parenthesis, indicating gene families subject to expansion (+red) and contraction (–blue), respectively. Black dots indicate node T (ancestor of A. thaliana) and node A (ancestor of A. arenosa), respectively. Bar graphs show the proportion of single-copy (blue), two-copy (red) and multi-copy (green) genes among all (All) and orphan (Orphan) genes in corresponding species. d, Micro-colinearity patterns between FLC and flanking genes in Col, Aar (A subgenome of 738), A. suecica (sT and sA subgenomes) and A. lyrata (Aly). Ribbons indicate colinearity of FLC genes (purple) and its flanking genes (grey). e, FLC expression is correlated with CHH methylation and siRNA levels in A. thaliana (Ath), F1, resynthesized allotetraploids (Allo733 and Allo738) and natural A. suecica (Asu). Scales indicate mRNA (0–100), CHH methylation density (0–1) and 24-nucleotide siRNA (0–1,050) levels. Different letters in mRNA (transcripts per kilobase million) indicate statistical significance of P < 0.05 (analysis of variance (ANOVA) test, n = 3). Source data
Fig. 3
Fig. 3. DNA methylation dynamics during the formation and evolution of allotetraploid Arabidopsis.
a, Chromosome features and methylation distributions. Notes in circos plots: (1) chromosomes, (2) gene and (3) TE density and (4) CG, (5) CHG and (6) CHH methylation levels using 100-kb windows in Ath (Ler4) or Aar, F1, Allo733, Allo738 and A. suecica (in that order from outside to inside in each methylation context). Note that strain identity is omitted in naming T and A chromosomes. b,c, CG methylation levels in the gene body and flanking (2-kb) sequences of the A (b) and T (c) subgenomes in F1, Allo733 (733), Allo738 (738) and A. suecica (Asu), relative to A. thaliana (Ath) and A. arenosa (Aar), respectively. d, Numbers of DMRs between T subgenome and Ath (Col) or A subgenome and Aar in F1, 733, 738 and Asu, respectively. e, Expression ratio log2TPM(Asu/(Aar or Ath)) of the genes flanking a 2-kb region of hypo- or hyper-DMRs between A. suecica (Asu) and A. arenosa (Aar) or A. thaliana (Ath, Ler4). Three asterisks indicate a statistical significance level of P < 0.001 (Mann–Whitney U-test). TPM, transcripts per kilobase per million. Source data
Fig. 4
Fig. 4. Convergence and inheritance of CG methylation levels between two subgenomes in allotetraploids.
a, CG methylation levels of homologues in A. thaliana (Ler4, T), A. arenosa (Aar, A), sT and sA subgenomes of A. suecica. b, Numbers of DMRs between T subgenome and Ath (Col) or A subgenome and Aar in F1, 733, 738 and Asu, respectively. Note that Allo733 and Allo738 may be treated as biological replicates of resynthesized allotetraploids. c, Clustering analysis of CG hyper-DMRs (A-T) in Aar/Ath and their changes in F1, Allo733 (733), Allo738 (738) and A. suecica (Asu), respectively. Dashed black box indicates 4,486 convergent DMRs where hyper-DMRs between A and T (Ler4) were conserved in newly formed allotetraploids and reduced to the sT subgenome level in Asu. Note that the upper portion (white dashed box) indicates the overlap group (1,875) with conserved DMRs (also see e). d, Clustering of CG hypo-DMRs in F1 and their changes in 733, 738 and Asu relative to Aar/Ath. Black dashed boxes indicate hypo-DMRs between T subgenome and Ath (upper panel) and between A subgenome and Aar (lower panel) in F1 were conserved in Allo733, Allo738 and Asu. e, Fraction of conserved CG hypo-DMRs in F1, Allo733, Allo738 and all three lines and their inheritance in Asu relative to Aar, with the numbers (conserved/total) shown to the left of each column. f, Venn diagram of the genes that overlapped with convergent (blue) and conserved (red) CG hypo-DMRs in Asu relative to Aar. Absolute values of CG methylation change thresholds were 0.5 in c,d. Source data
Fig. 5
Fig. 5. Association of CG methylation with expression of reproduction-related genes in Arabidopsis allotetraploids.
a, Clustering analysis of expression levels of reproduction-related genes (GO:0000003) in A. arenosa (Aar, A) and A.thaliana (Ler4, T) (Aar/Ath), F1, Allo733 (733), Allo738 (738) and A. suecica (Asu). b, Density plot of correlation coefficients between expression and CG methylation levels of the reproduction-related genes from clusters 1, 2 and 4 in the A subgenome. c, CG methylation near genic regions of SMC3, PSD5A and AFB3 and their mRNA expression patterns in Aar, F1, Allo733 (733), Allo738 (738) and Asu. Black arrows indicate the orientation of genes. SMC3, STRUCTURAL MAINTENANCE OF CHROMOSOMES3; PDS5A, PHYTOENE DESATURASE; AFB3, AUXIN SIGNALING F-BOX3. Scales indicate mRNA (0–20 and 0–100) and CG methylation density (0–1) levels. Different letters in mRNA (TPM) indicate statistical significance of P < 0.05 (ANOVA test, n = 3). Source data
Extended Data Fig. 1
Extended Data Fig. 1. Synteny and gene content of two subgenome in A. suecica.
a, Proportion of genes and TEs (left) and TE classes (right) in A. thaliana (Col) and in A and T subgenomes of Allo738 (738) and sA and sT subgenome of A. suecica (Asu). b, Distances of the nearest TE from each gene in sT and sA subgenomes of Asu (upper panel) and in Col (cT) and A subgenome (Aar-related) of Allo738 (lower panel). c, d, Genomic features of A. suecica (c) and A. arenosa (d) genome. (1) Chromosome ideogram; (2) Gene density; (3) TE density; (4)-(6) SNP (4), indel (5) identity, and aligned regions (6) between A. suecica and Col or A (c) or between A and A. lyrata genomes (d); (7) synteny of sA and sT homologous gene pairs (c) or paralogous gene pairs (d). Colours in (1) indicate T (blue) and A (green) or related subgenomes; colour scales in (2) and (3) indicate high (dark purple) to low (white) density; densities (2)-(5) were shown per 100-kb windows; only gene pairs in syntenic blocks spanning 30 genes were shown in (7). Source data
Extended Data Fig. 2
Extended Data Fig. 2. Analysis of Allo733 and A. suecica genome assemblies using reference genomes.
a, Dotplots of allotetraploid Allo738 (738) and A. suecica (Asu) assemblies with reference genomes of A. thaliana (Ler and Col) and A. lyrata, respectively. Two genomes were co-linear (red line) with disruptions (blue lines or dots) and inversions (blue line) or translocations (black circles). b, Heatmaps of Hi-C-seq chromosome contacts to show the location of structural variation in different chromosomes. Source data
Extended Data Fig. 3
Extended Data Fig. 3. Genomic variation between A. suecica, A. thaliana, and A. arenosa genomes.
a, Inversions between sT of Asu and Col-T (Ath) or between sA of Asu and A (Ara-related) of Allo738. b, Proportion of co-linear and non-co-linear regions between A and T. Three asterisks indicate statistical significance level of P < 0.001 (Fisher’s exact test). c, Proportion of SNP and indel distributions in sA and sT subgenomes of Asu relative to A and cT genomes. Non-co-linear: not collinear; INV: inversion; TP: transpositions; TLs: translocations within a subgenome; TLb: translocations between T and A subgenomes. d, Distribution of Ks values for a set of 14,668 single-copy genes among A subgenome (Aar-related) of 738, cT (Ath, Col), sT and sA subgenomes of Asu, and A. lyrata (Aly). e, Distribution of Ks values between genes in co-linear and TLb regions (co-linear regions: sA vs A, sT vs cT; TLb regions: sT vs A, sA vs cT). One asterisk indicates statistical significance level of P < 0.05 (Mann–Whitney U-test). f, Distribution of Ka values between A and col-T (cT), A. lyrata (Aly), and sT and sA subgenomes of Asu. g, Distribution of Ka/Ks values as in (f). h, Boxplots of the estimated time (million years ago, MYA) for intact LTR insertions in 25 A. thaliana ecotypes. The red dashed line indicates the median of time in sT (ANOVA test). Source data
Extended Data Fig. 4
Extended Data Fig. 4. Expansion and contraction of gene families in A and T-related genomes.
a, Heatmap showing expansion and contraction of gene families in A subgenome (Aar-related) of Allo738, Col-T (cT), and sA and sT subgenomes of A. suecica. b, Domain enrichment of gene family expansion/contraction (+/-) in A, sA, and their nearest ancestor (Node A in Fig. 2c), and cT, sT, and their nearest ancestor (Node T in Fig. 2c) (Fisher’s exact test). Source data
Extended Data Fig. 5
Extended Data Fig. 5. Comparative genomics of FLC loci in A. suecica and related species.
a, Gene structure of FLC in different species: A. thaliana (Col, AtFLC; Ler_AtFLC), T subgenome of Allo738 (738_AtFLC), A subgenome (Aar) (AaFLC1, AaFLC2, AaFLC3), A. suecica (As_AaFLC1, As_AaFLC2, As_AaFLC3), A. lyrata (AlFLC1 and AlFLC2). b, Phylogenetic tree of FLC genes. c, CG, CHG and CHH methylation and mRNA expression patterns of FLC genes and their vicinity in A (Aar), F1, 733, 738 and A. suecica (Asu). The y axis scales shown above gBrowse tracks indicate mRNA (1–100) and methylation (0–1) levels. Differentially methylated regions are shown in a dashed box. Shown below are diagrams of three FLC loci (arrows indicating transcription direction) and TEs. Different letters in mRNA (TPM) indicate statistical significance of P < 0.05 (ANOVA test, n = 3). Source data
Extended Data Fig. 6
Extended Data Fig. 6. DNA methylation levels in F1, Allo733 (733), Allo738 (738) and A. suecica (Asu) allotetraploids and their related progenitors (Aar and Ath Ler4).
a, b, Average methylation levels with two biological replicates in A (a) and T (b) related genomes or subgenomes in allotetraploids. Two asterisks indicate statistical significance level of P < 0.01 (Mann–Whitney U-test). c, Heatmap of pairwise comparisons between correlation coefficients of methylated cytosines in CG context of Aar, Ath Ler4 (T), and respective subgenomes of F1, 733, 738 and Asu. d, e, CG methylation levels in A1 (d) or T1 (e) related chromosomes of A. arenosa (Aar), A. thaliana (Ath Ler4), F1, Allo733 (733), Allo738 (738), and A. suecica (Asu). f, g, CHG and CHH methylation levels in genic regions of Ath Ler4 (T) and Aar (A) relative to respective subgenomes in allotetraploids (F1, Allo733, Allo738, and Asu). Source data
Extended Data Fig. 7
Extended Data Fig. 7. Association of gene expression with hypo-DMRs between sA subgenome of A. suecica and A. arenosa (Aar, A) and between sT subgenome and A. thaliana Ler4 (Ath, T).
a, Upset diagram of hypo-DMRs in CG, CHG and CHH context between sA and A. Numbers and percentages specific to each context were shown in red. b, Upset diagram of hypo-DMR-overlapping genes in CG, CHG and CHH context between sA and A. An asterisk indicates the fraction of the unique CHG or CHH DMRs (unique number / total number of DMRs) compared to that of their unique overlapping genes (unique number / total number of associated genes) was significantly reduced (P < 0.05, Fisher’s exact test). c, Heatmap of CG, CHG, and CHH DMRs (sT-T or sA-A) in Aar/Ath, F1, 733, 738, and Asu. Boxed regions show maintenance of DMRs from F1 to Allo733, Allo738, and A. suecica. d, Expression ratio (Log2[TPM(Asu/Aar)] of the genes as shown in (b). Colours (from left to right) indicate all genes (pale blue) and the genes flanked (2 kb) with hypo-DMRs in CG only (yellow), CHG only (purple), CHH only (red), all three (dark blue), both CG and CHG (orange), both CHG and CHH (green), and both CHH and CG (pink). One and three asterisks indicate statistical significance levels of P < 0.05 and P < 0.001, respectively (Mann–Whitney U-test). e, f, Number of CHG (e) and CHH (f) differentially methylated regions (DMRs) (sT-T or sA-A) in F1, 733, 738, and Asu, respectively. Source data
Extended Data Fig. 8
Extended Data Fig. 8. Differential expression of methylation pathway genes including AtROS1, AaROS1–1 and AaROS1–2 in allotetraploids.
a, Heatmap of transcript levels (TPM: transcripts per kilobase per million) of methylation pathway genes in A and T subgenomes of Arabidopsis allotetraploids, respectively. From parents (Aar/Ath) to A. suecica (Asu), the upregulated genes was marked orange, while down-regulated genes was marked blue. b, c, gBrowse tracks showing CG (green), CHG (red), and CHH (blue) methylation and mRNA expression (grey) levels in AtROS1 (b) and AaROS1–1 and AaROS1–2 (c) genes and their vicinity in A. thaliana (Ath) and A. arenosa (Aar), F1, 733, 738 and A. suecica (Asu). The regions associated with a TE in the 5′ sequence are shown in a dashed box. The y axis scales shown above the gBrowse tracks indicate mRNA (0–30) and methylation (0–1) levels. Different letters in mRNA (TPM) indicate statistical significance of P < 0.05 (ANOVA test, n = 3). Source data
Extended Data Fig. 9
Extended Data Fig. 9. Gene ontology (GO) enrichment terms for hypo-DMR-associated genes and association of CG methylation with non-additive gene expression in allotetraploids.
a, The first two columns show CG DMR (A-T) levels (difference threshold > 0.5) of homologous genes in A. thaliana (Ler4, T) and A. arenosa (Aar/Ath) and in A. suecica (Asu). The remaining four columns indicate expression levels (TPM) of these genes in Aar/Ath (Ler4 and Aar) and Asu, respectively. b, Correlation between expression fold changes of shared differentially expressed genes (DEGs) between Aar and Ath (Wang et al., 2016) (y axis) and methylation differences (x axis). c, Correlation between expression fold changes of shared DEGs between Allo738 and MPV (Wang et al., 2016) (y axis) and methylation differences between A (red) and T (blue) subgenomes in Allo738 against Aar/Ath (x axis). The red dashed line indicates 0.2 value of methylation difference; the percentage statistics indicates the fraction of genes with more than 0.2 methylation differences in each quadrant. The genes used in (b) and (c) were the shared DEGs of Aar vs. Ler4 and Allo738 vs. MPV. d, GO term overrepresentation for the genes showing conserved (red), convergent (blue), and other (grey) CG hypo-DMRs between sA of A. suecica and A. arenosa (Aar, A). Dashed line indicates onefold enrichment. Source data
Extended Data Fig. 10
Extended Data Fig. 10. Association of CG methylation with expression of reproduction-related genes in Arabidopsis allotetraploids.
a-d, CG methylation near genic regions of PSD5B (a), SMC6B (b), SMC1 (c), and SMC5 (d) and their mRNA expression patterns in A. arenosa (Aar), F1, Allo733 (733), Allo738 (738), and natural A. suecica (Asu). Black arrow indicate the orientation of gene. PDS5B: One of 5 PO76/PDS5 cohesion cofactor orthologs of Arabidopsis; SMC: STRUCTURAL MAINTENANCE OF CHROMOSOMES. Scales indicate mRNA (0–10 and 0–8) and CG methylation density (0–1) levels. Different letters in mRNA (TPM) indicate statistical significance of P < 0.05 (ANOVA test, n = 3). Source data

References

    1. Soltis DE, Visger CJ, Soltis PS. The polyploidy revolution then…and now: Stebbins revisited. Am. J. Bot. 2014;101:1057–1078. doi: 10.3732/ajb.1400178. - DOI - PubMed
    1. Leitch AR, Leitch IJ. Genomic plasticity and the diversity of polyploid plants. Science. 2008;320:481–483. doi: 10.1126/science.1153585. - DOI - PubMed
    1. Otto SP. The evolutionary consequences of polyploidy. Cell. 2007;131:452–462. doi: 10.1016/j.cell.2007.10.022. - DOI - PubMed
    1. Chen ZJ. Molecular mechanisms of polyploidy and hybrid vigor. Trends Plant Sci. 2010;15:57–71. doi: 10.1016/j.tplants.2009.12.003. - DOI - PMC - PubMed
    1. Van de Peer Y, Mizrachi E, Marchal K. The evolutionary significance of polyploidy. Nat. Rev. Genet. 2017;18:411–424. doi: 10.1038/nrg.2017.26. - DOI - PubMed

Publication types

LinkOut - more resources