Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
[Preprint]. 2025 Jan 24:2025.01.22.633974.
doi: 10.1101/2025.01.22.633974.

Extensive genome evolution distinguishes maize within a stable tribe of grasses

Affiliations

Extensive genome evolution distinguishes maize within a stable tribe of grasses

Michelle C Stitzer et al. bioRxiv. .

Abstract

Over the last 20 million years, the Andropogoneae tribe of grasses has evolved to dominate 17% of global land area. Domestication of these grasses in the last 10,000 years has yielded our most productive crops, including maize, sugarcane, and sorghum. The majority of Andropogoneae species, including maize, show a history of polyploidy - a condition that, while offering the evolutionary advantage of multiple gene copies, poses challenges to basic cellular processes, gene expression, and epigenetic regulation. Genomic studies of polyploidy have been limited by sparse sampling of taxa in groups with multiple polyploidy events. Here, we present 33 genome assemblies from 27 species, including chromosome-scale assemblies of maize relatives Zea and Tripsacum. In maize, the after-effects of polyploidy have been widely studied, showing reduced chromosome number, biased fractionation of duplicate genes, and transposable element (TE) expansions. While we observe these patterns within the genus Zea, 12 other polyploidy events deviate significantly. Those tetraploids and hexaploids retain elevated chromosome number, maintain nearly complete complements of duplicate genes, and have only stochastic TE amplifications. These genomes reveal variable outcomes of polyploidy, challenging simple predictions and providing a foundation for understanding its evolutionary implications in an ecologically and economically important clade.

PubMed Disclaimer

Figures

Figure 1:
Figure 1:. Assemblies of Andropogoneae from throughout the phylogenetic and geographic range have varying divergence, ploidy, and genome size.
A) Species phylogeny built from 7,725 syntenic genes, including multiple copies in polyploids, using ASTRAL-PRO3. Pie charts at nodes show the quartet support for the main topology in blue, indicated by the tree, the first alternative topology in teal, and the second alternative in yellow. Polyploidy events are shown in diamonds, with colors corresponding to ploidy. Throughout all figures, diploids are shown in yellow, tetraploids in purple, paleotetraploids in blue, and hexaploids in red. The x-axis position of diamonds reflect timing of divergence of parental genomes, so may predate estimated species divergence. The WGD shared by all Tripsacinae is shown with one point at the median parental divergence of all taxa except obligate annual subspecies in Zea mays. Subtribes are shown as gray boxes, with names listed when we sampled more than one representative. Single representative subtribes are: 1. Germainiinae, 2. Sorghinae, 3. Ischaeminae, 4. Apludinae, 5&6 are Incertae sedis pending taxonomic revision, and 7. Chrysopogoninae B) Haploid size in gigabases of assembly (circle) and TEs and tandem repeats (TR) (triangle), colored by ploidy as in A, with diploids in yellow. Scaffolded assembly size, including N’s, shown with a gray circle. Individuals with * after sample name represent haploid assemblies. Across all assemblies, average assembly size is 1.9 Gb, and average repeat size is 1.5 Gb. C) Map showing collection locations for our 33 samples in points colored by ploidy, as in A. The range of all Andropogoneae species is shown in green, constructed from wild occurrences in AuBuchon-Elder et al. (2023). Digitized collections are limited in the Indian subcontinent, although Andropogoneae are abundant there (Welker et al., 2020). D) Heterozygosity between alleles versus synonymous substitutions between homeologs for each assembly, with circles designating allopolyploids, triangles autopolyploids, and squares diploids. Diploids, which do not have homeologs, are assigned a Ks value of 0. The gray line indicates a 1:1 relationship between heterozygosity and synonymous substitution rate. Chrysoposon serrulatus is excluded from this plot, as it showed elevated nucleotide substitutions arising from nanopore sequencing. As it can be difficult to associate an individual plant with a point, figures with each assembly highlighted are available at https://mcstitzer.github.io/panand_assemblies/.
Figure 2:
Figure 2:. Polyploids in Andropogoneae are abundant.
A) Haploid chromosome number of each sampled individual, versus ploidy. Diploid median and multipliers to tetraploid and hexaploid expectations are shown with dotted lines. For A-C, statistical comparisons of each polyploid group to the diploids were performed using a Wilcoxon rank-sum test. P-values for each comparison shown at top of polyploid group, with *** p<0.001, ** p<0.01, * p<0.05, and ns for nonsignificant. B) Number of syntenic genes found in each individual, versus ploidy. Diploid median and multiplier to tetraploid and hexaploid expectations are shown with dotted lines. C) Megabases of repeats in each sampled individual, versus ploidy. Diploid median and multipliers up to 7x the diploid median are shown with dotted lines. D) Relatedness of duplicate copies in each polyploid across 7,725 gene trees. Each bar matches labels in E. Purple are gene trees where all tips of the given species are monophyletic, and pink are gene trees where the tips are found in paraphyletic or polyphyletic (non-monophyletic) arrangements. As the Tripsacinae paleotetraploidy is shared by multiple species, we downsampled gene trees so only the focal Tripsacinae sample was present in the tree. E) Median synonymous substitutions (Ks) between syntenic homologs in polyploids by alignment block, colored by ploidy.
Figure 3:
Figure 3:. Chromosome stability is higher in polyploids than diploids in Andropogoneae.
A-D Genespace riparian plots showing synteny of four genomes of each ploidy level, with Paspalum on the bottom. Syntenic blocks of each Paspalum chromosome are shown as ribbons for A) Diploids, B) Tetraploids, C) Hexaploids, and D) Paleotetraploids. E) Dotplot of syntenic anchor genes in blocks of 20 or more genes from Paspalum chromosomes versus diploid A. virginicum chromosomes. The A. virginicum assembly is haploid, so only homologous regions are present for each P. vaginatum chromosome. Inset in the top left shows a riparian plot comparing chromosomes, and inset in bottom right shows the karyotype of A. virginicum with 2n=2x=20, scale bar 10 μm. F) Dotplot of syntenic anchor genes in blocks of 20 or more genes from Paspalum chromosomes versus hexaploid B. laguroides scaffolds. The B. laguroides assembly has all six alleles assembled, so each homologous region can be present six times for each P. vaginatum chromosome. Inset in top left shows riparian plot comparing assemblies, and inset in bottom right shows karyotype of B. laguroides with 2n=6x=60, scale bar 10 μm. G) Chromosomal rearrangements show a negative relationship to haploid chromosome number. Each assembly is represented by a point, diploids in yellow, tetraploids in purple, paleotetraploids in blue, and hexaploids in red, as in Figure 1. H) Rearrangements are not significantly related to time since divergence of polyploid parents. For G and H, the median value within each species Zea and Tripsacum was used for calculating the relationship, due to multiple sampling of this polyploidy.
Figure 4:
Figure 4:. Genes are stagnant while noncoding regulatory sequence turns over rapidly.
A-D) Retention of syntenic genes in 100 gene windows along Paspalum chromosome 1 genes in A) diploids with one subgenome, B) tetraploids with two subgenomes, C) hexaploids with three subgenomes, and D) paleotetraploids with two subgenomes. Density plot to the right of each genome shows the genome-wide distribution of these values. E) Relationship between gene retention and time since polyploidy, with overall loss (solid line), but a positive relationship within tetraploids and hexaploids (dashed line). Regression for the solid line uses the median value across more densely sampled genera Zea and Tripsacum. F) Enrichment of Z. mays genomic features relative to genomic abundance in our non-coding sequence most conserved across Andropogoneae. Absolute fold enrichment is displayed above each point. G) Turnover of predicted transcription factor binding sites (TFBS) and genes between Z. mays and other Andropogoneae species by genomic divergence. A single representative subgenome was used for TFBS turnover calculations in polyploid species. Loess smooth linear regression lines with 95% confidence intervals are shown. Genomic divergence was calculated using alignments to all fourfold degenerate sites in Z. mays.
Figure 5:
Figure 5:. Transposable elements react stochastically to polyploidy.
A) Proportion of the genome in repeat sequence versus the median distance between syntenic genes. Points are colored by ploidy, with diploids in yellow, tetraploids in purple, paleotetraploids in blue, and hexaploids in red. The dashed line shows regression excluding paleotetraploids, with a strong positive correlation (r= 0.71, p=0.00016). B) Repeat proportion related to ploidy level. Statistical comparisons of each polyploid group to the diploids were performed using a Wilcoxon rank-sum test. P-values for each comparison are shown at the top of each polyploid group, with **** p<0.0001 and ns for nonsignificant. Diploid median is shown as a dotted horizontal line. C) Divergence between parental subgenomes versus repeat base pairs in each assembly. The solid line includes the median value of Zea and Tripsacum, which are positively correlated (r= 0.51, p= 0.008). D) Proportion of repeats in each genome belonging to different TE superfamilies. TR is Tandem Repeats of all length classes, red colors show LTR retrotransposons (RLX-Unknown; RLC-Ty1/Copia; RLG-Ty3), and blue colors DNA transposons (DTT-Tc1/Mariner; DTM-Mutator; DTH-pIF/Harbinger; DTC-CACTA; DTA-hAT; DHH-Helitron). E) Pielou’s evenness metric versus number of families in each assembly. These are calculated only including families with at least 10 copies in the genome. The evenness metric ranges from 0 (one family contributing all copies) to 1 (all families equally sized). Points are scaled by the amount of repeat base pairs in the genome, and colored by ploidy. F) Divergence between parental subgenomes and median timing of amplification of LTR retrotransposons. Point size is scaled by the number of structurally intact LTR retrotransposons identified in the genome, and colored by ploidy. G) Mean TE base pairs in 100 bp windows away from the transcriptional start site (TSS), left, and transcriptional termination site (TTS), right, of all Helixer genes, colored by ploidy.

References

    1. Alexa A., & Rahnenführer J. (2009). Gene set enrichment analysis with topGO. Bioconductor Improv, 27, 1–26.
    1. Alix K., Gérard P. R., Schwarzacher T., & Heslop-Harrison J. S. (Pat). (2017). Polyploidy and interspecific hybridization: Partners for adaptation, speciation and evolution in plants. Annals of Botany, 120(2), 183–194. 10.1093/aob/mcx079 - DOI - PMC - PubMed
    1. Armstrong J., Hickey G., Diekhans M., Fiddes I. T., Novak A. M., Deran A., Fang Q., Xie D., Feng S., Stiller J., Genereux D., Johnson J., Marinescu V. D., Alföldi J., Harris R. S., Lindblad-Toh K., Haussler D., Karlsson E., Jarvis E. D., … Paten B. (2020). Progressive Cactus is a multiple-genome aligner for the thousand-genome era. Nature, 587(7833), 246–251. 10.1038/s41586-020-2871-y - DOI - PMC - PubMed
    1. AuBuchon-Elder T., Minx P., Bookout B., & Kellogg E. A. (2023). Plant conservation assessment at scale: Rapid triage of extinction risks. PLANTS, PEOPLE, PLANET, 5(3), 386–397. 10.1002/ppp3.10355 - DOI
    1. Baduel P., Bray S., Vallejo-Marin M., Kolář F., & Yant L. (2018). The “Polyploid Hop”: Shifting Challenges and Opportunities Over the Evolutionary Lifespan of Genome Duplications. Frontiers in Ecology and Evolution, 6. 10.3389/fevo.2018.00117 - DOI

Publication types

LinkOut - more resources