Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Aug 6;373(6555):655-662.
doi: 10.1126/science.abg5289.

De novo assembly, annotation, and comparative analysis of 26 diverse maize genomes

Affiliations

De novo assembly, annotation, and comparative analysis of 26 diverse maize genomes

Matthew B Hufford et al. Science. .

Abstract

We report de novo genome assemblies, transcriptomes, annotations, and methylomes for the 26 inbreds that serve as the founders for the maize nested association mapping population. The number of pan-genes in these diverse genomes exceeds 103,000, with approximately a third found across all genotypes. The results demonstrate that the ancient tetraploid character of maize continues to degrade by fractionation to the present day. Excellent contiguity over repeat arrays and complete annotation of centromeres revealed additional variation in major cytological landmarks. We show that combining structural variation with single-nucleotide polymorphisms can improve the power of quantitative mapping studies. We also document variation at the level of DNA methylation and demonstrate that unmethylated regions are enriched for cis-regulatory elements that contribute to phenotypic variation.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Pan genome analysis of the gene space. A) Pan-genes categorized by annotation method and phylostrata. Genes annotated with evidence have mRNA support whereas ab initio genes are predicted based on DNA sequence alone. Genes within progressing phylostrata - species Zea mays (maize), tribe Andropogoneae, family Poaceae, kingdom Viridiplantae - are more conserved. B) Number of pan-genes added with each additional genome assembly. Order of genomes being added into the pan genome was bootstrapped 1000 times. Tropical lines include (CML52, CML69, CML103, CML228, CML247, CML277, CML322, CML333, Ki3, Ki11, NC350, NC358, Tzi8), temperate lines include (B73, B97, Ky21, M162W, Ms71,Oh43, Oh7B, HP301, P39, and Il14H). C) Proportion of pan-genes in the core, near core, dispensable, and private fractions of the pan-genome. For B and C, tandem duplicates were considered as a single pan-gene and coordinates were filled in when a gene was not annotated, but an alignment with greater than 90% coverage and 90% identity was present within the correct homologous block. D) Number of tissues with expression (RPKM>1) for each gene in each genome based on their pan-genome classification. Tissues in this analysis include (root, shoot, V11 base, V11 middle, V11 tip, anther, tassel, and ear).
Figure 2.
Figure 2.
The tempo of fractionation in maize. A) Schematic showing how genes were categorized. 16,195 conservatively chosen orthologs were subdivided into classes representing retained pairs, ancient fractionation, and recent fractionation. B) Unfolded site frequency spectrum (SFS) of segregating exon loss and non-coding SNPs (genic and non-genic) using sorghum to define the ancestral state. C) Heatmap of the number of co-retained exons between any two NAM lines. Lines with mixed ancestry (M37W, Mo18W, Tx303) are excluded. Colors indicate the Z-score (the difference measured in standard deviations between a single pairwise comparison and all others in the row).
Figure 3.
Figure 3.
Structural variation in the NAM founders. A) Pairwise alignments between Ki11, B73, Il14H on chromosome 8. Grey links represent syntenic aligned regions; gaps of unknown size (scaffold gaps) are marked by dashed lines. B) Large (>100 kbp) structural variants, centromeres, and knobs across the NAM lines versus the B73 reference. The subset of SVs larger than 1 Mbp were manually curated, and only those containing genes are represented. Features 1-5 highlight major SVs: 1) Multiple centromere movement events; 2) A major inversion previously hypothesized based on suppressed recombination; 3) A large deletion in the Ms71 inbred; 4) Knob polymorphism; 5) Reciprocal translocation between chromosome 9 and 10 in the Oh7B inbred (both segments placed in their standard positions for display).
Figure 4.
Figure 4.
UMR variation across the NAM founders. A) Annotation of the Miniature seed1 gene in the Mo17W inbred. An image from the MaizeGDB browser shows gene, TE, and UMR tracks. TE tracks are color-coded by superfamily: green/grey = LTR, red = TIR, blue = LINE. The grey vertical lines show 2.5 kbp intervals. B) Annotation and underlying methylation data for Miniature seed1 in the B73 inbred. The insertion of a Gypsy element moved part of the proximal UMR to a position 14 kbp upstream from the transcription start site (TSS). Methylation tracks indicate base-pair level methylation values from 0 to 100%. Asterisks indicate gaps in coverage, which are visible in separate tracks (Fig. S28). C) Relationship between methylation and gene expression. UMRs were mapped to B73 to identify UMRs that overlap with TSS. The Y axis indicates the ratio of transcripts per million (TPM, compared to B73) when the region is methylated (red) or unmethylated (teal).

References

    1. Hirsch CN, Foerster JM, Johnson JM, Sekhon RS, Muttoni G, Vaillancourt B, Peñagaricano F, Lindquist E, Pedraza MA, Barry K, de Leon N, Kaeppler SM, Buell CR, Insights into the maize pan-genome and pan-transcriptome. Plant Cell. 26, 121–135 (2014). - PMC - PubMed
    1. Hirsch CN, Hirsch CD, Brohammer AB, Bowman MJ, Soifer I, Barad O, Shem-Tov D, Baruch K, Lu F, Hernandez AG, Fields CJ, Wright CL, Koehler K, Springer NM, Buckler E, Buell CR, de Leon N, Kaeppler SM, Childs KL, Mikel MA, Draft Assembly of Elite Inbred Line PH207 Provides Insights into Genomic and Transcriptome Diversity in Maize. Plant Cell. 28, 2700–2714 (2016). - PMC - PubMed
    1. Jin M, Liu H, He C, Fu J, Xiao Y, Wang Y, Xie W, Wang G, Yan J, Maize pan-transcriptome provides novel insights into genome complexity and quantitative trait variation. Sci. Rep 6, 18936 (2016). - PMC - PubMed
    1. Lu F, Romay MC, Glaubitz JC, Bradbury PJ, Elshire RJ, Wang T, Li Y, Li Y, Semagn K, Zhang X, Hernandez AG, Mikel MA, Soifer I, Barad O, Buckler ES, High-resolution genetic mapping of maize pan-genome sequence anchors. Nat. Commun 6, 6914 (2015). - PMC - PubMed
    1. Ricci WA, Lu Z, Ji L, Marand AP, Ethridge CL, Murphy NG, Noshay JM, Galli M, Mejía-Guerra MK, Colomé-Tatché M, Johannes F, Rowley MJ, Corces VG, Zhai J, Scanlon MJ, Buckler ES, Gallavotti A, Springer NM, Schmitz RJ, Zhang X, Widespread long-range cis-regulatory elements in the maize genome. Nature Plants 5, 1237–1249 (2019). - PMC - PubMed

Publication types

MeSH terms