Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2005 Dec;139(4):1612-24.
doi: 10.1104/pp.105.068718.

Structure and architecture of the maize genome

Affiliations

Structure and architecture of the maize genome

Georg Haberer et al. Plant Physiol. 2005 Dec.

Abstract

Maize (Zea mays or corn) plays many varied and important roles in society. It is not only an important experimental model plant, but also a major livestock feed crop and a significant source of industrial products such as sweeteners and ethanol. In this study we report the systematic analysis of contiguous sequences of the maize genome. We selected 100 random regions averaging 144 kb in size, representing about 0.6% of the genome, and generated a high-quality dataset for sequence analysis. This sampling contains 330 annotated genes, 91% of which are supported by expressed sequence tag data from maize and other cereal species. Genes averaged 4 kb in size with five exons, although the largest was over 59 kb with 31 exons. Gene density varied over a wide range from 0.5 to 10.7 genes per 100 kb and genes did not appear to cluster significantly. The total repetitive element content we observed (66%) was slightly higher than previous whole-genome estimates (58%-63%) and consisted almost exclusively of retroelements. The vast majority of genes can be aligned to at least one sequence read derived from gene-enrichment procedures, but only about 30% are fully covered. Our results indicate that much of the increase in genome size of maize relative to rice (Oryza sativa) and Arabidopsis (Arabidopsis thaliana) is attributable to an increase in number of both repetitive elements and genes.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Gene characteristics in the 100 random regions. Graphs have been plotted to show the number of exons per gene (A), the number of genes per BAC clone (B), and the gene density expressed as number of genes per 100 kb (C).
Figure 2.
Figure 2.
Comparison of genes to species-specific EST collections. Proteins derived from the gene models were compared to the EST assemblies using TBLASTN. Homologous sequences were binned into four classes: gene models with highly significant EST matches (E value less than 1e−30), with significant homologies (E value between 10−30 and 10−20), with weak homologies (E values between 10−20 and 10−10), and those exhibiting no or only very weak homologies (E values higher than 10−10).
Figure 3.
Figure 3.
Graphic representation of a sample of annotated BAC clones. Ten out of 100 annotated BAC clones are arranged as bars depicting genes (blue) and regions containing repeat sequences (red). A straight gray line represents intergenic regions with no predicted gene models. To determine the coverage of our annotations by the collection of methyl- and C0t-filtered sequence reads, we compared the BAC sequences against the respective collections obtained from TIGR (http://www.tigr.org/tdb/tgi/maize/). All filtered sequence reads were mapped to the 100 BAC sequences by BLASTN sequence comparison and subsequent quality parsing. To anchor a clone to a genomic location, a minimal sequence identity of 98% over the complete alignment length and an alignment length equal or greater than 90% of the clone length were required. Sequence matches from methyl/C0t-filtered sequence reads are depicted in dark green and light green, respectively. Specific features are highlighted with consecutive numbers: (1) examples of low gene coverage by filtered sequences, (2) example of tandem gene copies representing highly similar hydrolases for which GSS tags could be unequivocally mapped, and (3) nonrepeat intergenic region well covered by filtered sequences.
Figure 4.
Figure 4.
Three examples of genes containing repetitive sequences within their introns. CG:temp172:AC145728.7 represents an ATPase II-like protein, CG:temp394:AC147505.5 an unknown protein containing a conserved PER1 domain, and CG:temp390:AC147505.5 a protein containing two cyclin K domains. Exons are shown as striped bars, introns as black lines, and repetitive sequences as triangles. DNA transposons are represented by black and retroelements by gray triangles. CG:temp172:AC145728.7 contains a retroelement, CG:temp394:AC147505.5 three DNA transposons (tourist-, Castaway-, and MITE-adh-like elements) and one retroelement, and CG:temp390:AC147505.5 two copies of Ty/copia elements and SINEs, respectively, within their introns.
Figure 5.
Figure 5.
Coverage of exonic, intronic, and genic sequences by methyl- and C0t-filtered sequence reads. Coverage was determined as described in Figure 3, and results were sorted into bins of size 10% of fractional coverage. Fractional coverage for exons, introns, and complete genes by methyl-filtered, C0t-filtered, and combined sequences are shown. A to C depict the values obtained for exonic, intronic, and genic coverage. Bars in medium blue show values obtained for methyl-filtered sequence reads, bars in light blue values for C0t-filtered clones, and dark blue bars depict cumulative values.

References

    1. Ahn S, Tanksley SD (1993) Comparative linkage maps of the rice and maize genomes. Proc Natl Acad Sci USA 90: 7980–7984 - PMC - PubMed
    1. Alleman M, Doctor J (2000) Genomic imprinting in plants: observations and evolutionary implications. Plant Mol Biol 43: 147–161 - PubMed
    1. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ (1990) Basic local alignment search tool. J Mol Biol 215: 403–410 - PubMed
    1. Arabidopsis Genome Initiative (2000) Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature 408: 796–815 - PubMed
    1. Batzoglou S, Jaffe DB, Stanley K, Butler J, Gnerre S, Mauceli E, Berger B, Mesirov JP, Lander ES (2002) ARACHNE: a whole-genome shotgun assembler. Genome Res 12: 177–189 - PMC - PubMed

Publication types