Computational finishing of large sequence contigs reveals interspersed nested repeats and gene islands in the rf1-associated region of maize
- PMID: 19675151
- PMCID: PMC2754626
- DOI: 10.1104/pp.109.143370
Computational finishing of large sequence contigs reveals interspersed nested repeats and gene islands in the rf1-associated region of maize
Abstract
The architecture of grass genomes varies on multiple levels. Large long terminal repeat retrotransposon clusters occupy significant portions of the intergenic regions, and islands of protein-encoding genes are interspersed among the repeat clusters. Hence, advanced assembly techniques are required to obtain completely finished genomes as well as to investigate gene and transposable element distributions. To characterize the organization and distribution of repeat clusters and gene islands across large grass genomes, we present 961- and 594-kb contiguous sequence contigs associated with the rf1 (for restorer of fertility1) locus in the near-centromeric region of maize (Zea mays) chromosome 3. We present two methods for computational finishing of highly repetitive bacterial artificial chromosome clones that have proved successful to close all sequence gaps caused by transposable element insertions. Sixteen repeat clusters were observed, ranging in length from 23 to 155 kb. These repeat clusters are almost exclusively long terminal repeat retrotransposons, of which the paleontology of insertion varies throughout the cluster. Gene islands contain from one to four predicted genes, resulting in a gene density of one gene per 16 kb in gene islands and one gene per 111 kb over the entire sequenced region. The two sequence contigs, when compared with the rice (Oryza sativa) and sorghum (Sorghum bicolor) genomes, retain gene colinearity of 50% and 71%, respectively, and 70% and 100%, respectively, for high-confidence gene models. Collinear genes on single gene islands show that while most expansion of the maize genome has occurred in the repeat clusters, gene islands are not immune and have experienced growth in both intragene and intergene locations.
Figures




Similar articles
-
Fertility restorer locus Rf1 [corrected] of sorghum (Sorghum bicolor L.) encodes a pentatricopeptide repeat protein not present in the colinear region of rice chromosome 12.Theor Appl Genet. 2005 Oct;111(6):994-1012. doi: 10.1007/s00122-005-2011-y. Epub 2005 Aug 3. Theor Appl Genet. 2005. PMID: 16078015
-
Evolution of DNA sequence nonhomologies among maize inbreds.Plant Cell. 2005 Feb;17(2):343-60. doi: 10.1105/tpc.104.025627. Epub 2005 Jan 19. Plant Cell. 2005. PMID: 15659640 Free PMC article.
-
Orthologous comparisons of the Hd1 region across genera reveal Hd1 gene lability within diploid Oryza species and disruptions to microsynteny in Sorghum.Mol Biol Evol. 2010 Nov;27(11):2487-506. doi: 10.1093/molbev/msq133. Epub 2010 Jun 3. Mol Biol Evol. 2010. PMID: 20522726
-
Maize as a model for the evolution of plant nuclear genomes.Proc Natl Acad Sci U S A. 2000 Jun 20;97(13):7008-15. doi: 10.1073/pnas.97.13.7008. Proc Natl Acad Sci U S A. 2000. PMID: 10860964 Free PMC article. Review.
-
Assembling genomes using short-read sequencing technology.Genome Biol. 2010 Jan 28;11(1):202. doi: 10.1186/gb-2010-11-1-202. Genome Biol. 2010. PMID: 20128932 Free PMC article. Review.
Cited by
-
A single molecule scaffold for the maize genome.PLoS Genet. 2009 Nov;5(11):e1000711. doi: 10.1371/journal.pgen.1000711. Epub 2009 Nov 20. PLoS Genet. 2009. PMID: 19936062 Free PMC article.
-
Important biological information uncovered in previously unaligned reads from chromatin immunoprecipitation experiments (ChIP-Seq).Sci Rep. 2015 Mar 2;5:8635. doi: 10.1038/srep08635. Sci Rep. 2015. PMID: 25727450 Free PMC article.
-
Nested insertions and accumulation of indels are negatively correlated with abundance of mutator-like transposable elements in maize and rice.PLoS One. 2014 Jan 27;9(1):e87069. doi: 10.1371/journal.pone.0087069. eCollection 2014. PLoS One. 2014. PMID: 24475224 Free PMC article.
-
A high-contiguity Brassica nigra genome localizes active centromeres and defines the ancestral Brassica genome.Nat Plants. 2020 Aug;6(8):929-941. doi: 10.1038/s41477-020-0735-y. Epub 2020 Aug 10. Nat Plants. 2020. PMID: 32782408 Free PMC article.
-
Megabase level sequencing reveals contrasted organization and evolution patterns of the wheat gene and transposable element spaces.Plant Cell. 2010 Jun;22(6):1686-701. doi: 10.1105/tpc.110.074187. Epub 2010 Jun 25. Plant Cell. 2010. PMID: 20581307 Free PMC article.
References
-
- Arabidopsis Genome Initiative (2000) Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature 408: 796–815 - PubMed
Publication types
MeSH terms
Substances
Associated data
- Actions
- Actions
LinkOut - more resources
Full Text Sources