Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2009 Nov;5(11):e1000740.
doi: 10.1371/journal.pgen.1000740. Epub 2009 Nov 20.

Sequencing, mapping, and analysis of 27,455 maize full-length cDNAs

Affiliations

Sequencing, mapping, and analysis of 27,455 maize full-length cDNAs

Carol Soderlund et al. PLoS Genet. 2009 Nov.

Abstract

Full-length cDNA (FLcDNA) sequencing establishes the precise primary structure of individual gene transcripts. From two libraries representing 27 B73 tissues and abiotic stress treatments, 27,455 high-quality FLcDNAs were sequenced. The average transcript length was 1.44 kb including 218 bases and 321 bases of 5' and 3' UTR, respectively, with 8.6% of the FLcDNAs encoding predicted proteins of fewer than 100 amino acids. Approximately 94% of the FLcDNAs were stringently mapped to the maize genome. Although nearly two-thirds of this genome is composed of transposable elements (TEs), only 5.6% of the FLcDNAs contained TE sequences in coding or UTR regions. Approximately 7.2% of the FLcDNAs are putative transcription factors, suggesting that rare transcripts are well-enriched in our FLcDNA set. Protein similarity searching identified 1,737 maize transcripts not present in rice, sorghum, Arabidopsis, or poplar annotated genes. A strict FLcDNA assembly generated 24,467 non-redundant sequences, of which 88% have non-maize protein matches. The FLcDNAs were also assembled with 41,759 FLcDNAs in GenBank from other projects, where semi-strict parameters were used to identify 13,368 potentially unique non-redundant sequences from this project. The libraries, ESTs, and FLcDNA sequences produced from this project are publicly available. The annotated EST and FLcDNA assemblies are available through the maize FLcDNA web resource (www.maizecdna.org).

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Figure 1
Figure 1. Homolog analysis with rice, sorghum, Arabidopsis, and poplar.
The 27k FLcDNAs were searched against annotated protein gene models of each species, and the overlapping matches between species are displayed in the Venn diagram. Two overlaps (25 overlapping hits between rice and poplar, and two overlapping hits between Arabidopsis and sorghum) are not listed.
Figure 2
Figure 2. FLcDNA density heat-map displayed on the maize chromosomes.
The 24,354 FLcDNAs that mapped to a single locus were counted in 1Mb bins (number of FLcDNA/Mb), color-coded, and plotted on the maize chromosomes. The yellow indicates average density (∼12cDNAs/Mb), the red is higher than average, and the blue is lower. The brown-colored bars next to each chromosome represent the regions where FLcDNA density is higher than average +2 standard deviations ( = 32 FLcDNAs/Mb).
Figure 3
Figure 3. Detection of homeologous genes in the maize genome.
Potential homeologous genes for the 24,354 single locus-mapped FLcDNAs were computed by using relaxed mapping parameters for aligning them to the maize genome. Approximately 44% SL-FLcDNAs had homeologous regions (ID; identity, AL; alignment length, SL; single locus, ML; multi loci (>4), 2–4L; 2–4 mapped loci).
Figure 4
Figure 4. A contig in the 69k loose assembly with multiple SNPs and GPs.
(A) The red vertical lines indicate a mismatch with the consensus sequence, and green vertical lines indicate a gap in relation to the consensus sequence. The clone prefix indicates the project (BC – Yu, W – Wang, M – Messing, F – Feldman). (B) Base view of the 5′ ends of 7 FLcDNAs with alternative start sites in relation to the consensus sequence. This is a close-up of bases 97–164 from the alignment shown in (A). The red bases do not agree with the consensus.

Similar articles

Cited by

References

    1. Wang Q, Dooner HK. Remarkable variation in maize genome structure inferred from haplotype diversity at the bz locus. Proc Natl Acad Sci U S A. 2006;103:17644–17649. - PMC - PubMed
    1. Buckler ES, Gaut BS, McMullen MD. Molecular and functional diversity of maize. Curr Opin Plant Biol. 2006;9:172–176. - PubMed
    1. Blanc G, Wolfe KH. Widespread paleopolyploidy in model plant species inferred from age distributions of duplicate genes. Plant Cell. 2004;16:1667–1678. - PMC - PubMed
    1. Zhang L, Gaut BS. Does recombination shape the distribution and evolution of tandemly arrayed genes (TAGs) in the Arabidopsis thaliana genome? Genome Res. 2003;13:2533–2540. - PMC - PubMed
    1. Messing J, Bharti AK, Karlowski WM, Gundlach H, Kim HR, et al. Sequence composition and genome organization of maize. Proc Natl Acad Sci U S A. 2004;101:14349–14354. - PMC - PubMed

Publication types

MeSH terms

LinkOut - more resources