Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2010;11(2):R12.
doi: 10.1186/gb-2010-11-2-r12. Epub 2010 Feb 3.

Genomic and small RNA sequencing of Miscanthus x giganteus shows the utility of sorghum as a reference genome sequence for Andropogoneae grasses

Affiliations

Genomic and small RNA sequencing of Miscanthus x giganteus shows the utility of sorghum as a reference genome sequence for Andropogoneae grasses

Kankshita Swaminathan et al. Genome Biol. 2010.

Abstract

Background: Miscanthus x giganteus (Mxg) is a perennial grass that produces superior biomass yields in temperate environments. The essentially uncharacterized triploid genome (3n = 57, x = 19) of Mxg is likely critical for the rapid growth of this vegetatively propagated interspecific hybrid.

Results: A survey of the complex Mxg genome was conducted using 454 pyrosequencing of genomic DNA and Illumina sequencing-by-synthesis of small RNA. We found that the coding fraction of the Mxg genome has a high level of sequence identity to that of other grasses. Highly repetitive sequences representing the great majority of the Mxg genome were predicted using non-cognate assembly for de novo repeat detection. Twelve abundant families of repeat were observed, with those related to either transposons or centromeric repeats likely to comprise over 95% of the genome. Comparisons of abundant repeat sequences to a small RNA survey of three Mxg organs (leaf, rhizome, inflorescence) revealed that the majority of observed 24-nucleotide small RNAs are derived from these repetitive sequences. We show that high-copy-number repeats match more of the small RNA, even when the amount of the repeat sequence in the genome is accounted for.

Conclusions: We show that major repeats are present within the triploid Mxg genome and are actively producing small RNAs. We also confirm the hypothesized origins of Mxg, and suggest that while the repeat content of Mxg differs from sorghum, the sorghum genome is likely to be of utility in the assembly of a gene-space sequence of Mxg.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Similarity of the Miscanthus × giganteus (Mxg) and other monocotyledon genomes. A sequence survey of Mxg was compared to the sorghum whole-genome sequence (red line), the rice whole-genome sequence (blue line) and the maize whole-genome sequence (green line). In addition to the whole-genome sequences the survey was also compared to the predicted sorghum coding regions (CDS) unfiltered (orange line), and the sorghum coding regions with known transposon-related sequences removed (yellow line) using nucleotide BLAST. In all cases, the percentage nucleotide identity of the match (x-axis) is plotted against the percentage of the total reads from the survey with a given percentage identity to the relevant dataset (y-axis). No matches were observed with nucleotide identity below 75% at the e value cutoff used (10-10).
Figure 2
Figure 2
Classification of repeats detected in Miscanthus × giganteus (Mxg) and sorghum by sequence comparison to the Plant Repeat Database. Sequence surveys of Mxg and sorghum were matched to the Plant Repeat Database by nucleotide BLAST search. The proportion of repeats in each class for these two genomes was estimated by comparing the percentage of reads matching repeats of different classes in the database. (a) Proportion of repeats from surveys of the two species matching general classes of plant repetitive sequence. In both Mxg and sorghum, retrotransposons are the predominant class of repeats. Transposons are class II (DNA) transposons according to the designations in the Plant Repeat Database. (b) Further classification into repeat subfamilies, showing differing levels of miniature inverted repeat transposable element (MITE) and transposable element families in the two species. LINE, long interspersed nuclear element; SINE, short interspersed nuclear element.
Figure 3
Figure 3
An estimation of copy number of sequences present in three sorghum genomic sequences in sorghum and Miscanthus × giganteus (Mxg). Copy number was estimated for regions of the sorghum genome in both sorghum and Mxg. Shown are three completed sorghum BAC sequences, one centromeric and two euchromatic. Sorghum copy number was estimated by matching to a sequence dataset of whole-genome sorghum shotgun sequences (red) and the Mxg copy number estimated by comparison to the 454 survey reads (blue) using a blastZ alignment within a 1,000-bp sliding window. The estimated genomic copy number based on the number of reads matching each window (y-axis) is plotted against the position of the window on the BAC (x-axis). The nucleotide identity cutoff for this analysis was 90%. The regions of greatest copy number on BACs AC169372 and AC169376 predominantly match miniature inverted repeat transposable elements (MITEs), transposons and retrotransposons, which are significantly more abundant in sorghum, while AC169373 contains highly abundant centromeric repeats, for which the Mxg and sorghum copy numbers agree closely.
Figure 4
Figure 4
Much of the small RNA transcriptome of Miscanthus × giganteus (Mxg) matches high copy number genomic repeats. (a) Number of Mxg repeats, or gene space sequence reads, matching a small RNA (sRNA), as determined by matching repeat sequences to sRNA signatures produced by sequencing sRNA from three Mxg tissues. Repeats are annotated by broad category where known. Unclassified repeats match a sequence in the database without an assigned category; unannotated repeats do not have a database match. Gene space reads are genome survey reads that match sorghum filtered coding sequences (Figure 1). (b) Percentages of small RNA produced by different repeat classes. Normalized abundance of small RNA signatures was calculated in transcript per quarter (TPQ) million reads. In addition to the data shown, telomeres and telomere-associated repeats together produced 0.09% of the total amount of sRNA (a percentage too small to effectively display in the chart).
Figure 5
Figure 5
Correlation between repeat copy number and amount of small RNA per kilobase of matching repeat sequences. Repeats were binned according to their estimated copy number in the Mxg genome and then divided into categories as in Figure 4a. The number of small RNA signatures matching the repeat sequence in each category and copy number class divided by the estimated total genomic size of the repeat class in kilobases is plotted on the y-axis. MITE, miniature inverted repeat transposable element.
Figure 6
Figure 6
Phylogenetic analysis of Miscanthus × giganteus (Mxg) based on nuclear ribosomal DNA. (a) Sites of variation in a nucleotide alignment of the ITS1, 5.8S and the ITS2 regions of the rDNA from various Miscanthus species and Mxg survey reads. Reads from the Mxg genome survey that matched the internal transcribed spacers (ITS1 and ITS2) and the 5.8S rRNA were manually aligned using Sequencher and McClade to Miscanthus, Maize, sorghum and Saccharum sequences from GenBank and variable residues identified. (b) Phylogenetic tree of Mxg survey reads together with related species. A Bayesian phylogenetic analysis of the residues from a 150-bp region of ITS2 spanned by several complete reads from Mxg and shown in bold in (a) was performed using the general time reversible (GTR) model of substitution and a gamma distribution of the rates of substitutions. Parameters estimated from the last 5,000 trees were used to calculate a posterior probability at each node and draw a 50% majority rule consensus tree (b). The numbers at the nodes indicate the percentage confidence in the branches as assessed using the posterior probabilities. The diamonds represent the individual Mxg survey reads from this region.

Similar articles

Cited by

References

    1. Brown RH. A difference in N use efficiency in C3 and C4 plants and its implications in adaptation and evolution. Crop Sci. 1978;18:93–98.
    1. Beadle CL, Long SP. Photosynthesis - is it limiting to biomass production. Biomass. 1985;8:119–168. doi: 10.1016/0144-4565(85)90022-8. - DOI
    1. Moore G, Devos KM, Wang Z, Gale MD. Cereal genome evolution: grasses, line up and form a circle. Curr Biol. 1995;5:737–739. doi: 10.1016/S0960-9822(95)00148-5. - DOI - PubMed
    1. Paterson AH, Freeling M, Sasaki T. Grains of knowledge: genomics of model cereals. Genome Res. 2005;15:1643–1650. doi: 10.1101/gr.3725905. - DOI - PubMed
    1. Bennetzen JL. Patterns in grass genome evolution. Curr Opin Plant Biol. 2007;10:176–181. doi: 10.1016/j.pbi.2007.01.010. - DOI - PubMed

Publication types

Associated data