. 2010;11(2):R12.

doi: 10.1186/gb-2010-11-2-r12. Epub 2010 Feb 3.

Genomic and small RNA sequencing of Miscanthus x giganteus shows the utility of sorghum as a reference genome sequence for Andropogoneae grasses

Kankshita Swaminathan¹, Magdy S Alabady, Kranthi Varala, Emanuele De Paoli, Isaac Ho, Dan S Rokhsar, Aru K Arumuganathan, Ray Ming, Pamela J Green, Blake C Meyers, Stephen P Moose, Matthew E Hudson

Affiliations

PMID: 20128909
PMCID: PMC2872872
DOI: 10.1186/gb-2010-11-2-r12

Genomic and small RNA sequencing of Miscanthus x giganteus shows the utility of sorghum as a reference genome sequence for Andropogoneae grasses

Kankshita Swaminathan et al. Genome Biol. 2010.

. 2010;11(2):R12.

doi: 10.1186/gb-2010-11-2-r12. Epub 2010 Feb 3.

Authors

Kankshita Swaminathan¹, Magdy S Alabady, Kranthi Varala, Emanuele De Paoli, Isaac Ho, Dan S Rokhsar, Aru K Arumuganathan, Ray Ming, Pamela J Green, Blake C Meyers, Stephen P Moose, Matthew E Hudson

Affiliation

¹ Department of Crop Sciences, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA.

PMID: 20128909
PMCID: PMC2872872
DOI: 10.1186/gb-2010-11-2-r12

Abstract

Background: Miscanthus x giganteus (Mxg) is a perennial grass that produces superior biomass yields in temperate environments. The essentially uncharacterized triploid genome (3n = 57, x = 19) of Mxg is likely critical for the rapid growth of this vegetatively propagated interspecific hybrid.

Results: A survey of the complex Mxg genome was conducted using 454 pyrosequencing of genomic DNA and Illumina sequencing-by-synthesis of small RNA. We found that the coding fraction of the Mxg genome has a high level of sequence identity to that of other grasses. Highly repetitive sequences representing the great majority of the Mxg genome were predicted using non-cognate assembly for de novo repeat detection. Twelve abundant families of repeat were observed, with those related to either transposons or centromeric repeats likely to comprise over 95% of the genome. Comparisons of abundant repeat sequences to a small RNA survey of three Mxg organs (leaf, rhizome, inflorescence) revealed that the majority of observed 24-nucleotide small RNAs are derived from these repetitive sequences. We show that high-copy-number repeats match more of the small RNA, even when the amount of the repeat sequence in the genome is accounted for.

Conclusions: We show that major repeats are present within the triploid Mxg genome and are actively producing small RNAs. We also confirm the hypothesized origins of Mxg, and suggest that while the repeat content of Mxg differs from sorghum, the sorghum genome is likely to be of utility in the assembly of a gene-space sequence of Mxg.

PubMed Disclaimer

Figures

**Figure 1**
**Similarity of the *Miscanthus × giganteus* (*Mxg*) and other monocotyledon genomes**. A sequence survey of *Mxg* was compared to the sorghum whole-genome sequence (red line), the rice whole-genome sequence (blue line) and the maize whole-genome sequence (green line). In addition to the whole-genome sequences the survey was also compared to the predicted sorghum coding regions (CDS) unfiltered (orange line), and the sorghum coding regions with known transposon-related sequences removed (yellow line) using nucleotide BLAST. In all cases, the percentage nucleotide identity of the match (x-axis) is plotted against the percentage of the total reads from the survey with a given percentage identity to the relevant dataset (y-axis). No matches were observed with nucleotide identity below 75% at the e value cutoff used (10^-10).

**Figure 2**
**Classification of repeats detected in *Miscanthus × giganteus* (*Mxg*) and sorghum by sequence comparison to the Plant Repeat Database**. Sequence surveys of *Mxg* and sorghum were matched to the Plant Repeat Database by nucleotide BLAST search. The proportion of repeats in each class for these two genomes was estimated by comparing the percentage of reads matching repeats of different classes in the database. **(a)** Proportion of repeats from surveys of the two species matching general classes of plant repetitive sequence. In both *Mxg* and sorghum, retrotransposons are the predominant class of repeats. Transposons are class II (DNA) transposons according to the designations in the Plant Repeat Database. **(b)** Further classification into repeat subfamilies, showing differing levels of miniature inverted repeat transposable element (MITE) and transposable element families in the two species. LINE, long interspersed nuclear element; SINE, short interspersed nuclear element.

**Figure 3**
**An estimation of copy number of sequences present in three sorghum genomic sequences in sorghum and *Miscanthus × giganteus* (*Mxg*)**. Copy number was estimated for regions of the sorghum genome in both sorghum and *Mxg*. Shown are three completed sorghum BAC sequences, one centromeric and two euchromatic. Sorghum copy number was estimated by matching to a sequence dataset of whole-genome sorghum shotgun sequences (red) and the *Mxg* copy number estimated by comparison to the 454 survey reads (blue) using a blastZ alignment within a 1,000-bp sliding window. The estimated genomic copy number based on the number of reads matching each window (y-axis) is plotted against the position of the window on the BAC (x-axis). The nucleotide identity cutoff for this analysis was 90%. The regions of greatest copy number on BACs AC169372 and AC169376 predominantly match miniature inverted repeat transposable elements (MITEs), transposons and retrotransposons, which are significantly more abundant in sorghum, while AC169373 contains highly abundant centromeric repeats, for which the *Mxg* and sorghum copy numbers agree closely.

**Figure 4**
**Much of the small RNA transcriptome of *Miscanthus × giganteus* (*Mxg*) matches high copy number genomic repeats**. **(a)** Number of *Mxg* repeats, or gene space sequence reads, matching a small RNA (sRNA), as determined by matching repeat sequences to sRNA signatures produced by sequencing sRNA from three *Mxg* tissues. Repeats are annotated by broad category where known. Unclassified repeats match a sequence in the database without an assigned category; unannotated repeats do not have a database match. Gene space reads are genome survey reads that match sorghum filtered coding sequences (Figure 1). **(b)** Percentages of small RNA produced by different repeat classes. Normalized abundance of small RNA signatures was calculated in transcript per quarter (TPQ) million reads. In addition to the data shown, telomeres and telomere-associated repeats together produced 0.09% of the total amount of sRNA (a percentage too small to effectively display in the chart).

**Figure 5**
**Correlation between repeat copy number and amount of small RNA per kilobase of matching repeat sequences**. Repeats were binned according to their estimated copy number in the *Mxg* genome and then divided into categories as in Figure 4a. The number of small RNA signatures matching the repeat sequence in each category and copy number class divided by the estimated total genomic size of the repeat class in kilobases is plotted on the y-axis. MITE, miniature inverted repeat transposable element.

**Figure 6**
**Phylogenetic analysis of *Miscanthus × giganteus* (*Mxg*) based on nuclear ribosomal DNA**. **(a)** Sites of variation in a nucleotide alignment of the ITS1, 5.8S and the ITS2 regions of the rDNA from various *Miscanthus* species and *Mxg* survey reads. Reads from the *Mxg* genome survey that matched the internal transcribed spacers (ITS1 and ITS2) and the 5.8S rRNA were manually aligned using Sequencher and McClade to *Miscanthus*, Maize, sorghum and *Saccharum* sequences from GenBank and variable residues identified. **(b)** Phylogenetic tree of *Mxg* survey reads together with related species. A Bayesian phylogenetic analysis of the residues from a 150-bp region of ITS2 spanned by several complete reads from *Mxg* and shown in bold in (a) was performed using the general time reversible (GTR) model of substitution and a gamma distribution of the rates of substitutions. Parameters estimated from the last 5,000 trees were used to calculate a posterior probability at each node and draw a 50% majority rule consensus tree (b). The numbers at the nodes indicate the percentage confidence in the branches as assessed using the posterior probabilities. The diamonds represent the individual *Mxg* survey reads from this region.

See this image and copyright information in PMC

References

1. Brown RH. A difference in N use efficiency in C3 and C4 plants and its implications in adaptation and evolution. Crop Sci. 1978;18:93–98.
1. Beadle CL, Long SP. Photosynthesis - is it limiting to biomass production. Biomass. 1985;8:119–168. doi: 10.1016/0144-4565(85)90022-8. - DOI
1. Moore G, Devos KM, Wang Z, Gale MD. Cereal genome evolution: grasses, line up and form a circle. Curr Biol. 1995;5:737–739. doi: 10.1016/S0960-9822(95)00148-5. - DOI - PubMed
1. Paterson AH, Freeling M, Sasaki T. Grains of knowledge: genomics of model cereals. Genome Res. 2005;15:1643–1650. doi: 10.1101/gr.3725905. - DOI - PubMed
1. Bennetzen JL. Patterns in grass genome evolution. Curr Opin Plant Biol. 2007;10:176–181. doi: 10.1016/j.pbi.2007.01.010. - DOI - PubMed

Publication types

Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Substances

Actions
Actions

Associated data

Actions
- Search in PubMed
- Search in GEO

LinkOut - more resources

Full Text Sources
Other Literature Sources
- The Lens - Patent Citations Database
Molecular Biology Databases
- NIAID Data Ecosystem - Find datasets on Infectious and Immune-mediated Diseases

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Genomic and small RNA sequencing of Miscanthus x giganteus shows the utility of sorghum as a reference genome sequence for Andropogoneae grasses

Affiliation

Genomic and small RNA sequencing of Miscanthus x giganteus shows the utility of sorghum as a reference genome sequence for Andropogoneae grasses

Authors

Affiliation

Abstract

Figures

References

Publication types

MeSH terms

Substances

Associated data

LinkOut - more resources

Full Text Sources

Other Literature Sources

Molecular Biology Databases