Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2013 Oct 6:14:686.
doi: 10.1186/1471-2164-14-686.

The repetitive component of the sunflower genome as shown by different procedures for assembling next generation sequencing reads

Affiliations

The repetitive component of the sunflower genome as shown by different procedures for assembling next generation sequencing reads

Lucia Natali et al. BMC Genomics. .

Abstract

Background: Next generation sequencing provides a powerful tool to study genome structure in species whose genomes are far from being completely sequenced. In this work we describe and compare different computational approaches to evaluate the repetitive component of the genome of sunflower, by using medium/low coverage Illumina or 454 libraries.

Results: By varying sequencing technology (Illumina or 454), coverage (0.55 x-1.25 x), assemblers and assembly procedures, six different genomic databases were produced. The annotation of these databases showed that they were composed of different proportions of repetitive DNA families. The final assembly of the sequences belonging to the six databases produced a whole genome set of 283,800 contigs. The redundancy of each contig was estimated by mapping the whole genome set with a large Illumina read set and measuring the number of matched Illumina reads. The repetitive component amounted to 81% of the sunflower genome, that is composed mainly of numerous families of Gypsy and Copia retrotransposons. Also many families of non autonomous retrotransposons and DNA transposons (especially of the Helitron superfamily) were identified.

Conclusions: The results substantially matched those previously obtained by using a Sanger-sequenced shotgun library and a standard 454 whole-genome-shotgun approach, indicating the reliability of the proposed procedures also for other species. The repetitive sequences were collected to produce a database, SUNREP, that will be useful for the annotation of the sunflower genome sequence and for studying the genome evolution in dicotyledons.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Distributions of mapped Illumina reads to the six sequence sets obtained by assembling original Illumina or 454 reads.
Figure 2
Figure 2
Functional composition of the assembled sequence sets. obtained by assembling original Illumina or 454 read sets (first row, unsplit), by assembling the same read sets after a preliminary splitting into subpackages of reads (second row, split), by assembling the two assembled sequence sets previously obtained from Illumina, 454 large and 454 small sets of reads (third row, total), and by assembling the three assembled sequence sets described in the third row (fourth row, WGSAS).
Figure 3
Figure 3
Distribution of mapped Illumina reads in the WGSAS. Sequences were subdivided into redundant and unique (low redundant), based on an arbitrary value corresponding to five-fold the mean average coverage of five putatively unique gene sequences.
Figure 4
Figure 4
Size distribution of Gypsy, Copia, and unknown LTR REs, of non-LTR REs, and of DNA transposons families obtained performing an all-by-all BLAST analysis. For each superfamily, the histograms depict the number of families (Y-axis) containing a specified number of contigs. The total number of families and singletons (i.e. families represented by one contig) are also reported.
Figure 5
Figure 5
Number of sequences composing the 30 most numerous families of LTR-REs (above) and DNA transposons (below).

References

    1. Greilhuber J, Borsch T, Müller K, Worberg A, Porembski S, Barthlott W. Smallest angiosperm genomes found in Lentibulariaceae, with chromosomes of bacterial size. Plant Biol. 2006;8:770–777. doi: 10.1055/s-2006-924101. - DOI - PubMed
    1. Pellicer J, Fay MF, Leitch IJ. The largest eukaryotic genome of them all? Bot J Linnean Soc. 2010;164:10–15. doi: 10.1111/j.1095-8339.2010.01072.x. - DOI
    1. Morgante M, De Paoli M, Radovic S. Transposable elements and the plant pan-genomes. Curr Opin Plant Biol. 2007;10:149–155. doi: 10.1016/j.pbi.2007.02.001. - DOI - PubMed
    1. Britten RJ. Transposable element insertions have strongly affected human evolution. Proc Natl Acad Sci USA. 2010;107:19945–19948. doi: 10.1073/pnas.1014330107. - DOI - PMC - PubMed
    1. Morgante M, Brunner S, Pea G, Fengler K, Zuccolo A, Rafalski A. Gene duplication and exon shuffling by helitron-like transposons generate intraspecies diversity in maize. Nat Genet. 2005;37:997–1002. doi: 10.1038/ng1615. - DOI - PubMed

Publication types

MeSH terms

LinkOut - more resources