Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2008 Nov;36(19):e122.
doi: 10.1093/nar/gkn502. Epub 2008 Aug 27.

Multiplex sequencing of plant chloroplast genomes using Solexa sequencing-by-synthesis technology

Affiliations

Multiplex sequencing of plant chloroplast genomes using Solexa sequencing-by-synthesis technology

Richard Cronn et al. Nucleic Acids Res. 2008 Nov.

Abstract

Organellar DNA sequences are widely used in evolutionary and population genetic studies, however, the conservative nature of chloroplast gene and genome evolution often limits phylogenetic resolution and statistical power. To gain maximal access to the historical record contained within chloroplast genomes, we have adapted multiplex sequencing-by-synthesis (MSBS) to simultaneously sequence multiple genomes using the Illumina Genome Analyzer. We PCR-amplified approximately 120 kb plastomes from eight species (seven Pinus, one Picea) in 35 reactions. Pooled products were ligated to modified adapters that included 3 bp indexing tags and samples were multiplexed at four genomes per lane. Tagged microreads were assembled by de novo and reference-guided assembly methods, using previously published Pinus plastomes as surrogate references. Assemblies for these eight genomes are estimated at 88-94% complete, with an average sequence depth of 55x to 186x. Mononucleotide repeats interrupt contig assembly with increasing repeat length, and we estimate that the limit for their assembly is 16 bp. Comparisons to 37 kb of Sanger sequence show a validated error rate of 0.056%, and conspicuous errors are evident from the assembly process. This efficient sequencing approach yields high-quality draft genomes and should have immediate applicability to genomes with comparable complexity.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Relative frequencies of barcode error by barcode tag (CCT, GGT), experiment (S1, S6) and nucleotide position (1,2, 3). Observed frequencies of erroneous, nontag nucleotides are indicated by position 1 (salmon), 2 (blue) and 3 (green); first and second position errors were far more common than third position errors. Slices within a position are scaled proportionately to the number of base calls for that nucleotide; if errors were present at equal frequencies within a base position, each slice would be of equal size and would not extend beyond the perimeter of the circle. In all experiments, errors involving substitutions to ‘A’ were more frequent than expected for position 1 and 3, where errors involving substitutions to ‘T’ were more frequent than expected for position 2.
Figure 2.
Figure 2.
Plots showing sequencing depth by position for eight chloroplast genomes sequenced by multiplex sequencing-by-synthesis. Microreads per position (y-axis) are plotted in gray relative to the position in the assembly (x-axis, in kb). The median number of reads across each PCR amplicon is indicated by black lines.
Figure 3.
Figure 3.
Frequency spectrum of mononucleotide repeats observed in reference and microread assemblies of Pinus chloroplast genomes. The number of repeats per length class (6–24 bp) is plotted for P. thunbergii (THUN; salmon) and P. koraiensis (KORA; blue). The average and 95% confidence interval for eight microread assemblies (seven Pinus, one Picea, white circles) are also shown. Inset: relationship between the proportions of repeats terminating contigs and the length of each repeat class for the eight microread assemblies. The least squares regression line is indicated.
Figure 4.
Figure 4.
Simulations of higher level multiplex levels. Random subsets of microreads from the P. gerardiana data set were sampled to simulate multiplex levels ranging from 4× (1.37 million microreads) to 16× (0.34 million microreads). Triplicate random subsets were assembled with Velvet de novo assembly, and assemblies were evaluated for sequencing depth (A), the number of contigs (B) and the summed contig lengths (C). Solid lines show the best fit line from least squares regression and shaded regions show the 95% confidence interval of the best fit line. The curved line (C) shows the best fit with a smoothing spline (λ = 5 × 1015; r2 = 0.973).

References

    1. Shinozaki K, Ohme M, Tanaka M, Wakasugi T, Hayashida N, Matsubayashi T, Zaita N, Chunwongse J, Obokata J, Yamaguchi-Shinozaki K, et al. The complete nucleotide sequence of the tobacco chloroplast genome: its gene organization and expression. EMBO J. 1986;5:2043–2049. - PMC - PubMed
    1. Birky CW. Transmission genetics of mitochondria and chloroplasts. Annu. Rev. Genet. 1978;12:471–512. - PubMed
    1. Moore WS. Inferring phylogenies from mtDNA variation: mitochondrial-gene trees versus nuclear-gene trees. Evolution. 1995;49:718–726. - PubMed
    1. Brunsfeld SJ, Sullivan J, Soltis DE, Soltis PS. Comparative phylogeography of northwestern North America: a synthesis. In: Silvertown J, Antonovics J, editors. Integrating Ecological and Evolutionary Processes in a Spatial Context. Oxford: Blackwell Science; 2001. pp. 319–339.
    1. Petit RJ, Aguinagalde I, de Beaulieu J.-L, Bittkau C, Brewer S, Cheddadi R, Ennos R, Fineschi S, Grivet D, Lascoux M, et al. Glacial refugia: hotspots but not melting pots of genetic diversity. Science. 2003;300:1563–1565. - PubMed

Publication types