Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2014 Sep;14(5):892-901.
doi: 10.1111/1755-0998.12236. Epub 2014 Feb 19.

Next-generation DNA barcoding: using next-generation sequencing to enhance and accelerate DNA barcode capture from single specimens

Affiliations
Free PMC article
Comparative Study

Next-generation DNA barcoding: using next-generation sequencing to enhance and accelerate DNA barcode capture from single specimens

Shadi Shokralla et al. Mol Ecol Resour. 2014 Sep.
Free PMC article

Abstract

DNA barcoding is an efficient method to identify specimens and to detect undescribed/cryptic species. Sanger sequencing of individual specimens is the standard approach in generating large-scale DNA barcode libraries and identifying unknowns. However, the Sanger sequencing technology is, in some respects, inferior to next-generation sequencers, which are capable of producing millions of sequence reads simultaneously. Additionally, direct Sanger sequencing of DNA barcode amplicons, as practiced in most DNA barcoding procedures, is hampered by the need for relatively high-target amplicon yield, coamplification of nuclear mitochondrial pseudogenes, confusion with sequences from intracellular endosymbiotic bacteria (e.g. Wolbachia) and instances of intraindividual variability (i.e. heteroplasmy). Any of these situations can lead to failed Sanger sequencing attempts or ambiguity of the generated DNA barcodes. Here, we demonstrate the potential application of next-generation sequencing platforms for parallel acquisition of DNA barcode sequences from hundreds of specimens simultaneously. To facilitate retrieval of sequences obtained from individual specimens, we tag individual specimens during PCR amplification using unique 10-mer oligonucleotides attached to DNA barcoding PCR primers. We employ 454 pyrosequencing to recover full-length DNA barcodes of 190 specimens using 12.5% capacity of a 454 sequencing run (i.e. two lanes of a 16 lane run). We obtained an average of 143 sequence reads for each individual specimen. The sequences produced are full-length DNA barcodes for all but one of the included specimens. In a subset of samples, we also detected Wolbachia, nontarget species, and heteroplasmic sequences. Next-generation sequencing is of great value because of its protocol simplicity, greatly reduced cost per barcode read, faster throughout and added information content.

Keywords: COI; DNA; Lepidoptera; Wolbachia; biodiversity; genomics; heteroplasmy; taxonomy.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Schematic diagram of parallel barcode recovery using multiple identifier (MID) tagging and next-generation sequencing (NGS) protocol.
Figure 2
Figure 2
Comparison of DNA sequence data recovered by Sanger sequencing and 454 pyrosequencing. A) Green bars represent number of full-length COI barcode sequences. Yellow bar represents number of partial COI barcode sequences. Red bars represent failed target barcode attempts. Orange bar represents number of heteroplasmic COI sequences. Purple bar represents number of coamplified nontarget COI sequences (i.e. ‘contaminants’). Light green bar represents number of Wolbachia sequences. B) Number of organisms recovering single or multiple sequence clusters during 454 pyrosequencing.
Figure 3
Figure 3
Neighbour-joining diagram of 352 DNA sequences recovered by 454 pyrosequencing and Sanger sequencing. Short sequences (<600 bp) have not been included. Distance measurement is calculated in number of base substitutions per site based on the Kimura 2-parameter method. The tree backbone represents the 454 pyrosequences and green triangles represent sequences produced by Sanger sequencing (>600 bp). Red circles represent sequences determined to be heteroplasmic. Blue squares represent individual specimens that also recovered a Wolbachia sequence.
Figure 4
Figure 4
Portion of a sequence electropherogram as produced by Sanger sequencing and composite sequence clusters as recovered by 454 pyrosequencing of a single specimen. Highlighted bases represent differences from the Sanger sequence. Arrows indicate the presence of peaks in the electropherogram corresponding to alternate sequences.

Similar articles

Cited by

References

    1. Berthier K, Chapuis M-P, Moosavi SM, Tohidi-Esfahani D, Sword GA. Nuclear insertions and heteroplasmy of mitochondrial DNA as two sources of intra-individual genomic variation in grasshoppers. Systematic Entomology. 2011;36:285–299.
    1. Binladen J, Gilbert MTP, Bollback JP, Panitz F, Bendixen C, Nielsen R, Willerslev E. The use of coded PCR primers enables high-throughput sequencing of multiple homolog amplification products by 454 parallel sequencing. PLoS ONE. 2007;2:e197. - PMC - PubMed
    1. Boessenkool S, Epp LS, Haile J, et al. Blocking human contaminant DNA during PCR allows amplification of rare mammal species from sedimentary ancient DNA. Molecular Ecology. 2012;21:1806–1815. - PubMed
    1. Brower AVZ. Problems with DNA barcodes for species delimitation: ‘Ten species’ of Astraptes fulgerator reassessed (Lepidoptera: Hesperiidae) Systematics and Biodiversity. 2006;4:127–132.
    1. Chacon IA, Janzen DH, Hallwachs W, Sullivan JB, Hajibabaei M. Cryptic species within cryptic moths: new species of Dunama Schaus (Notodontidae, Nystaleinae) in Costa Rica. Zookeys. 2013;264:11–45. - PMC - PubMed

Publication types