Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2008 Sep 9;105(36):13486-91.
doi: 10.1073/pnas.0803076105. Epub 2008 Aug 29.

Many species in one: DNA barcoding overestimates the number of species when nuclear mitochondrial pseudogenes are coamplified

Affiliations

Many species in one: DNA barcoding overestimates the number of species when nuclear mitochondrial pseudogenes are coamplified

Hojun Song et al. Proc Natl Acad Sci U S A. .

Abstract

Nuclear mitochondrial pseudogenes (numts) are nonfunctional copies of mtDNA in the nucleus that have been found in major clades of eukaryotic organisms. They can be easily coamplified with orthologous mtDNA by using conserved universal primers; however, this is especially problematic for DNA barcoding, which attempts to characterize all living organisms by using a short fragment of the mitochondrial cytochrome c oxidase I (COI) gene. Here, we study the effect of numts on DNA barcoding based on phylogenetic and barcoding analyses of numt and mtDNA sequences in two divergent lineages of arthropods: grasshoppers and crayfish. Single individuals from both organisms have numts of the COI gene, many of which are highly divergent from orthologous mtDNA sequences, and DNA barcoding analysis incorrectly overestimates the number of unique species based on the standard metric of 3% sequence divergence. Removal of numts based on a careful examination of sequence characteristics, including indels, in-frame stop codons, and nucleotide composition, drastically reduces the incorrect inferences of the number of unique species, but even such rigorous quality control measures fail to identify certain numts. We also show that the distribution of numts is lineage-specific and the presence of numts cannot be known a priori. Whereas DNA barcoding strives for rapid and inexpensive generation of molecular species tags, we demonstrate that the presence of COI numts makes this goal difficult to achieve when numts are prevalent and can introduce serious ambiguity into DNA barcoding.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Fig. 1.
Fig. 1.
Phylogenetic and barcoding analyses based on orthologous mtDNA COI and paralogous numt haplotypes from grasshoppers and crayfish. (A) Grasshoppers: the cladogram on the left is a strict consensus of 41 MPTs (L = 1002; CI = 0.54; RI = 0.85). Dots above branch indicate the nodes with the bootstrap value of >75 and posterior probability of >95%. Orthologous mtDNA is indicated in bold and putative numts are indicated as red terminals. Number in parenthesis represents the number of identical copies for a particular haplotypes (h) and asterisk indicates ones with in-frame stop codons. When DNA barcoding analysis (NJ analysis based on K2P distances) is performed on the complete dataset, the number of unique species inferred based on 3% sequence divergence (colored numbers next to the vertical bars) is overestimated (barcoding with numts). After the removal of the haplotypes with indels and in-frame stop codons (barcoding after quality control), the number of unique species inferred under DNA barcoding is drastically reduced. Purple, Schistocerca americana (Sa); blue, Calliptamus italicus (Ci); green, Acrida willemsei (Aw); orange, Locusta migratoria (Lm); and gray, outgroups. (B) Crayfish: the circular cladogram on top is the strict consensus of 94 MPTs (L = 1064; CI = 0.39; RI = 0.91). Terminals are colored to indicate species. Purple, Orconectes australis; orange, O. barri; green, O. incomptus; blue, O. packardi; and gray, outgroups. All numt haplotypes are indicated as red terminals. Similarly, DNA barcoding overestimates the number of unique species when numts are included, but the removal of numts reduces the inferred number of species. Notice that even after rigorous quality control, the inferred number of unique species is actually higher than the actual number of species, suggesting that some numts are difficult to identify.
Fig. 2.
Fig. 2.
Suggested steps to help avoid and identify numts in DNA barcoding analysis. Whereas these steps will help reduce the chance of sequencing numts instead of the target COI, they are not guaranteed to remove all numts. Each resulting sequence must be examined as part of quality control protocols. If numts are rampant, then the isolation of COI sequences becomes difficult and it may be best to use other genes. When interpreting the results from DNA barcoding analysis, it is important to survey congruence with other molecular markers, morphology, ecology, and behavior.

Similar articles

Cited by

References

    1. Funk DJ, Omland KE. Species-level paraphyly and polyphyly: Frequency, causes, and consequences, with insights from animal mitochondrial DNA. Annu Rev Ecol Evol Syst. 2003;34:397–423.
    1. Hebert PDN, Cywinska A, Ball SL, deWaard JR. Biological identifications through DNA barcodes. Proc R Soc London Ser B. 2003;270:313–322. - PMC - PubMed
    1. Rubinoff D, Cameron S, Will K. A genomic perspective on the shortcomings of mitochondrial DNA for “barcoding” identification. J Hered. 2006;97:581–594. - PubMed
    1. Campbell NJH, Barker SC. The novel mitochondrial gene arrangement of the cattle tick, Boophilus microplus: Fivefold tandem repetition of a coding region. Mol Biol Evol. 1999;16:732–740. - PubMed
    1. Frey JE, Frey B. Origin of intra-individual variation in PCR-amplified mitochondrial cytochrome oxidase I of Thrips tabaci (Thysanoptera: Thripidae): Mitochondrial heteroplasmy or nuclear integration? Hereditas. 2004;140:92–98. - PubMed

Publication types

Substances

Associated data

LinkOut - more resources