Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2007 Feb;23(1):1-21.
doi: 10.1111/j.1096-0031.2006.00126.x.

A comparison of algorithms for the identification of specimens using DNA barcodes: examples from gymnosperms

Affiliations
Free article

A comparison of algorithms for the identification of specimens using DNA barcodes: examples from gymnosperms

Damon P Little et al. Cladistics. 2007 Feb.
Free article

Abstract

In order to use DNA sequences for specimen identification (e.g., barcoding, fingerprinting) an algorithm to compare query sequences with a reference database is needed. Precision and accuracy of query sequence identification was estimated for hierarchical clustering (parsimony and neighbor joining), similarity methods (BLAST, BLAT and megaBLAST), combined clustering/similarity methods (BLAST/parsimony and BLAST/neighbor joining), diagnostic methods (DNA-BAR and DOME ID), and a new method (ATIM). We offer two novel alignment-free algorithmic solutions (DOME ID and ATIM) to identify query sequences for the purposes of DNA barcoding. Publicly available gymnosperm nrITS 2 and plastid matK sequences were used as test data sets. On the test data sets, almost all of the methods were able to accurately identify sequences to genus; however, no method was able to accurately identify query sequences to species at a frequency that would be considered useful for routine specimen identification (42-71% unambiguously correct). Clustering methods performed the worst (perhaps due to alignment issues). Similarity methods, ATIM, DNA-BAR, and DOME ID all performed at approximately the same level. Given the relative precision of the algorithms (median = 67% unambiguous), the low accuracy of species-level identification observed could be ascribed to the lack of correspondence between patterns of allelic similarity and species delimitations. Application of DNA barcoding to sequences of CITES listed cycads (Cycadopsida) provides an example of the potential application of DNA barcoding to enforcement of conservation laws.

PubMed Disclaimer

References

    1. Agarwal, P., States, D.J., 1998. Comparative accuracy of methods for protein sequence similarity search. Bioinformatics 14, 40-47.
    1. Altschul, S.F., Gish, W., Miller, W., Myers, E.W., Lipman, D.J., 1990. Basic local alignment search tool. J. Mol. Biol. 215, 403-410.
    1. Altschul, S.F., Madden, T.L., Schäffer, A.A., Zhang, J., Zhang, Z., Miller, W., Lipman, D.J., 1997. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25, 3389-3402.
    1. Amato, G., Egan, M.G., Schaller, G.B., Baker, R.H., Rosenbaum, H.C., Robichaud, W.G., DeSalle, R., 1999. Rediscovery of Roosevelt's barking deer (Muntiacus rooseveltorum). J. Mammal 80, 639-643.
    1. Anderson, I., Brass, A., 1998. Searching DNA databases for similarities to DNA sequences: When is a match significant? Bioinformatics 14, 349-356.

Associated data

LinkOut - more resources