Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Jul 6:2022:baac055.
doi: 10.1093/database/baac055.

Curation of a reference database of COI sequences for insect identification through DNA metabarcoding: COins

Affiliations

Curation of a reference database of COI sequences for insect identification through DNA metabarcoding: COins

Giulia Magoga et al. Database (Oxford). .

Abstract

DNA metabarcoding is a widespread approach for the molecular identification of organisms. While the associated wet-lab and data processing procedures are well established and highly efficient, the reference databases for taxonomic assignment can be implemented to improve the accuracy of identifications. Insects are among the organisms for which DNA-based identification is most commonly used; yet, a DNA-metabarcoding reference database specifically curated for their species identification using software requiring local databases is lacking. Here, we present COins, a database of 5' region cytochrome c oxidase subunit I sequences (COI-5P) of insects that includes over 532 000 representative sequences of >106 000 species specifically formatted for the QIIME2 software platform. Through a combination of automated and manually curated steps, we developed this database starting from all COI sequences available in the Barcode of Life Data System for insects, focusing on sequences that comply with several standards, including a species-level identification. COins was validated on previously published DNA-metabarcoding sequences data (bulk samples from Malaise traps) and its efficiency compared with other publicly available reference databases (not specific for insects). COins can allow an increase of up to 30% of species-level identifications and thus can represent a valuable resource for the taxonomic assignment of insects' DNA-metabarcoding data, especially when species-level identification is needed https://doi.org/10.6084/m9.figshare.19130465.v1.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
COins database development steps.
Figure 2.
Figure 2.
Number of ASVs identified by the two taxonomic assignment algorithms adopted in this study, i.e. the machine learning-based algorithm fit-classifier sklearn (SK L) and the BLAST+ (BL+) algorithm, using each database: (a) MIDORI database, (b) COins database and (c) ResBO database. Numbers of common identifications between the two algorithms are also expressed in percentages.
Figure 3.
Figure 3.
Number of ASVs assigned to the different taxonomic levels (from order to species) when using ResBO, COins and MIDORI as reference. Numbers of assignments obtained using the BLAST+ (BL+) and fit-classifier sklearn (SK L) algorithms are specified too.
Figure 4.
Figure 4.
Number of species identified using each database MIDORI, COins and ResBO. (a) Number of species identified adopting the BLAST+ algorithm (BL+). (b) Number of species identified adopting fit-classifier sklearn algorithm (SK L). All values are also reported as percentages.

References

    1. Hajibabaei M., Shokralla S., Zhou X.. et al. (2011) Environmental barcoding: a next-generation sequencing approach for biomonitoring applications using river benthos. PLoS One, 6, e17497.doi: 10.1371/journal.pone.0017497. - DOI - PMC - PubMed
    1. Taberlet P., Coissac E., Pompanon F.. et al. (2012) Towards next-generation biodiversity assessment using DNA metabarcoding. Mol. Ecol., 21, 2045–2050.doi: 10.1111/j.1365-294X.2012.05470.x. - DOI - PubMed
    1. Staats M., Arulandhu A., Gravendeel B.. et al. (2016) Advances in DNA metabarcoding for food and wildlife forensic species identification. Anal. Bioanal. Chem., 408, 4615–4630.doi: 10.1007/s00216-016-9595-8. - DOI - PMC - PubMed
    1. Montagna M., Berruti A., Bianciotto V.. et al. (2018) Differential biodiversity responses between kingdoms (plants, fungi, bacteria and metazoa) along an Alpine succession gradient. Mol. Ecol., 27, 3671–3685.doi: 10.1111/mec.14817. - DOI - PubMed
    1. Zhang G., Liu J., Gao M.. et al. (2020) Tracing the edible and medicinal plant Pueraria montana and its products in the marketplace yields subspecies level distinction using DNA barcoding and DNA metabarcoding. Front. Pharmacol., 11, 336.doi: 10.3389/fphar.2020.00336. - DOI - PMC - PubMed

Publication types