Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Aug;21(6):1772-1787.
doi: 10.1111/1755-0998.13337. Epub 2021 Feb 24.

Validated removal of nuclear pseudogenes and sequencing artefacts from mitochondrial metabarcode data

Affiliations

Validated removal of nuclear pseudogenes and sequencing artefacts from mitochondrial metabarcode data

Carmelo Andújar et al. Mol Ecol Resour. 2021 Aug.

Abstract

Metabarcoding of Metazoa using mitochondrial genes may be confounded by both the accumulation of PCR and sequencing artefacts and the co-amplification of nuclear mitochondrial pseudogenes (NUMTs). The application of read abundance thresholds and denoising methods is efficient in reducing noise accompanying authentic mitochondrial amplicon sequence variants (ASVs). However, these procedures do not fully account for the complex nature of concomitant sequences and the highly variable DNA contribution of specimens in a metabarcoding sample. We propose, as a complement to denoising, the metabarcoding Multidimensional Abundance Threshold Evaluation (metaMATE) framework, a novel approach that allows comprehensive examination of multiple dimensions of abundance filtering and the evaluation of the prevalence of unwanted concomitant sequences in denoised metabarcoding datasets. metaMATE requires a denoised set of ASVs as input, and designates a subset of ASVs as being either authentic (mitochondrial DNA haplotypes) or nonauthentic ASVs (NUMTs and erroneous sequences) by comparison to external reference data and by analysing nucleotide substitution patterns. metaMATE (i) facilitates the application of read abundance filtering strategies, which are structured with regard to sequence library and phylogeny and applied for a range of increasing abundance threshold values, and (ii) evaluates their performance by quantifying the prevalence of nonauthentic ASVs and the collateral effects on the removal of authentic ASVs. The output from metaMATE facilitates decision-making about required filtering stringency and can be used to improve the reliability of intraspecific genetic information derived from metabarcode data. The framework is implemented in the metaMATE software (available at https://github.com/tjcreedy/metamate).

Keywords: HTS; Metazoa; NGS; NUMT; denoising; intraspecific variation; pseudogene; spurious sequences; taxonomic inflation.

PubMed Disclaimer

References

REFERENCES

    1. Altschul, S. F., Gish, W., Miller, W., Myers, E. W., & Lipman, D. J. (1990). Basic local alignment search tool. Journal of Molecular Biology, 215(3), 403-410. https://doi.org/10.1016/S0022-2836(05)80360-2.
    1. Amir, A., McDonald, D., Navas-Molina, J. A., Kopylova, E., Morton, J. T., Zech Xu, Z., Kightley, E. P., Thompson, L. R., Hyde, E. R., Gonzalez, A., & Knight, R. (2017). Deblur rapidly resolves single-nucleotide community sequence patterns. American Society for Microbiology, 2(2), 1-7. https://doi.org/10.1128/mSystems.00191-16.
    1. Andújar, C., Arribas, P., Gray, C., Bruce, C., Woodward, G., Yu, D. W., & Vogler, A. P. (2018). Metabarcoding of freshwater invertebrates to detect the effects of a pesticide spill. Molecular Ecology, 27(1), 146-166. https://doi.org/10.1111/mec.14410.
    1. Andújar, C., Arribas, P., López, H., Arjona, Y., Pérez-Delgado, A., Oromí, P., Vogler, A. P., & Emerson, B. C. Metaphylogeography of soil mesofauna assemblages reveals strong habitat specialisation and geographical diversification within the soils of an oceanic island. In prep.
    1. Andújar, C., Creedy, T. J., Arribas, P., López, H., Salces-Castellano, A., Pérez-Delgado, A., Vogler, A. P., & Emerson, B. C. 2020; Metabarcode data used to test the metaMATE approach; Dryad; https://doi.org/10.5061/dryad.tmpg4f4xr.

Substances

LinkOut - more resources