Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2011 Mar 15;27(6):863-4.
doi: 10.1093/bioinformatics/btr026. Epub 2011 Jan 28.

Quality control and preprocessing of metagenomic datasets

Affiliations

Quality control and preprocessing of metagenomic datasets

Robert Schmieder et al. Bioinformatics. .

Abstract

Summary: Here, we present PRINSEQ for easy and rapid quality control and data preprocessing of genomic and metagenomic datasets. Summary statistics of FASTA (and QUAL) or FASTQ files are generated in tabular and graphical form and sequences can be filtered, reformatted and trimmed by a variety of options to improve downstream analysis.

Availability and implementation: This open-source application was implemented in Perl and can be used as a stand alone version or accessed online through a user-friendly web interface. The source code, user help and additional information are available at http://prinseq.sourceforge.net/.

PubMed Disclaimer

References

    1. Blankenberg D, et al. Manipulation of FASTQ data with Galaxy. Bioinformatics. 2010;26:1783–1785. - PMC - PubMed
    1. Burge C, et al. Over- and under-representation of short oligonucleotides in DNA sequences. Proc. Natl Acad. Sci. USA. 1992;89:1358–1362. - PMC - PubMed
    1. Cox MP, et al. SolexaQA: at-a-glance quality assessment of Illumina second-generation sequencing data. BMC Bioinformatics. 2010;11:485. - PMC - PubMed
    1. Gomez-Alvarez V, et al. Systematic artifacts in metagenomes from complex microbial communities. ISME J. 2009;3:1314–1317. - PubMed
    1. Morgulis A, et al. A fast and symmetric DUST implementation to mask low-complexity DNA sequences. J. Comput. Biol. 2006;13:1028. - PubMed

Publication types