fastp: an ultra-fast all-in-one FASTQ preprocessor
- PMID: 30423086
- PMCID: PMC6129281
- DOI: 10.1093/bioinformatics/bty560
fastp: an ultra-fast all-in-one FASTQ preprocessor
Abstract
Motivation: Quality control and preprocessing of FASTQ files are essential to providing clean data for downstream analysis. Traditionally, a different tool is used for each operation, such as quality control, adapter trimming and quality filtering. These tools are often insufficiently fast as most are developed using high-level programming languages (e.g. Python and Java) and provide limited multi-threading support. Reading and loading data multiple times also renders preprocessing slow and I/O inefficient.
Results: We developed fastp as an ultra-fast FASTQ preprocessor with useful quality control and data-filtering features. It can perform quality control, adapter trimming, quality filtering, per-read quality pruning and many other operations with a single scan of the FASTQ data. This tool is developed in C++ and has multi-threading support. Based on our evaluation, fastp is 2-5 times faster than other FASTQ preprocessing tools such as Trimmomatic or Cutadapt despite performing far more operations than similar tools.
Availability and implementation: The open-source code and corresponding instructions are available at https://github.com/OpenGene/fastp.
Figures






Similar articles
-
Ultrafast one-pass FASTQ data preprocessing, quality control, and deduplication using fastp.Imeta. 2023 May 8;2(2):e107. doi: 10.1002/imt2.107. eCollection 2023 May. Imeta. 2023. PMID: 38868435 Free PMC article.
-
FastqPuri: high-performance preprocessing of RNA-seq data.BMC Bioinformatics. 2019 May 3;20(1):226. doi: 10.1186/s12859-019-2799-0. BMC Bioinformatics. 2019. PMID: 31053060 Free PMC article.
-
AfterQC: automatic filtering, trimming, error removing and quality control for fastq data.BMC Bioinformatics. 2017 Mar 14;18(Suppl 3):80. doi: 10.1186/s12859-017-1469-3. BMC Bioinformatics. 2017. PMID: 28361673 Free PMC article.
-
fastQ_brew: module for analysis, preprocessing, and reformatting of FASTQ sequence data.BMC Res Notes. 2017 Jul 12;10(1):275. doi: 10.1186/s13104-017-2616-7. BMC Res Notes. 2017. PMID: 28701181 Free PMC article.
-
MutScan: fast detection and visualization of target mutations by scanning FASTQ data.BMC Bioinformatics. 2018 Jan 22;19(1):16. doi: 10.1186/s12859-018-2024-6. BMC Bioinformatics. 2018. PMID: 29357822 Free PMC article.
Cited by
-
Genome-Wide Association Studies (GWAS) and Transcriptome Analysis Reveal Male Heterogametic Sex-Determining Regions and Candidate Genes in Northern Snakeheads (Channa argus).Int J Mol Sci. 2024 Oct 10;25(20):10889. doi: 10.3390/ijms252010889. Int J Mol Sci. 2024. PMID: 39456674 Free PMC article.
-
Genome-wide analysis of flavonoid biosynthetic genes in Musaceae (Ensete, Musella, and Musa species) reveals amplification of flavonoid 3',5'-hydroxylase.AoB Plants. 2024 Sep 10;16(5):plae049. doi: 10.1093/aobpla/plae049. eCollection 2024 Oct. AoB Plants. 2024. PMID: 39450414 Free PMC article.
-
Exploration of transcriptional regulation network between buffalo oocytes and granulosa cells and its impact on different diameter follicles.BMC Genomics. 2024 Oct 26;25(1):1004. doi: 10.1186/s12864-024-10912-z. BMC Genomics. 2024. PMID: 39462339 Free PMC article.
-
Fine mapping of a major QTL, qECQ8, for rice taste quality.BMC Plant Biol. 2024 Oct 31;24(1):1034. doi: 10.1186/s12870-024-05744-8. BMC Plant Biol. 2024. PMID: 39478453 Free PMC article.
-
Immunoglobulin secretion influences the composition of chicken caecal microbiota.Sci Rep. 2024 Oct 25;14(1):25410. doi: 10.1038/s41598-024-76856-2. Sci Rep. 2024. PMID: 39455845 Free PMC article.
References
-
- Andrews S. (2010) A quality control tool for high throughput sequence data. http://www.bioinformatics.babraham.ac.uk/projects/fastqc/.
-
- Bianchi D.W., et al. (2015) Noninvasive prenatal testing and incidental detection of occult maternal malignancies. JAMA, 314, 162–169. - PubMed
-
- Brad Chapman R.K., et al. (2018) Validated, Scalable, Community Developed Variant Calling, RNA-Seq and Small RNA Analysis, https://github.com/chapmanb/bcbio-nextgen.
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources
Other Literature Sources
Medical
Molecular Biology Databases