Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Jan 29;5(1):vbaf010.
doi: 10.1093/bioadv/vbaf010. eCollection 2025.

Sequali: efficient and comprehensive quality control of short- and long-read sequencing data

Affiliations

Sequali: efficient and comprehensive quality control of short- and long-read sequencing data

Ruben H P Vorderman. Bioinform Adv. .

Abstract

Motivation: Quality control of sequencing data is the first step in many sequencing workflows. Short- and long-read sequencing technologies have many commonalities with regard to quality control. Several quality control programs exist; however, none possess a feature set that is adequate for both technologies. Quality control programs aimed at Oxford Nanopore Technologies sequencing lack vital features, such as adapter searching, overrepresented sequence analysis, and duplication analysis.

Results: Sequali was developed to provide sequencing quality control for both short- and long-read sequencing technologies. It features adapter search, overrepresented sequence analysis, and duplication analysis and supports FASTQ and uBAM inputs. It is significantly faster than comparable sequencing quality control programs for both short- and long-read sequencing technologies.

Availability and implementation: Sequali is an open-source Python application using C extensions and is freely available under the AGPL-3.0 license at https://github.com/rhpvorderman/sequali. The source code for each release is archived at zenodo: https://zenodo.org/doi/10.5281/zenodo.10822485.

PubMed Disclaimer

Conflict of interest statement

None declared.

Figures

Figure 1.
Figure 1.
Flowchart representation of Sequali’s internal workflow.

References

    1. Andrews S. FastQC: a quality control tool for high throughput sequence data. 2010. http://www.bioinformatics.babraham.ac.uk/projects/fastqc/ (12 August 2024, date last accessed).
    1. Bansal V. A computational method for estimating the PCR duplication rate in DNA and RNA-seq experiments. BMC Bioinformatics 2017;18:43. - PMC - PubMed
    1. Chen S, Zhou Y, Chen Y et al. fastp: an ultra-fast all-in-one FASTQ preprocessor. Bioinformatics 2018;34:i884–90. - PMC - PubMed
    1. Cock PJA, Fields CJ, Goto N et al. The Sanger FASTQ file format for sequences with quality scores, and the Solexa/Illumina FASTQ variants. Nucleic Acids Res 2010;38:1767–71. - PMC - PubMed
    1. Da Veiga Leprevost F, Grüning BA, Alves Aflitos S et al. BioContainers: an open-source and community-driven framework for software standardization. Bioinformatics 2017;33:2580–2. - PMC - PubMed

LinkOut - more resources