Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Jul 3;22(2):qzae025.
doi: 10.1093/gpbjnl/qzae025.

BSAlign: A Library for Nucleotide Sequence Alignment

Affiliations

BSAlign: A Library for Nucleotide Sequence Alignment

Haojing Shao et al. Genomics Proteomics Bioinformatics. .

Abstract

Increasing the accuracy of the nucleotide sequence alignment is an essential issue in genomics research. Although classic dynamic programming (DP) algorithms (e.g., Smith-Waterman and Needleman-Wunsch) guarantee to produce the optimal result, their time complexity hinders the application of large-scale sequence alignment. Many optimization efforts that aim to accelerate the alignment process generally come from three perspectives: redesigning data structures [e.g., diagonal or striped Single Instruction Multiple Data (SIMD) implementations], increasing the number of parallelisms in SIMD operations (e.g., difference recurrence relation), or reducing search space (e.g., banded DP). However, no methods combine all these three aspects to build an ultra-fast algorithm. In this study, we developed a Banded Striped Aligner (BSAlign) library that delivers accurate alignment results at an ultra-fast speed by knitting a series of novel methods together to take advantage of all of the aforementioned three perspectives with highlights such as active F-loop in striped vectorization and striped move in banded DP. We applied our new acceleration design on both regular and edit distance pairwise alignment. BSAlign achieved 2-fold speed-up than other SIMD-based implementations for regular pairwise alignment, and 1.5-fold to 4-fold speed-up in edit distance-based implementations for long reads. BSAlign is implemented in C programing language and is available at https://github.com/ruanjue/bsalign.

Keywords: Banded dynamic programming; Edit distance; F evaluation; Pairwise alignment; Striped vectorization.

PubMed Disclaimer

Conflict of interest statement

Both authors have declared no competing interests.

Figures

Figure 1
Figure 1
The striped move algorithm for each row A. Global visualization for the band along the diagonal. B. Detail example for row iteration in normal order. C. Detail example for row iteration in striped order (striped move). Assuming the band width, the number of divided segments, and the number of cells in a register are 16, 4, and 4, respectively. In normal order, the cells are in the same color for the same register. Only the offset is numbered inside the cell. SIMD, Single Instruction Multiple Data.
Figure 2
Figure 2
The lazy and active F loop algorithms inside a row Assuming the band width, the number of divided segments, and the number of cells in a register are 16, 4, and 4, respectively. A. and B. The workflow for the lazy (A) and active (B) F loops, respectively. C. An example of long horizontal gap (f7,j to f13,j). The dash line indicates the optimal path. D. A diagram showing how the lazy F loop solves the aforementioned example for each cell. Initial loop: all the cells in the first register are negative infinity, and the algorithm calculates all the cells by standard SIMD calculation. Lazy F loop: the algorithm keeps correcting all the registers one by one until none of them is updated, and this panel shows a situation that it takes 3 loops to guarantee that all the cells are corrected. E. A diagram showing how the active F loop solves the aforementioned example for each cell. Unlike the lazy F loop, there is no difference between “updated” and “no update”. All the values are updated one times only. Initial loop: the same as the lazy F loop. Correction: each cell in the first segment is checked by Equation 10 and updated by the correct value. This extended register is also the first register (after the right shift) in the striped format. Final loop: when the first register is totally correct, the remaining segments are corrected by SIMD calculation. Black box (waiting for updates) indicates that the value is influenced by F and needs to be corrected. Gray box (updated) indicates that the value is updated recently. White box (no update) indicates that the value is the same as expected and correct. The arrows above digits indicate the first time being updated as the correct value.
Figure 3
Figure 3
The average computation time for pairwise alignment and edit distance The length of the query sequence pair is from 1000 bp to 100,000 bp. Each pair is run 100 times. A. Six implementations: BSAlign, ksw2 [15], SSW [11], parasail [19], WFA [17], and BA [16] are compared. Option band width is set at 128 bp for BSAlign and ksw2. B. Three implementations: BSAlign, Edlib [21], and Myers [20] are compared. The mode “whole” and “limit” mean that the maximum edit distance is set at the whole query length and the true edit distance in simulation, respectively. BSAlign, Banded Striped Aligner; SSW, Striped Smith–Waterman; WFA, wavefront alignment; BA, block aligner.

References

    1. Needleman SB, Wunsch CD.. A general method applicable to the search for similarities in the amino acid sequence of two proteins. J Mol Biol 1970;48:443–53. - PubMed
    1. Smith T, Waterman M.. Identification of common molecular subsequences. J Mol Biol 1981;147:195–7. - PubMed
    1. Wozniak A. Using video-oriented instructions to speed up sequence comparison. Comput Appl Biosci 1997;13:145–50. - PubMed
    1. Rognes T, Seeberg E.. Six-fold speed-up of Smith–Waterman sequence database searches using parallel processing on common microprocessors. Bioinformatics 2000;16:699–706. - PubMed
    1. Zhang J, Lan H, Chan Y, Shang Y, Schmidt B, Liu W.. BGSA: a bit-parallel global sequence alignment toolkit for multi-core and many-core architectures. Bioinformatics 2019;35:2306–8. - PubMed

LinkOut - more resources