GPU-accelerated homology search with MMseqs2

Felix Kallenborn^#¹, Alejandro Chacon^#², Christian Hundt², Hassan Sirelkhatim², Kieran Didi^{2

3}, Sooyoung Cha^{4

5}, Christian Dallago^{6

7

8}, Milot Mirdita⁹, Bertil Schmidt¹⁰, Martin Steinegger^{11

12

13

14}

Affiliations

¹ Department of Computer Science, Johannes Gutenberg University Mainz, Mainz, Germany.
² NVIDIA, Santa Clara, CA, USA.
³ Department of Computer Science, University of Oxford, Oxford, UK.
⁴ School of Biological Sciences, Seoul National University, Seoul, Republic of Korea.
⁵ Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, Republic of Korea.
⁶ NVIDIA, Santa Clara, CA, USA. cdallago@nvidia.com.
⁷ Department of Biostatistics and Bioinformatics, Duke University, Durham, NC, USA. cdallago@nvidia.com.
⁸ Department of Cell Biology, Duke University, Durham, NC, USA. cdallago@nvidia.com.
⁹ School of Biological Sciences, Seoul National University, Seoul, Republic of Korea. mmirdit@snu.ac.kr.
¹⁰ Department of Computer Science, Johannes Gutenberg University Mainz, Mainz, Germany. bertil.schmidt@uni-mainz.de.
¹¹ School of Biological Sciences, Seoul National University, Seoul, Republic of Korea. martin.steinegger@snu.ac.kr.
¹² Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, Republic of Korea. martin.steinegger@snu.ac.kr.
¹³ Institute of Molecular Biology and Genetics, Seoul National University, Seoul, Republic of Korea. martin.steinegger@snu.ac.kr.
¹⁴ Artificial Intelligence Institute, Seoul National University, Seoul, Republic of Korea. martin.steinegger@snu.ac.kr.

^# Contributed equally.

PMID: 40968302
PMCID: PMC12510879
DOI: 10.1038/s41592-025-02819-8

GPU-accelerated homology search with MMseqs2

Felix Kallenborn et al. Nat Methods. 2025 Oct.

. 2025 Oct;22(10):2024-2027.

doi: 10.1038/s41592-025-02819-8. Epub 2025 Sep 18.

Authors

Affiliations

¹ Department of Computer Science, Johannes Gutenberg University Mainz, Mainz, Germany.
² NVIDIA, Santa Clara, CA, USA.
³ Department of Computer Science, University of Oxford, Oxford, UK.
⁴ School of Biological Sciences, Seoul National University, Seoul, Republic of Korea.
⁵ Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, Republic of Korea.
⁶ NVIDIA, Santa Clara, CA, USA. cdallago@nvidia.com.
⁷ Department of Biostatistics and Bioinformatics, Duke University, Durham, NC, USA. cdallago@nvidia.com.
⁸ Department of Cell Biology, Duke University, Durham, NC, USA. cdallago@nvidia.com.
⁹ School of Biological Sciences, Seoul National University, Seoul, Republic of Korea. mmirdit@snu.ac.kr.
¹⁰ Department of Computer Science, Johannes Gutenberg University Mainz, Mainz, Germany. bertil.schmidt@uni-mainz.de.
¹¹ School of Biological Sciences, Seoul National University, Seoul, Republic of Korea. martin.steinegger@snu.ac.kr.
¹² Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, Republic of Korea. martin.steinegger@snu.ac.kr.
¹³ Institute of Molecular Biology and Genetics, Seoul National University, Seoul, Republic of Korea. martin.steinegger@snu.ac.kr.
¹⁴ Artificial Intelligence Institute, Seoul National University, Seoul, Republic of Korea. martin.steinegger@snu.ac.kr.

^# Contributed equally.

PMID: 40968302
PMCID: PMC12510879
DOI: 10.1038/s41592-025-02819-8

Abstract

Rapidly growing protein databases demand faster sensitive search tools. Here the graphics processing unit (GPU)-accelerated MMseqs2 delivers 6× faster single-protein searches than CPU methods on 2 × 64 cores, speeds previously requiring large protein batches. For larger query batches, it is the most cost-effective solution, outperforming the fastest alternative method by 2.4-fold with eight GPUs. It accelerates protein structure prediction with ColabFold 31.8× over the standard AlphaFold2 pipeline and protein structure search with Foldseek by 4-27×. MMseqs2-GPU is available under an open-source license at https://mmseqs.com/ .

PubMed Disclaimer

Conflict of interest statement

Competing interests: C.D., A.C., C.H., H.S. and K.D. are employed by NVIDIA. M.S. declares an outside interest in Stylus Medicine. The other authors declare no competing interests.

Figures

**Fig. 1. MMseqs2-GPU workflow and gapless alignment performance.**
a, Gapless alignment scans reference sequences against a query, ranking and filtering them by alignment scores. b, Sequences above a threshold proceed to gapped Smith–Waterman–Gotoh alignment. c, GPU-optimized gapless alignment splits the query profile into segments (up to 2,048 residues), loading them into fast shared memory for efficient access by GPU threads; warp shuffles allow efficient cross-thread data sharing for diagonal computations. d, GPU speedups (1, 2, 4 and 8 L40S GPUs) relative to a 2 × 64-core CPU for random sequence pairs (lengths 32–2048). e, GPU speedups (1 and 8 GPUs) versus a 2 × 64-core CPU for 6,370 queries searching against a 1×, 4× and 16× sized 30-million-protein reference database. The 16× set exceeds GPU memory, requiring database streaming at 7.575/11.676 TCUPS ≈ 64.9% of in-memory performance.

**Fig. 2. MMseqs2-GPU runtimes for homology search.**
a, In single-batch processing of 6,370 queries against a 30-million-sequence database, MMseqs2-GPU on one L40S GPU (dark green; baseline in bold, horizontal) is ~16× faster than BLAST (dark blue) and ~178× faster than JackHMMER (purple; measured on 10% of queries). MMseqs2-GPU achieves further speedups (up to ~5×) by splitting databases across multiple GPUs (bright versus dark green). MMseqs2-GPU on a single L40S provided the lowest AWS cost for all batch sizes; MMseqs2-CPU k-mer was faster at a batch size of 6,370, but 1.6× more costly (bottom). b,c, MMseqs2-GPU accelerates structure prediction without compromising accuracy (0.70 ± 0.05 TM-score). On 20 CASP14 targets, ColabFold MMseqs2-GPU (green) was 1.65× faster than ColabFold-CPU k-mer (orange) and 31.8× faster than AlphaFold2 (JackHMMER+HHblits, violet). MMseqs2 searched 238 million cluster representatives and expanded to 1 billion members; JackHMMER searched 426 million sequences, and HHblits searched 81 million profiles containing 2.1 billion members. d, Foldseek-GPU on one L40S (dark green, baseline in bold, horizontal) is 4× faster than Foldseek-CPU k-mer (orange) at large batch sizes (6,370 queries). Eight L40S GPUs accelerate searches by 7× compared to one GPU, and 27× compared to Foldseek-CPU.

**Extended Data Fig. 1. Combined gapless and gapped alignment TCUPS.**
TCUPS of 1 and 8 GPU executions of the combined MMseqs2-GPU gapless and gapped alignment workflow for 6370 queries against target sets of 1, 2, 4, 8, and 16 times a 30 M protein database (Methods ‘Sensitivity’). 8 and 16 times executions exceed GPU RAM and are processed with database streaming. The latter is processed with 7.3/11.6TCUPS ≈ 63% of in- memory processing speed.

See this image and copyright information in PMC

References

1. Altschul, S. F., Gish, W., Miller, W., Myers, E. W. & Lipman, D. J. Basic local alignment search tool. J. Mol. Biol.215, 403–410 (1990). - DOI - PubMed
1. Eddy, S. R. Accelerated profile HMM searches. PLoS Comput. Biol.7, e1002195 (2011). - DOI - PMC - PubMed
1. Buchfink, B., Reuter, K. & Drost, H.-G. Sensitive protein alignments at tree-of-life scale using DIAMOND. Nat. Methods18, 366–368 (2021). - DOI - PMC - PubMed
1. Steinegger, M. & Söding, J. MMseqs2 enables sensitive protein sequence searching for the analysis of massive datasets. Nat. Biotechnol.35, 1026–1028 (2017). - DOI - PubMed
1. Watson, J. D., Laskowski, R. A. & Thornton, J. M. Predicting protein function from sequence and structural data. Curr. Opin. Struct. Biol.15, 275–284 (2005). - DOI - PubMed

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

GPU-accelerated homology search with MMseqs2

Affiliations

GPU-accelerated homology search with MMseqs2

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

MeSH terms

Substances

Grants and funding

LinkOut - more resources

Full Text Sources