Finding Maximal Exact Matches Using the r-Index
- PMID: 35041518
- PMCID: PMC8902461
- DOI: 10.1089/cmb.2021.0445
Finding Maximal Exact Matches Using the r-Index
Abstract
Efficiently finding maximal exact matches (MEMs) between a sequence read and a database of genomes is a key first step in read alignment. But until recently, it was unknown how to build a data structure in [Formula: see text] space that supports efficient MEM finding, where r is the number of runs in the Burrows-Wheeler Transform. In 2021, Rossi et al. showed how to build a small auxiliary data structure called thresholds in addition to the r-index in [Formula: see text] space. This addition enables efficient MEM finding using the r-index. In this article, we present the tool that implements this solution, which we call MONI. Namely, we give a high-level view of the main components of the data structure and show how the source code can be downloaded, compiled, and used to find MEMs between a set of sequence reads and a set of genomes.
Keywords: MEM finding; r-index; run-length-encoded Burrows–Wheeler transform; thresholds.
Conflict of interest statement
The authors declare they have no competing financial interests.
References
-
- Bannai, H., Gagie, T., and Tomohiro, I.. 2020. Refining the r-index. Theor. Comput. Sci. 812, 96–108.
-
- Ferragina, P., and Manzini, G.. 2005. Indexing compressed text. J. ACM 52, 552–581.
-
- Gagie, T., Navarro, G., and Prezza N.. 2020a. Fully functional suffix trees and optimal text searching in BWT-Runs Bounded Space. J. ACM 67, 2:1–2:54.
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
Miscellaneous
