Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Feb;29(2):188-194.
doi: 10.1089/cmb.2021.0445. Epub 2022 Jan 17.

Finding Maximal Exact Matches Using the r-Index

Affiliations

Finding Maximal Exact Matches Using the r-Index

Massimiliano Rossi et al. J Comput Biol. 2022 Feb.

Abstract

Efficiently finding maximal exact matches (MEMs) between a sequence read and a database of genomes is a key first step in read alignment. But until recently, it was unknown how to build a data structure in [Formula: see text] space that supports efficient MEM finding, where r is the number of runs in the Burrows-Wheeler Transform. In 2021, Rossi et al. showed how to build a small auxiliary data structure called thresholds in addition to the r-index in [Formula: see text] space. This addition enables efficient MEM finding using the r-index. In this article, we present the tool that implements this solution, which we call MONI. Namely, we give a high-level view of the main components of the data structure and show how the source code can be downloaded, compiled, and used to find MEMs between a set of sequence reads and a set of genomes.

Keywords: MEM finding; r-index; run-length-encoded Burrows–Wheeler transform; thresholds.

PubMed Disclaimer

Conflict of interest statement

The authors declare they have no competing financial interests.

References

    1. Au, K.F., Underwood, J.G., Lee, L., et al. . 2012. Improving PacBio long read accuracy by short read alignment. PLoS ONE 7, e46679. - PMC - PubMed
    1. Bannai, H., Gagie, T., and Tomohiro, I.. 2020. Refining the r-index. Theor. Comput. Sci. 812, 96–108.
    1. Boucher, C., Gagie, T., Kunhle, A., et al. . 2019. Prefix-free parsing for building big BWTs. Algorithms Mol. Biol. 14, 13:1–13:15. - PMC - PubMed
    1. Ferragina, P., and Manzini, G.. 2005. Indexing compressed text. J. ACM 52, 552–581.
    1. Gagie, T., Navarro, G., and Prezza N.. 2020a. Fully functional suffix trees and optimal text searching in BWT-Runs Bounded Space. J. ACM 67, 2:1–2:54.

Publication types

MeSH terms

LinkOut - more resources