Salta al contenuto principale
Passa alla visualizzazione normale.

MARINELLA SCIORTINO

r-indexing the eBWT

  • Autori: Boucher, Christina; Cenzato, Davide; Lipták, Zsuzsanna; Rossi, Massimiliano; Sciortino, Marinella
  • Anno di pubblicazione: 2024
  • Tipologia: Articolo in rivista
  • OA Link: http://hdl.handle.net/10447/635773

Abstract

The extended Burrows-Wheeler Transform (eBWT) [Mantaci et al. TCS 2007] is a variant of the BWT, introduced for collections of strings. In this paper, we present the extended r-index, an analogous data structure to the r-index [Gagie et al. JACM 2020]. It occupies O(r) words, with r the number of runs of the eBWT, and offers the same functionalities as the r-index. We also show how to efficiently support finding maximal exact matches (MEMs). We implemented the extended r-index and tested it on circular bacterial genomes and plasmids, comparing it to five state-of-the-art compressed text indexes. While our data structure maintains similar time and memory requirements for answering pattern matching queries as the original r-index, it is the only index in the literature that can naturally be used for both circular and linear input collections. This is an extended version of [Boucher et al., r-indexing the eBWT, SPIRE 2021].