New string attractor-based complexities for infinite words
- Authors: Cassaigne, Julien; Gheeraert, France; Restivo, Antonio; Romana, Giuseppe; Sciortino, Marinella; Stipulanti, Manon
- Publication year: 2024
- Type: Articolo in rivista
- OA Link: http://hdl.handle.net/10447/647293
Abstract
A string attractor is a set of positions in a word such that each distinct factor has an occurrence crossing a position from the set. This definition comes from the data compression field, where the size $\gamma^⁎$ of a smallest string attractor represents a lower bound for the output size of a large family of string compressors exploiting repetitions in words, including BWT-based and LZ-based compressors. For finite words, the combinatorial properties of string attractors have been studied in 2021 by Mantaci et al.. Later, Schaeffer and Shallit introduced the string attractor profile function, a complexity function that evaluates for each $n>0$, the size $\gamma^*$ of the length-n prefix of a one-sided infinite word. A natural development of the research on the topic is to link string attractors with other classical notions of repetitiveness in combinatorics on words. Our contribution in this sense is threefold. First, we explore the relation between the string attractor profile function and other well-known combinatorial complexity functions in the context of infinite words, such as the factor complexity and the property of recurrence. Moreover, we study its asymptotic growth in the case of purely morphic words and obtain a complete description in the binary case. Second, we introduce two new string attractor-based complexity functions, in which the structure and the distribution of positions in a string attractor are taken into account, and we study their combinatorial properties. We also show that these measures provide a finer classification of some infinite families of words, namely the Sturmian and quasi-Sturmian words. Third, we explicitly give the three complexities for some specific morphic words called k-bonacci words. A preliminary version of some results presented in this paper can be found in [Restivo, Romana, Sciortino, String Attractors and Infinite Words, LATIN 2022].