Abstract
In this paper we extend to two-dimensional data two recently introduced one-dimensional compressibility measures: the \(\gamma \) measure defined in terms of the smallest string attractor, and the \(\delta \) measure defined in terms of the number of distinct substrings of the input string. Concretely, we introduce the two-dimensional measures \(\gamma _{2D}\) and \(\delta _{2D}\) as natural generalizations of \(\gamma \) and \(\delta \) and study some of their properties. Among other things, we prove that \(\delta _{2D}\) is monotone and can be computed in linear time, and we show that although it is still true that \(\delta _{2D}\le \gamma _{2D}\) the gap between the two measures can be \(\varOmega (\sqrt{n})\) for families of \(n\times n\) matrices and therefore asymptotically larger than the gap in one-dimension. Finally, we use the measures \(\gamma _{2D}\) and \(\delta _{2D}\) to provide the first analysis of the space usage of the two-dimensional block tree introduced in [Brisaboa et al., Two-dimensional block trees, The computer Journal, 2023].
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Belazzougui, D., et al.: Block trees. J. Comput. Syst. Sci. 117, 1–22 (2021)
Brisaboa, N., Gagie, T., Gómez-Brandón, A., Navarro, G.: Two-dimensional block trees. Comput. J. (2023, to appear)
Christiansen, A.R., Ettienne, M.B., Kociumaka, T., Navarro, G., Prezza, N.: Optimal-time dictionary-compressed indexes. ACM Trans. Algorithms 17(1), 8:1–8:39 (2021)
Giancarlo, R.: A generalization of the suffix tree to square matrices, with applications. SIAM J. Comput. 24(3), 520–562 (1995)
Giancarlo, R., Grossi, R.: On the construction of classes of suffix trees for square matrices: algorithms and applications. Inf. Comput. 130(2), 151–182 (1996)
Kempa, D., Prezza, N.: At the roots of dictionary compression: string attractors. In: Diakonikolas, I., Kempe, D., Henzinger, M. (eds.) Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing, STOC 2018, Los Angeles, CA, USA, 25–29 June 2018, pp. 827–840. ACM (2018)
Kim, D.K., Na, J.C., Sim, J.S., Park, K.: Linear-time construction of two-dimensional suffix trees. Algorithmica 59(2), 269–297 (2011)
Kociumaka, T., Navarro, G., Prezza, N.: Toward a definitive compressibility measure for repetitive sequences. IEEE Trans. Inf. Theory 69(4), 2074–2092 (2023)
Mantaci, S., Restivo, A., Romana, G., Rosone, G., Sciortino, M.: A combinatorial view on string attractors. Theor. Comput. Sci. 850, 236–248 (2021)
Navarro, G.: Indexing highly repetitive string collections, part I: repetitiveness measures. ACM Comput. Surv. 54(2), article 29 (2021)
Navarro, G.: Indexing highly repetitive string collections, part II: compressed indexes. ACM Comput. Surv. 54(2), article 26 (2021)
Navarro, G., Prezza, N.: Universal compressed text indexing. Theoret. Comput. Sci. 762, 41–50 (2019)
Raskhodnikova, S., Ron, D., Rubinfeld, R., Smith, A.D.: Sublinear algorithms for approximating string compressibility. Algorithmica 65(3), 685–709 (2013)
Funding
This research was partially supported by MIUR-PRIN project “Multicriteria Data Structures and Algorithms: from compressed to learned indexes, and beyond” grant n. 2017WR7SHH, and by the PNRR ECS00000017 Tuscany Health Ecosystem, Spoke 6 “Precision medicine & personalized healthcare”, CUP I53C22000780001, funded by the European Commission under the NextGeneration EU programme.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Carfagna, L., Manzini, G. (2023). Compressibility Measures for Two-Dimensional Data. In: Nardini, F.M., Pisanti, N., Venturini, R. (eds) String Processing and Information Retrieval. SPIRE 2023. Lecture Notes in Computer Science, vol 14240. Springer, Cham. https://doi.org/10.1007/978-3-031-43980-3_9
Download citation
DOI: https://doi.org/10.1007/978-3-031-43980-3_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-43979-7
Online ISBN: 978-3-031-43980-3
eBook Packages: Computer ScienceComputer Science (R0)