Abstract
Multidimensional data are widely used in real-life applications. Intel’s new brand of SSDs, called 3D XPoint, is an example of three-dimensional data. Motivated by a structural analysis of multidimensional data, we introduce the multidimensional period recovery problem, defined as follows. The input is a d-dimensional text array, with dimensions \(n_1 \times n_2 \times \dots \times n_d\), that contains corruptions, while the original text without the corruptions is periodic. The goal is then to report the period of the original text. We show that, if the number of corruptions is at most \(\left\lfloor \frac{1}{2 + \epsilon }\left\lfloor \frac{n_1}{p_1}\right\rfloor \cdots \left\lfloor \frac{n_d}{p_d}\right\rfloor \right\rfloor \), where \(\epsilon > 0\) and \(p_1 \times \cdots \times p_d\) are the period’s dimensions, then the amount of possible period candidates is \(O(\log N)\), where \(N = \varPi _{i=1}^{d}n_i\). The independency of this bound of the number of dimensions is a surprising key contribution of this paper. We present an \(O(\varPi _{i=1}^{d} n_i \varPi _{i=1}^{d} \log n_i)\) algorithm for any constant dimension d (linear time up to logarithmic factor), to report these candidates. The tightness of the bound on the number of errors enabling a small size candidate set is demonstrated by showing that if the number of errors is equal to \(\left\lfloor \frac{1}{2}\left\lfloor \frac{n_1}{p_1}\right\rfloor \cdots \left\lfloor \frac{n_d}{p_d}\right\rfloor \right\rfloor \), a family of texts with \(\varTheta (N)\) period candidates can be constructed for any dimension \(d \ge 2\).
Similar content being viewed by others
Notes
This notion should not be confused with other notions of primitivity in stringology, such as in covers. The difference in the definition of primitivity for covers stems from the fact that the string must end with a complete occurrence of a cover, which is not the case for a period.
References
Amir, A., Amit, M., Landau, G.M., Sokol, D.: Period recovery of strings over the Hamming and edit distances. Theor. Comput. Sci. 710, 2–18 (2018)
Amir, A., Benson, G.: Two-dimensional periodicity in rectangular arrays. SIAM J. Comput. 27(1), 90–106 (1998)
Amir, A., Benson, G., Farach, M.: Optimal parallel two dimensional pattern matching. In: Snyder, L. (ed.) Proceedings of the 5th Annual ACM Symposium on Parallel Algorithms and Architectures, SPAA ’93, Velen, Germany, June 30–July 2, 1993, pp. 79–85. ACM (1993)
Amir, A., Benson, G., Farach, M.: Optimal parallel two dimensional text searching on a CREW PRAM. Inf. Comput. 144(1), 1–17 (1998)
Amir, A., Boneh, I.: Dynamic palindrome detection (2019). CoRR arXiv:1906.09732
Amir, A., Boneh, I., Charalampopoulos, P., Kondratovsky, E.: Repetition detection in a dynamic string. In: European Symposium on Algorithms ESA, Volume 144 of LIPIcs, pp. 5:1–5:18. Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2019)
Amir, A., Eisenberg, E., Levy, A.: Approximate periodicity. Inf. Comput. 241, 215–226 (2015)
Amir, A., Eisenberg, E., Levy, A., Porat, E., Shapira, N.: Cycle detection and correction. ACM Trans. Algorithms 9(1), 13:1-13:20 (2012)
Amir, A., Landau, G.M., Marcus, S., Sokol, D.: Two-dimensional maximal repetitions. Theor. Comput. Sci. 812, 49–61 (2020)
Amir, A., Levy, A., Lewenstein, M., Lubin, R., Porat, B.: Can we recover the cover? Algorithmica 81(7), 2857–2875 (2019)
Amir, A., Levy, A., Lubin, R., Porat, E.: Approximate cover of strings. Theor. Comput. Sci. 793, 59–69 (2019)
Amit, M., Crochemore, M., Landau, G.M.: Locating all maximal approximate runs in a string. In: Fischer, J., Sanders, P. (eds.) Combinatorial Pattern Matching, 24th Annual Symposium, CPM 2013, Bad Herrenalb, Germany, June 17–19, 2013. Proceedings, Volume 7922 of Lecture Notes in Computer Science, pp. 13–27. Springer (2013)
Apostolico, A., Brimkov, V.E.: Fibonacci arrays and their two-dimensional repetitions. Theor. Comput. Sci. 237(1–2), 263–273 (2000)
Apostolico, A., Giancarlo, R.: Periodicity and repetitions in parameterized strings. Discrete Appl. Math. 156(9):1389–1398 (2008). General Theory of Information Transfer and Combinatorics
Bannai, H., I, T., Inenaga, S., Nakashima, Y., Takeda, M., Tsuruta, K.: The “runs” theorem (2015). CoRR, arXiv:1406.0263v7
Boyer, R.S., Moore, J.S.: MJRTY: a fast majority vote algorithm. In: Boyer, R.S. (ed.) Automated Reasoning: Essays in Honor of Woody Bledsoe, Automated Reasoning Series, pp. 105–118. Kluwer Academic Publishers (1991)
Cole, R., Crochemore, M., Galil, Z., Gasieniec, L., Hariharan, R., Muthukrishnan, S., Park, K., Rytter, W.: Optimally fast parallel algorithms for preprocessing and pattern matching in one and two dimensions. In: 34th Annual Symposium on Foundations of Computer Science, Palo Alto, California, USA, 3–5 November 1993, pp. 248–258. IEEE Computer Society (1993)
Crochemore, M., Iliopoulos, C.S., Kubica, M., Radoszewski, J., Rytter, W., Waleń, T.: Extracting powers and periods in a word from its runs structure. Theor. Comput. Sci. 521, 29–41 (2014)
Crochemore, M.: An optimal algorithm for computing the repetitions in a word. Inf. Process. Lett. 12(5), 244–250 (1981)
Crochemore, M., Gasieniec, L., Hariharan, R., Muthukrishnan, S., Rytter, W.: A constant time optimal parallel algorithm for two-dimensional pattern matching. SIAM J. Comput. 27(3), 668–681 (1998)
Crochemore, M., Rytter, W.: Usefulness of the Karp–Miller–Rosenberg algorithm in parallel computations on strings and arrays. Theor. Comput. Sci. 88(1), 59–82 (1991)
Fine, N.J., Wilf, H.S.: Uniqueness theorems for periodic functions. Proc. Am. Math. Soc. 16(1), 109–114 (1965)
Galil, Z.: Optimal parallel algorithms for string matching. Inf. Control 67(1–3), 144–157 (1985)
Galil, Z., Giancarlo, R.: Improved string matching with k mismatches. SIGACT News 17(4), 52–54 (1986)
Galil, Z., Park, K.: Alphabet-independent two-dimensional witness computation. SIAM J. Comput. 25(5), 907–935 (1996)
Gamard, G., Richomme, G., Shallit, J., Smith, T.J.: Periodicity in rectangular arrays. Inf. Process. Lett. 118, 58–63 (2017)
Gusfield, D., Stoye, J.: Linear time algorithms for finding and representing all the tandem repeats in a string. J. Comput. Syst. Sci. 69(4), 525–546 (2004)
Karp, R.M., Miller, R.E., Rosenberg, A.L.: Rapid identification of repeated patterns in strings, trees and arrays. In: Fischer, P.C., Zeiger, H.P., Ullman, J.D., Rosenberg, A.L. (eds.) Proceedings of the 4th Annual ACM Symposium on Theory of Computing, May 1–3, 1972, Denver, Colorado, USA, pp. 125–136. ACM (1972)
Kociumaka, T., Radoszewski, J., Rytter, W., Walen, T.: Internal pattern matching queries in a text and applications. In: Indyk, P. (ed.) Proceedings of the Twenty-Sixth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2015, San Diego, CA, USA, January 4–6, 2015, pp. 532–551. SIAM (2015)
Kolpakov, R.M., Kucherov, G.: Finding maximal repetitions in a word in linear time. In: 40th Annual Symposium on Foundations of Computer Science, FOCS ’99, 17–18 October, 1999, New York, NY, USA, pp. 596–604. IEEE Computer Society (1999)
Kolpakov, R.M., Kucherov, G.: Finding approximate repetitions under Hamming distance. Theor. Comput. Sci. 303(1), 135–156 (2003)
Landau, G.M., Schmidt, J.P., Sokol, D.: An algorithm for approximate tandem repeats. J. Comput. Biol. 8(1), 1–18 (2001)
Marcus, S., Sokol, D.: 2d Lyndon words and applications. Algorithmica 77(1), 116–133 (2017)
Régnier, M., Rostami, L.: A unifying look at d-dimensional periodicities and space coverings. In: Apostolico, A., Crochemore, M., Galil, Z., Manber, U. (eds.) Combinatorial Pattern Matching, 4th Annual Symposium, CPM 93, Padova, Italy, June 2–4, 1993, Proceedings, Volume 684 of Lecture Notes in Computer Science, pp. 215–227. Springer (1993)
Sim, J.S., Iliopoulos, C.S., Park, K., Smyth, W.F.: Approximate periods of strings. Theor. Comput. Sci. 262(1), 557–568 (2001)
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
A partial version of this paper appeared in the proceedings of SPIRE 2020.
Amihood Amir: Partly supported by ISF Grant 1475/18 and BSF Grant 2018141. Dina Sokol: Partly supported by BSF Grant 2018141.
Rights and permissions
About this article
Cite this article
Amir, A., Butman, A., Kondratovsky, E. et al. Multidimensional Period Recovery. Algorithmica 84, 1490–1510 (2022). https://doi.org/10.1007/s00453-022-00926-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00453-022-00926-y