The eXtended Burrows-Wheeler Transform (XBWT) is a data transformation introduced in [Ferragina et al., FOCS 2005] to compactly represent a labeled tree and simultaneously support navigation and path-search operations over its label structure.
A natural application of the XBWT is to store a dictionary of strings. A recent extensive experimental study [Martínez-Prieto et al., Information Systems, 2016] shows that, among the available string dictionary implementations, the XBWT is attractive because of its good tradeoff between small space usage, speed, and support for substring searches. In this paper we further investigate the use of the XBWT for storing a string dictionary. Our first contribution is to show how to add suffix links (aka failure links) to a XBWT string dictionary. For a XBWT dictionary with n internal nodes our suffix links can be traversed in constant time and only take \(2n + o(n)\) bits of space.
Our second contribution are practical construction algorithms for the XBWT, including the additional data structure supporting the traversal of suffix links. Our algorithms build on the many well engineered algorithms for Suffix Array and BWT construction and offer different tradeoffs between running time and working space.
KeywordsInternal Node Construction Algorithm Empty String Suffix Array Common Prefix
- 6.Ferragina, P., Luccio, F., Manzini, G., Muthukrishnan, S.: Structuring labeled trees for optimal succinctness, and beyond. In: Proceedings of the 46th IEEE Symposium on Foundations of Computer Science (FOCS), pp. 184–193 (2005)Google Scholar
- 7.Ferragina, P., Luccio, F., Manzini, G., Muthukrishnan, S.: Compressing and searching XML data via two zips. In: Proceedings of the 15th International World Wide Web Conference (WWW), pp. 751–760 (2006)Google Scholar
- 8.Ferragina, P., Luccio, F., Manzini, G., Muthukrishnan, S.: Compressing and indexing labeled trees, with applications. J. ACM, 57 (2009)Google Scholar
- 10.Holt, J., McMillan, L.: Constructing Burrows-Wheeler transforms of large string collections via merging. In: BCB, pp. 464–471. ACM (2014)Google Scholar
- 12.Kärkkäinen, J., Kempa, D.: Engineering a lightweight external memory suffix array construction algorithm. In: Proceedings of CEUR Workshop, ICABD, vol. 1146, pp. 53–60 (2014). http://CEUR-WS.org
- 15.Knuth, D.E.: Sorting and Searching. The Art of Computer Programming, 2nd edn. Addison-Wesley, Reading (1998)Google Scholar
- 19.Navarro, G., Sadakane, K.: Fully-functional static and dynamic succinct trees. ACM Trans. Algorithms 10 (2014). Article 16Google Scholar