Abstract
We present an efficient data structure for finding the longest prefix of a query string q in a dynamic database of strings. When the database strings are prefixes of IP-addresses then this is the IP-lookup problem. Our data structure is I/O efficient. It supports a query with a string q using \(O(\log_{B}(n)+\frac{|q|}{B})\) I/O operations, where B is the size of a disk block. It also supports an insertion and a deletion of a string q with the same number of I/Os. The data structure requires O(n/B) blocks, and the running time for each operation is O(Blog B (n)+|q|).
Similar content being viewed by others
Notes
Line rate is the data transmission speed of a communication line or network. This is about 40 Gbps and more today (2011).
A segment contains a node if it contains all points in its subtree.
Each bit in this memory has 3 possible states: 0, 1, and “don’t care” (this is the reason for the word ternary). To fit into a memory cell a prefix is padded with “don’t cares”. The memory is “content associative” since we give it a specific content and we get back the location of this content.
This is only a conceptual mapping we do not store each interval in the node to which it is mapped.
The root may also represent a prefix of all the keys: In such case we assume that there is an edge entering the root which corresponds to this prefix.
Note that when all the strings in the trie share a prefix the string depth of the root is equal to the length of this prefix.
The reason for this extra factor of B in the query time is because the search in each Patricia trie along the path could take O(B) time if the trie is highly unbalanced. The same happens with the string B-tree [10].
References
Agarwal, P.K., Arge, L., Yi, K.: An optimal dynamic interval stabbing-max data structure. In: Proceedings of the 16th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pp. 803–812 (2005)
Bayer, R., McCreight, E.M.: Organization and maintenance of large ordered indexes. Acta Inform. 1(3), 173–189 (1972)
Bender, M.A., Demaine, E., Farach-Colton, M.: Cache-oblivious B-trees. SIAM J. Comput. 35(2), 341–358 (2005)
Bender, M.A., Farach-Colton, M., Kusznaul, B.C.: Cache-oblivious string B-trees. In: Proceedings of the 25th ACM Symposium on Principles of Database Systems (PODS), pp. 233–242 (2006)
Brodal, G.S., Fagerberg, R.: Cache-oblivious string dictionaries. In: Proceeding 17th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pp. 581–590 (2006)
de Berg, M., van Kreveld, M., Overmars, M., Schwarzkopf, O.: Computational Geometry: Algorithms and Applications. Springer, Berlin (2000)
Demaine, E.D., Iacono, J., Langerman, S.: Worst-case optimal tree layout in a memory hierarchy. CoRR, cs.DS/0410048 (2004)
Eatherton, W., Dittia, Z., Varghese, G.: Tree bitmap: hardware/software IP lookups with incremental updates. ACM SIGCOMM Comput. Commun. Rev. 34(2), 97–122 (2004)
Feldmann, A., Muthukrishnan, S.: Tradeoffs for packet classification. In: IEEE International Conference on Computer Communications (INFOCOM), pp. 1193–1202 (2000)
Ferragina, P., Grossi, R.: The string B-tree: a new data structure for string search in external memory and its applications. J. ACM 46(2), 236–280 (1999)
Frigo, M., Leiserson, C.E., Prokop, H., Ramachandran, S.: Cache-oblivious algorithms. In: IEEE Symposium on Foundations of Computer Science (FOCS), pp. 285–297 (1999)
Hasan, J., Cadambi, S., Jakkula, V., Chakradhar, S.: Chisel: a storage-efficient, collision-free hash- based network processing architecture. In: International Symposium on Computer Architecture (ISCA), pp. 203–215 (2006)
Kaplan, H., Molad, E., Tarjan, R.E.: Dynamic rectangular intersection with priorities. In: ACM Symposium on Theory of Computing (STOC), pp. 639–648 (2003)
Ko, P., Aluru, S.: Obtaining provably good performance from suffix trees in secondary storage. In: Combinatorial Pattern Matching (CPM). LNCS, vol. 4009, pp. 72–83. Springer, Berlin (2006)
Lampson, B.W., Srinivasan, V., Varghese, G.: IP lookups using multiway and multicolumn search. IEEE/ACM Trans. Netw. 7(3), 324–334 (1999)
Lu, H., Sahni, S.: A B-tree dynamic router-table design. IEEE Trans. Comput. 54(7), 813–824 (2005)
Morrison, D.R.: Patricia: practical algorithm to retrieve information coded in alphanumeric. J. ACM 15(4), 514–534 (1968)
Sahni, S., Kim, K.: O(logn) dynamic packet routing. In: IEEE Symposium on Computers and Communications (ISCC), pp. 443–448 (2002)
Sleator, D., Tarjan, R.E.: A data structure for dynamic trees. J. Comput. Syst. Sci. 26(3), 362–391 (1983)
Sleator, D., Tarjan, R.E.: Self-adjusting binary search trees. J. ACM 32, 652–686 (1985)
Suri, S., Varghese, G., Warkhede, P.: Multiway range trees: scalable IP lookup with fast updates. In: IEEE Global Communications Conference (GLOBECOM), pp. 1610–1614 (2001)
Tarjan, R.E.: A class of algorithms which require nonlinear time to maintain disjoint sets. J. Comput. Syst. Sci. 18, 110–127 (1979)
Varghese, G.: Network Algorithmics: An Interdisciplinary Approach to Designing Fast Networked Devices. The Morgan Kaufmann Series in Networking. Morgan Kaufmann, San Francisco (2004)
Vitter, J.S.: External memory algorithms and data structures: dealing with massive data. ACM Comput. Surv. 33(2), 209–271 (2001)
Warkhede, P.R., Suri, S., Varghese, G.: Multiway range trees: scalable IP lookup with fast updates. Comput. Netw. 44(3), 289–303 (2004)
Zane, F., Narlikar, G., Basu, A.: Coolcams: power-efficient TCAMs for forwarding engines. In: IEEE International Conference on Computer Communications (INFOCOM), pp. 42–52 (2003)
Author information
Authors and Affiliations
Corresponding author
Additional information
This work is partially supported by United states—Israel Binational Science Foundation, project number 2006204.
Rights and permissions
About this article
Cite this article
Hershcovitch, M., Kaplan, H. I/O Efficient Dynamic Data Structures for Longest Prefix Queries. Algorithmica 65, 371–390 (2013). https://doi.org/10.1007/s00453-011-9594-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00453-011-9594-2