ICALP 2007: Automata, Languages and Programming pp 435-446

# Balanced Families of Perfect Hash Functions and Their Applications

• Noga Alon
• Shai Gutner
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4596)

## Abstract

The construction of perfect hash functions is a well-studied topic. In this paper, this concept is generalized with the following definition. We say that a family of functions from [n] to [k] is a δ-balanced (n,k)-family of perfect hash functions if for every S ⊆ [n], |S| = k, the number of functions that are 1-1 on S is between T/δ and δT for some constant T > 0. The standard definition of a family of perfect hash functions requires that there will be at least one function that is 1-1 on S, for each S of size k. In the new notion of balanced families, we require the number of 1-1 functions to be almost the same (taking δ to be close to 1) for every such S. Our main result is that for any constant δ> 1, a δ-balanced (n,k)-family of perfect hash functions of size 2O(k loglogk) logn can be constructed in time 2O(k loglogk)n logn. Using the technique of color-coding we can apply our explicit constructions to devise approximation algorithms for various counting problems in graphs. In particular, we exhibit a deterministic polynomial time algorithm for approximating both the number of simple paths of length k and the number of simple cycles of size k for any $$k \leq O(\frac{\log n}{\log \log \log n})$$ in a graph with n vertices. The approximation is up to any fixed desirable relative error.

### Keywords

approximate counting of subgraphs color-coding perfect hashing

## Preview

### References

1. 1.
Alon, N., Bruck, J., Naor, J., Naor, M., Roth, R.M.: Construction of asymptotically good low-rate error-correcting codes through pseudo-random graphs. IEEE Transactions on Information Theory 38(2), 509 (1992)
2. 2.
Alon, N., Goldreich, O., Håstad, J., Peralta, R.: Simple construction of almost k-wise independent random variables. Random Struct. Algorithms 3(3), 289–304 (1992)
3. 3.
Alon, N., Moshkovitz, D., Safra, S.: Algorithmic construction of sets for k-restrictions. ACM Transactions on Algorithms 2(2), 153–177 (2006)
4. 4.
Alon, N., Spencer, J.H.: The Probabilistic Method, 2nd edn. Wiley, Chichester (2000)
5. 5.
Alon, N., Yuster, R., Zwick, U.: Color-coding. Journal of the ACM 42(4), 844–856 (1995)
6. 6.
Alon, N., Yuster, R., Zwick, U.: Finding and counting given length cycles. Algorithmica 17(3), 209–223 (1997)
7. 7.
Arvind, V., Raman, V.: Approximation algorithms for some parameterized counting problems. In: Bose, P., Morin, P. (eds.) ISAAC 2002. LNCS, vol. 2518, pp. 453–464. Springer, Heidelberg (2002)Google Scholar
8. 8.
Azar, Y., Motwani, R., Naor, J.: Approximating probability distributions using small sample spaces. Combinatorica 18(2), 151–171 (1998)
9. 9.
Feller, W.: An introduction to probability theory and its applications, 3rd edn., vol. I. Wiley, Chichester (1968)
10. 10.
Flum, J., Grohe, M.: The parameterized complexity of counting problems. SIAM Journal on Computing 33(4), 892–922 (2004)
11. 11.
Fredman, M.L., Komlós, J., Szemerédi, E.: Storing a sparse table with O(1) worst case access time. Journal of the ACM 31(3), 538–544 (1984)
12. 12.
Hüffner, F., Wernicke, S., Zichner, T.: Algorithm engineering for color-coding to facilitate signaling pathway detection. In: Sankoff, D., Wang, L., Chin, F. (eds.) APBC 2007. Proceedings of 5th Asia-Pacific Bioinformatics Conference, Hong Kong, China, January 15-17, 2007. Advances in Bioinformatics and Computational Biology, vol. 5, pp. 277–286. Imperial College Press, Imperial (2007)Google Scholar
13. 13.
Koller, D., Megiddo, N.: Constructing small sample spaces satisfying given constraints. SIAM Journal on Discrete Mathematics 7(2), 260–274 (1994)
14. 14.
Naor, J., Naor, M.: Small-bias probability spaces: Efficient constructions and applications. SIAM Journal on Computing 22(4), 838–856 (1993)
15. 15.
Naor, M., Schulman, L.J., Srinivasan, A.: Splitters and near-optimal derandomization. In: 36th Annual Symposium on Foundations of Computer Science, pp. 182–191 (1995)Google Scholar
16. 16.
Schmidt, J.P., Siegel, A.: The spatial complexity of oblivious k-probe hash functions. SIAM Journal on Computing 19(5), 775–786 (1990)
17. 17.
Scott, J., Ideker, T., Karp, R.M., Sharan, R.: Efficient algorithms for detecting signaling pathways in protein interaction networks. Journal of Computational Biology 13(2), 133–144 (2006)
18. 18.
Sharan, R., Ideker, T.: Modeling cellular machinery through biological network comparison. Nature Biotechnology 24(4), 427–433 (2006)
19. 19.
Shlomi, T., Segal, D., Ruppin, E., Sharan, R.: QPath: a method for querying pathways in a protein-protein interaction network. BMC Bioinformatics 7, 199 (2006)
20. 20.
Yuster, R., Zwick, U.: Finding even cycles even faster. SIAM Journal on Discrete Mathematics 10(2), 209–222 (1997)
21. 21.
Yuster, R., Zwick, U.: Detecting short directed cycles using rectangular matrix multiplication and dynamic programming. In: Proceedings of the Fifteenth Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 254–260. ACM Press, New York (2004)Google Scholar