Skip to main content
Log in

Hyper-distance oracles in hypergraphs

  • Regular Paper
  • Published:
The VLDB Journal Aims and scope Submit manuscript

Abstract

We study point-to-point distance estimation in hypergraphs, where the query is parameterized by a positive integer s, which defines the required level of overlap for two hyperedges to be considered adjacent. To answer s-distance queries, we first explore an oracle based on the line graph of the given hypergraph and discuss its limitations: The line graph is typically orders of magnitude larger than the original hypergraph. We then introduce HypED, a landmark-based oracle with a predefined size, built directly on the hypergraph, thus avoiding the materialization of the line graph. Our framework allows to approximately answer vertex-to-vertex, vertex-to-hyperedge, and hyperedge-to-hyperedge s-distance queries for any value of s. A key observation at the basis of our framework is that as s increases, the hypergraph becomes more fragmented. We show how this can be exploited to improve the placement of landmarks, by identifying the s-connected components of the hypergraph. For this latter task, we devise an efficient algorithm based on the union-find technique and a dynamic inverted index. We experimentally evaluate HypED on several real-world hypergraphs and prove its versatility in answering s-distance queries for different values of s. Our framework allows answering such queries in fractions of a millisecond while allowing fine-grained control of the trade-off between index size and approximation error at creation time. Finally, we prove the usefulness of the s-distance oracle in two applications, namely hypergraph-based recommendation and the approximation of the s-closeness centrality of vertices and hyperedges in the context of protein-protein interactions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Algorithm 1
Algorithm 2
Algorithm 3
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

Notes

  1. The s-distance is often defined as the length of the shortest s-path. We subtract 1 to align with graph theory conventions where the distance between edges is the number of vertices in a shortest path between them, making adjacent edges connected by a path of length 2 but at distance 1.

  2. An alternative definition of vertex-to-vertex s-distance that is a metric could be considered. Let the dual hypergraph be the one obtained by swapping the roles of vertices and hyperedges: Hyperedges become vertices, and each vertex in the original hypergraph becomes a hyperedge that connects all the vertices in the dual that correspond to the hyperedges of the original hypergraph by which it was contained. We can compute the hyperedge-to-hyperedge s-distance in this dual hypergraph and obtain a metric vertex-to-vertex s-distance in the original hypergraph.

    However, note that the vertex-to-vertex s-distance as in Definition 3 and the hyperedge-to-hyperedge s-distance in the dual hypergraph yield distinct results. For instance, s-connected vertices in the hypergraph may be at infinite distance in the dual. In fact, an s-path in the hypergraph is a sequence of hyperedges such that consecutive hyperedges share at least s common vertices, whereas an s-path in the dual is a sequence of vertices such that consecutive vertices belong to at least s common hyperedges.

    Both definitions are valid and could be adopted depending on the applications at hand. Our framework can also handle this alternative definition of vertex-to-vertex s-distance, by simply applying it to the dual hypergraph. Of course, this leads to a separate oracle that could not be used to answer hyperedge-to-hyperedge s-distance queries in the original hypergraph. Therefore, henceforth we focus on the vertex-to-vertex s-distance as per Definition 3. This allows to answer all three types of queries with a single oracle.

  3. http://www.sociopatterns.org/datasets.

  4. http://konect.cc/networks.

  5. https://snap.stanford.edu/biodata.

  6. In contrast to hyperedges, for \(s > 1\), a vertex v may belong to different s-connected components, as it may be in hyperedges that overlap only in v.

References

  1. Liu, Q., Huang, Y., Metaxas, D.N.: Hypergraph with sampling for image retrieval. Pattern Recogn. 44(10), 2255 (2011)

    Article  Google Scholar 

  2. Akiba, T., Iwata, Y., Yoshida, Y.: Fast exact shortest-path distance queries on large networks by pruned landmark labeling. In: SIGMOD, p. 349 (2013)

  3. Aksoy, S.G., Joslyn, C., Marrero, C.O., Praggastis, B., Purvine, E.: Hypernetwork science via high-order hypergraph walks. EPJ Data Sci. 9(1), 16 (2020)

    Article  Google Scholar 

  4. Ausiello, G., Laura, L.: Directed hypergraphs: introduction and fundamental algorithms–a survey. Theor. Comput. Sci. 658, 293 (2017)

    Article  MathSciNet  Google Scholar 

  5. Baswana, S., Goyal, V., Sen, S.: All-pairs nearly 2-approximate shortest paths in o (n2polylogn) time. Theor. Comput. Sci. 410(1), 84 (2009)

    Article  Google Scholar 

  6. Benson, A.R., Abebe, R., Schaub, M.T., Jadbabaie, A., Kleinberg, J.: Simplicial closure and higher-order link prediction. PNAS 115(48), E11221 (2018)

    Article  Google Scholar 

  7. Berge, C.: Hypergraphs: Combinatorics of Finite Sets, vol. 45. Elsevier (1984)

  8. Betzler, N., Fellows, M.R., Guo, J., Niedermeier, R., Rosamond, F.A.: Fixed-parameter algorithms for kemeny scores. In: AAIM, p. 60 (2008)

  9. Billings, J.C.W., Hu, M., Lerda, G., Medvedev, A.N., Mottes, F., Onicas, A., Santoro, A., Petri, G.: Simplex2vec embeddings for community detection in simplicial complexes. arXiv preprint arXiv:1906.09068 (2019)

  10. Brancotte, B., Yang, B., Blin, G., Cohen-Boulakia, S., Denise, A., Hamel, S.: Rank aggregation with ties: experiments and analysis. PVLDB 8(11), 1202 (2015)

    Google Scholar 

  11. Bretto, A., Cherifi, H., Aboutajdine, D.: Hypergraph imaging: an overview. Pattern Recogn. 35(3), 651 (2002)

    Article  Google Scholar 

  12. Bu, J., Tan, S., Chen, C., Wang, C., Wu, H., Zhang, L., He, X.: Music recommendation by unified hypergraph: Combining social media information and music content. In: MM, p. 391 (2010)

  13. Chlamtáč, E., Dinitz, M., Konrad, C., Kortsarz, G., Rabanca, G.: The densest \(k\)-subhypergraph problem. SIAM J. Discrete Math. 32(2), 1458 (2018)

  14. Cohen-Boulakia, S., Denise, A., Hamel, S.: Using medians to generate consensus rankings for biological data. In: SSDBM, p. 73 (2011)

  15. Cooley, O., Kang, M., Koch, C.: Evolution of high-order connected components in random hypergraphs. Electron. Not. Discrete Math. 49, 569 (2015)

    Article  Google Scholar 

  16. Cooper, C., Lee, S.H., Radzik, T., Siantos, Y.: Random walks in recommender systems: exact computation and simulations. In: WWW, p. 811 (2014)

  17. De Figueiredo, L.F., Schuster, S., Kaleta, C., Fell, D.A.: Can sugars be produced from fatty acids? a test case for pathway analysis tools. Bioinformatics 24(22), 2615 (2008)

    Article  Google Scholar 

  18. Draves, R., Padhye, J., Zill, B.: Routing in multi-radio, multi-hop wireless mesh networks. In: MobiCom, p. 114 (2004)

  19. Farhan, M., Wang, Q., Lin, Y., Mckay, B.: A highly scalable labelling approach for exact distance queries in complex networks. EDBT (2019)

  20. Fatemi, B., Taslakian, P., Vazquez, D., Poole, D.: Knowledge hypergraphs: Prediction beyond binary relations. arXiv preprint arXiv:1906.00137 (2019)

  21. Feng, S., Heath, E., Jefferson, B., Joslyn, C., Kvinge, H., Mitchell, H.D., Praggastis, B., Eisfeld, A.J., Sims, A.C., Thackray, L.B., et al.: Hypergraph models of biological networks to identify genes critical to pathogenic viral response. BMC Bioinform. 22(1), 1 (2021)

    Article  Google Scholar 

  22. Feng, Y., You, H., Zhang, Z., Ji, R., Gao, Y.: Hypergraph neural networks. In: AAAI, p. 3558 (2019)

  23. Franzese, N., Groce, A., Murali, T., Ritz, A.: Hypergraph-based connectivity measures for signaling pathway topologies. PLoS Comput. Biol. 15(10), e1007,384 (2019)

    Article  Google Scholar 

  24. Gallo, G., Longo, G., Pallottino, S., Nguyen, S.: Directed hypergraphs and applications. Discrete Appl. Math. 42(2–3), 177 (1993)

    Article  MathSciNet  Google Scholar 

  25. Gao, J., Zhao, Q., Ren, W., Swami, A., Ramanathan, R., Bar-Noy, A.: Dynamic shortest path algorithms for hypergraphs. Trans. Netw. 23(6), 1805 (2014)

    Article  Google Scholar 

  26. Goldberg, A.V.: Point-to-point shortest path algorithms with preprocessing. In: SOFSEM, p. 88 (2007)

  27. Goldberg, A.V., Harrelson, C.: Computing the shortest path: A search meets graph theory. In: SODA, vol. 5, p. 156. Citeseer (2005)

  28. Goldberg, A.V., Kaplan, H., Werneck, R.F.: Reach for a*: Efficient point-to-point shortest path algorithms. In: ALENEX, p. 129. SIAM (2006)

  29. Goldman, R., Shivakumar, N., Venkatasubramanian, S., Garcia-Molina, H.: Proximity search in databases. VLDB 98, p. 26 (1998)

  30. Gori, M., Pucci, A.: Research paper recommender systems: A random-walk based approach. In: 2006 IEEE/WIC/ACM International Conference on Web Intelligence (WI 2006 Main Conference Proceedings)(WI’06), p. 778 (2006)

  31. Gori, M., Pucci, A., Roma, V., Siena, I.: Itemrank: A random-walk based scoring algorithm for recommender engines. In: IJCAI, vol. 7, p. 2766 (2007)

  32. Gubichev, A., Bedathur, S., Seufert, S., Weikum, G.: Fast and accurate estimation of shortest paths in large graphs. In: CIKM, p. 499 (2010)

  33. Huang, J., Zhang, R., Yu, J.X.: Scalable hypergraph learning and processing. In: ICDM, p. 775 (2015)

  34. Hwang, H., Lee, S., Shin, K.: Hyfer: A framework for making hypergraph learning easy, scalable and benchmarkable. In: GLB (2021)

  35. Italiano, G.F., Nanni, U.: Online maintenance of minimal directed hypergraphs (1989)

  36. Jeong, H., Mason, S.P., Barabási, A.L., Oltvai, Z.N.: Lethality and centrality in protein networks. Nature 411(6833), 41 (2001)

    Article  Google Scholar 

  37. Ji, S., Feng, Y., Ji, R., Zhao, X., Tang, W., Gao, Y.: Dual channel hypergraph collaborative filtering. In: KDD, p. 2020 (2020)

  38. Jiang, J., Wei, Y., Feng, Y., Cao, J., Gao, Y.: Dynamic hypergraph neural networks. In: IJCAI, p. 2635 (2019)

  39. Jin, R., Peng, Z., Wu, W., Dragan, F., Agrawal, G., Ren, B.: Parallelizing pruned landmark labeling: dealing with dependencies in graph algorithms. In: ICS, p. 1 (2020)

  40. Joslyn, C.A., Aksoy, S.G., Callahan, T.J., Hunter, L.E., Jefferson, B., Praggastis, B., Purvine, E., Tripodi, I.J.: Hypernetwork science: from multidimensional networks to computational topology. In: CCS, p. 377 (2020)

  41. Joy, M.P., Brock, A., Ingber, D.E., Huang, S.: High-betweenness proteins in the yeast protein interaction network. J. Biomed. Biotechnol. 2005(2), 96 (2005)

    Article  Google Scholar 

  42. Karypis, G., Aggarwal, R., Kumar, V., Shekhar, S.: Multilevel hypergraph partitioning: applications in vlsi domain. VLSI 7(1), 69 (1999)

    Google Scholar 

  43. Kemeny, J.G.: Mathematics without numbers. Daedalus 88(4), 577 (1959)

    Google Scholar 

  44. Kirkland, S.: Two-mode networks exhibiting data loss. J. Comp. Netw. 6(2), 297 (2018)

    MathSciNet  Google Scholar 

  45. Klamt, S., Haus, U.U., Theis, F.: Hypergraphs and cellular networks. PLoS Comput. Biol. 5(5), e1000,385 (2009)

    Article  MathSciNet  Google Scholar 

  46. Kleinberg, J.M.: Navigation in a small world. Nature 406(6798), 845 (2000)

    Article  Google Scholar 

  47. Kotlyar, M., Fortney, K., Jurisica, I.: Network-based characterization of drug-regulated genes, drug targets, and toxicity. Methods 57(4), 499 (2012)

    Article  Google Scholar 

  48. Krieger, S., Kececioglu, J.: Fast approximate shortest hyperpaths for inferring pathways in cell signaling hypergraphs. In: WABI. Schloss Dagstuhl-Leibniz-Zentrum für Informatik (2021)

  49. Kumar, R., Vassilvitskii, S.: Generalized distances between rankings. In: WWW, p. 571(2010)

  50. Li, D., Xu, Z., Li, S., Sun, X.: Link prediction in social networks based on hypergraph. In: WWW, p. 41 (2013)

  51. Li, J., He, J., Zhu, Y.: E-tail product return prediction via hypergraph-based local graph cut. In: KDD, p. 519 (2018)

  52. Li, W., Qiao, M., Qin, L., Zhang, Y., Chang, L., Lin, X.: Scaling up distance labeling on graphs with core-periphery properties. In: SIGMOD, p. 1367 (2020)

  53. Liu, X.T., Firoz, J., Aksoy, S., Amburg, I., Lumsdaine, A., Joslyn, C., Gebremedhin, A.H., Praggastis, B.: High-order line graphs of non-uniform hypergraphs: Algorithms, applications, and experimental analysis. arXiv preprint arXiv:2201.11326 (2022)

  54. Liu, X.T., Firoz, J., Lumsdaine, A., Joslyn, C., Aksoy, S., Praggastis, B., Gebremedhin, A.: Parallel algorithms and heuristics for efficient computation of high-order line graphs of hypergraphs. arXiv preprint arXiv:2010.11448 (2020)

  55. Lu, L., Peng, X.: High-ordered random walks and generalized laplacians on hypergraphs. In: International Workshop on Algorithms and Models for the Web-Graph, p. 14 (2011)

  56. Luo, Q., Yu, D., Cai, Z., Lin, X., Wang, G., Cheng, X.: Toward maintenance of hypercores in large-scale dynamic hypergraphs. In: VLDBJ, p. 1 (2022)

  57. Manne, F., Patwary, M., Ali, M.: A scalable parallel union-find algorithm for distributed memory computers. In: PPAM, p. 186 (2009)

  58. Nielsen, L.R., Andersen, K.A., Pretolani, D.: Finding the k shortest hyperpaths. Comput. Oper. Res. 32(6), 1477 (2005)

    Article  MathSciNet  Google Scholar 

  59. Potamias, M., Bonchi, F., Castillo, C., Gionis, A.: Fast shortest path distance estimation in large networks. In: CIKM, p. 867 (2009)

  60. Preti, G., De Francisci Morales, G., Bonchi, F.: Strud: Truss decomposition of simplicial complexes. In: The Web Conference, p. 3408 (2021)

  61. Qi, Z., Xiao, Y., Shao, B., Wang, H.: Toward a distance oracle for billion-node graphs. PVLDB 7(1), 61 (2013)

    Google Scholar 

  62. Rahman, S.A., Advani, P., Schunk, R., Schrader, R., Schomburg, D.: Metabolic pathway analysis web service (pathway hunter tool at cubic). Bioinformatics 21(7), 1189 (2005)

    Article  Google Scholar 

  63. Ritz, A., Avent, B., Murali, T.: Pathway analysis with signaling hypergraphs. TCBB 14(5), 1042 (2015)

  64. Ritz, A., Tegge, A.N., Kim, H., Poirel, C.L., Murali, T.: Signaling hypergraphs. Trends Biotechnol. 32(7), 356 (2014)

    Article  Google Scholar 

  65. Schölkopf, B., Platt, J., Hofmann, T.: Learning with hypergraphs: Clustering, classification, and embedding. In: NIPS, p. 1601 (2007)

  66. Shun, J.: Practical parallel hypergraph algorithms. In: SIGPLAN, p. 232 (2020)

  67. Sommer, C.: Shortest-path queries in static networks. CSU 46(4), 1 (2014)

  68. Soofi, A., Taghizadeh, M., Tabatabaei, S.M., Tavirani, M.R., Shakib, H., Namaki, S., Alighiarloo, N.S.: Centrality analysis of protein-protein interaction networks and molecular docking prioritize potential drug-targets in type 1 diabetes. IJPR 19(4), 121 (2020)

    Google Scholar 

  69. Sun, B., Chan, T.H.H., Sozio, M.: Fully dynamic approximate k-core decomposition in hypergraphs. TKDD 14(4) (2020)

  70. Tan, H.K., Ngo, C.W., Wu, X.: Modeling video hyperlinks with hypergraph for web video reranking. In: MM, p. 659 (2008)

  71. Tan, S., Guan, Z., Cai, D., Qin, X., Bu, J., Chen, C.: Mapping users across networks by manifold alignment on hypergraph. AAAI 28(1) (2014)

  72. Tarjan, R.E., Van Leeuwen, J.: Worst-case analysis of set union algorithms. JACM 31(2), 245 (1984)

    Article  MathSciNet  Google Scholar 

  73. Thorup, M., Zwick, U.: Approximate distance oracles. JACM 52(1), 1 (2005)

  74. Tofallis, C.: A better measure of relative prediction accuracy for model selection and model estimation. J. Oper. Res. Soc. 66(8), 1352 (2015)

    Article  Google Scholar 

  75. Tretyakov, K., Armas-Cervantes, A., García-Bañuelos, L., Vilo, J., Dumas, M.: Fast fully dynamic landmark-based estimation of shortest path distances in very large graphs. In: CIKM, p. 1785 (2011)

  76. Viacava Follis, A.: Centrality of drug targets in protein networks. BMC Bioinform. 22(1), 1 (2021)

    Article  Google Scholar 

  77. Vieira, M.V., Fonseca, B.M., Damazio, R., Golgher, P.B., Reis, D.d.C., Ribeiro-Neto, B.: Efficient search ranking in social networks. In: CIKM, p. 563 (2007)

  78. Xu, Q., Zhang, X., Zhao, J., Wang, X., Wolf, T.: Fast shortest-path queries on large-scale graphs. In: ICNP, p. 1 (2016)

  79. Yang, D., Qu, B., Yang, J., Cudre-Mauroux, P.: Revisiting user mobility and social relationships in lbsns: A hypergraph embedding approach. In: WWW, p. 2147 (2019)

  80. Zhang, M., Cui, Z., Jiang, S., Chen, Y.: Beyond link prediction: Predicting hyperlinks in adjacency space. AAAI 32(1) (2018)

  81. Zheng, X., Luo, Y., Sun, L., Ding, X., Zhang, J.: A novel social network hybrid recommender system based on hypergraph topologic structure. WWW, p. 985 (2018)

  82. Zheng, X., Luo, Y., Sun, L., Ding, X., Zhang, J.: A novel social network hybrid recommender system based on hypergraph topologic structure. WWW, p. 985 (2018)

  83. Zhu, Y., Guan, Z., Tan, S., Liu, H., Cai, D., He, X.: Heterogeneous hypergraph embedding for document recommendation. Neurocomputing 216, 150 (2016)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Francesco Bonchi.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Preti, G., De Francisci Morales, G. & Bonchi, F. Hyper-distance oracles in hypergraphs. The VLDB Journal (2024). https://doi.org/10.1007/s00778-024-00851-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s00778-024-00851-2

Keywords

Navigation