Skip to main content
Log in

Answering reachability queries with ordered label constraints over labeled graphs

  • Research Article
  • Published:
Frontiers of Computer Science Aims and scope Submit manuscript

Abstract

Reachability query plays a vital role in many graph analysis tasks. Previous researches proposed many methods to efficiently answer reachability queries between vertex pairs. Since many real graphs are labeled graph, it highly demands Label-Constrained Reachability (LCR) query in which constraint includes a set of labels besides vertex pairs. Recent researches proposed several methods for answering some LCR queries which require appearance of some labels specified in constraints in the path. Besides that constraint may be a label set, query constraint may be ordered labels, namely OLCR (Ordered-Label-Constrained Reachability) queries which retrieve paths matching a sequence of labels. Currently, no solutions are available for OLCR. Here, we propose DHL, a novel bloom filter based indexing technique for answering OLCR queries. DHL can be used to check reachability between vertex pairs. If the answers are not no, then constrained DFS is performed. So, we employ DHL followed by performing constrained DFS to answer OLCR queries. We show that DHL has a bounded false positive rate, and it’s powerful in saving indexing time and space. Extensive experiments on 10 real-life graphs and 12 synthetic graphs demonstrate that DHL achieves about 4.8–22.5 times smaller index space and 4.6–114 times less index construction time than two state-of-art techniques for LCR queries, while achieving comparable query response time. The results also show that our algorithm can answer OLCR queries effectively.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Jin R, Hong H, Wang H, Ruan N, Xiang Y. Computing label-constraint reachability in graph databases. In: Proceedings of 2010 ACM SIGMOD International Conference on Management of Data. 2010, 123–134

  2. Zhu J, Nie Z, Liu X, Zhang B, Wen J. Statsnowball: a statistical approach to extracting entity relationships. In: Proceedings of the 18th International Conference on World Wide Web. 2009, 101–110

  3. Yang C, Liu G, Yan C, Jiang C. A clustering-based flexible weighting method in AdaBoost and its application to transaction fraud detection. Science China Information Sciences, 2021, 64(12): 222101

    Article  Google Scholar 

  4. Cheng J, Huang S, Wu H, Fu A W C. Tf-label: a topological-folding labeling scheme for reachability querying in a large graph. In: Proceedings of 2013 ACM SIGMOD International Conference on Management of Data. 2013, 193–204

  5. Cohen E, Halperin E, Kaplan H, Zwick U. Reachability and distance queries via 2-hop labels. SIAM Journal on Computing, 2003, 32(5): 1338–1355

    Article  MathSciNet  MATH  Google Scholar 

  6. Jin R, Ning R, Dey S, Xu J Y. SCARAB: scaling reachability computation on large graphs. In: Proceedings of 2012 ACM SIGMOD International Conference on Management of Data. 2012, 169–180

  7. Jin R, Xiang Y, Ruan N, Wang H. Efficiently answering reachability queries on very large directed graphs. In: Proceedings of 2008 ACM SIGMOD International Conference on Management of Data. 2008, 595–608

  8. Li L, Hua W, Zhou X. HD-GDD: high dimensional graph dominance drawing approach for reachability query. World Wide Web, 2017, 20(4): 677–696

    Article  Google Scholar 

  9. Su J, Zhu Q, Wei H, Yu J X. Reachability querying: can it be even faster? IEEE Transactions on Knowledge and Data Engineering, 2017, 29(3): 683–697

    Article  Google Scholar 

  10. Wei H, Yu J X, Lu C, Jin R. Reachability querying: an independent permutation labeling approach. Proceedings of the VLDB Endowment, 2014, 7(12): 1191–1202

    Article  Google Scholar 

  11. Wei H, Yu J X, Lu C, Jin R. Reachability querying: an independent permutation labeling approach. The VLDB Journal, 2018, 27(1): 1–26

    Article  Google Scholar 

  12. Yuan P, You Y, Zhou S, Jin H, Liu L. Providing fast reachability query services with MGTag: a multi-dimensional graph labeling method. IEEE Transactions on Services Computing, 2022, 15(2): 1000–1011

    Article  Google Scholar 

  13. Holme P, Saramäki J. Temporal networks. Physics Reports, 2012, 519(3): 97–125

    Article  Google Scholar 

  14. Shi C, Li Y, Zhang J, Sun Y, Yu P S. A survey of heterogeneous information network analysis. IEEE Transactions on Knowledge and Data Engineering, 2017, 29(1): 17–37

    Article  Google Scholar 

  15. Peng Y, Zhang Y, Lin X, Qin L, Zhang W. Answering billion-scale label-constrained reachability queries within microsecond. Proceedings of the VLDB Endowment, 2020, 13(6): 812–825

    Article  Google Scholar 

  16. Zou L, Xu K, Yu J X, Chen L, Xiao Y, Zhao D. Efficient processing of label-constraint reachability queries in large graphs. Information Systems, 2014, 40: 47–66

    Article  Google Scholar 

  17. Valstar L D J, Fletcher G H L, Yoshida Y. Landmark indexing for evaluation of label-constrained reachability queries. In: Proceedings of 2017 ACM International Conference on Management of Data. 2017, 345–358

  18. Jin R, Ruan N, Xiang Y, Wang H. Path-tree: an efficient reachability indexing scheme for large directed graphs. ACM Transactions on Database Systems, 2011, 36(1): 7

    Article  Google Scholar 

  19. Chen Y, Chen Y. An efficient algorithm for answering graph reachability queries. In: Proceedings of the 24th IEEE International Conference on Data Engineering. 2008, 893–902

  20. Van Schaik S J, De Moor O. A memory efficient reachability data structure through bit vector compression. In: Proceedings of 2011 ACM SIGMOD International Conference on Management of Data. 2011, 913–924

  21. Wang H, He H, Yang J, Yu P S, Yu J X. Dual labeling: answering graph reachability queries in constant time. In: Proceedings of the 22nd International Conference on Data Engineering (ICDE’06). 2006, 75

  22. Chen Y. General spanning trees and reachability query evaluation. In: Proceedings of the 2nd Canadian Conference on Computer Science and Software Engineering. 2009, 243–252

  23. Jin R, Xiang Y, Ruan N, Fuhry D. 3-HOP: a high-compression indexing scheme for reachability query. In: Proceedings of 2009 ACM SIGMOD International Conference on Management of Data. 2009, 813–826

  24. Jin R, Wang G. Simple, fast, and scalable reachability oracle. Proceedings of the VLDB Endowment, 2013, 6(14): 1978–1989

    Article  Google Scholar 

  25. Schenkel R, Theobald A, Weikum G. HOPI: an efficient connection index for complex xml document collections. In: Proceedings of the 9th International Conference on Extending Database Technology. 2004, 237–255

  26. Cheng J, Yu J X, Lin X, Wang H, Philip S Y. Fast computation of reachability labeling for large graphs. In: Proceedings of the 10th International Conference on Extending Database Technology. 2006, 961–979

  27. Cai J, Poon C K. Path-hop: efficiently indexing large graphs for reachability queries. In: Proceedings of the 19th ACM International Conference on Information and Knowledge Management. 2010, 119–128

  28. Zhu A D, Lin W, Wang S, Xiao X. Reachability queries on large dynamic graphs: a total order approach. In: Proceedings of 2014 ACM SIGMOD International Conference on Management of Data. 2014, 1323–1334

  29. TrßI S, Leser U. Fast and practical indexing and querying of very large graphs. In: Proceedings of 2007 ACM SIGMOD International Conference on Management of Data. 2007, 845–856

  30. Yildirim H, Chaoji V, Zaki M J. Grail: scalable reachability index for large graphs. Proceedings of the VLDB Endowment, 2010, 3(1–2): 276–284

    Article  Google Scholar 

  31. Yildirim H, Chaoji V, Zaki M J. Grail: a scalable index for reachability queries in very large graphs. The VLDB Journal, 2012, 21(4): 509–534

    Article  Google Scholar 

  32. Seufert S, Anand A, Bedathur S, Weikum G. FERRARI: flexible and efficient reachability range assignment for graph indexing. In: Proceedings of the 29th IEEE International Conference on Data Engineering (ICDE). 2013, 1009–1020

  33. Battista G D, Eades P, Tamassia R, Tollis I G. Graph Drawing: Algorithms for the Visualization of Graphs. New Jersey: Prentice Hall, 1999

    MATH  Google Scholar 

  34. Veloso R R, Cerf L, Meira W Jr, Zaki M J. Reachability queries in very large graphs: a fast refined online search approach. In: Proceedings of the 17th International Conference on Extending Database Technology. 2014, 511–522

  35. Sengupta N, Bagchi A, Ramanath M, Bedathur S. ARROW: approximating reachability using random walks over web-scale graphs. In: Proceedings of the 35th IEEE International Conference on Data Engineering (ICDE). 2019, 470–481

  36. Wadhwa S, Prasad A, Ranu S, Bagchi A, Bedathur S. Efficiently answering regular simple path queries on large labeled networks. In: Proceedings of 2019 International Conference on Management of Data. 2019, 1463–1480

  37. Bloom B H. Space/time trade-offs in hash coding with allowable errors. Communications of the ACM, 1970, 13(7): 422–426

    Article  MATH  Google Scholar 

  38. Luo L, Guo D, Ma R T B, Rottenstreich O, Luo X. Optimizing bloom filter: challenges, solutions, and comparisons. IEEE Communications Surveys & Tutorials, 2019, 21(2): 1912–1949

    Article  Google Scholar 

Download references

Acknowledgements

The research was supported by the National Natural Science Foundation of China (Grant Nos. 61932004 and 62072205).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Pingpeng Yuan.

Additional information

Daoliang He received the BS degree in Electronic and Information Engineering from Northwestern Polytechnical University, China in 2015. He is currently working toward the PhD degree in computer science and technology with Huazhong University of Science and Technology, China. His research interests include big data analysis and processing, graph mining and graph representation learning.

Pingpeng Yuan received the PhD degree in computer science from Zhejiang University, China. Currently, he is a Professor with the School of Computer Science and Technology, Huazhong University of Science and Technology, China. His research interests include databases, knowledge representation, and reasoning and information retrieval, with a focus on high performance computing. During exploring his research, he implements systems and innovative applications in addition to investigating theoretical solutions and algorithmic design. Thus, he is the Principle Developer in multiple system prototypes, including TripleBit, PathGraph, and SemreX.

Hai Jin received the PhD degree in computer engineering from Huazhong University of Science and Technology, China in 1994. He is a Chair Professor of Computer Science and Engineering with Huazhong University of Science and Technology, China. In 1996, he was awarded a German Academic Exchange Service Fellowship to visit the Technical University of Chemnitz in Germany. He is the Chief Scientist of ChinaGrid, the largest grid computing project in China, and also the Chief Scientists of National 973 Basic Research Program Project of Virtualization Technology of Computing System and Cloud Security. Dr. Jin is a fellow of IEEE, a Fellow of CCF, and a member of the ACM. He has coauthored 22 books and published over 700 research papers. His research interests include computer architecture, virtualization technology, cluster computing and cloud computing, peer-to-peer computing, network storage, and network security.

Electronic Supplementary Material

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

He, D., Yuan, P. & Jin, H. Answering reachability queries with ordered label constraints over labeled graphs. Front. Comput. Sci. 18, 181601 (2024). https://doi.org/10.1007/s11704-022-2368-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11704-022-2368-y

Keywords

Navigation