Abstract
Regular Path Queries (RPQs), which are essentially regular expressions to be matched against the labels of paths in labeled graphs, are at the core of graph database query languages like SPARQL. A way to solve RPQs is to translate them into a sequence of operations on the adjacency matrices of each label. We design and implement a Boolean algebra on sparse matrix representations and, as an application, use them to handle RPQs. Our baseline representation uses the same space as the previously most compact index for RPQs and excels in handling the hardest types of queries. Our more succinct structure, based on \(k^2\)-trees, is 4 times smaller and still solves complex RPQs in reasonable time.
Supported by ANID - Millennium Science Initiative Program – Code ICN17_002, and Fondecyt Grant 1-230755, Fondecyt Grant 1221926; CITIC is funded by Xunta de Galicia and CIGUS; GAIN/Xunta de Galicia Grant ED431C 2021/53 (GRC); Xunta de Galicia/FEDER-UE Grant IN852D 2021/3; MCIN/AEI and NextGenerationEU/PRTR Grants [PID2020-114635RB-I00, TED2021-129245B-C21].
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
If v is not a power of 2 we round it up to the next power, leaving the extended cells empty. This imposes almost no extra overhead on the \(k^2\)-tree representation.
References
Álvarez-García, S., Brisaboa, N.R., Fernández, J., Martínez-Prieto, M., Navarro, G.: Compressed vertical partitioning for efficient RDF management. Knowl. Inf. Syst. 44(2), 439–474 (2015)
Angles, R., et al.: G-CORE: a core for future graph query languages. In: SIGMOD International Conference on Management of Data, pp. 1421–1432. ACM (2018). https://doi.org/10.1145/3183713.3190654
Angles, R., Arenas, M., Barceló, P., Hogan, A., Reutter, J.L., Vrgoc, D.: Foundations of modern query languages for graph databases. ACM Comput. Surv. 50(5), 68:1–68:40 (2017). https://doi.org/10.1145/3104031
Arroyuelo, D., Hogan, A., Navarro, G., Rojas-Ledesma, J.: Time- and space-efficient regular path queries. In: Proceedings of the 38th IEEE International Conference on Data Engineering (ICDE), pp. 3091–3105 (2022)
Arroyuelo, D., Navarro, G., Reutter, J.L., Rojas-Ledesma, J.: Optimal joins using compressed quadtrees. ACM Trans. Database Syst. 47(2), article 8 (2022)
Arroyuelo, D., Hogan, A., Navarro, G., Reutter, J., Rojas-Ledesma, J., Soto, A.: Worst-case optimal graph joins in almost no space. In: ACM International Conference on Management of Data (SIGMOD), pp. 102–114 (2021)
de Bernardo, G., Gagie, T., Ladra, S., Navarro, G., Seco, D.: Faster compressed quadtrees. J. Comput. Syst. Sci. 131, 86–104 (2023)
de Bernardo, G., Álvarez-García, S., Brisaboa, N.R., Navarro, G., Pedreira, O.: Compact querieable representations of raster data. In: Kurland, O., Lewenstein, M., Porat, E. (eds.) SPIRE 2013. LNCS, vol. 8214, pp. 96–108. Springer, Cham (2013). https://doi.org/10.1007/978-3-319-02432-5_14
Bonifati, A., Martens, W., Timm, T.: Navigating the maze of Wikidata query logs. In: The World Wide Web Conference (WWW), pp. 127–138. ACM (2019)
Brisaboa, N., Cerdeira-Pena, A., de Bernardo, G., Fariña, A., Navarro, G.: Space/time-efficient RDF stores based on circular suffix sorting. J. Supercomput. 79, 5643–5683 (2023)
Brisaboa, N.R., Ladra, S., Navarro, G.: Compact representation of web graphs with extended functionality. Inf. Syst. 39(1), 152–174 (2014)
Clark, D.R.: Compact PAT trees. Ph.D. thesis, University of Waterloo, Canada (1996)
Cormen, T.H., Leiserson, C.E., Rivest, R.L., Stein, C.: Introduction to Algorithms, 3rd edn. MIT Press, Cambridge (2009)
Deutsch, A., et al.: Graph pattern matching in GQL and SQL/PGQ. In: Proceedings of the International Conference on Management of Data (SIGMOD), pp. 2246–2258 (2022)
Deutsch, A., Xu, Y., Wu, M., Lee, V.E.: Aggregation support for modern graph analytics in TigerGraph. In: SIGMOD International Conference on Management of Data, pp. 377–392. ACM (2020). https://doi.org/10.1145/3318464.3386144
Elgohary, A., Boehm, M., Haas, P.J., Reiss, F.R., Reinwald, B.: Compressed linear algebra for declarative large-scale machine learning. Commun. ACM 62(524), 83–91 (2019)
Erling, O., Mikhailov, I.: RDF support in the Virtuoso DBMS. In: Pellegrini, T., Auer, S., Tochtermann, K., Schaffert, S. (eds.) Networked Knowledge - Networked Media. Studies in Computational Intelligence, vol. 221, pp. 7–24. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-02184-8_2
Francis, N., et al.: Cypher: an evolving query language for property graphs. In: SIGMOD International Conference on Management of Data, pp. 1433–1445. ACM (2018)
Furman, M.E.: Application of a method of fast multiplication of matrices in the problem of Finding the transitive closure of a graph. Sov. Math. Dokl. 11(5), 1252 (1970)
Huffman, D.A.: A method for the construction of minimum-redundancy codes. Proc. Inst. Electr. Radio Eng. 40(9), 1098–1101 (1952)
Losemann, K., Martens, W.: The complexity of evaluating path expressions in SPARQL. In: Proceedings of the 31st Symposium on Principles of Database Systems (PODS), pp. 101–112. ACM (2012)
Malyshev, S., Krötzsch, M., González, L., Gonsior, J., Bielefeldt, A.: Getting the most out of Wikidata: semantic technology usage in Wikipedia’s knowledge graph. In: Vrandečić, D., et al. (eds.) ISWC 2018. LNCS, vol. 11137, pp. 376–394. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-00668-6_23
Manola, F., Miller, E.: RDF primer. W3C Recommendation (2004). http://www.w3.org/TR/rdf-primer/
Martens, W., Niewerth, M., Popp, T., Rojas, C., Vansummeren, S., Vrgoc, D.: Representing paths in graph database pattern matching. Proc. VLDB Endow. 16(7), 1790–1803 (2023). https://www.vldb.org/pvldb/vol16/p1790-martens.pdf
Mendelzon, A.O., Wood, P.T.: Finding regular simple paths in graph databases. SIAM J. Comput. 24(6), 1235–1258 (1995)
Munro, J.I.: Tables. In: Chandru, V., Vinay, V. (eds.) FSTTCS 1996. LNCS, vol. 1180, pp. 37–42. Springer, Heidelberg (1996). https://doi.org/10.1007/3-540-62034-6_35
Penn, G.: Efficient transitive closure of sparse matrices over closed semirings. Theoret. Comput. Sci. 354(1), 72–81 (2006)
van Rest, O., Hong, S., Kim, J., Meng, X., Chafi, H.: PGQL: a property graph query language. In: International Workshop on Graph Data Management: Experiences and Systems (GRADES), p. 7. ACM (2016)
Saad, Y.: Iterative Methods for Sparse Linear Systems. SIAM (2003)
Schoor, A.: Fast algorithm for sparse matrix multiplication. Inf. Process. Lett. 15(2), 87–89 (1982)
Thompson, B.B., Personick, M., Cutcher, M.: The bigdata®RDF graph database. In: Linked Data Management, pp. 193–237. Chapman and Hall/CRC (2014)
Vrandecic, D., Krötzsch, M.: Wikidata: a free collaborative knowledge base. Commun. ACM 57(10), 78–85 (2014)
Yakovets, N., Godfrey, P., Gryz, J.: Query planning for evaluating SPARQL property paths. In: SIGMOD International Conference on Management of Data, pp. 1875–1889. ACM (2016)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Arroyuelo, D., Gómez-Brandón, A., Navarro, G. (2023). Evaluating Regular Path Queries on Compressed Adjacency Matrices. In: Nardini, F.M., Pisanti, N., Venturini, R. (eds) String Processing and Information Retrieval. SPIRE 2023. Lecture Notes in Computer Science, vol 14240. Springer, Cham. https://doi.org/10.1007/978-3-031-43980-3_4
Download citation
DOI: https://doi.org/10.1007/978-3-031-43980-3_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-43979-7
Online ISBN: 978-3-031-43980-3
eBook Packages: Computer ScienceComputer Science (R0)