Skip to main content

Inserting Keys into the Robust Content-and-Structure (RCAS) Index

  • Conference paper
  • First Online:
Advances in Databases and Information Systems (ADBIS 2021)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 12843))

Included in the following conference series:

Abstract

Semi-structured data is prevalent and typically stored in formats like XML and JSON. The most common type of queries on such data are Content-and-Structure (CAS) queries, and a number of CAS indexes have been developed to speed up these queries. The state-of-the-art is the RCAS index, which properly interleaves content and structure, but does not support insertions of single keys. We propose several insertion techniques that explore the trade-off between insertion and query performance. Our exhaustive experimental evaluation shows that the techniques are efficient and preserve RCAS’s good query performance.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    We measure the cache misses with the perf command on Linux, which relies on the Performance Monitoring Unit (PMU) in modern processors to record hardware events like cache accesses and misses in the CPU.

References

  1. Abramatic, J., Cosmo, R.D., Zacchiroli, S.: Building the universal archive of source code. Commun. ACM 61(10), 29–31 (2018). https://doi.org/10.1145/3183558

    Article  Google Scholar 

  2. Arge, L.: The buffer tree: a technique for designing batched external data structures. Algorithmica 37(1), 1–24 (2003). https://doi.org/10.1007/s00453-003-1021-x

    Article  MathSciNet  MATH  Google Scholar 

  3. Askitis, N., Zobel, J.: B-tries for disk-based string management. VLDB J. 18(1), 157–179 (2009). https://doi.org/10.1007/s00778-008-0094-1

    Article  Google Scholar 

  4. Di Cosmo, R., Zacchiroli, S.: Software heritage: why and how to preserve software source code. In: iPRES (2017)

    Google Scholar 

  5. Heinz, S., Zobel, J., Williams, H.E.: Burst tries: a fast, efficient data structure for string keys. ACM Trans. Inf. Syst. 20(2), 192–223 (2002). https://doi.org/10.1145/506309.506312

    Article  Google Scholar 

  6. Jagadish, H.V., Narayan, P.P.S., Seshadri, S., Sudarshan, S., Kanneganti, R.: Incremental organization for data recording and warehousing. In: VLDB, pp. 16–25 (1997)

    Google Scholar 

  7. Leis, V., Kemper, A., Neumann, T.: The adaptive radix tree: artful indexing for main-memory databases. In: ICDE, pp. 38–49 (2013). https://doi.org/10.1109/ICDE.2013.6544812

  8. Luo, C., Carey, M.J.: LSM-based storage techniques: a survey. VLDB J. 29(1), 393–418 (2019). https://doi.org/10.1007/s00778-019-00555-y

    Article  Google Scholar 

  9. Mathis, C., Härder, T., Schmidt, K., Bächle, S.: XML indexing and storage: fulfilling the wish list. Comput. Sci. Res. Dev. 30(1), 51–68 (2012). https://doi.org/10.1007/s00450-012-0204-6

    Article  Google Scholar 

  10. Morrison, D.R.: PATRICIA - practical algorithm to retrieve information coded in alphanumeric. J. ACM 15(4), 514–534 (1968). https://doi.org/10.1145/321479.321481

    Article  Google Scholar 

  11. Nishimura, S., Yokota, H.: QUILTS: multidimensional data partitioning framework based on query-aware and skew-tolerant space-filling curves. In: SIGMOD, pp. 1525–1537 (2017). https://doi.org/10.1145/3035918.3035934

  12. O’Neil, P.E., Cheng, E., Gawlick, D., O’Neil, E.J.: The log-structured merge-tree (LSM-tree). Acta Informatica 33(4), 351–385 (1996). https://doi.org/10.1007/s002360050048

    Article  MATH  Google Scholar 

  13. Orenstein, J.A., Merrett, T.H.: A class of data structures for associative searching. In: PODS 1984, New York, NY, USA, pp. 181–190 (1984). https://doi.org/10.1145/588011.588037

  14. Piatov, D., Helmer, S., Dignös, A.: An interval join optimized for modern hardware. In: ICDE, pp. 1098–1109 (2016). https://doi.org/10.1109/ICDE.2016.7498316

  15. Rousseau, G., Di Cosmo, R., Zacchiroli, S.: Software provenance tracking at the scale of public source code. Empirical Softw. Eng. 25(4), 2930–2959 (2020). https://doi.org/10.1007/s10664-020-09828-5

    Article  Google Scholar 

  16. Severance, D.G., Lohman, G.M.: Differential files: their application to the maintenance of large databases. ACM Trans. Database Syst. 1(3), 256–267 (1976). https://doi.org/10.1145/320473.320484

    Article  Google Scholar 

  17. Wellenzohn, K., Böhlen, M.H., Helmer, S.: Dynamic interleaving of content and structure for robust indexing of semi-structured hierarchical data. In: PVLDB, vol. 13, no. 10, pp. 1641–1653 (2020). https://doi.org/10.14778/3401960.3401963

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sven Helmer .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Wellenzohn, K., Popovic, L., Böhlen, M., Helmer, S. (2021). Inserting Keys into the Robust Content-and-Structure (RCAS) Index. In: Bellatreche, L., Dumas, M., Karras, P., Matulevičius, R. (eds) Advances in Databases and Information Systems. ADBIS 2021. Lecture Notes in Computer Science(), vol 12843. Springer, Cham. https://doi.org/10.1007/978-3-030-82472-3_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-82472-3_10

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-82471-6

  • Online ISBN: 978-3-030-82472-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics