Advertisement

Nested Schema Mappings for Integrating JSON

  • Rihan HaiEmail author
  • Christoph Quix
  • David Kensche
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11157)

Abstract

JSON has become one of the most popular data formats. Yet studies on JSON data integration (DI) are scarce. In this work, we study one of the key DI tasks, nested mapping generation in the context of integrating heterogeneous JSON based data sources. We propose a novel mapping representation, namely bucket forest mappings that models the nested mappings in an efficient and native manner. We show experimentally the practicality of our approach over six real world data sets. Moreover, via intensive experiments over synthetic scenarios we demonstrate that our approach scales well to the increasing metadata complexity of DI scenarios.

Notes

Acknowledgements

This work has been partially funded by the German Federal Ministry of Education and Research (BMBF) (project HUMIT, http://humit.de/, grant no. 01IS14007A) and the German Research Foundation (DFG) within the Cluster of Excellence “Integrative Production Technology for High Wage Countries” (EXC 128).

References

  1. 1.
    Abiteboul, S., Bidoit, N.: Non first normal form relations: an algebra allowing data restructuring. J. Comput. Syst. Sci. 33(3), 361–393 (1986)MathSciNetCrossRefGoogle Scholar
  2. 2.
    Alexe, B., Tan, W.C., Velegrakis, Y.: STBenchmark: towards a benchmark for mapping systems. VLDB J. 1(1), 230–244 (2008)Google Scholar
  3. 3.
    Arocena, P.C., Glavic, B., Ciucanu, R., Miller, R.J.: The iBench integration metadata generator. VLDB J. 9(3), 108–119 (2015)Google Scholar
  4. 4.
    Bonifati, A., et al.: Schema mapping and query translation in heterogeneous P2P XML databases. VLDB J. 19(2), 231–256 (2010)CrossRefGoogle Scholar
  5. 5.
    Chi, Y., et al.: Canonical forms for labelled trees and their applications in frequent subtree mining. Knowl. Inf. Syst. 8(2), 203–234 (2005)CrossRefGoogle Scholar
  6. 6.
    Fuxman, A., Hernandez, M.A., Ho, H., Miller, R.J., Papotti, P., Popa, L.: Nested mappings: schema mapping reloaded. In: Proceedings of VLDB, pp. 67–78 (2006)Google Scholar
  7. 7.
    Hai, R., Geisler, S., Quix, C.: Constance: an intelligent data lake system. In: Proceedings of SIGMOD, pp. 2097–2100 (2016)Google Scholar
  8. 8.
    Hai, R., Quix, C., Zhou, C.: Query rewriting for heterogeneous data lakes. In: Benczúr, A., Thalheim, B., Horváth, T. (eds.) ADBIS 2018. LNCS, vol. 11019, pp. 35–49. Springer, Cham (2018).  https://doi.org/10.1007/978-3-319-98398-1_3CrossRefGoogle Scholar
  9. 9.
    Halevy, A.Y., Ives, Z.G., Suciu, D., Tatarinov, I.: Schema mediation in peer data management systems. In: Proceedings of ICDE, pp. 505–516 (2003)Google Scholar
  10. 10.
    Kensche, D., Quix, C., Li, X., Li, Y., Jarke, M.: Generic schema mappings for composition and query answering. Data Knowl. Eng. 68(7), 599–621 (2009)CrossRefGoogle Scholar
  11. 11.
    ten Cate, B., Kolaitis, P.G.: Structural characterizations of schema-mapping languages. In: Proceedings of ICDT, pp. 63–72 (2009)Google Scholar
  12. 12.
    Wang, L., et al.: Schema management for document stores. PVLDB 8(9), 922–933 (2015)Google Scholar
  13. 13.
    Yu, C., Popa, L.: Constraint-based XML query rewriting for data integration. In: Proceedings of SIGMOD, pp. 371–382 (2004)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  1. 1.RWTH Aachen UniversityAachenGermany
  2. 2.Fraunhofer Institute for Applied Information Technology FITSankt AugustinGermany
  3. 3.SAP Innovation Center NetworkPotsdamGermany

Personalised recommendations