Abstract
NOSQL database management systems adopt semi-structured data models, such as JSON, to easily accommodate schema evolution and overcome the overhead generated from transforming internal structures to tabular data (i.e., impedance mismatch). There exist multiple, and equivalent, ways to physically represent semi-structured data, but there is a lack of evidence about the potential impact on space and query performance. In this paper, we embark on the task of quantifying that, precisely for document stores. We empirically compare multiple ways of representing semi-structured data, which allows us to derive a set of guidelines for efficient physical database design considering both JSON and relational options in the same palette.
Partly funded by the European Commission through the programme “EM IT4BI-DC”. We thank Braulio Blanco for assisting on the first version of the experiments.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
- 2.
Source code and all graphs available at https://github.com/dtim-upc/MongoDBTests.
- 3.
- 4.
- 5.
References
Abiteboul, S.: Querying semi-structured data. In: ICDT (1997)
Abiteboul, S., Buneman, P., Suciu, D.: Data on the Web - From Relations to Semistructured Data and XML. Morgan Kaufmann, Burlington (2000)
Ambler, S.: Agile Database Techniques: Effective Strategies for the Agile Software Developer. Wiley, Hoboken (2003)
Atzeni, P., Bugiotti, F., Cabibbo, L., Torlone, R.: Data modeling in the NoSQL world. Comput. Stand. Interfaces 67, 103149 (2020)
Badia, A., Lemire, D.: A call to arms: revisiting database design. SIGMOD Rec. 40(3), 61–69 (2011)
Codd, E.F.: A relational model of data for large shared data banks. Commun. ACM 13(6), 377–387 (1970)
de la Vega, A., García-Saiz, D., Blanco, C., Zorrilla, M.E., Sánchez, P.: Mortadelo: automatic generation of NoSQL stores from platform-independent data models. Future Gener. Comput. Syst. 105, 455–474 (2020)
Hernández, A., etal.: Performance Benchmark PostgreSQL/MongoDB (Technical report) (2019)
Herrero, V., Abelló, A., Romero, O.: NOSQL design for analytical workloads: variability matters. In: ER (2016)
Hewasinghage, M., Abelló, A., Varga, J., Zimányi, E.: DocDesign: cost-based database design for document stores. In: SSDBM (2020)
Kanade, A., Gopal, A., Kanade, S.: A study of normalization and embedding in MongoDB. In: IACC (2014)
Mohan, C.: History repeats itself: sensible and NonsenSQL aspects of the NoSQL hoopla. In: EDBT (2013)
Scherzinger, S., Sidortschuck, S.: An empirical study on the design and evolution of NoSQL database schemas. CoRR, abs/2003.00054 (2020)
Sadalage, P., Fowler, M.: NoSQL Distilled: A Brief Guide to the Emerging World of Polyglot Persistence. Addison-Wesley Professional, Boston (2012)
Truica, C., Radulescu, F., Boicea, A., Bucur, I.: Performance evaluation for CRUD operations in asynchronously replicated document oriented database. In: CSCS (2015)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Hewasinghage, M., Nadal, S., Abelló, A. (2020). On the Performance Impact of Using JSON, Beyond Impedance Mismatch. In: Darmont, J., Novikov, B., Wrembel, R. (eds) New Trends in Databases and Information Systems. ADBIS 2020. Communications in Computer and Information Science, vol 1259. Springer, Cham. https://doi.org/10.1007/978-3-030-54623-6_7
Download citation
DOI: https://doi.org/10.1007/978-3-030-54623-6_7
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-54622-9
Online ISBN: 978-3-030-54623-6
eBook Packages: Computer ScienceComputer Science (R0)