Abstract
While the concept of database schema plays a central role in relational database systems, most NoSQL systems are schemaless: these databases are created without having to formally define its schema. Instead, it is implicit in the stored data. This lack of schema definition offers a greater flexibility; more specifically, the schemaless databases ease both the recording of non-uniform data and data evolution. However, this comes at the cost of losing some of the benefits provided by schemas. In this article, a MDE-based reverse engineering approach for inferring the schema of aggregate-oriented NoSQL databases is presented. We show how the obtained schemas can be used to build database utilities that tackle some of the problems encountered using implicit schemas: a schema diagram viewer and a data validator generator are presented.
Work partially supported by the Cátedra SAES of the University of Murcia (http://www.catedrasaes.org), a research lab sponsored by the SAES company (http://www.electronica-submarina.com/).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
References
Abiteboul, S.: Querying semi-structured data. Technical report 1996–19, Stanford InfoLab (1996). http://ilpubs.stanford.edu:8090/144/
Apache Foundation: Apache Drill, Visited April 2015. http://drill.apache.org/
Bugiotti, F., Cabibbo, L., Atzeni, P., Torlone, R.: Database design for NoSQL systems. In: Yu, E., Dobbie, G., Jarke, M., Purao, S. (eds.) ER 2014. LNCS, vol. 8824, pp. 223–231. Springer, Heidelberg (2014)
Buneman, P.: Semistructured data. In: Sixteenth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, pp. 117–121. ACM (1997)
Cánovas Izquierdo, J.L., Cabot, J.: Discovering implicit schemas in JSON data. In: Daniel, F., Dolog, P., Li, Q. (eds.) ICWE 2013. LNCS, vol. 7977, pp. 68–83. Springer, Heidelberg (2013)
Fowler, M.: Schemaless Data Structures, January 2013. http://martinfowler.com/articles/schemaless/
IETF: JSON Schema Specification, Visited April 2015. http://json-schema.org/
Janga, P., Davis, K.C.: Mapping heterogeneous XML document collections to relational databases. In: Yu, E., Dobbie, G., Jarke, M., Purao, S. (eds.) ER 2014. LNCS, vol. 8824, pp. 86–99. Springer, Heidelberg (2014)
Karpov, V.: Mongoose NPM package, Visited April 2015. https://www.npmjs.com/package/mongoose
Klettke, M., Störl, U., Scherzinger, S.: Schema extraction and structural outlier detection for JSON-based NoSQL data stores. In: BTW 2105, pp. 425–444 (2015)
Redmond, E., Wilson, J.R.: Seven Databases in Seven Weeks. A Guide to Modern Databases and the NoSQL Movement, Pragmatic Programmers (2013)
Rückstieß, T.: mongodb-schema NPM package, Visited April 2015. https://www.npmjs.com/package/mongodb-schema
Sadalage, P., Fowler, M.: NoSQL Distilled: A Brief Guide to the Emerging World of Polyglot Persistence. Addison-Wesley, Reading (2012)
Steinberg, D., Budinsky, F., Paternostro, M., Merks, E.: Eclipse Modeling Framework. Addison-Wesley, Reading (2008)
Zaharia, M., Chowdhury, M., et al.: Resilient distributed datasets: a fault-tolerant abstraction for in-memory cluster computing. In: NSDI, April 2012. http://spark.apache.org
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Sevilla Ruiz, D., Morales, S.F., García Molina, J. (2015). Inferring Versioned Schemas from NoSQL Databases and Its Applications. In: Johannesson, P., Lee, M., Liddle, S., Opdahl, A., Pastor López, Ó. (eds) Conceptual Modeling. ER 2015. Lecture Notes in Computer Science(), vol 9381. Springer, Cham. https://doi.org/10.1007/978-3-319-25264-3_35
Download citation
DOI: https://doi.org/10.1007/978-3-319-25264-3_35
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-25263-6
Online ISBN: 978-3-319-25264-3
eBook Packages: Computer ScienceComputer Science (R0)