Abstract
This article presents the analysis of approaches to data warehouse construction based on relational and NoSQL solutions and lists the limitations of the relational approach to data mining. The contradiction between data presentation in the real subject domain and the model of data presentation in the relational and NoSQL approaches is revealed. The revealed contradiction is related to the temporality of the values of individual data attributes, the variability of the composition of these attributes, and structure of connections between them. A new logical model of the data warehouse with dynamic structure is proposed. The model is based on the concept of the object as a container for properties storage. Each property of the object includes the property name and two property values—without reference and with reference, that are relevant at a given time. The reference property value points to an object whose name is interpreted as the value of the property at a given time. A formal description of the model with allocation of the necessary functionality to manipulate objects and their properties (selectors, predicates, constructors) is given and the necessary control structures are introduced. Substantiation of the proposed model, called an OP-model is given on the basis of compliance with the logical ER data model. It is proved that any ER data model can be implemented in the OP-model. At the same time, the advantages of the OP-model are indicated, they are associated with the possibility of changing connections between entities due to changes in the reference value at a particular time. The potential for scalability of data warehouse due to the unique identification of each object is noted.
Similar content being viewed by others
REFERENCES
Barsegjan, A.A., Tekhnologii analiza dannyh: Data Mining, Visual Mining, Text Mining, OLAP (Data Analysis Technologies: Data Mining, Visual Mining, Text Mining, and OLAP), St. Petersburg: BHV-Peterburg, 2007.
Date, C.J., An Introduction to Database Systems, Addison-Wesley Publishing Company, 1995.
Martin, J., Computer Data-Base Organization, New Jersey: IBM Systems Research Institute, 1977.
Connolly, T.M. and Begg, C.E., Database Systems: A Practical Approach to Design, Implementation, and Management, Pearson Education, 2005.
List of NoSQL Databases. http://nosql-database.org/.
Aguilera, M.K., Delporte-Gallet, C., Fauconnier, H., and Toueg, S., Communication-efficient leader election and consensus with limited link synchrony, The Proceedings of the International Symposium on Principles of Distributed Computing (PODC), 2004, pp. 328–337.
Herlihy, M. and Shavit, N., The topological structure of asynchronous computability, J. ACM, 1999, vol. 46, no. 6, pp. 858–923.
Haifeng, Y. and Amin, V., The costs and limits of availability for replicated services, ACM Trans. Comput. Syst., 2006, vol. 24, no. 1, pp. 70–113.
Cooper, B.F., Ramakrishnan, R., Srivastava, U., Silberstein, A., Bohannon, P., Jacobsen, H.-A., Puz, N., Weaver, D., and Yerneni, R., Pnuts: Yahoo!'s hosted data serving platform, PVLDB, 2008, vol. 1, no. 2, pp. 1277–1288.
Swati Ahirrao and Rajesh Ingle, Scalable transactions in cloud data stores, J. Cloud Comput., 2015, vol. 4, p. 21.
In-memory data structure store Redis. http://redis.io/.
MongoDB Professional with Cloud Manager. https://www.mongodb.org/.
A Database for the Web CouchDB. http://couchdb.apache.org/.
Pisarenko, D.S. and Rublev, V.S., Object DBMS DIM and its main concepts, Model. Anal. Inf. Sist., 2009, vol. 16, no. 1, pp. 62–91.
Rublev, V.S., The object query language of the dynamic information model DIM, Model. Anal. Inf. Sist., 2010, vol. 17, no. 3, pp. 144–161.
Roublev, V.S., Evolution of DBMS DIM database schemes, Model. Anal. Inf. Sist., 2012, vol. 19, no. 2, pp. 97–108.
Antonov, D.V. and Roublev, V.S., Access efficiency to data in DIM DBMS, Model. Anal. Inf. Sist., 2015, vol. 22, no. 2, pp. 158–175.
Petrov, A.N. and Roublev, V.S., Completeness of the dynamics of the attributes values of data in the database DIM, Model. Anal. Inf. Sist., 2015, vol. 22, no. 2, pp. 259–277.
Roublev, V.S., Static completeness of the dynamic information model, Autom. Control Comput. Sci., 2015, vol. 49, no. 3, pp. 167–176.
A Comprehensive Data Integration and Business Analytics Platform. http://www.pentaho.com/.
Data Mining Software in Java. http://www.cs.waikato.ac.nz/ml/weka/.
Doug, H., Let Over Lambda, 2010.
Alexandros, B., An efficient database storage structure for large dynamic objects, Proceedings, IEEE Data Engineering Conference, Phoenix, Arizona, 1992, pp. 301–308.
Poltavtsev, A.A., Dynamic structures in relation databases, Program. Prod. Sist., 2015, no. 2, pp. 95–97.
Tsikritzis, D. and Lokhovski, F., Modeli dannykh (Data Models), Moscow: Finansy Stat., 1985.
Kalinichenko, L.A., Metody i sredstva integratsii neodnorodnykh baz dannykh (Methods and Tools for Integration of Heterogeneous Databases), Moscow: Nauka, 1983.
Funding
This work was supported by the state task of the Ministry of Education and Science of the Russian under the project no. 2.87.2016/HM.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
The authors declare that they have no conflicts of interest.