Abstract
Modern analytics environments are characterized by a data infrastructure that comprises a great variety of datasets, data formats, data management and processing systems. Such environments are dynamic and data analysis needs to be performed in a flexible and agile manner via data virtualization techniques. Towards this end, we have proposed the Data Virtual Machine (DVM), a graph-based conceptual model based on entities and attributes. The basic idea of the DVM is that the relations of entities and attributes are based and expressed as the output of data processing tasks. In this paper we discuss the notion of data virtualization and propose a set of goals for relevant techniques in terms of modeling capabilities, query formulation and schema flexibility. We also place DVMs with respect to these goals.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Abadi, D., et al.: The Beckman report on database research. Commun. ACM 59, 692–699 (2016)
Alagiannis, I., Borovica-Gajic, R., Branco, M., Idreos, S., Ailamaki, A.: Nodb: efficient query execution on raw data files. Commun. ACM 58(12), 112–121 (2015)
Chatziantoniou, D., Kantere, V.: Data virtual machines: data-driven conceptual modeling of big data infrastructures. In: Workshops of EDBT 2020 (2020)
Chatziantoniou, D., Kantere, V.: Data virtual machines: a novel approach to data virtualization (2021, submitted for publication)
Chatziantoniou, D., Kantere, V.: Datamingler: a novel approach to data virtualization. In: Li, G., Li, Z., Idreos, S., Srivastava, D. (eds.) SIGMOD 2021: International Conference on Management of Data, Virtual Event, China, 20–25 June 2021, pp. 2681–2685. ACM (2021). https://doi.org/10.1145/3448016.3452752, https://doi.org/10.1145/3448016.3452752
Chatziantoniou, D., Tselai, F.: Introducing data connectivity in a big data web. In: Proceedings of the Third Workshop on Data analytics in the Cloud, DanaC 2014, pp. 7:1–7:4 (2014). http://doi.acm.org/10.1145/2627770.2627773
Denodo: Data virtualization: the modern data integration solution (2019). https://www.denodo.com/en/document/whitepaper/data-virtualization-modern-data-integration-solution
Doan, A., Halevy, A.Y., Ives, Z.G.: Principles of Data Integration. Morgan Kaufmann, San Francisco (2012)
Gartner: Market Guide for Data Virtualization (2018). https://www.gartner.com/en/documents/3893219/market-guide-for-data-virtualization
IBM: IBM’s data virtualization tool: Cloud Pak for data (2021). https://www.ibm.com/analytics/data-virtualization
Karpathiotakis, M., Alagiannis, I., Heinis, T., Branco, M., Ailamaki, A.: Just-in-time data virtualization: lightweight data management with ViDa. In: CIDR 2015 (2015)
Microsoft: Introducing data virtualization with polybase (2021). https://docs.microsoft.com/en-us/sql/relational-databases/polybase/polybase-guide?view=sql-server-ver15
Oracle Corp.: Oracle Data Service Integrator (2020). https://www.oracle.com/middleware/technologies/data-service-integrator.html
Data virtualization and data warehousing (2020). https://en.wikipedia.org/wiki/Data_virtualization
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Chatziantoniou, D., Kantere, V. (2021). Data Virtual Machines: Enabling Data Virtualization. In: Rezig, E.K., et al. Heterogeneous Data Management, Polystores, and Analytics for Healthcare. DMAH Poly 2021 2021. Lecture Notes in Computer Science(), vol 12921. Springer, Cham. https://doi.org/10.1007/978-3-030-93663-1_1
Download citation
DOI: https://doi.org/10.1007/978-3-030-93663-1_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-93662-4
Online ISBN: 978-3-030-93663-1
eBook Packages: Computer ScienceComputer Science (R0)