Toward integrating grid and cloud-based concepts for an enhanced deployment of spatial data warehouses in cyber-physical system applications

  • Boubaker Boulekrouche
  • Nafaâ Jabeur
  • Zaia Alimazighi
Original Research

Abstract

Thanks to their spatially distributed sensors, cyber-physical system (CPS) applications are currently collecting large amounts of heterogeneous data. When it comes to allowing several decision-makers to collaboratively plan their actions, these applications need appropriate tools for an efficient storage, analysis, and visualization of the available data. Spatial data warehouses (SDWs) have proven their efficiency in carrying out these operations. However, because of the increasing volumes of data, the commonly used spatial extract-transform-load (SETL) process generally fails to update the SDW within acceptable timeframes. In order to solve this problem, we propose to perform the SETL tasks in a distributed, parallel manner by means of a grid of computing resources. In addition to being the unique solution that uses grid computing for the SETL process of SDWs, our solution makes use of cloud computing techniques to shorten the spatial data processing time and reduce resource consumption. To meet our goals, we propose a multi-agent-based solution to adequately schedule and balance the processing activities over the grid while allowing a joint use of real-time and archive data for personalized reporting and visualization of services envisioned to the decision-makers who are using the same CPS application.

Keywords

Spatial ETL Spatial data warehouse Cyber physical systems Multi-agent systems Cloud computing Grid computing 

Notes

Compliance with ethical standards

Conflict of interest

The authors, Boubaker Boulekrouche, Nafaâ Jabeur, and Zaia Alimazighi, declare that there is no conflict of interests regarding the publication of this paper.

References

  1. Ablimit A, Fusheng W, Hoang V, Rubao L, Qiaoling L, Xiaodong Z, Joel S (2013) Hadoop-GIS: a high performance spatial data warehousing system over MapReduce. In: Proceedings of the 39th International Conference on Very Large Databases (VLDB’2013), pp 1009–1020Google Scholar
  2. Bala M, Alimazighi Z (2012) ETL-X design: Outil d’aide à la modélisation de processus ETL. In: Proceedings of 6éme édition des Avancées sur les Systèmes Décisionnels, pp 155–166Google Scholar
  3. Bala M, Boussaid O, Alimazighi Z, Bentayeb F (2014) PF-ETL: vers l’intégration de données massives dans les fonctionnalités. Proc INFORSID 2014:61–76Google Scholar
  4. Bandyopadhyay S, Coyle EJ (2013) An energy efficient hierarchical clustering algorithm for wireless sensor networks. In: Proceedings of INFOCOM, pp 1713–1723Google Scholar
  5. Bédard Y, Han J (2009) Fundamentals of spatial data warehousing for geographic knowledge discovery. In: Miller HJ, Han J (eds) Geographic data mining and knowledge discovery, 2nd edn. Taylor & Francis, pp 53–73Google Scholar
  6. Bernier E, Bédard Y (2007) A data warehouse strategy for on-demand multiscale mapping. In: Mackaness WA, Ruas A, Sarjakoski LT (eds) Generalisation of geographic information: cartographic modeling and applications. Amsterdam, pp 177–198Google Scholar
  7. Butte B (2004) Solving the data Warehouse dilemma With grid technology, IBM Global Services. http://csis.bits-pilani.ac.in/faculty/goel/course_material/Data%20Warehousing/I%20sem%202005-06/Assignemt%202/GW510-5041-00F.pdf. Accessed 20 March 2015
  8. Costa R, Furtado P (2008) Optimizer and QoS for the community data warehouse architecture. In: Zakrzewska D, Menasalvas E, Byczkowska-Lipiñska L (eds) New trends in database systems: methods, tools, applications. Springer-VerlagGoogle Scholar
  9. Demiya T, Yoshihisa T, Kanazawa M (2008) Compact grid: a grid computing system using low resource compact computers. J Commun Netw Distrib 1:112–117Google Scholar
  10. Eckerson W, White C (2003) Evaluating ETL and data integration platforms. Technical report, The Data Warehousing InstituteGoogle Scholar
  11. FME (2015) Safe software FME workbanch. http://www.safe.com/. Accessed 5 Dec 2015
  12. Foster I, Kesselman C, Tuecke S (2001) The anatomy of the grid: enabling scalable virtual organizations. J High Perform Comput Appl 15:200–222CrossRefGoogle Scholar
  13. GeoKettle (2015) http://www.spatialytics.org/projects/geokettle/. Accessed 5 Dec 2015
  14. Helmy T, Al-Jamimi H, Ahmed B, Loqman H (2012) Fuzzy logic-based scheme for load balancing in grid services. J Softw Eng Appl 5:149–156. doi:10.4236/jsea.2012.512b029 CrossRefGoogle Scholar
  15. Just VB (2013) NSPIRE Transformation with Stetl: a lightweight python framework for geospatial ETL. In: Proceedings of KEN WorkshopGoogle Scholar
  16. Kumar S, Singhal N (2012) A priority based dynamic load balancing approach in a grid based distributed computing network. J Comput Appl 49:511–514Google Scholar
  17. Liu D (2014) A fault-tolerant architecture for ROIA in cloud. J Ambient Intell Humaniz Comput 6:587–595. doi:10.1007/s12652-014-0220-4
  18. Liu X, Thomsen C, Pedersen TB (2011) ETLMR: a highly scalable dimensional ETL framework based on Mapreduce. In: Proceedings of 13th International Conference on Data Warehousing and Knowledge, pp 96–111Google Scholar
  19. Malinowski E, Zimányi E (2008) Advanced data warehouse design: from conventional to spatial and temporal applications. Springer-VerlagGoogle Scholar
  20. Marey O, Bentahar J, Khosrowshahi-Asl E, Sultan K, Dssouli R (2015) Decision making under subjective uncertainty in argumentation-based agent negotiation. J Ambient Intell Humaniz Comput 6(3):307–323CrossRefGoogle Scholar
  21. Martel C (1999) Développement d’un cadre théorique pour la gestion des représentations multiples dans les bases de données spatiales. Université Laval, Mémoire de maîtriseGoogle Scholar
  22. Misra S, Saha SK, Mazumdar C (2013) Performance comparison of Hadoop based tools with commercial ETL tools—a case study. In: Proceedings of Big Data Analytics (BDA’13), pp 176–184Google Scholar
  23. Nudd G, Kerbyson D, Papaefstathiou E, Perry S, Harper J, Wilcox D (2010) Pace—a toolset for the performance prediction of parallel and distributed systems. J High Perform Comput Appl 14(3):228–251CrossRefGoogle Scholar
  24. Patroumpas K, Alexakis Giannopoulos MG, Athanasiou S (2014) TripleGeo: an ETL tool for transforming geospatial data into RDF triples. In: Proceedings of the EDBT/ICDT 2014 Joint Conference, pp 275–278Google Scholar
  25. Rajkumar R, Lee I, Sha L, Stankovic J (2010) Cyber-physical systems: the next computing revolution. In: Proceedings of the 47th Design Automation Conference, pp 731–736Google Scholar
  26. Salehi M, Bédard Y, Rivest S (2010) A formal conceptual model and definition framework for spatial datacubes. Geomatica 64:119–129Google Scholar
  27. Santos V, Oliveira B, Silva R, Belo O (2012) Configuring and executing ETL tasks on grid environments—requirements and specificities. In: Proceedings of First World Conference on Innovation and Computer Sciences (INSODE 2011), pp 112–117Google Scholar
  28. Spatial extension for Talend (2015) http://talend-spatial.github.io/. Accessed 05 Dec 2015
  29. Stefanovic N, Han J, Koperski JK (2000) Object-based selective materialization for efficient implementation of spatial data cubes. IEEE Trans Knowl Data Eng 12:938–958CrossRefGoogle Scholar
  30. Tekadpande S, Deshpande L (2015) Analysis and design of ETL process using Hadoop. J Eng Innov Technol (IJEIT) 4(4):144–159Google Scholar
  31. Thirumala RB, Reddy LSS (2011) Survey on improved scheduling in hadoop MapReduce in cloud environments. J Comput Appl 34(9):29–33Google Scholar
  32. Trujillo, Luján-Mora JS (2003) A UML based approach for modeling ETL processes in data warehouses. In: Proceedings of 22nd International Conference on Conceptual Modeling (ER 2003), pp 307–320Google Scholar
  33. Tziovara V, Vassiliadis P, Simitsis (2007) Deciding the physical implementation of ETL workflows. In: Proceedings of ACM 10th International Workshop on Data Warehousing and OLAP (DOLAP 2007), pp 49–56Google Scholar
  34. Vassiliadis P (2009) A survey of extract–transform–load technology. J Data Warehous Min 5(3):1–27CrossRefGoogle Scholar
  35. Vassiliadis P, Simitsis A, Georgantas P, Terrovitis M (2003) A framework for the design of ETL scenarios. In: Proceedings of 15th Conference on Advanced Information Systems Engineering (CAiSE 2003), pp 520–535Google Scholar
  36. Vassiliadis P, Simitsis A, Georgantas P, Terrovitis M, Skiadopoulos S (2005) A generic and customizable framework for the design of ETL scenarios. Inform Syst 30(7):492–525CrossRefGoogle Scholar
  37. Wehrle P, Miquel M, Tchounikine A (2007) A grid services-oriented architecture for efficient operation of distributed data warehouses on globus. In: Proceedings of Advanced Information Networking and Applications (AINA’07), pp 994–999Google Scholar
  38. Xi-qian C, Zhong-xian C, Xiu-kun CA (2004) Applying DP to ETL of spatial data warehouse. In: Proceedings of the Third International Conference on Machine Learning and Cybenetics, pp 26–29Google Scholar
  39. Xue S, Xiong L, Yang S, Zhao L (2016) A self-adaptive multi-view framework for multi-source information service in cloud ITS. J Ambient Intell Human Comput 7(2):205–220CrossRefGoogle Scholar
  40. Zode M (2008) Grids in data warehouses, http://www.tdan.com/view-articles/9378. Accessed 25 April 2015

Copyright information

© Springer-Verlag Berlin Heidelberg 2016

Authors and Affiliations

  • Boubaker Boulekrouche
    • 1
  • Nafaâ Jabeur
    • 2
  • Zaia Alimazighi
    • 1
  1. 1.LSIUniversity of Sciences and Technologies Houari BoumedieneAlgiersAlgeria
  2. 2.German University of Technology in Oman (GUtech)MuscatSultanate of Oman

Personalised recommendations