Advertisement

Real-Time Data Stream Partitioning over a Sliding Window in Real-Time Spatial Big Data

  • Sana Hamdi
  • Emna Bouazizi
  • Sami Faiz
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11334)

Abstract

In recent years, real-time spatial applications, like location-aware services and traffic monitoring, have become more and more important. Such applications result in dynamic environments where data, as well as queries, are continuously moving. As a result, there is a tremendous amount of real-time spatial data generated every day. The growth of the data volume seems to outspeed the advance of our computing infrastructure. For instance, in real-time spatial Big Data, users expect to receive the results of each query within a short time period without holding into account the load of the system. But with a huge amount of real-time spatial data generated, the system performance degrades rapidly, especially in overload situations. To solve this problem, we propose the use of data partitioning as an optimization technique. Traditional horizontal and vertical partitioning can increase the performance of the system and simplify data management. But they remain insufficient for real-time spatial Big data; they can’t deal with real-time and stream queries efficiently. Thus, in this paper, we propose a novel data partitioning approach over a sliding window in real-time spatial Big Data named VPA-RTSBD (Vertical Partitioning Approach for Real-Time Spatial Big data). This contribution is an implementation of the Matching algorithm for traditional vertical partitioning. We find, firstly, the optimal attributes sequence by the use of the Matching algorithm. Then, we propose a new cost model used for database partitioning, for keeping the data amount of each partition more balanced limit and for providing a parallel execution guarantee for the most frequent queries. VPA-RTSBD aims to obtain a real-time partitioning scheme and deals with stream data. It improves the performance of query execution by maximizing the degree of parallel execution. This affects QoS (Quality Of Service) improvement in real-time spatial Big Data especially with a huge volume of stream data. The performance of our contribution is evaluated via simulation experiments. The results show that the proposed algorithm is both efficient and scalable and that it outperforms comparable algorithms.

Keywords

Real-time spatial Big Data Vertical partitioning Horizontal partitioning Matching algorithm Hamming distance Stream query 

References

  1. 1.
    Agrawal, S., Narasayya, V., Yang, B.: Integrating vertical and horizontal partitioning into automated physical database design. In: Proceedings of the 2004 ACM SIGMOD International Conference on Management of Data, pp. 359–370. ACM, June 2004Google Scholar
  2. 2.
    Ahirrao, S., Ingle, R.: Scalable transactions in cloud data stores. J. Cloud Comput.: Adv. Appl. 4, 1–14 (2015). SpringerOpenCrossRefGoogle Scholar
  3. 3.
    Bernstein, P.A., et al.: Adapting microsoft SQL server for cloud computing. In: 2011 IEEE 27th International Conference on Data Engineering (ICDE), pp. 1255–1263. IEEE, April 2011Google Scholar
  4. 4.
    Bhat, M.V., Haupt, A.: An efficient clustering algorithm. IEEE Trans. Syst. Man Cybern. 1, 61–64 (1976)CrossRefGoogle Scholar
  5. 5.
    Chu, W.W., Ieong, I.T.: A transaction-based approach to vertical partitioning for relational database systems. IEEE Trans. Softw. Eng. 19(8), 804–812 (1993)CrossRefGoogle Scholar
  6. 6.
    Comer, D.W., Philip, S.Y.: A vertical partitioning algorithm for relational databases. In: 1987 IEEE Third International Conference on Data Engineering, pp. 30–35. IEEE, February 1987Google Scholar
  7. 7.
    Cornell, D.W., Yu, P.S.: An effective approach to vertical partitioning for physical design of relational databases. IEEE Trans. Softw. Eng. 16(2), 248–258 (1990)CrossRefGoogle Scholar
  8. 8.
    Curino, C., Jones, E., Zhang, Y., Madden, S.: Schism: a workload-driven approach to database replication and partitioning. Proc. VLDB Endow. 3(1–2), 48–57 (2010)CrossRefGoogle Scholar
  9. 9.
    Das, S., El Abbadi, A., Agrawal, D.: ElasTraS: an elastic transactional data store in the cloud. HotCloud 9, 131–142 (2009)Google Scholar
  10. 10.
    Guo, M., Kang, H.: The implementation of database partitioning based on streaming framework. In: 2016 13th Web Information Systems and Applications Conference, pp. 157–162. IEEE, September 2016Google Scholar
  11. 11.
    Hamdi, S., Bouazizi, E., Faiz, S.: A new QoS management approach in real-time GIS with heterogeneous real-time geospatial data using a feedback control scheduling. In: Proceedings of the 19th International Database Engineering & Applications Symposium, pp. 174–179. ACM (2015)Google Scholar
  12. 12.
    Hamdi, S., Bouazizi, E., Faiz, S.: A speculative concurrency control in real-time spatial big data using real-time nested spatial transactions and imprecise computation. In: 2017 IEEE/ACS 14th International Conference on Computer Systems and Applications (AICCSA), pp. 534-540. IEEE, October 2017Google Scholar
  13. 13.
    Hammer, M., Niamir, B.: A heuristic approach to attribute partitioning. In: Proceedings of the 1979 ACM SIGMOD International Conference on Management of Data, pp. 93–101. ACM, May 1979Google Scholar
  14. 14.
    Hoffer, J.A., Severance, D.G. : The use of cluster analysis in physical data base design. In: Proceedings of the 1st International Conference on Very Large Data Bases, pp. 69–86. ACM, September 1975Google Scholar
  15. 15.
    Jindal, A., Dittrich, J.: Relax and let the database do the partitioning online. In: Castellanos, M., Dayal, U., Lehner, W. (eds.) BIRTE 2011. LNBIP, vol. 126, pp. 65–80. Springer, Heidelberg (2012).  https://doi.org/10.1007/978-3-642-33500-6_5CrossRefGoogle Scholar
  16. 16.
    Karypis, G., Kumar, V.: METIS-unstructured graph partitioning and sparse matrix ordering system, version 2.0. (1995)Google Scholar
  17. 17.
    Lin, X., Orlowska, M., Zhang, Y.: A graph based cluster approach for vertical partitioning in database design. Data Knowl. Eng. 11(2), 151–169 (1993)CrossRefGoogle Scholar
  18. 18.
    Liroz-Gistau, M., Akbarinia, R., Pacitti, E., Porto, F., Valduriez, P.: Dynamic workload-based partitioning for large-scale databases. In: Liddle, S.W., Schewe, K.-D., Tjoa, A.M., Zhou, X. (eds.) DEXA 2012. LNCS, vol. 7447, pp. 183–190. Springer, Heidelberg (2012).  https://doi.org/10.1007/978-3-642-32597-7_16CrossRefGoogle Scholar
  19. 19.
    Mokbel, M.F., Xiong, X., Aref, W.G., Hambrusch, S.E., Prabhakar, S., Hammad, M.A.: PALACE: a query processor for handling real-time spatio-temporal data streams. In: Proceedings of the Thirtieth International Conference on Very Large Data Bases-Volume 30, VLDB Endowment, August, pp. 1377–1380 (2004)Google Scholar
  20. 20.
    Navathe, S., Ceri, S., Wiederhold, G., Dou, J.: Vertical partitioning algorithms for database design. ACM Trans. Database Syst. (TODS) 9(4), 680–710 (1984)CrossRefGoogle Scholar
  21. 21.
    Navathe, S.B., Ra, M.: Vertical partitioning for database design: a graphical algorithm. In: ACM Sigmod Record, vol. 18, no. 2, pp. 440–450. ACM, June 1989CrossRefGoogle Scholar
  22. 22.
    Papadomanolakis, S., Ailamaki, A.: An integer linear programming approach to database design. In: 2007 IEEE 23rd International Conference on Data Engineering Workshop, pp. 442–449. IEEE, April 2007Google Scholar
  23. 23.
    Phansalkar, S., Ahirrao, S.: Survey of data partitioning algorithms for big data stores. In: 2016 Fourth International Conference on Parallel, Distributed and Grid Computing (PDGC), pp. 163–168. IEEE, December 2016Google Scholar
  24. 24.
    Rodríguez, L., Li, X.: A support-based vertical partitioning method for database design. In: 2011 8th International Conference on Electrical Engineering Computing Science and Automatic Control (CCE), pp. 1–6. IEEE, October 2011Google Scholar
  25. 25.
    Shraddha Phansalkar, D.A.: Transaction aware vertical partitioning of database (TAVPD) for responsive OLTP applications in cloud data stores. J. Theor. Appl. Inf. Technol. 59(1), 73–81 (2014)Google Scholar
  26. 26.
    Son, J.H., Kim, M.H.: An adaptable vertical partitioning method in distributed systems. J. Syst. Softw. 73(3), 551–561 (2004)CrossRefGoogle Scholar
  27. 27.
    Zhao, W., Cheng, Y., Rusu, F.: Workload-driven vertical partitioning for effective query processing over raw data. (2015)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  1. 1.Tunisia Polytechnic SchoolUniversity of CarthageLa MarsaTunisia
  2. 2.MIRACL LaboratoryUniversity of SfaxSfaxTunisia
  3. 3.LTSIRS LaboratoryTunisTunisia

Personalised recommendations