Skip to main content

Advertisement

Log in

A Review on Recent Trends in Query Processing and Optimization in Big Data

  • Published:
Wireless Personal Communications Aims and scope Submit manuscript

Abstract

The new generation of information technology is exchanging a huge data and these data should be process for analytical and visualization purposes. In most of our daily activities we are exchanging the data such as our mobile devices maintains call records, visited location records, exchanging messages etc., and these data are growing drastically and becomes a challenging task to maintain these data. From the starting of human civilization until 2000, humankind produces five Exabyte data but we are producing five Exabyte data every day. These data become useless if some query operation is not performed on it. Data analysis canon these data results any business more responsive and stronger and overcome business challenges. Thus, query-processing system takes place and it plays very decisive part in big data. This paper differentiates betwixt disparate query optimization processes and their algorithms utilized in big data for avoiding issues in query optimization. Utilizing different query processing techniques, the researchers are assisted by this review to find a disparate way for processing the data and how to enhance the query processing as well as optimization in various applications.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  1. Fegaras, L. (2016). Incremental query processing on big data streams. IEEE Transactions on Knowledge and Data Engineering, 28(11), 2998–3012. https://doi.org/10.1109/TKDE.2016.2601103

    Article  Google Scholar 

  2. Liao, Y.-T., Zhou, J., Chia-Hung, Lu., Chen, S.-C., Hsu, C.-H., Chen, W., Jiang, M.-F., & Chung, Y.-C. (2016). Data adapter for querying and transformation between SQL and NoSQL database. Future Generation Computer Systems, 65, 111–121. https://doi.org/10.1016/j.future.2016.02.002

    Article  Google Scholar 

  3. Zhong, Y., Han, J., Zhang, T., Li, Z., Fang, J., & Chen, G. (2012). Towards parallel spatial query processing for big spatial data. In: International Parallel and Distributed Processing Symposium Workshops and PhD Forum, p 21–25 May, Shanghai, China.

  4. Sun, W., Chen, C., BaihuaZheng, C. C., & Liu, P. (2015). An air index for spatial query processing in road networks. IEEE Transactions on Knowledge and Data Engineering, 27(2), 382–395. https://doi.org/10.1109/TKDE.2014.2330836

    Article  Google Scholar 

  5. Hambrusch, S., Liu, C., Aref, W., & Prabhakar, S. (2001). Query processing in broadcasted spatial index trees. In: International Symposium on Spatial Temporal Databases, July 12–15, Redondo Beach, CA, USA.

  6. Papadias, D., Zhang, J., Mamoulis, N. & Tao, Y. (2003). Query processing in spatial network databases. In: 29th International Conference on Very Large Data Bases, September 9–12, Berlin Germany.

  7. Kolahdouzan, M. & Shahabi, C. (2004). Voronoi-based k nearest neighbor search for spatial network databases. In: 30th International Conference on Very Large Data Bases, 31 August-3 September, Toronto Canada.

  8. Hans-Peter, K., Peer, K., Peter, K., Matthias, R., & Tim, S. (2007). Proximity queries in large traffic networks. In: 15th ACM International Symposium on Advances in Geographic Information Systems (ACM GIS 2007), November 7–9, Seattle, WA.

  9. Xu, J., Zheng, B., Lee, W., & Lee, D. (2003). Energy efficient index for querying location-dependent data in mobile broadcast environments. In: 19th International Conference on Data Engineering, 5–8 March, Bangalore, India.

  10. Deng, Z., Xiaoming, W., Wang, L., Chen, X., Ranjan, R., Zomaya, A., & Chen, D. (2015). Parallel processing of dynamic continuous queries over streaming data flows. IEEE Transactions on Parallel and Distributed Systems, 26(3), 834–846. https://doi.org/10.1109/TPDS.2014.2311811

    Article  Google Scholar 

  11. Park, K. (2014). Location-based grid-index for spatial query processing. IEEE Transactions on Knowledge and Data Engineering, 41(4), 1294–1300. https://doi.org/10.1016/j.eswa.2013.08.027

    Article  Google Scholar 

  12. Dingming, W., Yiu, M. L., Cong, G., & Jensen, C. S. (2012). Joint top-K spatial keyword query processing. IEEE Transactions on Knowledge and Data Engineering, 24(10), 1889–1903. https://doi.org/10.1109/TKDE.2011.172

    Article  Google Scholar 

  13. Kolcun, R., Boyle, D. E., & McCann, J. A. (2016). Efficient distributed query processing. IEEE Transactions on Automation Science and Engineering, 13(3), 1230–1246. https://doi.org/10.1109/TASE.2016.2530941

    Article  Google Scholar 

  14. Lee, K., Liu, L., Ganti, R. K., MudhakarSrivatsa, Q. Z., Zhou, Y., & Wang, Q. (2016). Lightweight indexing and querying services for big spatial data. IEEE Transactions on Services Computing, 12(3), 343–355. https://doi.org/10.1109/TSC.2016.2637332

    Article  Google Scholar 

  15. Zheng, B., Xu, J., Lee, W., & Lee, L. (2006). Grid-partition index a hybrid method for nearest-neighbor queries in wireless location based services. International Journal of Very Large Data Bases, 15(1), 21–39. https://doi.org/10.1007/s00778-004-0146-0

    Article  Google Scholar 

  16. Huang, Z., Zhang, J., & Fang, Q. (2015). Efficient query processing platform for uncertain big data. International Journal of Database Theory and Application, 8(5), 149–160. https://doi.org/10.14257/ijdta.2015.8.5.12

    Article  Google Scholar 

  17. Tao, Xu., Wang, D., & Liu, G. (2015). Banian a cross-platform interactive query system for structured big data. Tsinghua Science and Technology, 20(1), 62–71. https://doi.org/10.1109/TST.2015.7040514

    Article  MathSciNet  Google Scholar 

  18. Karnstedta, M., & Kai-UweSattlerb, M. H. (2012). Scalable distributed indexing and query processing over Linked Data. Web Semantics: Science, Services and Agents on the World Wide Web, 10(8), 3–32. https://doi.org/10.1016/j.websem.2011.11.010

    Article  Google Scholar 

  19. Catena, M., & Tonellotto, N. (2017). Scalable network distance browsing in spatial databases. IEEE Transactions on Knowledge and Data Engineering, 29(7), 1412–1425. https://doi.org/10.1145/1376616.1376623

    Article  Google Scholar 

  20. Wu, S., & Wu, K. L. (2009). An indexing framework for efficient retrieval on the cloud. IEEE Data Engineering Bulletin, 32(1), 77–84.

    MathSciNet  Google Scholar 

  21. Mazuran, M., Quintarelli, E., & Tanca, L. (2012). Data mining for XML query-answering support. IEEE Transactions on Knowledge and Data Engineering, 24(8), 1393–1407. https://doi.org/10.1109/TKDE.2011.80

    Article  Google Scholar 

  22. Zhou, J., Wang, W., Chen, Z., Yu, J. X., Tang, X., Yifei, L., & Li, Y. (2016). Top-down XML keyword query processing. IEEE Transactions on Knowledge and Data Engineering, 28(5), 1340–1246. https://doi.org/10.1109/TKDE.2016.2516536

    Article  Google Scholar 

  23. Hua, Y., Xiao, B., & Wang, J. (2009). BR-tree a scalable prototype for supporting multiple queries of multidimensional data. IEEE Transactions on Computers, 58(12), 1585–1598. https://doi.org/10.1109/TC.2009.97

    Article  MathSciNet  MATH  Google Scholar 

  24. Dean, J. & Ghemawat, S. (2004). Mapreduce simplified data processing on large clusters. In: Proceedings of the 6th conference on Symposium on Opearting Systems Design and Implementation ACM, December 6–8, San Francisco CA.

  25. Beng & Kun. (2010). Efficient B-TREE based indexing for cloud data processing. Proceedings of the VLDB Endowment, 3(1), 1207-1218, https://doi.org/10.14778/1920841.1920991

  26. Haas, S. & Arnold, O. (2016). A database accelerator for energy-efficient query processing and optimization. IEEE Nordic Circuits and Systems, 1–2 Nov, Copenhagen, Denmark.

  27. Zhang, D., & Chow, C.-Y. (2016). A spatial mashup service for efficient evaluation of concurrent k-NN queries. IEEE Transactions on Computers, 65(8), 2428–2442. https://doi.org/10.1109/TC.2015.2485215

    Article  MathSciNet  Google Scholar 

  28. Adamus, R. & Kowalski, T. M. (2015). A step towards genuine declarative language-integrated queries. In: Federated Conference on Computer Science and Information Systems (FedCSIS), 13–16 Sept, Lodz, Poland.

  29. Saedi, A. K. Z. A., Ghazali, R., & Deris, M. B. M. (2014). An efficient multi join query optimization for DBMS using swarm intelligent approach. In: 4th World Congress on Information and Communication Technologies (WICT), 8–11 Dec, Melaka, Malaysia.

  30. Xu, C., & Chen, Q. (2017). Authenticating aggregate queries over set-valued data with confidentiality. IEEE Transactions on Knowledge and Data Engineering. https://doi.org/10.1109/tkde.2017.2773541

    Article  Google Scholar 

  31. Weaver, J. & Han, Q. (2016). Building the case for dynamic location query processing. In: 17th IEEE International Conference on Mobile Data Management (MDM), 13–16 June, Porto, Portugal.

  32. Wang, J. & Zheng, K. (2014). Cost-efficient spatial network partitioning for distance-based query processing. In: 15th International Conference on Mobile Data Management, 14–18 July, Brisbane, QLD, Australia.

  33. Chennubhotla, T. & Sadri, F. (2012). Efficient query processing in the semantic model approach to information integration. In: 13th International Conference on Information Reuse & Integration (IRI), 8–10 Aug, Las Vegas, NV, USA.

  34. Qin, Z., & Tang, Y. (2017). Efficient XML query and update processing using a novel prime-based middle fraction labeling scheme. China Communications, 14(3), 145–157. https://doi.org/10.1109/CC.2017.7897330

    Article  Google Scholar 

  35. Catena, M., & Tonellotto, N. (2017). Energy-efficient query processing in web search engines. IEEE Transactions on Knowledge and Data Engineering, 29(7), 1412–1424. https://doi.org/10.1109/TKDE.2017.2681279

    Article  Google Scholar 

  36. Talha, A. M., & Kamel, I. (2019). Facilitating secure and efficient spatial query processing on the cloud. IEEE Transactions on Cloud Computing, 7(4), 988–1001. https://doi.org/10.1109/TCC.2017.2724509

    Article  Google Scholar 

  37. Zhang, S. & Vo, H. T. (2017). Multi-query optimization for complex event processing in SAP ESP. IEEE Transactions on Cloud Computing, 19–22 April, San Diego, CA, USA.

  38. Mansha, S. & Kamiran, F. (2015). Multi-query optimization in federated databases using evolutionary algorithm. In: IEEE 14th International Conference on Machine Learning and Applications, 9–11 Dec, Miami, FL, USA.

  39. Giannakouris, V. & Papailiou, N. (2016). MuSQLE distributed SQL query execution over multiple engine environments. In: IEEE International Conference on Big Data (Big Data), 5–8 Dec, Washington, DC, USA.

  40. Renukuntla, S. S. B. & Rawat, S. (2014). Optimization of excerpt query process for packet attribution system. In: International Conference on Information Assurance and Security (IAS), 28–30 Nov, Okinawa, Japan.

  41. Garg, V. (2015). Optimization of multiple queries for big data with apache hadoop/hive. In: International Conference on Computational Intelligence and Communication Networks, 12–14 Dec, Jabalpur, India.

  42. Deepak, S. & Umesh Kumar, S. (2012). Query processing and optimization of parallel database system in multi processor environments. In Sixth Asia Modelling Symposium, 29–31 May, Bali, Indonesia.

  43. Myalapalli, V. K. & Chakravarthy, A. S. N. (2016). Revamping SQL queries for cost based optimization. In: International Conference on Circuits, Controls, Communications and Computing (I4C), 4–6 Oct, Bangalore, India.

  44. Shan, Y. & Chen, Y. (2015). Scalable query optimization for efficient data processing using mapreduce. In: IEEE International Congress on Big Data, 27 June-2 July, New York, NY, USA.

  45. Mafrica, C. & Johnson, J. (2015). Stream query processing on emerging memory architectures. In: IEEE Non-Volatile Memory System and Applications Symposium (NVMSA), 19–21 Aug, Hong Kong, China.

  46. Kumar, D., & Jha, V. K. (2020). An improved query optimization process in big data using ACO-GA algorithm and HDFS map reduce technique. Distributed and Parallel Databases. https://doi.org/10.1007/s10619-020-07285-z

    Article  Google Scholar 

Download references

Acknowledgements

We thank the anonymous referees for their useful suggestions.

Funding

This work has no funding resource.

Author information

Authors and Affiliations

Authors

Contributions

All authors contributed to the study conception and design. Material preparation, data collection and analysis were performed by DK, VKJ. The first draft of the manuscript was written by DK and all authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Deepak Kumar.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Ethical Approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Data Availability

Data sharing is not applicable to this article as no datasets were generated or analyzed during the current study.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kumar, D., Jha, V.K. A Review on Recent Trends in Query Processing and Optimization in Big Data. Wireless Pers Commun 124, 633–654 (2022). https://doi.org/10.1007/s11277-021-09375-2

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11277-021-09375-2

Keywords

Navigation