Abstract
The new generation of information technology is exchanging a huge data and these data should be process for analytical and visualization purposes. In most of our daily activities we are exchanging the data such as our mobile devices maintains call records, visited location records, exchanging messages etc., and these data are growing drastically and becomes a challenging task to maintain these data. From the starting of human civilization until 2000, humankind produces five Exabyte data but we are producing five Exabyte data every day. These data become useless if some query operation is not performed on it. Data analysis canon these data results any business more responsive and stronger and overcome business challenges. Thus, query-processing system takes place and it plays very decisive part in big data. This paper differentiates betwixt disparate query optimization processes and their algorithms utilized in big data for avoiding issues in query optimization. Utilizing different query processing techniques, the researchers are assisted by this review to find a disparate way for processing the data and how to enhance the query processing as well as optimization in various applications.
Similar content being viewed by others
References
Fegaras, L. (2016). Incremental query processing on big data streams. IEEE Transactions on Knowledge and Data Engineering, 28(11), 2998–3012. https://doi.org/10.1109/TKDE.2016.2601103
Liao, Y.-T., Zhou, J., Chia-Hung, Lu., Chen, S.-C., Hsu, C.-H., Chen, W., Jiang, M.-F., & Chung, Y.-C. (2016). Data adapter for querying and transformation between SQL and NoSQL database. Future Generation Computer Systems, 65, 111–121. https://doi.org/10.1016/j.future.2016.02.002
Zhong, Y., Han, J., Zhang, T., Li, Z., Fang, J., & Chen, G. (2012). Towards parallel spatial query processing for big spatial data. In: International Parallel and Distributed Processing Symposium Workshops and PhD Forum, p 21–25 May, Shanghai, China.
Sun, W., Chen, C., BaihuaZheng, C. C., & Liu, P. (2015). An air index for spatial query processing in road networks. IEEE Transactions on Knowledge and Data Engineering, 27(2), 382–395. https://doi.org/10.1109/TKDE.2014.2330836
Hambrusch, S., Liu, C., Aref, W., & Prabhakar, S. (2001). Query processing in broadcasted spatial index trees. In: International Symposium on Spatial Temporal Databases, July 12–15, Redondo Beach, CA, USA.
Papadias, D., Zhang, J., Mamoulis, N. & Tao, Y. (2003). Query processing in spatial network databases. In: 29th International Conference on Very Large Data Bases, September 9–12, Berlin Germany.
Kolahdouzan, M. & Shahabi, C. (2004). Voronoi-based k nearest neighbor search for spatial network databases. In: 30th International Conference on Very Large Data Bases, 31 August-3 September, Toronto Canada.
Hans-Peter, K., Peer, K., Peter, K., Matthias, R., & Tim, S. (2007). Proximity queries in large traffic networks. In: 15th ACM International Symposium on Advances in Geographic Information Systems (ACM GIS 2007), November 7–9, Seattle, WA.
Xu, J., Zheng, B., Lee, W., & Lee, D. (2003). Energy efficient index for querying location-dependent data in mobile broadcast environments. In: 19th International Conference on Data Engineering, 5–8 March, Bangalore, India.
Deng, Z., Xiaoming, W., Wang, L., Chen, X., Ranjan, R., Zomaya, A., & Chen, D. (2015). Parallel processing of dynamic continuous queries over streaming data flows. IEEE Transactions on Parallel and Distributed Systems, 26(3), 834–846. https://doi.org/10.1109/TPDS.2014.2311811
Park, K. (2014). Location-based grid-index for spatial query processing. IEEE Transactions on Knowledge and Data Engineering, 41(4), 1294–1300. https://doi.org/10.1016/j.eswa.2013.08.027
Dingming, W., Yiu, M. L., Cong, G., & Jensen, C. S. (2012). Joint top-K spatial keyword query processing. IEEE Transactions on Knowledge and Data Engineering, 24(10), 1889–1903. https://doi.org/10.1109/TKDE.2011.172
Kolcun, R., Boyle, D. E., & McCann, J. A. (2016). Efficient distributed query processing. IEEE Transactions on Automation Science and Engineering, 13(3), 1230–1246. https://doi.org/10.1109/TASE.2016.2530941
Lee, K., Liu, L., Ganti, R. K., MudhakarSrivatsa, Q. Z., Zhou, Y., & Wang, Q. (2016). Lightweight indexing and querying services for big spatial data. IEEE Transactions on Services Computing, 12(3), 343–355. https://doi.org/10.1109/TSC.2016.2637332
Zheng, B., Xu, J., Lee, W., & Lee, L. (2006). Grid-partition index a hybrid method for nearest-neighbor queries in wireless location based services. International Journal of Very Large Data Bases, 15(1), 21–39. https://doi.org/10.1007/s00778-004-0146-0
Huang, Z., Zhang, J., & Fang, Q. (2015). Efficient query processing platform for uncertain big data. International Journal of Database Theory and Application, 8(5), 149–160. https://doi.org/10.14257/ijdta.2015.8.5.12
Tao, Xu., Wang, D., & Liu, G. (2015). Banian a cross-platform interactive query system for structured big data. Tsinghua Science and Technology, 20(1), 62–71. https://doi.org/10.1109/TST.2015.7040514
Karnstedta, M., & Kai-UweSattlerb, M. H. (2012). Scalable distributed indexing and query processing over Linked Data. Web Semantics: Science, Services and Agents on the World Wide Web, 10(8), 3–32. https://doi.org/10.1016/j.websem.2011.11.010
Catena, M., & Tonellotto, N. (2017). Scalable network distance browsing in spatial databases. IEEE Transactions on Knowledge and Data Engineering, 29(7), 1412–1425. https://doi.org/10.1145/1376616.1376623
Wu, S., & Wu, K. L. (2009). An indexing framework for efficient retrieval on the cloud. IEEE Data Engineering Bulletin, 32(1), 77–84.
Mazuran, M., Quintarelli, E., & Tanca, L. (2012). Data mining for XML query-answering support. IEEE Transactions on Knowledge and Data Engineering, 24(8), 1393–1407. https://doi.org/10.1109/TKDE.2011.80
Zhou, J., Wang, W., Chen, Z., Yu, J. X., Tang, X., Yifei, L., & Li, Y. (2016). Top-down XML keyword query processing. IEEE Transactions on Knowledge and Data Engineering, 28(5), 1340–1246. https://doi.org/10.1109/TKDE.2016.2516536
Hua, Y., Xiao, B., & Wang, J. (2009). BR-tree a scalable prototype for supporting multiple queries of multidimensional data. IEEE Transactions on Computers, 58(12), 1585–1598. https://doi.org/10.1109/TC.2009.97
Dean, J. & Ghemawat, S. (2004). Mapreduce simplified data processing on large clusters. In: Proceedings of the 6th conference on Symposium on Opearting Systems Design and Implementation ACM, December 6–8, San Francisco CA.
Beng & Kun. (2010). Efficient B-TREE based indexing for cloud data processing. Proceedings of the VLDB Endowment, 3(1), 1207-1218, https://doi.org/10.14778/1920841.1920991
Haas, S. & Arnold, O. (2016). A database accelerator for energy-efficient query processing and optimization. IEEE Nordic Circuits and Systems, 1–2 Nov, Copenhagen, Denmark.
Zhang, D., & Chow, C.-Y. (2016). A spatial mashup service for efficient evaluation of concurrent k-NN queries. IEEE Transactions on Computers, 65(8), 2428–2442. https://doi.org/10.1109/TC.2015.2485215
Adamus, R. & Kowalski, T. M. (2015). A step towards genuine declarative language-integrated queries. In: Federated Conference on Computer Science and Information Systems (FedCSIS), 13–16 Sept, Lodz, Poland.
Saedi, A. K. Z. A., Ghazali, R., & Deris, M. B. M. (2014). An efficient multi join query optimization for DBMS using swarm intelligent approach. In: 4th World Congress on Information and Communication Technologies (WICT), 8–11 Dec, Melaka, Malaysia.
Xu, C., & Chen, Q. (2017). Authenticating aggregate queries over set-valued data with confidentiality. IEEE Transactions on Knowledge and Data Engineering. https://doi.org/10.1109/tkde.2017.2773541
Weaver, J. & Han, Q. (2016). Building the case for dynamic location query processing. In: 17th IEEE International Conference on Mobile Data Management (MDM), 13–16 June, Porto, Portugal.
Wang, J. & Zheng, K. (2014). Cost-efficient spatial network partitioning for distance-based query processing. In: 15th International Conference on Mobile Data Management, 14–18 July, Brisbane, QLD, Australia.
Chennubhotla, T. & Sadri, F. (2012). Efficient query processing in the semantic model approach to information integration. In: 13th International Conference on Information Reuse & Integration (IRI), 8–10 Aug, Las Vegas, NV, USA.
Qin, Z., & Tang, Y. (2017). Efficient XML query and update processing using a novel prime-based middle fraction labeling scheme. China Communications, 14(3), 145–157. https://doi.org/10.1109/CC.2017.7897330
Catena, M., & Tonellotto, N. (2017). Energy-efficient query processing in web search engines. IEEE Transactions on Knowledge and Data Engineering, 29(7), 1412–1424. https://doi.org/10.1109/TKDE.2017.2681279
Talha, A. M., & Kamel, I. (2019). Facilitating secure and efficient spatial query processing on the cloud. IEEE Transactions on Cloud Computing, 7(4), 988–1001. https://doi.org/10.1109/TCC.2017.2724509
Zhang, S. & Vo, H. T. (2017). Multi-query optimization for complex event processing in SAP ESP. IEEE Transactions on Cloud Computing, 19–22 April, San Diego, CA, USA.
Mansha, S. & Kamiran, F. (2015). Multi-query optimization in federated databases using evolutionary algorithm. In: IEEE 14th International Conference on Machine Learning and Applications, 9–11 Dec, Miami, FL, USA.
Giannakouris, V. & Papailiou, N. (2016). MuSQLE distributed SQL query execution over multiple engine environments. In: IEEE International Conference on Big Data (Big Data), 5–8 Dec, Washington, DC, USA.
Renukuntla, S. S. B. & Rawat, S. (2014). Optimization of excerpt query process for packet attribution system. In: International Conference on Information Assurance and Security (IAS), 28–30 Nov, Okinawa, Japan.
Garg, V. (2015). Optimization of multiple queries for big data with apache hadoop/hive. In: International Conference on Computational Intelligence and Communication Networks, 12–14 Dec, Jabalpur, India.
Deepak, S. & Umesh Kumar, S. (2012). Query processing and optimization of parallel database system in multi processor environments. In Sixth Asia Modelling Symposium, 29–31 May, Bali, Indonesia.
Myalapalli, V. K. & Chakravarthy, A. S. N. (2016). Revamping SQL queries for cost based optimization. In: International Conference on Circuits, Controls, Communications and Computing (I4C), 4–6 Oct, Bangalore, India.
Shan, Y. & Chen, Y. (2015). Scalable query optimization for efficient data processing using mapreduce. In: IEEE International Congress on Big Data, 27 June-2 July, New York, NY, USA.
Mafrica, C. & Johnson, J. (2015). Stream query processing on emerging memory architectures. In: IEEE Non-Volatile Memory System and Applications Symposium (NVMSA), 19–21 Aug, Hong Kong, China.
Kumar, D., & Jha, V. K. (2020). An improved query optimization process in big data using ACO-GA algorithm and HDFS map reduce technique. Distributed and Parallel Databases. https://doi.org/10.1007/s10619-020-07285-z
Acknowledgements
We thank the anonymous referees for their useful suggestions.
Funding
This work has no funding resource.
Author information
Authors and Affiliations
Contributions
All authors contributed to the study conception and design. Material preparation, data collection and analysis were performed by DK, VKJ. The first draft of the manuscript was written by DK and all authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Ethical Approval
This article does not contain any studies with human participants or animals performed by any of the authors.
Data Availability
Data sharing is not applicable to this article as no datasets were generated or analyzed during the current study.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Kumar, D., Jha, V.K. A Review on Recent Trends in Query Processing and Optimization in Big Data. Wireless Pers Commun 124, 633–654 (2022). https://doi.org/10.1007/s11277-021-09375-2
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11277-021-09375-2