Abstract
The data processing in the Socialist Republic of Vietnam (Vietnam, hereunder) is in an early stage and a variety of problems are needed to be solved. In the Vietnamese banking and financial sectors, where managing and storing of customer data and transaction histories are being emphasized as never before, the volume of data to be secured on a daily basis are explosively increasing due to rapid economic development so that the relevant authorities are seeking an efficient and reliable way to manage them. Being a widely known popular variation of B-tree, B+-tree is considered as a most adequate tree-type data structure for bulk data. Nevertheless, as it is quite time-consuming to construct a B+-tree for massive data the authors propose a Hadoop framework-based parallel B+-tree system to deal with the problem. The system is largely divided into three phases: First, data are partitioned and distributed evenly such that each partition will have almost the same amount of data volume. Second, a parallel local B+-tree system is constructed. Finally, some small-scale B+-trees are constructed and integrated into the complete form of B+-tree which will be dealing with an entire data set. The authors expect that the proposed system will offer an efficient index structuring while reducing data processing time.
Similar content being viewed by others
References
Douglas, C.: The ubiquitous B-tree. Comput. Surv. ACM. 11(2), 121–137 (1979)
Cong, V.N.H., et al.: Improving the quality of an R-tree using the Map-Reduce framework. Advanced Multimedia and Ubiquitous Engineering, (CUTE 2016), vol. 448, pp. 164–170. Springer, Singapore (2017)
Cong, V.N.H: Enhanced R-tree bulk loading scheme using Map-Reduce framework. M.S. Thesis of Department of IT Convergence and Application Engineering, pp. 4–22. The Graduate School, Pukyong National University, Republic of Korea (2017)
Leutenegger, S.T., Edgington, J.M., Lopez, M.A.: STR: a simple and efficient algorithm for R-tree packing.In: IEEE 13th International Conference on Data Engineering, pp. 497–506 (1997)
Kajioka, S., Mori, T., Uchiya, T., Takumi, I., Matsuo, H.: Experiment of indoor position presumption based on RSSI of Bluetooth LE beacon, In: 2014 IEEE 3rd Global Conference on Consumer Electronics (GCCE), pp. 337–339. IEEE (2014)
Huh, J.-H., Je, S.-M., Seo, K.: Design and configuration of avoidance technique for worst situation in zigbee communications using OPNET. Information Science and Applications (ICISA). LNEE, vol. 376, pp. 331–336. Springer, Heidelberg (2016)
Birkenmeie, G.F., Park, J.-K., Rizvi, S.T.: Principally quasi-Baer ring hulls. In: Van Huynh, D., López-Permouth, S.R. (eds.) Advances in Ring Theory. Trends in Mathematics, pp. 47–61. Springer, Basel (2010)
Birkenmeier, G.F., Park, J.-K., Rizvi, S.T.: Ring hulls of semiprime homomorphic images. In: Brzeziński, T., Gómez Pardo, J.L., Shestakov, I., Smith, P.F. (eds.) Modules and Comodules. Trends in Mathematics, pp. 101–111. Springer, Basel (2008)
Apache Hadoop: http://hadoop.apache.org
Prasad, S.K., McDermott, M., He, X.: GPGPU-based parallel R-tree construction and querying. In: 2015 IEEE International Conference (IPDPSW), pp. 619–627 (2015)
Sung, Y., Jeong, Y.-S., Park, J.-H.: Beacon-based active media control interface in indoor ubiquitous computing environment. Cluster Comput. 19(1), 547–556 (2016)
Huh, J.-H., Otgonchimeg, S., Seo, K.: Advanced metering infrastructure design and test bed experiment using intelligent agents: focusing on the PLC network base technology for Smart Grid system. J. Supercomput. 72(5), 1862–1877 (2016)
Cheong, H., Eun, J., Kim, H., Kim, K.: Belief propagation decoding assisted on-the-fly Gaussian elimination for short LT codes. Cluster comput. 19(1), 309–314 (2016)
Huynh, C.V., Kim, J., Huh, J.H.: Improving the B+-tree construction for transaction log data in bank system using Hadoop. International Conference on Information Science and Applications (ICISA 2017). LNEE, vol. 424, pp. 519–525. Springer, Singapore (2017)
Zhou, W., Lu, J., Luan, Z., Wang, S., Xue, G., Yao, S.: SNB-index: a SkipNet and B+ tree based auxiliary cloud index. Cluster comput. 17(2), 453–462 (2014)
Viglas, S.D.: Adapting the B+-tree for asymmetric I/O. In: East European Conference on Advances in Databases and Information Systems, pp. 399-412. Springer, Berlin, Heidelberg (2012)
Abdullahi, A.U., Ahmad, R., Zakaria, M.N.: Experimental performance analysis of B+-trees with Big Data indexing potentials. In: International Conference of Reliable Information and Communication Technology, pp. 20-29. Springer, New York (2017)
Acknowledgements
The part of this paper [14] was presented International Conference on Information Science and Applications (ICISA 2017), March 20th–23th at MACAU. I am grateful to two anonymous commentators who have contributed to the enhancement of the paper’s completeness with their valuable suggestions at the Conference.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Ngu, H.C.V., Huh, JH. B+-tree construction on massive data with Hadoop. Cluster Comput 22 (Suppl 1), 1011–1021 (2019). https://doi.org/10.1007/s10586-017-1183-y
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10586-017-1183-y