Skip to main content
Log in

B+-tree construction on massive data with Hadoop

  • Published:
Cluster Computing Aims and scope Submit manuscript

Abstract

The data processing in the Socialist Republic of Vietnam (Vietnam, hereunder) is in an early stage and a variety of problems are needed to be solved. In the Vietnamese banking and financial sectors, where managing and storing of customer data and transaction histories are being emphasized as never before, the volume of data to be secured on a daily basis are explosively increasing due to rapid economic development so that the relevant authorities are seeking an efficient and reliable way to manage them. Being a widely known popular variation of B-tree, B+-tree is considered as a most adequate tree-type data structure for bulk data. Nevertheless, as it is quite time-consuming to construct a B+-tree for massive data the authors propose a Hadoop framework-based parallel B+-tree system to deal with the problem. The system is largely divided into three phases: First, data are partitioned and distributed evenly such that each partition will have almost the same amount of data volume. Second, a parallel local B+-tree system is constructed. Finally, some small-scale B+-trees are constructed and integrated into the complete form of B+-tree which will be dealing with an entire data set. The authors expect that the proposed system will offer an efficient index structuring while reducing data processing time.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

References

  1. Douglas, C.: The ubiquitous B-tree. Comput. Surv. ACM. 11(2), 121–137 (1979)

    Article  MATH  Google Scholar 

  2. Cong, V.N.H., et al.: Improving the quality of an R-tree using the Map-Reduce framework. Advanced Multimedia and Ubiquitous Engineering, (CUTE 2016), vol. 448, pp. 164–170. Springer, Singapore (2017)

  3. Cong, V.N.H: Enhanced R-tree bulk loading scheme using Map-Reduce framework. M.S. Thesis of Department of IT Convergence and Application Engineering, pp. 4–22. The Graduate School, Pukyong National University, Republic of Korea (2017)

  4. Leutenegger, S.T., Edgington, J.M., Lopez, M.A.: STR: a simple and efficient algorithm for R-tree packing.In: IEEE 13th International Conference on Data Engineering, pp. 497–506 (1997)

  5. Kajioka, S., Mori, T., Uchiya, T., Takumi, I., Matsuo, H.: Experiment of indoor position presumption based on RSSI of Bluetooth LE beacon, In: 2014 IEEE 3rd Global Conference on Consumer Electronics (GCCE), pp. 337–339. IEEE (2014)

  6. Huh, J.-H., Je, S.-M., Seo, K.: Design and configuration of avoidance technique for worst situation in zigbee communications using OPNET. Information Science and Applications (ICISA). LNEE, vol. 376, pp. 331–336. Springer, Heidelberg (2016)

    Google Scholar 

  7. Birkenmeie, G.F., Park, J.-K., Rizvi, S.T.: Principally quasi-Baer ring hulls. In: Van Huynh, D., López-Permouth, S.R. (eds.) Advances in Ring Theory. Trends in Mathematics, pp. 47–61. Springer, Basel (2010)

    Chapter  Google Scholar 

  8. Birkenmeier, G.F., Park, J.-K., Rizvi, S.T.: Ring hulls of semiprime homomorphic images. In: Brzeziński, T., Gómez Pardo, J.L., Shestakov, I., Smith, P.F. (eds.) Modules and Comodules. Trends in Mathematics, pp. 101–111. Springer, Basel (2008)

    Chapter  Google Scholar 

  9. Apache Hadoop: http://hadoop.apache.org

  10. Prasad, S.K., McDermott, M., He, X.: GPGPU-based parallel R-tree construction and querying. In: 2015 IEEE International Conference (IPDPSW), pp. 619–627 (2015)

  11. Sung, Y., Jeong, Y.-S., Park, J.-H.: Beacon-based active media control interface in indoor ubiquitous computing environment. Cluster Comput. 19(1), 547–556 (2016)

    Article  Google Scholar 

  12. Huh, J.-H., Otgonchimeg, S., Seo, K.: Advanced metering infrastructure design and test bed experiment using intelligent agents: focusing on the PLC network base technology for Smart Grid system. J. Supercomput. 72(5), 1862–1877 (2016)

    Article  Google Scholar 

  13. Cheong, H., Eun, J., Kim, H., Kim, K.: Belief propagation decoding assisted on-the-fly Gaussian elimination for short LT codes. Cluster comput. 19(1), 309–314 (2016)

    Article  Google Scholar 

  14. Huynh, C.V., Kim, J., Huh, J.H.: Improving the B+-tree construction for transaction log data in bank system using Hadoop. International Conference on Information Science and Applications (ICISA 2017). LNEE, vol. 424, pp. 519–525. Springer, Singapore (2017)

    Chapter  Google Scholar 

  15. Zhou, W., Lu, J., Luan, Z., Wang, S., Xue, G., Yao, S.: SNB-index: a SkipNet and B+ tree based auxiliary cloud index. Cluster comput. 17(2), 453–462 (2014)

    Article  Google Scholar 

  16. Viglas, S.D.: Adapting the B+-tree for asymmetric I/O. In: East European Conference on Advances in Databases and Information Systems, pp. 399-412. Springer, Berlin, Heidelberg (2012)

  17. Abdullahi, A.U., Ahmad, R., Zakaria, M.N.: Experimental performance analysis of B+-trees with Big Data indexing potentials. In: International Conference of Reliable Information and Communication Technology, pp. 20-29. Springer, New York (2017)

Download references

Acknowledgements

The part of this paper [14] was presented International Conference on Information Science and Applications (ICISA 2017), March 20th–23th at MACAU. I am grateful to two anonymous commentators who have contributed to the enhancement of the paper’s completeness with their valuable suggestions at the Conference.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jun-Ho Huh.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ngu, H.C.V., Huh, JH. B+-tree construction on massive data with Hadoop. Cluster Comput 22 (Suppl 1), 1011–1021 (2019). https://doi.org/10.1007/s10586-017-1183-y

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10586-017-1183-y

Keywords

Navigation