A Fast GPU Based Implementation of Optimal Binary Search Tree Using Dynamic Programming

Wani, Mohsin Altaf; Ahmad, Manzoor

doi:10.1007/978-981-10-6544-6_26

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 750))

Included in the following conference series:

International Conference on Information, Communication and Computing Technology

1810 Accesses
1 Citations
2 Altmetric

Abstract

Modern GPUs (Graphics processing units) can perform computation at a very high rate as compared to CPU’s; as a result they are increasingly used for general purpose parallel computation. Parallel algorithms can be developed for GPUs using different computing architectures like CUDA (compute unified device architecture) and OpenCL (Open Computing Language). Determining Optimal Binary Search Tree is an optimization problem to find the optimal arrangement of nodes in a binary search tree so that average search time is minimized. A Dynamic programming algorithm can solve this problem within O(n³)-time complexity and a workspace of size O(n²). We have developed a fast parallel implementation of this O(n³)-time algorithm on a GPU. For achieving the required goal we need to provide data structures suitable for parallel computation of this algorithm, besides we need to efficiently utilize the cache memory available and to minimize thread divergence. Our implementation executes this algorithm within 114.4 s for an instance containing 16384 keys on an NVidia GTX 570, while a conventional CPU based implementation takes 48166 s to execute. Thus, a speed up factor of 422 compared to a conventional CPU based implementation is obtained.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Cormen, T.H., Lieserson, C.E., Rivest, R.L.: Introduction to Algorithms, 4th edn. MIT Press, London (1990)
Google Scholar
Neapolitan, R., Naimipour, K.: Foundations of Algorithms Using C ++ Pseudo Code. Jones & Bartlett, Toronto (2003)
MATH Google Scholar
NVidia Corporation: CUDA programming guide version 4.1. (2011) http://docs.nvidia.com/cuda/cuda-c-programming-guide/
NVidia Corporation: CUDA C Best Practices Guide version 4.1 (2011). http://docs.nvidia.com/cuda/cuda-c-best-practices-guide/
Hwu, W.W.: GPU Computing Gems, Emerald edn. Morgan Kaufmann, San Francisco (2011)
Google Scholar
Man, D., Uda, K., Ito, Y., Nakano, K.: A GPU implementation of computing Euclidean distance map. In: 2011 Second International Conference on Networking and Computing (ICNC), pp. 68–76. IEEE (2011)
Google Scholar
Nishida, K., Nakano, K., Ito, Y.: Accelerating the Dynamic Programming for the Optimal Polygon Triangulation on the GPU. In: Xiang, Y., Stojmenovic, I., Apduhan, B.O., Wang, G., Nakano, K., Zomaya, A. (eds.) ICA3PP 2012. LNCS, vol. 7439, pp. 1–15. Springer, Heidelberg (2012). doi:10.1007/978-3-642-33078-0_1
Chapter Google Scholar
Nishida, K., Ito, Y., Nakano, K.: Accelerating the dynamic programming for the matrix chain product on the GPU. In: 2011 Second International Conference on Networking and Computing (ICNC), pp. 320–326. IEEE (2011)
Google Scholar
Liu, Y., Schmidt, B.: GSWABE: faster GPU-accelerated sequence alignment with optimal alignment retrieval for short DNA sequences. In: Fox, G.C., Hey, A.J. (eds.) Concurrency and Computation Practice and Experience, pp. 958–972. Wiley, New York (2014)
Google Scholar
Li, K., Liu, J., Wan, L., Yin, S., Li, K.: A cost-optimal parallel algorithm for the 0–1 knapsack problem and its performance on multicore CPU and GPU implementations. Parallel Comput. 43, 27–42 (2015)
Article MathSciNet Google Scholar
Chakroun, I., Melab, N.: An adaptative multi-GPU based branch-and-bound. A case study: the flow-shop scheduling problem. In: Proceedings of the 2012 IEEE 14th International Conference, HPCC 2012, pp. 389–395. IEEE Computer Society, Washington, DC (2012)
Google Scholar
Gmys, J., Mezmaz, M., Melab, N., Tuyttens, D.: A GPU-based Branch-and-Bound algorithm using Integer-Vector-Matrix data structure. Parallel Comput. 59, 119–139 (2016)
Article MathSciNet Google Scholar
Chakroun, I., Melab, N.: Operator-level GPU-accelerated branch and bound algorithms. Proc. Comput. Sci. 18(2013), 280–289 (2013)
Article Google Scholar
Tan, G., Feng, S., Sun, N.: Locality and parallelism optimization for dynamic programming algorithm in bioinformatics. In: Proceedings of the 2006 ACM/IEEE Conference on Supercomputing, p. 41. IEEE (2006)
Google Scholar
NVidia Corporation, NVIDIA’s Next Generation CUDATM Compute Architecture: Fermi (2009). http://www.nvidia.com/content/pdf/fermi_white_papers/nvidia_fermi_compute_architecture_whitepaper.pdf
Han, B., Lu, Y.: Research on optimization and parallelization of optimal binary search tree using dynamic programming. In: Advances in Intelligent Systems Research. Atlantis Press, Paris (2012)
Google Scholar
Myoupo, J.F., Tchendji, V.K.: Parallel dynamic programming for solving the optimal search binary tree problem on CGM. Int. J. High Perform. Comput. Netw. 7(4), 269–280 (2014)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Faculty of the Department of Computer Science, University of Kashmir, South Campus, Srinagar, J&K, India
Mohsin Altaf Wani
Department of Computer Science, University of Kashmir, Srinagar, J&K, India
Manzoor Ahmad

Authors

Mohsin Altaf Wani
View author publications
You can also search for this author in PubMed Google Scholar
Manzoor Ahmad
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mohsin Altaf Wani .

Editor information

Editors and Affiliations

Indian Institute of Technology Delhi, New Delhi, India
Saroj Kaushik
Delhi Technological University, Delhi, India
Daya Gupta
Jagan Institute of Management Studies, Delhi, India
Latika Kharb
Jagan Institute of Management Studies, Delhi, India
Deepak Chahal

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wani, M.A., Ahmad, M. (2017). A Fast GPU Based Implementation of Optimal Binary Search Tree Using Dynamic Programming. In: Kaushik, S., Gupta, D., Kharb, L., Chahal, D. (eds) Information, Communication and Computing Technology. ICICCT 2017. Communications in Computer and Information Science, vol 750. Springer, Singapore. https://doi.org/10.1007/978-981-10-6544-6_26

Download citation

DOI: https://doi.org/10.1007/978-981-10-6544-6_26
Published: 11 October 2017
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-6543-9
Online ISBN: 978-981-10-6544-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics