Abstract
The kd tree is one of the most commonly used spatial data structures for a variety of graphics applications because of its reliably high-acceleration performance. Several years ago, Zhou et al. devised an effective kd-tree construction algorithm that runs entirely on a GPU. In this chapter, we present improved GPU programming techniques for implementing the algorithm more efficiently on current GPUs. One of the major ideas is to reduce the number of necessary kernel functions by replacing the essential, segmented-scan, and reduction computations by simpler per-block atomic operations, thereby alleviating the overheads from multiple synchronous kernel calls. Combined with the efficient implementation of intrablock scan and reduction, using recently introduced intrinsic functions, these changes achieve remarkable performance enhancement to the kd-tree construction process. Through an example of real-time ray tracing for dynamic scenes of nontrivial complexity, we demonstrate that the proposed GPU techniques can be exploited effectively for various real-time applications.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Wald, I., Havran, V.: On building fast kd-trees for ray tracing, and on doing that in O(Nlog N). In: Proceedings of the EEE Symposium on Interactive Ray Tracing, pp. 61–69 (2006)
Shevtsov, M., Soupikov, A.: Highly parallel fast Kd-tree construction for interactive ray tracing of dynamic scenes. Comp Graph Forum (Proceedings of Eurographics) 26:395–404 (2007)
Zhou, K., Hou, Q., Wang, R., Guo, B.: Real-time KD-tree construction on graphics hardware. ACM Trans. Graph. 27, 1–11 (2008)
Hou, Q., Sun, X., Zhou, K., Lauterbach, C., Manocha, D.: Memory-scalable GPU spatial hierarchy construction. IEEE Trans. Vis. Comput. Graph. 17, 466–474 (2011)
Choi, B., Komuravelli, R., Lu, V., Sung, H., Bocchino, R., Adve, S., Hart, J.: Parallel SAH k-D tree construction. In: Proceedings of High-Performance Graphics (HPG’10), pp. 77–86 (2010)
Wu, Z., Zhao, F., Liu, X.: SAH KD-tree construction on GPU. In: Proceedings of High Performance Graph (HPG’11), pp. 71–78 (2011)
CUDPP Google Group.: CUDA data parallel primitives library release 2.0. http://code.google.com/p/cudpp/ (2011). Accessed 1 June 2013
Sengupta, S., Harris, M., Garland, M., Owens, J.: Efficient parallel scan algorithms for many-core GPUs. In: Scientific Computing with Multicore and Accelerators, Taylor & Francis, pp. 413–442 (2011)
NVIDIA.: CUDA C programming guide: design guide (PG-02829-001 v5.0) (2012)
Skjellum, A., Whittaker, D., Bangalore, P.: Ballot counting for optimal binary prefix sum. In: Presented in the GPU Technology Conference 2010 (2010)
Manku, G.: Fast bit counting routines. http://cpptruths.googlecode.com/svn/trunk/c/bitcount.c (2002). Accessed 1 June 2013
Acknowledgments
This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korean government (MOE) (No. 2012R1A1A2008958).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Appendix: A Single-Kernel Implementation for the Triangle-Sorting Process (Sect. 13.2.1.2)
Appendix: A Single-Kernel Implementation for the Triangle-Sorting Process (Sect. 13.2.1.2)
Rights and permissions
Copyright information
© 2015 Springer Science+Business Media Singapore
About this chapter
Cite this chapter
Chang, B., Seo, W., Ihm, I. (2015). On the Efficient Implementation of a Real-Time Kd-Tree Construction Algorithm. In: Cai, Y., See, S. (eds) GPU Computing and Applications. Springer, Singapore. https://doi.org/10.1007/978-981-287-134-3_13
Download citation
DOI: https://doi.org/10.1007/978-981-287-134-3_13
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-287-133-6
Online ISBN: 978-981-287-134-3
eBook Packages: EngineeringEngineering (R0)