Heterogeneous CPU–GPU tracking–learning–detection (H-TLD) for real-time object tracking
- 351 Downloads
- 1 Citations
Abstract
The recently proposed tracking–learning–detection (TLD) method has become a popular visual tracking algorithm as it was shown to provide promising long-term tracking results. On the other hand, the high computational cost of the algorithm prevents it being used at higher resolutions and frame rates. In this paper, we describe the design and implementation of a heterogeneous CPU–GPU TLD (H-TLD) solution using OpenMP and CUDA. Leveraging the advantages of the heterogeneous architecture, serial parts are run asynchronously on the CPU while the most computationally costly parts are parallelized and run on the GPU. Design of the solution ensures keeping data transfers between CPU and GPU at a minimum and applying stream compaction and overlapping data transfer with computation whenever such transfers are necessary. The workload is balanced for a uniform work distribution across the GPU multiprocessors. Results show that 10.25 times speed-up is achieved at 1920 \(\times\) 1080 resolution compared to the baseline TLD. The source code has been made publicly available to download from the following address: http://gpuresearch.ii.metu.edu.tr/codes/.
Keywords
Object tracking Heterogeneous CPU–GPU implementations Real time CUDAReferences
- 1.Kalal, Z., Mikolajczyk, K., Matas, J.: Tracking–learning–detection. IEEE Trans. Pattern Anal. Mach. Intell. 34(7), 1409–1422 (2012)CrossRefGoogle Scholar
- 2.Sinha, S.N., Frahm, J.M., Pollefeys, M., Genc, Y.: Feature tracking and matching in video using programmable graphics hardware. Mach. Vis. Appl. 22(1), 207–217 (2011)CrossRefGoogle Scholar
- 3.Concha, D., Cabido, R., Pantrigo, J., Montemayor, A.: Performance evaluation of a 3D multi-view-based particle filter for visual object tracking using GPUs and multicore CPUs. J. Real Time Image Process. 1–19 (2014). doi: 10.1007/s11554-014-0483-1
- 4.Guler, P., Emeksiz, D., Temizel, A., Teke, M., Temizel, T.T.: Real-time multi-camera video analytics system on GPU. J. Real Time Image Process. 1–16 (2013). doi: 10.1007/s11554-013-0337-2
- 5.Kumar, P., Singhal, A., Mehta, S., Mittal, A.: Real-time moving object detection algorithm on high-resolution videos using GPUs. J. Real Time Image Process. pp. 1–17 (2013). doi: 10.1007/s11554-012-0309-y
- 6.Ishii, I., Ichida, T., Gu, Q., Takaki, T.: 500-fps face tracking system. J. Real Time Image Process. 8(4), 379–388 (2013)CrossRefGoogle Scholar
- 7.Liu, K.Y., Li, Y.H., Li, S., Tang, L., Wang, L.: A new parallel particle filter face tracking method based on heterogeneous system. J. Real Time Image Process. 7(3), 153–163 (2012)CrossRefGoogle Scholar
- 8.Mahmoudi, S., Kierzynka, M., Manneback, P., Kurowski, K.: Real-time motion tracking using optical flow on multiple GPUs. Bull. Polish Acad. Sci. Tech. Sci. 62(1), 139–150 (2014)Google Scholar
- 9.Marzat, J., Dumortier, Y., Ducrot, A.: Real-time dense and accurate parallel optical flow using cuda. In: International Conference in Central Europe on Computer Graphics, Visualization and Computer Vision (WSCG) (2009)Google Scholar
- 10.Mizukami, Y., Tadamura, K.: Optical flow computation on compute unified device architecture. In: International Conference on Image Analysis and Processing, pp. 179–184. IEEE (2007)Google Scholar
- 11.Bouguet, J.Y.: Pyramidal implementation of the affine Lucas Kanade feature tracker description of the algorithm. Intel Corp. 5, 1–10 (2001)Google Scholar
- 12.Nebehay, G.: Robust object tracking based on tracking–learning–detection. Master’s thesis, Faculty of Informatics, TU Vienna (2012)Google Scholar
- 13.Atala, J., Bederián, C., Bordese, A., Ingaramo, G., Gaich, F., Medina, J., Rosetti, M., Sánchez, J., Tealdi, M., Wolovick, N.: Real-time FullHD tracking–learning–detection on a 2-SMX GPU. In: GPU Technology Conference (GTC) Poster, 2015Google Scholar
- 14.Ping, Z., Yongqi, S., Yali, W., Rui, Z.: A parallel implementation of TLD algorithm using CUDA. In: 5th IET International Conference on Wireless, Mobile and Multimedia Networks (ICWMMN 2013), pp. 220–224 (2013)Google Scholar
- 15.Lewis, J.: Fast normalized cross-correlation. Vis. Interface 10, 120–123 (1995)Google Scholar
- 16.Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)MATHCrossRefGoogle Scholar
- 17.Kalal, Z., Matas, J., Mikolajczyk, K.: Pn learning: Bootstrapping binary classifiers by structural constraints. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 49–56. IEEE (2010)Google Scholar
- 18.Kalal, Z., Mikolajczyk, K., Matas, J.: Forward-backward error: automatic detection of tracking failures. In: International Conference on Pattern Recognition (ICPR), pp. 2756–2759. IEEE (2010)Google Scholar
- 19.Bradski, G.: OpenCV Library. Dr. Dobb’s J. Softw. Tools (2008)Google Scholar
- 20.NPP library. Available: https://developer.nvidia.com/NPP (2015)
- 21.CUB library. Available: http://nvlabs.github.io/cub/ (2015)
- 22.Thrust library. Available: https://developer.nvidia.com/Thrust (2015)
- 23.Dagum, L., Menon, R.: OpenMP: an industry standard API for shared-memory programming. Comput. Sci. Eng. IEEE 5(1), 46–55 (1998)CrossRefGoogle Scholar
- 24.PassMark Software: CPU Benchmarks. Available: http://www.cpubenchmark.net/ (2015)