Abstract
In many robotic applications there is the need for detecting and tracking moving and/or static objects while the robot moves, in order to interact with them. High quality detection methods require considerable computational time when the number of objects to be detected is high, or when operating within dynamic, real-world environments. Then, when an object detection result is available, it is referred to a previous frame and not to the current one. A method for obtaining delay-free detections is introduced in this present article. It consists of projecting a delayed detection onto the current frame by using a set of feature tracks generated by using the KLT (Kanade-Lucas-Tomasi) tracker. The proposed method is shown to improve detection accuracy when the tracked object is moving with respect to the camera. In addition, the method is able to detect and manage false detections and occlusions using statistical classifiers (Support Vector Machine) and the Viterbi algorithm (Viterbi, IEEE Trans. Inf. Theory 13(2), 260–269 1967). The method is validated in a person-following task, and compared against a part-based HOG person detector, and four performant tracking methods (Meanshift, Compressive Tracking, Tracking-by-detection with Kernels and Kernelized Correlation Filter). Additionally, the method is validated in two additional tasks: face tracking and car tracking. In all reported experiments, the proposed method obtains the best performance among all compared methods.
Similar content being viewed by others
References
Felzenszwalb, P., McAllester, D., Ramanan, D.: A Discriminatively Trained, Multiscale, Deformable Part Model. In: IEEE Conference on Computer Vision and Pattern Recognition, pp 1–8. CVPR 2008 (2008)
Tomasi, C., Kanade, T.: Detection and Tracking of Point Features, School of Computer Science. Carnegie Mellon University, Pittsburgh (1991)
Fischler, M.A., Bolles, R.C.: Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun. ACM 24(6), 381–395 (1981)
Zhang, K., Zhang, L., Yang, M.H.: Real-Time Compressive Tracking. In: Computer Vision–ECCV 2012, pp. 864–877 (2012)
Henriques, J.F., Caseiro, R., Martins, P., Batista, J.: Exploiting the Circulant Structure of Tracking-By-Detection with Kernels. In: Computer Vision–ECCV 2012, pp. 702–715 (2012)
Comaniciu, D., Ramesh, V., Meer, P.: Real-time tracking of non-rigid objects using mean shift. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. vol. 2, pp. 142–149 (2000)
Rublee, E., Rabaud, V., Konolige, K., Bradski, G.: ORB: An efficient alternative to SIFT or SURF. In: Proceedings of the 2011 International Conference on Computer Vision, pp. 2564–2571 (2011)
Alahi, A., Ortiz, O., Vandergheynst, P.: Freak: Fast Retina Keypoint. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 510–517 (2012)
Ballard, D.H.: Generalizing the Hough transform to detect arbitrary shapes. Pattern Recogn. 13(2), 111–122 (1981)
Ruiz-Del-Solar, J., Loncomilla, P.: Robot head pose detection and gaze direction determination using local invariant features. Adv. Robot. 23(3), 305–328 (2009)
Loncomilla, P., Ruiz-del-Solar, J., Martínez, L.: Object recognition using local invariant features for robotic applications: A survey. Pattern Recogn. 60, 499–514 (2016)
Tuytelaars, T., Mikolajczyk, K.: Local invariant feature detectors: a survey. Found. Trends Comput. Graph. Vis. 3(3), 177–280 (2008)
Mikolajczyk, K., Schmid, C.: A performance evaluation of local descriptors. IEEE Trans. Pattern Anal. Mach. Intell. 27(10), 1615–1630 (2005)
Guo, Z., Zhang, D.: A completed modeling of local binary pattern operator for texture classification. IEEE Trans. Image Process. 19(6), 1657–1663 (2010)
Dalal, N., Triggs, B.: Histograms of Oriented Gradients for Human Detection. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2005, Vol. 1, pp. 886–893 (2005)
Chu, C.T., Hwang, J.N., Pai, H.I., Lan, K.M.: Tracking human under occlusion based on adaptive multiple kernels with projected gradients. IEEE Trans. Multimed. 15(7), 1602–1615 (2013)
Zhou, X., Li, Y., He, B.: Tracking Humans in Mutual Occlusion based on Game Theory (2013)
Jeong, J.M., Yoon, T.S., Park, J.B.: Kalman filter based multiple objects detection-tracking algorithm robust to occlusion. In: 2014 Proceedings of the SICE Annual Conference (SICE), pp. 941–946 (2014)
Rahmatian, S., Safabakhsh, R.: Online Multiple People Tracking-By-Detection in Crowded Scenes. In: 2014 7Th International Symposium on Telecommunications (IST), pp. 337–342 (2014)
Suresh, S., Chitra, K., Deepack, P.: Patch Based Frame Work for Occlusion Detection in Multi Human Tracking. In: Circuits, Power and Computing Technologies (ICCPCT), pp. 1194–1196 (2013)
Li, Z., Tang, Q.L., Sang, N.: Improved mean shift algorithm for occlusion pedestrian tracking. Electron. Lett. 44(10), 622–623 (2008)
Yan, J., Ling, Q., Zhang, Y., Li, F., Zhao, F.: A Novel Occlusion-Adaptive Multi-Object Tracking Method for Road Surveillance Applications. In: 2013 32Nd Chinese Control Conference (CCC), pp. 3547–3551 (2013)
Tang, S., Andriluka, M., Milan, A., Schindler, K., Roth, S., Schiele, B.: Learning People Detectors for Tracking in Crowded Scenes. In: 2013 IEEE International Conference on Computer Vision (ICCV), pp. 1049–1056 (2013)
Tang, S., Andriluka, M., Schiele, B.: Detection and tracking of occluded people. Int. J. Comput. Vis. 110(1), 58–69 (2014)
Guan, Y., Chen, X., Yang, D., Wu, Y.: Multi-Person Tracking-By-Detection with Local Particle Filtering and Global Occlusion Handling. In: 2014 IEEE International Conference on Multimedia and Expo (ICME), pp. 1–6 (2014)
Viterbi, A.J.: Error bounds for convolutional codes and an asymptotically optimum decoding algorithm. IEEE Trans. Inf. Theory 13(2), 260–269 (1967)
Pairo, W., Ruiz-del-Solar, J., Verschae, R., Correa, M., Loncomilla, P.: Person Following by Mobile Robots: Analysis of Visual and Range Tracking Methods and Technologies. In: Robocup 2013: Robot World Cup XVII, pp. 231–243 (2014)
Ruiz-del-Solar, J., Correa, M., Verschae, R., Bernuy, F., Loncomilla, P., Mascaró, M., Riquelme, R., Smith, F.: Bender – A general-purpose social robot with human-robot interaction abilities. J. Hum.–Robot Interact. 1(2), 54–75 (2012)
Uijlings, R., van de Sande, A., Gevers, T., Smeulders, M.: Selective search for object recognition. Int. J. Comput. Vis. 104(2), 154 (2013)
Zitnick, C.L., Dollár, P.: Edge Boxes: Locating object proposals from edges. In: ECCV 2014, Lecture Notes in Computer Science of Computer Vision, vol. 8639, pp. 391–405 (2014)
Felzenszwalb, P., Girshick, R., McAllester, D., Ramanan, D.: Visual object detection with deformable part models. Commun. ACM 56(9), 97–105 (2013)
Dollár, P., Belongie, S.J., Perona, P.: The fastest pedestrian detector in the West. BMVC 2(3), 68.1–68.11 (2010)
Dollár, P., Appel, R., Belongie, S., Perona, P.: Fast feature pyramids for object detection. IEEE Trans. Pattern Anal. Mach. Intell. 36(8), 1532–1545 (2014)
Kalal, Z., Mikolajczyk, K., Matas, J.: Tracking-learning-detection. IEEE Trans. Pattern Anal. Mach. Intell. 34(7), 1409–1422 (2012)
Pernici, F., Del Bimbo, A.: Object tracking by oversampling local features. IEEE Trans. Pattern Anal. Mach. Intell. 36(12), 2538–2551 (2014)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet Classification with Deep Convolutional Neural Networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)
Girshick, R.: Fast R-CNN. arXiv:1504.08083 [cs.CV] (2015)
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. In: Advances in Neural Information Processing Systems (NIPS), Vol. 28 (2015)
He, K., Zhang, X., Ren, S., Sun, J.: Deep Residual Learning for Image Recognition. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2016)
Chen, L., Zhou, F., Shen, Y., Tian, X., Ling, H., Chen, Y.: Illumination insensitive efficient Second-Order minimization for planar object tracking. ICRA (2017)
Tan, D.J., Ilic, S.: Multi-forest tracker: a chameleon in tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1202–1209 (2014)
Tan, D.J., Tombari, F., Ilic, S., Navab, N.: A versatile learning-based 3d temporal tracker: Scalable, robust, online. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 693–701 (2015)
Tan, D.J., Navab, N., Tombari, F.: Looking beyond the Simple Scenarios: Combining Learners and Optimizers in 3D Temporal Tracking. In: IEEE Transactions on Visualization and Computer Graphics (2017)
Redmon, J., Farhadi, A.: YOLO9000: Better, Faster, Stronger. arXiv:1612.08242 (2016)
Zhang, K., Zhang, Z., Li, Z., Qiao, Y.: Joint face detection and alignment using multitask cascaded convolutional networks. IEEE Signal Process. Lett. 23(10), 1499–1503 (2016)
Henriques, J.F., Caseiro, R., Martins, P., Batista, J.: High-speed tracking with Kernelized correlation filters. IEEE Trans. Pattern Anal. Mach. Intell. 37, 583–596 (2015)
Hare, S., Golodetz, S., Saffari, A., Vineet, V., Cheng, M.M., Hicks, S.L., Torr, P.H.S.: Struck: Structured output tracking with Kernels. IEEE Trans. Pattern Anal. Mach. Intell. 38(110), 2096–2109 (2016)
Acknowledgments
This work was partially funded by FONDECYT Projects 1130153 and 1161500.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Pairo, W., Loncomilla, P. & del Solar, J.R. A Delay-Free and Robust Object Tracking Approach for Robotics Applications. J Intell Robot Syst 95, 99–117 (2019). https://doi.org/10.1007/s10846-018-0840-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10846-018-0840-6