Directional coherence-based spatiotemporal descriptor for object detection in static and dynamic scenes

Kim, Wonjun; Han, Jae-Joon

doi:10.1007/s00138-016-0801-7

Directional coherence-based spatiotemporal descriptor for object detection in static and dynamic scenes

Original Paper
Published: 23 July 2016

Volume 28, pages 49–59, (2017)
Cite this article

Machine Vision and Applications Aims and scope Submit manuscript

Wonjun Kim¹ &
Jae-Joon Han²

323 Accesses
3 Citations
Explore all metrics

Abstract

This paper presents a simple, yet powerful local descriptor, so-called the histograms of space–time dominant orientations (HiSTDO). Specifically, our HiSTDO is composed of two main components, i.e., the dominant orientation and its coherence, which represents how intensively gradients in the local region are distributed along the space–time dominant orientation. By incorporating them into the histogram, we define it as our HiSTDO descriptor. In contrast to previous methods vulnerable to the presence of the background clutter and the camera noise, our HiSTDO greatly encodes the space–time shape of underlying structures even under such challenging conditions, and it can thus be efficiently applied to various applications (e.g., object and action detection). Experimental results on diverse datasets demonstrate that the proposed descriptor is effective for human action as well as object detection.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Directional geometric histogram feature extraction and applications

Article 10 May 2017

RSD-HoG: A New Image Descriptor

Advanced Human Detection Using Fused Information of Depth and Intensity Images

References

Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vision 60(2), 91–110 (2004)
Article Google Scholar
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: Proceedings IEEE computer vision and pattern recognition (CVPR), pp. 886–893 (2005)
Felzenszwalb, P., Girshick, R., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part based models. IEEE Trans. Pattern Anal. Mach. Intell. 32(9), 1627–1645 (2010)
Article Google Scholar
Zeng, C., Ma, H., Ming, A.: Fast human detection using mi-SVM and a cascade of HOG-LBP features. In: Proceedings IEEE international conference on image processing (ICIP), pp. 3845–3848 (2010)
Zhang, J., Huang, K., Yu, Y., Tan, T.: Boosted local structured HOG-LBP for object localization. In: Proceedings IEEE computer vision and pattern recognition (CVPR), pp. 1393–1400 (2011)
Jia, W., Hu, R.-X., Lei, Y.-K., Zhao, Y., Gui, J.: Histogram of oriented lines for palmprint recognition. IEEE Trans. Syst. Man Cybern.: Syst. 44(3), 385–395 (2014)
Article Google Scholar
Pang, Y., Zhang, K., Yuan, Y., Wang, K.: Distributed object detection with linear SVMs. IEEE Trans. Cybern. 44(11), 2122–2133 (2014)
Article Google Scholar
Scovanner, P., Ali, S., Shah, M.: A 3-dimensional SIFT descriptor and its application to action recognition. In: Proceedings ACM international conference on multimedia, pp. 357–360 (2007)
Klaser, A., Marszalek, M., Schmid, C.: A spatio-temporal descriptor based on 3d-gradients. In: Proceedings British machine vision conference (BMVC), pp. 995–1004 (2008)
Dalal, N., Triggs, B., Schmid, C.: Human detection using oriented histograms of optical flow and appreance. In: Proceedings European conference on computer vision (ECCV), pp. 428–441 (2006)
Baker, S., Roth, S., Scharstein, D., Black, M. J., Lewis, J., Szeliski, R.: A database and evaluation methodology for optical flow. In: Proceedings IEEE international conference on computer vision (ICCV), pp. 1–8 (2007)
Kim, W., Yoo, B., Han, J-J.: HDO : a novel local image descriptor. In: Proceedings IEEE international conference on image processing (ICIP), pp. 5671–5675 (2014)
Kim, W., Kim, C.: Spatiotemporal saliency detection using textural contrast and its applications. IEEE Trans. Circuits Syst. Video Technol. 24(4), 646–659 (2014)
Article Google Scholar
Rowley, H., Baluja, S., Kanade, T.: Neural network-based face detection. IEEE Trans. Pattern Anal. Mach. Intell. 20(1), 22–38 (1998)
Article Google Scholar
Devernay, F.: A non-maxima suppression method for edge detection with sub-pixel accuracy. Technical report, INRIA, no. RR-2724 (1995)
Agarwal, S., Awan, A., Roth, D.: Learning to detect objects in images via a sparse, part-based representation. IEEE Trans. Pattern Anal. Mach. Intell. 26(11), 1475–1490 (2004)
Article Google Scholar
Bourdev, L., Brandt, J.: Robust object detection via soft cascade. Proc. IEEE Comput. Vis. Pattern Recognit. (CVPR) 2, 236–243 (2005)
Google Scholar
Li, J., Wang, T., Zhang, Y.: Face detection using SURF cascade. In: Proceedings IEEE computer vision and pattern recognition workshops (CVPRW), pp. 2183–2190 (2011)
Liao, S., Jain, A.K., Li, S.: A fast and accurate unconstrained face detector. IEEE Trans. Pattern Anal. Mach. Intell. 38(2), 211–223 (2016)
Article Google Scholar
Wang, H., Klaser, A., Schmid, C., Liu, C-L.: Action recognition by dense trajectories. In: Proceedings IEEE computer vision and pattern recognition (CVPR), pp. 3169–3176 (2011)
Wang, H., Schmid, C.: Action recognition with improved trajectories. In: Proceedings IEEE international conference on computer vision (ICCV), pp. 3551–3558 (2013)
Kappor, A., Winn, J.: Located hidden random fields: learning discriminative parts for object Detection. In: Proceedings European conference on computer vision (ECCV), pp. 302–315 (2006)
Leibe, B., Leonardis, A., Schiele, B.: Robust object detection with interleaved categorization and segmentation. Int. J. Comput. Vision 77(1), 259–289 (2008)
Article Google Scholar
Shechtman, E., Irani, M.: Space-time behavior-based correlation-or-how to tell if two underlying motion fields are similar without computing them? IEEE Trans. Pattern Anal. Mach. Intell. 29(11), 2045–2056 (2007)
Article Google Scholar
Ke, Y., Sukthankar, R., Hebert, M.: Event detection in crowded videos. In: Proceedings IEEE international conference on computer vision (ICCV), pp. 1–8 (2007)
Tian, Y., Cao, L., Liu, Z., Zhang, Z.: Hierarchical filtered motion for action recognition in crowded videos. IEEE Trans. Syst. Man Cybern.-Part C: Appl. Rev. 42(3), 313–323 (2012)
Article Google Scholar
Yuan, J., Liu, Z., Wu, Y.: Discriminative subvolume search for efficient action detection. In: Proceedings IEEE computer vision and pattern recognition (CVPR), pp. 2442–2449 (2009)
Siva, P., Xiang, T.: Weakly supervised action detection. In: Proceedings British machine vision conference (BMVC), pp. 65.1–65.11 (2011)
Roshtkhari, M.J., Levine, M.D.: Human activity recognition in videos using a single example. Image Vis. Comput. 31(11), 864–876 (2013)
Adeli-Mosabbeb, E., Fathy, M.: Non-negative matrix completion for action detection. Image Vis. Comput. 39(7), 38–51 (2015)
Article Google Scholar
Schuldt, C., Laptev, I., Caputo, B.: Recognizing human actions: a local SVM approach. In: Proceedings IEEE conference on pattern recognition (ICPR), pp. 32–36 (2004)
Liu, J., Luo, J., Shah, M.: Recognizing realistic actions from videos “in the wild”. In: Proceedings IEEE computer vision and pattern recognition (CVPR), pp. 1996–2003 (2009)
Marszalek, M., Laptev, I., Schmid, C.: Actions in context. In: Proceedings IEEE computer vision and pattern recognition (CVPR), pp. 2929–2936 (2009)
Wang, H., Klaser, A., Schmid, C., Liu, L.: Action recognition by dense trajectories. In: Proceedings IEEE computer vision and pattern recognition (CVPR), pp. 3169–3176 (2011)
Reddy, K., Shah, M.: Recognizing 50 human action categories of web videos. Mach. Vis. Appl. 24, 971–981 (2013)
Article Google Scholar
Ikizler-Cinbis, N., Sclaroff, S.: Object, scene and actions: combining multiple features for human action recognition. In: Proceedings European conference on computer vision (ECCV), pp. 494–507 (2010)
Le, Q. V., Zou, W. Y., Yeung, S. Y., Ng, A. Y.: Learning hierarchical invariant spatio-temporal features for action recognition with independent subspace analysis. In: Proceedings IEEE computer vision and pattern recognition (CVPR), pp. 3361–3368 (2011)
Chetverikov, D., Axt, A.: Approximation-free running SVD and its application to motion detection. Pattern Recogn. Lett. 31(9), 891–897 (2010)
Liu, X., Wen, Z., Zhang, Y.: Limited memory block Krylov subspace optimization for computing dominant singular value decomposition. SIAM J. Sci. Comput. 35(3), 1641–1668 (2013)
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Electronics Engineering, Konkuk University, 120 Neungdong-ro, Gwangjin-gu, Seoul, 05029, Korea
Wonjun Kim
Samsung Advanced Institute of Technology (SAIT), 130 Samsung-ro, Suwon-si, Gyeonggi-do, 16678, Korea
Jae-Joon Han

Authors

Wonjun Kim
View author publications
You can also search for this author in PubMed Google Scholar
Jae-Joon Han
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Wonjun Kim.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kim, W., Han, JJ. Directional coherence-based spatiotemporal descriptor for object detection in static and dynamic scenes. Machine Vision and Applications 28, 49–59 (2017). https://doi.org/10.1007/s00138-016-0801-7

Download citation

Received: 15 December 2015
Revised: 18 April 2016
Accepted: 10 July 2016
Published: 23 July 2016
Issue Date: February 2017
DOI: https://doi.org/10.1007/s00138-016-0801-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Directional coherence-based spatiotemporal descriptor for object detection in static and dynamic scenes

Abstract

Access this article

Similar content being viewed by others

Directional geometric histogram feature extraction and applications

RSD-HoG: A New Image Descriptor

Advanced Human Detection Using Fused Information of Depth and Intensity Images

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Directional coherence-based spatiotemporal descriptor for object detection in static and dynamic scenes

Abstract

Access this article

Similar content being viewed by others

Directional geometric histogram feature extraction and applications

RSD-HoG: A New Image Descriptor

Advanced Human Detection Using Fused Information of Depth and Intensity Images

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation