Detection of engineering vehicles in high-resolution monitoring images

Liu, Xun; Zhang, Yin; Zhang, San-yuan; Wang, Ying; Liang, Zhong-yan; Ye, Xiu-zi

doi:10.1631/FITEE.1500026

Detection of engineering vehicles in high-resolution monitoring images

Published: 13 May 2015

Volume 16, pages 346–357, (2015)
Cite this article

Frontiers of Information Technology & Electronic Engineering Aims and scope Submit manuscript

Xun Liu¹,
Yin Zhang¹,
San-yuan Zhang¹,
Ying Wang¹,
Zhong-yan Liang¹ &
…
Xiu-zi Ye²

94 Accesses
5 Citations
Explore all metrics

Abstract

This paper presents a novel formulation for detecting objects with articulated rigid bodies from high-resolution monitoring images, particularly engineering vehicles. There are many pixels in high-resolution monitoring images, and most of them represent the background. Our method first detects object patches from monitoring images using a coarse detection process. In this phase, we build a descriptor based on histograms of oriented gradient, which contain color frequency information. Then we use a linear support vector machine to rapidly detect many image patches that may contain object parts, with a low false negative rate and a high false positive rate. In the second phase, we apply a refinement classification to determine the patches that actually contain objects. In this stage, we increase the size of the image patches so that they include the complete object using models of the object parts. Then an accelerated and improved salient mask is used to improve the performance of the dense scale-invariant feature transform descriptor. The detection process returns the absolute position of positive objects in the original images. We have applied our methods to three datasets to demonstrate their effectiveness.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Object Detection Based on Multiresolution CoHOG

Advanced Human Detection Using Fused Information of Depth and Intensity Images

Vehicle detection and recognition for intelligent traffic surveillance system

Article 01 March 2015

References

Avidan, S., 2006. SpatialBoost: adding spatial reasoning to AdaBoost. Proc. 9th European Conf. on Computer Vision, p.386–396. [doi:10.1007/11744085_30]
Google Scholar
Bay, H., Ess, A., Tuytelaars, T., et al., 2008. Speeded-up robust features (SURF). Comput. Vis. Image Understand., 110(3):346–359. [doi:10.1016/j.cviu.2007.09.014]
Article Google Scholar
Breiman, L., Spector, P., 1992. Submodel selection and evaluation in regression. The X-random case. Int. Statist. Rev., 60(3):291–319.
Article Google Scholar
Calonder, M., Lepetit, V., Strecha, C., et al., 2010. BRIEF: binary robust independent elementary features. Proc. 11th European Conf. on Computer Vision, p.778–792. [doi:10.1007/978-3-642-15561-1_56]
Google Scholar
Dalal, N., Triggs, B., 2005. Histograms of oriented gradients for human detection. Proc. IEEE Computer Society Conf. on Computer Vision and Pattern Recognition, p.886–893. [doi:10.1109/CVPR.2005.177]
Google Scholar
Déniz, O., Bueno, G., Salido, J., et al., 2011. Face recognition using histograms of oriented gradients. Patt. Recogn. Lett., 32(12):1598–1603. [doi:10.1016/j.patrec.2011.01.004]
Article Google Scholar
Dubout, C., Fleuret, F., 2012. Exact acceleration of linear object detectors. Proc. 12th European Conf. on Computer Vision, p.301–311. [doi:10.1007/978-3-642-33712-3_22]
Google Scholar
Felzenszwalb, P.F., Huttenlocher, D.P., 2005. Pictorial structures for object recognition. Int. J. Comput. Vis., 61(1):55–79. [doi:10.1023/B:VISI.0000042934.15159.49]
Article Google Scholar
Felzenszwalb, P.F., Girshick, R.B., McAllester, D., 2010a. Cascade object detection with deformable part models. Proc. IEEE Conf. on Computer Vision and Pattern Recognition, p.2241–2248. [doi:10.1109/CVPR.2010.5539906]
Google Scholar
Felzenszwalb, P.F., Girshick, R.B., McAllester, D., et al., 2010b. Object detection with discriminatively trained part-based models. IEEE Trans. Patt. Anal. Mach. Intell., 32(9):1627–1645. [doi:10.1109/TPAMI.2009.167]
Article Google Scholar
Fischler, M.A., Elschlager, R.A., 1973. The representation and matching of pictorial structures. IEEE Trans. Comput., 22(1):67–92.
Article Google Scholar
Goferman, S., Tal, A., Zelnik-Manor, L., 2010. Puzzle-like collage. Comput. Graph. For., 29(2):459–468. [doi:10.1111/j.1467-8659.2009.01615.x]
Google Scholar
Goferman, S., Zelnik-Manor, L., Tal, A., 2012. Contextaware saliency detection. IEEE Trans. Patt. Anal. Mach. Intell., 34(10):1915–1926. [doi:10.1109/TPAMI.2011.272]
Article Google Scholar
Grauman, K., Darrell, T., 2005. The pyramid match kernel: discriminative classification with sets of image features. Proc. 10th IEEE Int. Conf. on Computer Vision, p.1458–1465. [doi:10.1109/ICCV.2005.239]
Google Scholar
Itti, L., Koch, C., 2001. Computational modelling of visual attention. Nat. Rev. Neurosci., 2(3):194–203. [doi:10.1038/35058500]
Article Google Scholar
Juan, L., Gwun, O., 2009. A comparison of SIFT, PCA-SIFT and SURF. Int. J. Image Process., 3(4):143–152.
Google Scholar
Kanan, C., Cottrell, G., 2010. Robust classification of objects, faces, and flowers using natural image statistics. Proc. IEEE Conf. on Computer Vision and Pattern Recognition, p.2472–2479. [doi:10.1109/CVPR.2010.5539947]
Google Scholar
Kanan, C., Tong, M.H., Zhang, L., et al., 2009. SUN: topdown saliency using natural statistics. Vis. Cogn., 17(6–7):979–1003. [doi:10.1080/13506280902771138]
Article Google Scholar
Ke, Y., Sukthankar, R., 2004. PCA-SIFT: a more distinctive representation for local image descriptors. Proc. IEEE Computer Society Conf. on Computer Vision and Pattern Recognition, p.506–513. [doi:10.1109/CVPR.2004.1315206]
Google Scholar
Kobayashi, T., 2013. BFO meets HOG: feature extraction based on histograms of oriented p.d.f gradients for image classification. Proc. IEEE Conf. on Computer Vision and Pattern Recognition, p.747–754. [doi:10.1109/CVPR.2013.102]
Google Scholar
Kokkinos, I., 2011. Rapid deformable object detection using dual-tree branch-and-bound. Advances in Neural Information Processing Systems, p.2681–2689.
Google Scholar
Lazebnik, S., Schmid, C., Ponce, J., 2006. Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. Proc. IEEE Computer Society Conf. on Computer Vision and Pattern Recognition, p.2169–2178. [doi:10.1109/CVPR.2006.68]
Google Scholar
Leutenegger, S., Chli, M., Siegwart, R.Y., 2011. BRISK: binary robust invariant scalable keypoints. Proc. IEEE Int. Conf. on Computer Vision, p.2548–2555. [doi:10.1109/ICCV.2011.6126542]
Google Scholar
Li, F.F., Perona, P., 2005. A Bayesian hierarchical model for learning natural scene categories. Proc. IEEE Computer Society Conf. on Computer Vision and Pattern Recognition, p.524–531. [doi:10.1109/CVPR.2005.16]
Google Scholar
Li, F.F., Fergus, R., Perona, P., 2007. Learning generative visual models from few training examples: an incremental Bayesian approach tested on 101 object categories. Comput. Vis. Image Understand., 106(1):59–70. [doi:10.1016/j.cviu.2005.09.012]
Article Google Scholar
Li, W.H., Lin, Y.F., Fu, B., et al., 2013. Cascade classifier using combination of histograms of oriented gradients for rapid pedestrian detection. J. Softw., 8(1):71–77. [doi:10.4304/jsw.8.1.71-77]
Google Scholar
Liu, C., Yuen, J., Torralba, A., et al., 2008. SIFT flow: dense correspondence across different scenes. Proc. 10th European Conf. on Computer Vision, p.28–42. [doi:10.1007/978-3-540-88690-7_3]
Google Scholar
Lowe, D.G., 2004. Distinctive image features from scaleinvariant keypoints. Int. J. Comput. Vis., 60(2):91–110. [doi:10.1023/B:VISI.0000029664.99615.94]
Article Google Scholar
Ojala, T., Pietikainen, M., Maenpaa, T., 2002. Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans. Patt. Anal. Mach. Intell., 24(7):971–987. [doi:10.1109/TPAMI.2002.1017623]
Article Google Scholar
Otsu, N., 1975. A threshold selection method from gray-level histograms. Automatica, 11:23–27.
Google Scholar
Ott, P., Everingham, M., 2009. Implicit color segmentation features for pedestrian and object detection. Proc. IEEE 12th Int. Conf. on Computer vision, p.723–730. [doi:10.1109/ICCV.2009.5459238]
Google Scholar
Pedersoli, M., Vedaldi, A., Gonzàlez, J., et al., 2015. A coarse-to-fine approach for fast deformable object detection. Patt. Recogn., 48(5):1844–1853. [doi:10.1016/j.patcog.2014.11.006]
Article Google Scholar
Rahtu, E., Kannala, J., Salo, M., et al., 2010. Segmenting salient objects from images and videos. Proc. 11th European Conf. on Computer Vision, p.366–379. [doi:10.1007/978-3-642-15555-0_27]
Google Scholar
Rutishauser, U., Walther, D., Koch, C., et al., 2004. Is bottom-up attention useful for object recognition? Proc. IEEE Computer Society Conf. on Computer Vision and Pattern Recognition, p.37–44. [doi:10.1109/CVPR.2004.1315142]
Google Scholar
Santella, A., Agrawala, M., DeCarlo, D., et al., 2006. Gazebased interaction for semi-automatic photo cropping. Proc. SIGCHI Conf. on Human Factors in Computing Systems, p.771–780. [doi:10.1145/1124772.1124886]
Google Scholar
Tola, E., Lepetit, V., Fua, P., 2010. DAISY: an efficient dense descriptor applied to wide-baseline stereo. IEEE Trans. Patt. Anal. Mach. Intell., 32(5):815–830. [doi:10.1109/TPAMI.2009.77]
Article Google Scholar
van de Sande, K.E.A., Gevers, T., Snoek, C.G.M., 2010. Evaluating color descriptors for object and scene recognition. IEEE Trans. Patt. Anal. Mach. Intell., 32(9):1582–1596. [doi:10.1109/TPAMI.2009.154]
Article Google Scholar
Vedaldi, A., Fulkerson, B., 2010. VLFeat: an open and portable library of computer vision algorithms. Proc. Int. Conf. on Multimedia, p.1469–1472. [doi:10.1145/1873951.1874249]
Chapter Google Scholar
Wilcoxon, F., 1945. Individual comparisons by ranking methods. Biometr. Bull., 1(6):80–83.
Article Google Scholar
Wu, J.X., Rehg, J.M., 2009. Beyond the Euclidean distance: creating effective visual codebooks using the histogram intersection kernel. Proc. IEEE 12th Int. Conf. on Computer Vision, p.630–637. [doi:10.1109/ICCV.2009.5459178]
Google Scholar
Yan, J.J., Lei, Z., Wen, L.Y., et al., 2014. The fastest deformable part model for object detection. Proc. IEEE Conf. on Computer Vision and Pattern Recognition, p.2497–2504. [doi:10.1109/CVPR.2014.320]
Google Scholar
Zaklouta, F., Stanciulescu, B., 2014. Real-time traffic sign recognition in three stages. Robot. Auton. Syst., 62(1):16–24. [doi:10.1016/j.robot.2012.07.019]
Article Google Scholar
Zhang, J., Marszalek, M., Lazebnik, S., et al., 2007. Local features and kernels for classification of texture and object categories: a comprehensive study. Int. J. Comput. Vis., 73(2):213–238. [doi:10.1007/s11263-006-9794-4]
Article Google Scholar

Download references

Author information

Authors and Affiliations

College of Computer Science and Technology, Zhejiang University, Hangzhou, 310027, China
Xun Liu, Yin Zhang, San-yuan Zhang, Ying Wang & Zhong-yan Liang
College of Mathematics and Information Science, Wenzhou University, Wenzhou, 325035, China
Xiu-zi Ye

Authors

Xun Liu
View author publications
You can also search for this author in PubMed Google Scholar
Yin Zhang
View author publications
You can also search for this author in PubMed Google Scholar
San-yuan Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Ying Wang
View author publications
You can also search for this author in PubMed Google Scholar
Zhong-yan Liang
View author publications
You can also search for this author in PubMed Google Scholar
Xiu-zi Ye
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yin Zhang.

Additional information

Project supported by the China Knowledge Centre for Engineering Sciences and Technology (No. CKCEST-2014-1-2), the Zhejiang Provincial Natural Science Foundation of China (No. LY14F020027), and the National Natural Science Foundation of China (No. 61272304)

ORCID: Xun LIU, http://orcid.org/0000-0002-3045-2943

Rights and permissions

Reprints and permissions

About this article

Cite this article

Liu, X., Zhang, Y., Zhang, Sy. et al. Detection of engineering vehicles in high-resolution monitoring images. Frontiers Inf Technol Electronic Eng 16, 346–357 (2015). https://doi.org/10.1631/FITEE.1500026

Download citation

Received: 20 January 2015
Revised: 23 March 2015
Published: 13 May 2015
Issue Date: May 2015
DOI: https://doi.org/10.1631/FITEE.1500026

Key words

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Detection of engineering vehicles in high-resolution monitoring images

Abstract

Access this article

Similar content being viewed by others

Object Detection Based on Multiresolution CoHOG

Advanced Human Detection Using Fused Information of Depth and Intensity Images

Vehicle detection and recognition for intelligent traffic surveillance system

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Key words

Navigation

Detection of engineering vehicles in high-resolution monitoring images

Abstract

Access this article

Similar content being viewed by others

Object Detection Based on Multiresolution CoHOG

Advanced Human Detection Using Fused Information of Depth and Intensity Images

Vehicle detection and recognition for intelligent traffic surveillance system

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Key words

Search

Navigation