Advertisement

International Journal of Computer Vision

, Volume 106, Issue 3, pp 342–364 | Cite as

Rotation-Invariant HOG Descriptors Using Fourier Analysis in Polar and Spherical Coordinates

  • Kun Liu
  • Henrik Skibbe
  • Thorsten Schmidt
  • Thomas Blein
  • Klaus Palme
  • Thomas Brox
  • Olaf Ronneberger
Article

Abstract

The histogram of oriented gradients (HOG) is widely used for image description and proves to be very effective. In many vision problems, rotation-invariant analysis is necessary or preferred. Popular solutions are mainly based on pose normalization or learning, neglecting some intrinsic properties of rotations. This paper presents a method to build rotation-invariant HOG descriptors using Fourier analysis in polar/spherical coordinates, which are closely related to the irreducible representation of the 2D/3D rotation groups. This is achieved by considering a gradient histogram as a continuous angular signal which can be well represented by the Fourier basis (2D) or spherical harmonics (3D). As rotation-invariance is established in an analytical way, we can avoid discretization artifacts and create a continuous mapping from the image to the feature space. In the experiments, we first show that our method outperforms the state-of-the-art in a public dataset for a car detection task in aerial images. We further use the Princeton Shape Benchmark and the SHREC 2009 Generic Shape Benchmark to demonstrate the high performance of our method for similarity measures of 3D shapes. Finally, we show an application on microscopic volumetric data.

Keywords

Rotation-invariance Image descriptor Fourier analysis Spherical harmonics Histogram of oriented gradients Feature design Volumetric data 

Notes

Acknowledgments

This study was supported by the Excellence Initiative of the German Federal and State Governments: BIOSS Centre for Biological Signalling Studies (EXC 294) and the Bundesministerium für Bildung und Forschung (German Federal Ministry of Education and Research) Project: New Methods in Systems Biology (SYSTEC, 0101-31P5914) – Quantitative 3D and 4D cell analysis in living organisms.

Henrik Skibbe is indebted to the Baden-Württemberg Stiftung for the financial support by the Elite Program for Post-docs. Dr. Thomas Blein was supported by a long-term post-doctoral fellowship from European Molecular Biology Organization (EMBO, ALTF250-2009). Dr. Thomas Blein and Prof. Klaus Palme are also supported by Deutsches Zentrum für Luft und Raumfahrt (DLR 50WB1022) and the European Union Framework 6 Program (AUTOSCREEN, LSHG-CT-2007-037897).

References

  1. Ahonen, T., Matas, J., He, C., Pietikäinen, M. (2009). Rotation invariant image description with local binary pattern histogram Fourier features. In Scandinavian Conference on Image, Analysis, pp. 61–70.Google Scholar
  2. Akgül, C., Axenopoulos, A., Bustos, B., Chaouch, M., Daras, P., Dutagaci, H., Furuya, T., Godil, A., Kreft, S., Lian, Z., et al. (2009). SHREC 2009-Generic Shape Retrieval contest. In Eurographics workshop on 3D object retrieval.Google Scholar
  3. Allaire, S., Kim, J., Breen, S., Jaffray, D., & Pekar, V. (2008). Full orientation invariance and improved feature selectivity of 3D SIFT with application to medical image analysis. In CVPR Workshops.Google Scholar
  4. Arsenault, H., & Sheng, Y. (1986). Properties of the circular harmonic expansion for rotation-invariant pattern recognition. Applied Optics, 25(18), 3225–3229.CrossRefGoogle Scholar
  5. Bendale, P., Triggs, B., & Kingsbury, N. (2010). Multiscale keypoint analysis based on complex wavelets. In British Machine Vision Conference, pp. 49(1–49), 10.Google Scholar
  6. Bourdev, L., Malik, J. (2009). Poselets: Body part detectors trained using 3D human pose annotations. In International Conference on Computer Vision, pp. 1365–1372.Google Scholar
  7. Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5–32.CrossRefzbMATHGoogle Scholar
  8. Brink, D., & Satchler, G. (1968). Angular momentum. Oxford: Clarendon Press.Google Scholar
  9. Bülow, T. (2004). Spherical diffusion for 3D surface smoothing. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(12), 1650–1654.CrossRefGoogle Scholar
  10. Burkhardt, H., & Siggelkow, S. (2001). Invariant features in pattern recognition—fundamentals and applications. In C. Kotropoulos & I. Pitas (Eds.), Nonlinear model-based image/video processing and analysis (pp. 269–307). New York: Wiley.Google Scholar
  11. Chang, C.-C., Lin, C.-J. (2011). LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology, 2,27:1–27:27. Software available at http://www.csie.ntu.edu.tw/cjlin/libsvm
  12. Cortes, C., & Vapnik, V. (1995). Support-vector networks. Machine Learning, 20(3), 273–297.zbMATHGoogle Scholar
  13. Dalal, N., Triggs, B. (2005). Histograms of oriented gradients for human detection. In IEEE Conference on Computer Vision and, Pattern Recognition, pp. 886–893.Google Scholar
  14. Driscoll, J., & Healy, D. (1994). Computing Fourier transforms and convolutions on the 2-sphere. Advances in Applied Mathematics, 15(2), 202–250.CrossRefzbMATHMathSciNetGoogle Scholar
  15. Fan, R., Chang, K., Hsieh, C., Wang, X., & Lin, C. (2008). LIBLINEAR: A library for large linear classification. The Journal of Machine Learning Research, 9, 1871–1874.zbMATHGoogle Scholar
  16. Fehr, J. (2010). Local rotation invariant patch descriptors for 3D vector fields. In International Conference on, Pattern Recognition, pp. 1381–1384.Google Scholar
  17. Felzenszwalb, P., Girshick, R., McAllester, D., & Ramanan, D. (2010). Object detection with discriminatively trained part-based models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(9), 1627–1645.CrossRefGoogle Scholar
  18. Flitton, G., Breckon, T., & Megherbi, N. (2010). Object recognition using 3D SIFT in complex CT volumes. In British Machine Vision Conference, pp. 11(1–11), 12.Google Scholar
  19. Fornasier, M., & Toniolo, D. (2005). Fast, robust and efficient 2D pattern recognition for re-assembling fragmented images. Pattern Recognition, 38(11), 2074–2087.CrossRefGoogle Scholar
  20. Förstner, W., Gülch, E. (1987). A fast operator for detection and precise location of distinct points, corners and centres of circular features. In ISPRS intercommission conference on fast processing of photogrammetric data, pp. 281–305.Google Scholar
  21. Freeman, W., & Adelson, E. (1991). The design and use of steerable filters. IEEE Transactions on Pattern Analysis and Machine Intelligence, 13(9), 891–906.CrossRefGoogle Scholar
  22. Gauglitz, S. (2011). Improving keypoint orientation assignment. In British Machine Vision Conference, pp. 93(1–93), 11.Google Scholar
  23. Giannakis, G. (1989). Signal reconstruction from multiple correlations: frequency- and time-domain approaches. Journal of Optical Society of America A, 6(5), 682–697.CrossRefGoogle Scholar
  24. Golub, G., & Van Loan, C. (1996). Matrix computations. Baltimore: Johns Hopkins Univ Press.zbMATHGoogle Scholar
  25. Green, R. (2003). Spherical harmonic lighting: The gritty details. In Game Developers Conference, 2, 2–3.Google Scholar
  26. Haasdonk, B., & Burkhardt, H. (2007). Invariant kernel functions for pattern analysis and machine learning. Machine Learning, 68(1), 35–61.CrossRefGoogle Scholar
  27. Heitz, G., Koller, D. (2008). Learning spatial context: Using stuff to find things. In European Conference on Computer Vision, pp. 30–43.Google Scholar
  28. Jacovitti, G., & Neri, A. (2000). Multiresolution circular harmonic decomposition. IEEE Transaction on Signal Processing, 48(11), 3242–3247.CrossRefMathSciNetGoogle Scholar
  29. Kavukcuoglu, K., Ranzato, M., Fergus, R., Le-Cun, Y. (2009). Learning invariant features through topographic filter maps. In IEEE Conference on Computer Vision and, Pattern Recognition, pp. 1605–1612.Google Scholar
  30. Kazhdan, M., Funkhouser, T., Rusinkiewicz, S. (2003). Rotation invariant spherical harmonic representation of 3D shape descriptors. In Eurographics/ACM SIGGRAPH symposium on Geometry processing, pp. 156–164.Google Scholar
  31. Kläser, A., Marszałek, M., Schmid, C. (2008). A spatio-temporal descriptor based on 3D-gradients. In British Machine Vision Conference, pp. 995–1004.Google Scholar
  32. Knopp, J., Prasad, M., Van Gool, L. (2010a). Orientation invariant 3D object classification using Hough transform based methods. In ACM Multimedia, Workshop, pp. 15–20.Google Scholar
  33. Knopp, J., Prasad, M., Willems, G., Timofte, R., Van Gool, L. (2010b). Hough transform and 3D SURF for robust three dimensional classification. In European Conference on Computer Vision, pp. 589–602.Google Scholar
  34. LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11), 2278–2324.CrossRefGoogle Scholar
  35. Lenz, R. (1990). Group theoretical methods in image processing. Berlin: Springer.CrossRefGoogle Scholar
  36. Lin, W., Liu, L., Matsushita, Y., Low, K., Liu, S. (2012). Aligning images in the wild. In IEEE Conference on Computer Vision and, Pattern Recognition, pp. 1–8.Google Scholar
  37. Liu, K., Skibbe, H., Schmidt, T., Blein, T., Palme, K., & Ronneberger, O. (2011). 3D rotation-invariant description from tensor operation on spherical HOG field. In British Machine Vision Conference, pp. 33(1-33), 12.Google Scholar
  38. Liu, K., Wang, Q., Driever, W., Ronneberger, O. (2012). 2D/3D Rotation-invariant Detection using Equivariant Filters and Kernel Weighted Mapping. In IEEE Conference on Computer Vision and, Pattern Recognition, pp. 917–924.Google Scholar
  39. Lowe, D. (2004). Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 60(2), 91–110.CrossRefGoogle Scholar
  40. Makadia, A., & Daniilidis, K. (2010). Spherical correlation of visual representations for 3D model retrieval. International Journal of Computer Vision, 89(2), 193–210.CrossRefGoogle Scholar
  41. Memisevic, R., & Hinton, G. (2010). Learning to represent spatial transformations with factored higher-order boltzmann machines. Neural Computation, 22(6), 1473–1492.CrossRefzbMATHGoogle Scholar
  42. Özuysal, M., Calonder, M., Lepetit, V., & Fua, P. (2010). Fast keypoint recognition using random ferns. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(3), 448–461.CrossRefGoogle Scholar
  43. Ponce, C., & Singer, A. (2011). Computing steerable principal components of a large set of images and their rotations. IEEE Transactions on Image Processing, 20(11), 3051–3062.CrossRefMathSciNetGoogle Scholar
  44. Reisert, M., & Burkhardt, H. (2008a). Efficient tensor voting with 3D tensorial harmonics. In CVPR Workshops.Google Scholar
  45. Reisert, M., & Burkhardt, H. (2008b). Equivariant holomorphic filters for contour denoising and rapid object detection. IEEE Transactions on Image Processing, 17(2), 190–203.CrossRefMathSciNetGoogle Scholar
  46. Reisert M., Burkhardt H. (2009) Spherical Tensor Calculus for Local Adaptive Filtering. In: Aja-Fernández S., de Luis García R., Tao D., Li X. (eds) Tensors in Image Processing and Computer Vision Advances in Pattern Recognition. Springer, USA, pp. 153–178.Google Scholar
  47. Ronneberger, O., Burkhardt, H., & Schultz, E. (2002). General-purpose Object Recognition in 3D Volume Data Sets using Gray-Scale Invariants—Classification of Airborne Pollen-Grains Recorded with a Confocal Laser Scanning Microscope. In International Conference on Pattern Recognition, 2, 290–295.Google Scholar
  48. Ronneberger, O., Liu, K., Rath, M., Ruess, D., Mueller, T., Skibbe, H., et al. (2012). ViBE-Z: a framework for 3D virtual colocalization analysis in zebrafish larval brains. Nature Methods, 9(7), 735–742.CrossRefGoogle Scholar
  49. Ronneberger, O., Wang, Q., & Burkhardt, H. (2007). 3D Invariants with High Robustness to Local Deformations for Automated Pollen Recognition (pp. 455–435). Pattern recognition: In DAGM conference on. Google Scholar
  50. Rose, M. (1957). Elementary theory of angular momentum. New York: Wiley.zbMATHGoogle Scholar
  51. Scherer, M., Walter, M., & Schreck, T. (2010). Histograms of Oriented Gradients for 3D Model Retrieval (pp. 41–48). Visualization and Computer Vision: In International Conference in Central Europe on Computer Graphics.Google Scholar
  52. Schmidt, T., Keuper, M., Pasternak, T., Palme, K., & Ronneberger, O. (2012). Modeling of Sparsely Sampled Tubular Surfaces Using Coupled Curves (pp. 83–92). Pattern recognition: In DAGM conference on.Google Scholar
  53. Schmidt, U., Roth, S. (2012). Learning rotation-aware features: From invariant priors to equivariant descriptors. In IEEE Conference on Computer Vision and, Pattern Recognition, pp. 2050–2057.Google Scholar
  54. Schultz, T., Weickert, J., & Seidel, H. (2009). A higher-order structure tensor. In D. Laidlaw & J. Weickert (Eds.), Visualization and processing of tensor fields (pp. 263–279). Berlin: Springer.CrossRefGoogle Scholar
  55. Sheng, Y., & Arsenault, H. (1986). Experiments on pattern recognition using invariant Fourier-Mellin descriptors. Journal of Optical Society of America A, 3(6), 771–776.CrossRefGoogle Scholar
  56. Shilane, P., Min, P., Kazhdan, M., Funkhouser, T. (2004). The Princeton Shape Benchmark. In International Conference on Shape Modeling and Applications, pp. 167–178.Google Scholar
  57. Skibbe, H., & Reisert, M. (2012). Circular Fourier-HOG features for rotation invariant object detection in biomedical images. In IEEE International Symposium on Biomedical Imaging, pp. 450–453.Google Scholar
  58. Skibbe, H., Reisert, M., & Burkhardt, H. (2011). SHOG-spherical HOG descriptors for rotation invariant 3D object detection. In DAGM conference on Pattern recognition, pp. 142–151.Google Scholar
  59. Skibbe, H., Reisert, M., Ronneberger, O., & Burkhardt, H. (2009). Increasing the dimension of creativity in rotation invariant feature design using 3D tensorial harmonics. In DAGM conference on Pattern recognition, pp. 141–150.Google Scholar
  60. Skibbe, H., Reisert, M., Schmidt, T., Brox, T., Ronneberger, O., Burkhardt, H. (2012). Fast rotation invariant 3D feature computation utilizing efficient local neighborhood operators. IEEE Transactions on Pattern Analysis and Machine Intelligence, 34(8):1563–1575. Software available at https://bitbucket.org/skibbe/sta-imagetoolbox Google Scholar
  61. Takacs, G., Chandrasekhar, V., Tsai, S., Chen, D., Grzeszczuk, R., Girod, B. (2010). Unified real-time tracking and recognition with rotation-invariant fast features. In IEEE Conference on Computer Vision and, Pattern Recognition, pp. 934–941.Google Scholar
  62. Vedaldi, A., Blaschko, M., Zisserman, A. (2011). Learning equivariant structured output SVM regressors. In International Conference on Computer Vision, pp. 959–966.Google Scholar
  63. Villamizar, M., Moreno-Noguer, F., Andrade-Cetto, J., Sanfeliu, A. (2010). Efficient rotation invariant object detection using boosted random ferns. In IEEE Conference on Computer Vision and, Pattern Recognition, pp. 1038–1045.Google Scholar
  64. Wang, Q., Ronneberger, O., & Burkhardt, H. (2009). Rotational invariance based on fourier analysis in polar and spherical coordinates. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31, 1715–1722.CrossRefzbMATHGoogle Scholar
  65. Wolberg, G., Zokai, S. (2000). Robust image registration using log-polar transform. In IEEE International Conference on Image Processing, pp. 493–496.Google Scholar

Copyright information

© Springer Science+Business Media New York 2013

Authors and Affiliations

  • Kun Liu
    • 1
  • Henrik Skibbe
    • 1
    • 2
  • Thorsten Schmidt
    • 1
  • Thomas Blein
    • 3
    • 4
  • Klaus Palme
    • 3
  • Thomas Brox
    • 1
  • Olaf Ronneberger
    • 1
  1. 1.Department of Computer ScienceUniversity of FreiburgFreiburgGermany
  2. 2.Integrated Systems Biology Lab., Department of Systems ScienceKyoto UniversityKyotoJapan
  3. 3.Institute of Biology II (Botany)University of FreiburgFreiburgGermany
  4. 4.Institut Jean-Pierre BourginINRA Centre de Versailles-GrignonVersaillesFrance

Personalised recommendations