Advertisement

Artificial Intelligence Review

, Volume 49, Issue 4, pp 549–580 | Cite as

Patch Autocorrelation Features: a translation and rotation invariant approach for image classification

  • Radu Tudor IonescuEmail author
  • Andreea Lavinia Ionescu
  • Josiane Mothe
  • Dan Popescu
Article

Abstract

The autocorrelation is often used in signal processing as a tool for finding repeating patterns in a signal. In image processing, there are various image analysis techniques that use the autocorrelation of an image in a broad range of applications from texture analysis to grain density estimation. This paper provides an extensive review of two recently introduced and related frameworks for image representation based on autocorrelation, namely Patch Autocorrelation Features (PAF) and Translation and Rotation Invariant Patch Autocorrelation Features (TRIPAF). The PAF approach stores a set of features obtained by comparing pairs of patches from an image. More precisely, each feature is the euclidean distance between a particular pair of patches. The proposed approach is successfully evaluated in a series of handwritten digit recognition experiments on the popular MNIST data set. However, the PAF approach has limited applications, because it is not invariant to affine transformations. More recently, the PAF approach was extended to become invariant to image transformations, including (but not limited to) translation and rotation changes. In the TRIPAF framework, several features are extracted from each image patch. Based on these features, a vector of similarity values is computed between each pair of patches. Then, the similarity vectors are clustered together such that the spatial offset between the patches of each pair is roughly the same. Finally, the mean and the standard deviation of each similarity value are computed for each group of similarity vectors. These statistics are concatenated to obtain the TRIPAF feature vector. The TRIPAF vector essentially records information about the repeating patterns within an image at various spatial offsets. After presenting the two approaches, several optical character recognition and texture classification experiments are conducted to evaluate the two approaches. Results are reported on the MNIST (98.93%), the Brodatz (96.51%), and the UIUCTex (98.31%) data sets. Both PAF and TRIPAF are fast to compute and produce compact representations in practice, while reaching accuracy levels similar to other state-of-the-art methods.

Keywords

Patch Autocorrelation Features Image autocorrelation Patch-based method Optical character recognition Texture classification Rotation invariant method Translation invariant method MNIST Brodatz UIUCTex 

Notes

Acknowledgements

The authors thank the reviewers for their helpful comments. Andreea Lavinia Ionescu has been funded by the Sectoral Operational Programme Human Resources Development 2007–2013 of the Ministry of European Funds through the Financial Agreement POSDRU/159/1.5/S/134398.

References

  1. Agarwal S, Roth D (2002) Learning a sparse representation for object detection. In: Proceedings of ECCV, pp 113–127Google Scholar
  2. Barnes C, Goldman DB, Shechtman E, Finkelstein A (2011) The patchmatch randomized matching algorithm for image manipulation. Commun ACM 54(11):103–110CrossRefGoogle Scholar
  3. Belongie S, Malik J, Puzicha J (2002) Shape matching and object recognition using shape contexts. IEEE Trans Pattern Anal Mach Intell 24(4):509–522CrossRefGoogle Scholar
  4. Bishop CM (1995) Neural networks for pattern recognition. Oxford University Press Inc, New YorkzbMATHGoogle Scholar
  5. Bosch A, Zisserman A, Munoz X (2007) Image classification using random forests and ferns. In: Proceedings of ICCV, pp 1–8Google Scholar
  6. Breiman L (2001) Random forests. Mach Learn 45(1):5–32. doi: 10.1023/A:1010933404324 CrossRefzbMATHGoogle Scholar
  7. Brochard J, Khoudeir M, Augereau B (2001) Invariant feature extraction for 3D texture analysis using the autocorrelation function. Pattern Recognit Lett 22(6–7):759–768. doi: 10.1016/S0167-8655(01)00015-0 CrossRefzbMATHGoogle Scholar
  8. Brodatz P (1966) Textures: a photographic album for artists and designers. Dover pictorial archives. Dover Publications, New YorkGoogle Scholar
  9. Caruana R, Niculescu-Mizil A (2006) An empirical comparison of supervised learning algorithms. In: Proceedings of the 23rd international conference on machine learning, New York, NY, USA, ICML ’06, pp 161–168Google Scholar
  10. Cho TS, Avidan S, Freeman WT (2010) The patch transform. IEEE Trans Pattern Anal Mach Intell 32(8):1489–1501CrossRefGoogle Scholar
  11. Ciresan DC, Meier U, Schmidhuber J (2012) Multi-column deep neural networks for image classification. In: Proceedings of CVPR, pp 3642–3649Google Scholar
  12. Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297zbMATHGoogle Scholar
  13. Csurka G, Dance CR, Fan L, Willamowski J, Bray C (2004) Visual categorization with bags of keypoints. In: Workshop on statistical learning in computer vision, ECCV, pp 1–22Google Scholar
  14. Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: Proceedings of CVPR, vol 1, pp 886–893Google Scholar
  15. Daugman JG (1985) Uncertainty relation for resolution in space, spatial frequency, and orientation optimized by two-dimensional visual cortical filters. J Opt Soc Am A 2(7):1160–1169CrossRefGoogle Scholar
  16. DeCoste D, Schölkopf B (2002) Training invariant support vector machines. Mach Learn 46(1–3):161–190. doi: 10.1023/A:1012454411458 CrossRefzbMATHGoogle Scholar
  17. Deselaers T, Keyser D, Ney H (2005) Discriminative training for object recognition using image patches. In: Proceedings of CVPR, pp 157–162Google Scholar
  18. Dinu LP, Ionescu R, Popescu M (2012) Local patch dissimilarity for images. In: Proceedings of ICONIP, vol 7663, pp 117–126Google Scholar
  19. Efros AA, Freeman WT (2001) Image quilting for texture synthesis and transfer. In: Proceedings of SIGGRAPH ’01, pp 341–346Google Scholar
  20. Falconer K (2003) Fractal geometry: mathematical foundations and applications, 2nd edn. Wiley, ChichesterCrossRefzbMATHGoogle Scholar
  21. Gonen M, Alpaydin E (2011) Multiple kernel learning algorithms. J Mach Learn Res 12:2211–2268MathSciNetzbMATHGoogle Scholar
  22. Guo G, Dyer CR (2007) Patch-based image correlation with rapid filtering. In: Proceedings of CVPRGoogle Scholar
  23. Haouas F, Dhiaf ZB, Solaiman B (2016) Fusion of spatial autocorrelation and spectral data for remote sensing image classification. In: Proceedings of ATSIP, pp 537–542Google Scholar
  24. Haralick RM, Shanmugam K, Dinstein I (1973) Textural features for image classification. IEEE Trans Syst Man Cybern 3(6):610–621CrossRefGoogle Scholar
  25. Hastie T, Tibshirani R (2003) The elements of statistical learning, corrected edn. Springer, BerlinzbMATHGoogle Scholar
  26. Horikawa Y (2004a) Comparison of support vector machines with autocorrelation kernels for invariant texture classification. In: Proceedings of ICPR, vol 1, pp 660–663. doi: 10.1109/ICPR.2004.1334253
  27. Horikawa Y (2004b) Use of autocorrelation kernels in kernel canonical correlation analysis for texture classification. In: Proceedings of ICONIP, vol 3316, pp 1235–1240Google Scholar
  28. Ionescu RT, Popescu M (2013a) Kernels for Visual Words Histograms. In: Proceedings of ICIAP, vol 8156, pp 81–90Google Scholar
  29. Ionescu RT, Popescu M (2013b) Speeding Local Patch Dissimilarity. In: Proceedings of ICIAP, vol 8156, pp 1–10Google Scholar
  30. Ionescu RT, Popescu M (2015) PQ kernel: a rank correlation kernel for visual word histograms. Pattern Recognit Lett 55:51–57. doi: 10.1016/j.patrec.2014.06.003 CrossRefGoogle Scholar
  31. Ionescu RT, Popescu M (2016) Knowledge transfer between computer vision and text mining. In: Advances in computer vision and pattern recognition. Springer, New York. doi: 10.1007/978-3-319-30367-3
  32. Ionescu RT, Popescu M, Grozea C (2013) Local learning to improve bag of visual words model for facial expression recognition. In: Workshop on challenges in representation learning, ICMLGoogle Scholar
  33. Ionescu RT, Popescu AL, Popescu D, Popescu M (2014a) Local Texton Dissimilarity with applications on biomass classification. In: Proceedings of VISAPPGoogle Scholar
  34. Ionescu RT, Popescu AL, Popescu M (2014b) Texture classification with the PQ Kernel. In: Proceedings of WSCGGoogle Scholar
  35. Ionescu RT, Popescu AL, Popescu D (2015a) Patch Autocorrelation Features for optical character recognition. In: Proceedings of VISAPPGoogle Scholar
  36. Ionescu RT, Popescu AL, Popescu D (2015b) Texture classification with Patch Autocorrelation Features. In: Proceedings of ICONIP, vol 9489, pp 1–11Google Scholar
  37. Ionescu RT, Popescu M, Cahill A (2016) String kernels for native language identification: insights from behind the curtains. Comput Linguist 42(3):491–525MathSciNetCrossRefGoogle Scholar
  38. Kameyama K, Phan TNB (2013) Image feature extraction and similarity evaluation using kernels for higher-order local autocorrelation. In: Proceedings of ICONIP, pp 442–449Google Scholar
  39. Kégl B, Busa-Fekete R (2009) Boosting products of base classifiers. In: Proceedings of ICML, pp 497–504. doi: 10.1145/1553374.1553439
  40. Keysers D, Deselaers T, Gollan C, Ney H (2007) Deformation models for image recognition. IEEE Trans Pattern Anal Mach Intell 29(8):1422–1435. doi: 10.1109/TPAMI.2007.1153 CrossRefGoogle Scholar
  41. Krizhevsky A, Sutskever I, Hinton GE (2012) ImageNet classification with deep convolutional neural networks. In: Proceedings of NIPS, pp 1106–1114Google Scholar
  42. Kuse M, Wang YF, Kalasannavar V, Khan M, Rajpoot N (2011) Local isotropic phase symmetry measure for detection of beta cells and lymphocytes. J Pathol Inform 2(2):2CrossRefGoogle Scholar
  43. Lazebnik S, Schmid C, Ponce J (2005) A sparse texture representation using local affine regions. IEEE Trans Pattern Anal Mach Intell 27(8):1265–1278CrossRefGoogle Scholar
  44. Laalaoui Y, Bouguila N (eds) (2015) Artificial intelligence applications in information and communication technologies. Springer, NewYorkGoogle Scholar
  45. Lazebnik S, Schmid C, Ponce J (2006) Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: Proceedings of CVPR, vol 2, pp 2169–2178Google Scholar
  46. LeCun Y, Jackel LD, Boser B, Denker JS, Graf HP, Guyon I, Henderson D, Howard RE, Hubbard W (1989) Handwritten digit recognition: applications of neural net chips and automatic learning. In: IEEE communications, pp 41–46Google Scholar
  47. LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324CrossRefGoogle Scholar
  48. LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436–444CrossRefGoogle Scholar
  49. Leung T, Malik J (2001) Representing and recognizing the visual appearance of materials using three-dimensional textons. Int J Comput Vis 43(1):29–44CrossRefzbMATHGoogle Scholar
  50. Liu L, Fieguth P, Kuang G, Zha H (2011) Sorted random projections for robust texture classification. In: Proceedings of ICCV, pp 391–398. doi: 10.1109/ICCV.2011.6126267
  51. Lowe DG (1999) Object Recognition from Local Scale-Invariant Features. In: Proceedings of ICCV, vol 2, pp 1150–1157Google Scholar
  52. Manning CD, Raghavan P, Schütze H (2008) Introduction to information retrieval. Cambridge University Press, New YorkCrossRefzbMATHGoogle Scholar
  53. Michaeli T, Irani M (2014) Blind deblurring using internal patch recurrence. In: Proceedings of ECCV, pp 783–798Google Scholar
  54. Nguyen HG, Fablet R, Boucher JM (2011) Visual textures as realizations of multivariate log-Gaussian Cox processes. In: Proceedings of CVPR, pp 2945–2952Google Scholar
  55. Paredes R, Prez-Cortes J, Juan A, Vidal E (2001) Local representations and a direct voting scheme for face recognition. In: Proceedings of workshop on pattern recognition in information systems, pp 71–79Google Scholar
  56. Passino G, Izquierdo E (2007) Patch-based image classification through conditional random field model. In: Proceedings of the international conference on mobile multimedia communications, pp 6:1–6:6Google Scholar
  57. Perronnin F, Dance CR (2007) Fisher kernels on visual vocabularies for image categorization. In: Proceedings of CVPRGoogle Scholar
  58. Perronnin F, Sánchez J, Mensink T (2010) Improving the Fisher kernel for large-scale image classification. In: Proceedings of ECCV, pp 143–156Google Scholar
  59. Philbin J, Chum O, Isard M, Sivic J, Zisserman A (2007) Object retrieval with large vocabularies and fast spatial matching. In: Proceedings of CVPR, pp 1–8Google Scholar
  60. Popescu AL, Popescu D, Ionescu RT, Angelescu N, Cojocaru R (2013) Efficient fractal method for texture classification. In: Proceedings of ICSCSGoogle Scholar
  61. Popovici V, Thiran J (2001) Higher order autocorrelations for pattern classification. Proceedings of ICIP 3:724–727. doi: 10.1109/ICIP.2001.958221 Google Scholar
  62. Quan Y, Xu Y, Sun Y, Luo Y (2014) Lacunarity analysis on image patterns for texture classification. In: Proceedings of CVPR, pp 160–167Google Scholar
  63. Salakhutdinov R, Hinton GE (2009) Deep Boltzmann machines. In: Proceedings of AISTATS, pp 448–455Google Scholar
  64. Shawe-Taylor J, Cristianini N (2004) Kernel methods for pattern analysis. Cambridge University Press, CambridgeCrossRefzbMATHGoogle Scholar
  65. Simard P, LeCun Y, Denker JS, Victorri B (1996) Transformation invariance in pattern recognition, tangent distance and tangent propagation. In: Neural networks: tricks of the trade, LNCS Series, vol. 7700. pp. 235–269Google Scholar
  66. Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. CoRR abs/1409.1556Google Scholar
  67. Socher R, Huval B, Bath B, Manning C, Ng A (2012) Convolutional-recursive deep learning for 3D object classification. In: Proceedings of NIPS, pp 665–673Google Scholar
  68. Srihari SN (1992) High-performance reading machines. In: Proceedings of the IEEE (special issue on optical character recognition), vol 80, no. 7, pp 1120–1132Google Scholar
  69. Suen CY, Nadal C, Legault R, Mai TA, Lam L (1992) Computer recognition of unconstrained handwritten numerals. In: Proceedings of the IEEE (special issue on optical character recognition), vol 80, no. 7, pp 1162–1180Google Scholar
  70. Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: Proceedings of CVPRGoogle Scholar
  71. Szeliski R (2010) Computer vision: algorithms and applications, 1st edn. Springer, New YorkzbMATHGoogle Scholar
  72. Teow LN, Loe KF (2002) Robust vision-based features and classification schemes for off-line handwritten digit recognition. Pattern Recognit 35(11):2355–2364. doi: 10.1016/S0031-3203(01)00228-X CrossRefzbMATHGoogle Scholar
  73. Toyoda T, Hasegawa O (2007) Extension of higher order local autocorrelation features. Pattern Recognit 40(5):1466–1473. doi: 10.1016/j.patcog.2006.10.006 CrossRefzbMATHGoogle Scholar
  74. Upton G, Cook I (2004) A dictionary of statistics. Oxford University Press, OxfordzbMATHGoogle Scholar
  75. Wilder KJ (1998) Decision tree algorithms for handwritten digit recognition. Electronic Doctoral Dissertations for UMass AmherstGoogle Scholar
  76. Yi S, Pavlovic V (2013) Spatio-temporal context modeling for BoW-based video classification. In: Proceedings of ICCV workshops, pp 779–786Google Scholar
  77. Zhang J, Marszalek M, Lazebnik S, Schmid C (2007) Local features and kernels for classification of texture and object categories: a comprehensive study. Int J Comput Vis 73(2):213–238CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media Dordrecht 2016

Authors and Affiliations

  • Radu Tudor Ionescu
    • 1
    Email author
  • Andreea Lavinia Ionescu
    • 2
  • Josiane Mothe
    • 3
  • Dan Popescu
    • 2
  1. 1.University of BucharestBucharestRomania
  2. 2.Politehnica University of BucharestBucharestRomania
  3. 3.École Supérieure du Professorat et de l’Éducation, IRIT, UMR 55005 CNRSUniversité de ToulouseToulouseFrance

Personalised recommendations