Skip to main content
Log in

Scene Classification Using Multi-Resolution WAHOLB Features and Neural Network Classifier

  • Published:
Neural Processing Letters Aims and scope Submit manuscript

Abstract

This article approaches scene classification problem by proposing an enhanced bag of features (BoF) model and a modified radial basis function neural network (RBFNN) classifier. The proposed BoF model integrates the image features extracted by histogram of oriented gradients, local binary pattern and wavelet coefficients. The extracted features are obtained in a hierarchical multi-resolution manner. The proposed approach is able to capture multi-level (the pixel-, patch-, and image-level) features. The histograms of features constructed by BoF model are then used for training a modified RBFNN classifier. As a modification, we propose using a new variant of particle swarm optimization, in which the parameters are updated adaptively, for determining the center of Gaussian functions in RBFNN. Experimental results demonstrate that our proposed approach significantly outperforms the state-of-the-art methods on scene classification of OT, FP, and LSP benchmark datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  1. Alexandridis A, Chondrodima E, Sarimveis H (2013) Radial basis function network training using a nonsymmetric partition of the input space and particle swarm optimization. IEEE Trans Neural Netw Learn 24(2):219–230. doi:10.1109/Tnnls.2012.2227794

    Article  Google Scholar 

  2. Asuncion A, Newman DJ (2007) UCI machine learning repository. University of California, Irvine. http://archive.ics.uci.edu/ml/

  3. Bolovinou A, Pratikakis I, Perantonis S (2013) Bag of spatio-visual words for context inference in scene classification. Pattern Recognit 46(3):1039–1053. doi:10.1016/j.patcog.2012.07.024

    Article  Google Scholar 

  4. Bosch A, Zisserman A, Munoz X (2006) Scene classification via pLSA. In: Proceedings of computer vision-Eccv 2006, Pt 4, 3954: 517–530

  5. Bosch A, Zisserman A, Munoz X (2007) Representing shape with a spatial pyramid kernel. In: Proceedings of the 6th ACM international conference on image and video retrieval. ACM, pp 401–408

  6. Box GE, Jenkins GM, Reinsel GC, Ljung GM (2015) Time series analysis: forecasting and control. Wiley, Hoboken

    MATH  Google Scholar 

  7. Cancelliere R, Gai M (2003) A comparative analysis of neural network performances in astronomical imaging. Appl Numer Math 45(1):87–98. doi:10.1016/S0168-9274(02)00237-4

    Article  MATH  Google Scholar 

  8. Chan T, Jia K, Gao S, Lu J, Zeng Z, Ma YP (2014) A simple deep learning baseline for image classification? arXiv preprint. arXiv preprint arXiv:1404.3606 1(3)

  9. Chen XY (2007) Deformation measurement of the large flexible surface by improved RBFNN algorithm and BPNN algorithm. In: Proceedings advances in neural networks-ISNN 2007, Pt 3, 4493: 41–48

  10. Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: Proceedings of CVPR IEEE, pp 886–893

  11. Dong C-R, Chan PP, Ng WW, Yeung DS (2011) A survey of the initialization of centers and widths in radial basis function network for classification. In: 2011 IEEE international conference on machine learning and cybernetics (ICMLC), pp 1082–1087

  12. Fan R-E, Chang K-W, Hsieh C-J, Wang X-R, Lin C-J (2008) LIBLINEAR: a library for large linear classification. J Mach Learn Res 9:1871–1874

    MATH  Google Scholar 

  13. Fan H, Zhou E (2016) Approaching human level facial landmark localization by deep learning. Image Vis Comput 47:27–35

    Article  Google Scholar 

  14. Farhidzadeh H, Zhou M, Goldgof DB, Hall LO, Raghavan M, Gatenby RA (2014) Prediction of treatment response and metastatic disease in soft tissue sarcoma. Med Imaging Comput Aided Diagn. doi:10.1117/12.2043792

    Google Scholar 

  15. Farhidzadeh H, Chaudhury B, Zhou M, Goldgof DB, Hall LO, Gatenby RA, Gillies RJ, Raghavan M (2015) Prediction of treatment outcome in soft tissue sarcoma based on radiologically defined habitats. Proc Spie. doi:10.1117/12.2082324

    Google Scholar 

  16. Farhidzadeh H, Goldgof DB, Hall LO, Gatenby RA, Gillies RJ, Raghavan M (2015) Texture feature analysis to predict metastatic and necrotic soft tissue sarcomas. IEEE Syst Man Cybern. doi:10.1109/Smc.2015.488

    Google Scholar 

  17. Farhidzadeh H, Kim JY, Scott JG, Goldgof DB, Hall LO, Harrison LB (2016) Classification of progression free survivalwith nasopharyngeal carcinoma tumors. In: SPIE medical imaging, international society for optics and photonics

  18. Fathi V, Montazer GA (2013) An improvement in RBF learning algorithm based on PSO for real time applications. Neurocomputing 111:169–176. doi:10.1016/j.neucom.2012.12.024

    Article  Google Scholar 

  19. Fei-Fei L, Perona P (2005) A Bayesian hierarchical model for learning natural scene categories 2005. In: Proceedings of IEEE computer society conference on computer vision and pattern recognition, 2: 524–531

  20. Friedman JH (1991) Multivariate adaptive regression splines. Ann Stat 19:1–67

    Article  MathSciNet  MATH  Google Scholar 

  21. Heikkila M, Pietikainen M, Schmid C (2009) Description of interest regions with local binary patterns. Pattern Recognit 42(3):425–436. doi:10.1016/j.patcog.2008.08.014

    Article  MATH  Google Scholar 

  22. Hinton GE, Osindero S, Teh Y-W (2006) A fast learning algorithm for deep belief nets. Neural Comput 18(7):1527–1554

    Article  MathSciNet  MATH  Google Scholar 

  23. Holmes CC, Mallick BK (2000) Bayesian wavelet networks for nonparametric regression. IEEE Trans Neural Netw 11(1):27–35. doi:10.1109/72.822507

    Article  Google Scholar 

  24. Huang X, Li SZ, Wang Y (2004) Shape localization based on statistical method using extended local binary pattern. In: Third international conference on IEEE image and graphics (ICIG’04), pp 184–187

  25. Jain AK, Farrokhnia F (1990) Unsupervised texture segmentation using gabor filters. In: 1990 IEEE international conference on systems, man, and cybernetics: 14–19. doi:10.1109/Icsmc.1990.142050

  26. Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems. pp 1097–1105

  27. Lazebnik S, Schmid C, Ponce J (2006) Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In: 2006 IEEE computer society conference on computer vision and pattern recognition (CVPR’06). pp 2169–2178

  28. Lin IC, Liou CY (2007) Least-mean-square training of cluster-weighted modeling. In: Proceedings artificial neural networks-ICANN, Pt 2, 4669: 301–310

  29. Loo CK, Rajeswari M, Rao MVC (2004) Novel direct and self-regulating approaches to determine optimum growing multi-experts network structure. IEEE Trans Neural Netw 15(6):1378–1395. doi:10.1109/Tnn.2004.837779

    Article  Google Scholar 

  30. Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vision 60(2):91–110. doi:10.1023/B:Visi.0000029664.99615.94

    Article  Google Scholar 

  31. Meng X, Wang Z, Wu L (2012) Building global image features for scene recognition. Pattern Recognit 45(1):373–380

    Article  Google Scholar 

  32. Montazer GA, Sabzevari R, Khatir HG (2007) Improvement of learning algorithms for RBF neural networks in a helicopter sound identification system. Neurocomputing 71(1–3):167–173. doi:10.1016/j.neucom.2007.08.002

    Article  Google Scholar 

  33. Montazer GA, Sabzevari R, Ghorbani F (2009) Three-phase strategy for the OSD learning method in RBF neural networks. Neurocomputing 72(7–9):1797–1802. doi:10.1016/j.neucom.2008.05.011

    Article  Google Scholar 

  34. Montazer GA, Soltanshahi MA, Giveki D (2015) Extended bag of visual words for face detection. Adv Comput Intell 9094:503–510. doi:10.1007/978-3-319-19258-1_41 (Pt I Iwann 2015)

    Article  MathSciNet  Google Scholar 

  35. Montazer GA, Giveki D (2015) Content based image retrieval system using clustered scale invariant feature transforms. Opt Int J Light Electron Opt 126(18):1695–1699

    Article  Google Scholar 

  36. Ojala T, Pietikainen M, Maenpaa T (2002) Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans Pattern Anal 24(7):971–987. doi:10.1109/Tpami.2002.1017623

    Article  MATH  Google Scholar 

  37. Oliva A, Torralba A (2001) Modeling the shape of the scene: a holistic representation of the spatial envelope. Int J Comput Vis 42(3):145–175. doi:10.1023/A:1011139631724

    Article  MATH  Google Scholar 

  38. Pang YW, Yan H, Yuan Y, Wang KQ (2012) Robust CoHOG feature extraction in human-centered image/video management system. IEEE Trans Syst Man Cybern B 42(2):458–468. doi:10.1109/Tsmcb.2011.2167750

    Article  Google Scholar 

  39. Patrinos P, Alexandridis A, Ninos K, Sarimveis H (2010) Variable selection in nonlinear modeling based on RBF networks and evolutionary computation. Int J Neural Syst 20(05):365–379

    Article  Google Scholar 

  40. Prechelt L (1994) Proben1: A set of neural network benchmark problems and benchmarking rules

  41. Qin J, Yung NH (2010) Scene categorization via contextual visual words. Pattern Recognit 43(5):1874–1888

    Article  MATH  Google Scholar 

  42. Qin J, Yung NH (2012) Feature fusion within local region using localized maximum-margin learning for scene categorization. Pattern Recognit 45(4):1671–1683

    Article  Google Scholar 

  43. Quelhas P, Monay F, Odobez JM, Gatica-Perez D, Tuytelaars T, Van Gool L (2005) Modeling scenes with local descriptors and latent aspects. In: IEEE international conference on computer vision: 883–890

  44. Samad T (1991) Back propagation with expected source values. Neural Netw 4(5):615–618

    Article  Google Scholar 

  45. Shi ZW, Han M (2007) Support vector echo-state machine for chaotic time-series prediction. IEEE Trans Neural Netw 18(2):359–372. doi:10.1109/Tnn.2006.885113

    Article  Google Scholar 

  46. Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556

  47. Song TC, Li HL (2013) Local polar DCT features for image description. IEEE Signal Proc Lett 20(1):59–62. doi:10.1109/Lsp.2012.2229273

    Article  MathSciNet  Google Scholar 

  48. Sun S (2013) A survey of multi-view machine learning. Neural Comput Appl 23(7–8):2031–2038

    Article  Google Scholar 

  49. Tian XL, Jiao LC, Liu XL, Zhang XH (2014) Feature integration of EODH and color-SIFT: application to image retrieval based on codebook. Signal Process Image 29(4):530–545. doi:10.1016/j.image.2014.01.010

    Article  Google Scholar 

  50. Tian Y, Luo P, Wang X, Tang X (2015) Deep learning strong parts for pedestrian detection. In: Proceedings of the IEEE international conference on computer vision. pp 1904–1912

  51. Walia E, Pal A (2014) Fusion framework for effective color image retrieval. J Vis Commun Image Represent 25(6):1335–1348. doi:10.1016/j.jvcir.2014.05.005

    Article  Google Scholar 

  52. Wang XY, Han TX, Yan SC (2009) An HOG-LBP human detector with partial occlusion handling. In: 2009 IEEE 12th international conference on computer vision (ICCV), pp 32–39. doi:10.1109/Iccv.2009.5459207

  53. Wang W, Yang X, Ooi BC, Zhang D, Zhuang Y (2016) Effective deep learning-based multi-modal retrieval. VLDB J 25(1):79–101

    Article  Google Scholar 

  54. Wang Y, Gong S (2007) Conditional random field for natural scene categorization. In: BMVC. Citeseer, pp 1–10

  55. Wang R, Tao D (2016) Non-local auto-encoder with collaborative stabilization for image restoration. IEEE Trans Image Process 25(5):2117–2129

    Article  MathSciNet  Google Scholar 

  56. Wang S, Wang Y, Zhu S-C (2012) Hierarchical space tiling for scene modeling. In: Asian conference on computer vision. Springer, pp 796–810

  57. Wang N, Yeung D-Y (2013) Learning a deep compact image representation for visual tracking. In: Advances in neural information processing systems. pp 809–817

  58. Wu JX, Rehg JM (2011) CENTRIST: a visual descriptor for scene categorization. IEEE Trans Pattern Anal 33(8):1489–1501. doi:10.1109/Tpami.2010.224

    Article  Google Scholar 

  59. Xu C, Tao D, Xu C (2013) A survey on multi-view learning. arXiv preprint arXiv:1304.5634

  60. Yu J, Qin ZC, Wan T, Zhang X (2013) Feature integration analysis of bag-of-features model for image retrieval. Neurocomputing 120:355–364. doi:10.1016/j.neucom.2012.08.061

    Article  Google Scholar 

  61. Yu J, Rui Y, Tang YY, Tao D (2014) High-order distance-based multiview stochastic learning in image classification. IEEE Trans Cybern 44(12):2431–2442

    Article  Google Scholar 

  62. Zhang WC, Shan SG, Gao W, Chen XL, Zhang HM (2005) Local gabor binary pattern histogram sequence (LGBPHS): a novel non-statistical model for face representation and recognition. IEEE Int Conf Comput Vis 1:786–791

    Google Scholar 

  63. Zhang S, Tian Q, Hua G, Huang Q, Gao W (2014) ObjectPatchNet: towards scalable and semantic image annotation and retrieval. Comput Vis Image Underst 118:16–29

    Article  Google Scholar 

  64. Zheng YB, Huang XS, Feng SJ (2010) An image matching algorithm based on combination of SIFT and the rotation invariant LBP. J Comput Aided Design Comput Gr 22(2):286–292

    Google Scholar 

  65. Zhou L, Zhou Z, Hu D (2013) Scene classification using a multi-resolution bag-of-features model. Pattern Recognit 46(1):424–433

    Article  Google Scholar 

  66. Zhu Z, Wang X, Bai S, Yao C, Bai X (2016) Deep learning representation using autoencoder for 3d shape retrieval. Neurocomputing 204:41–50

    Article  Google Scholar 

Download references

Acknowledgements

The authors are grateful to the anonymous reviewers for the insightful comments and constructive suggestions. Part of this research has been funded by Iranian Research Institute for Information Science and Technology (IranDoc) (No. TMU92-03-44).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Gholam Ali Montazer.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Montazer, G.A., Giveki, D. Scene Classification Using Multi-Resolution WAHOLB Features and Neural Network Classifier. Neural Process Lett 46, 681–704 (2017). https://doi.org/10.1007/s11063-017-9614-6

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11063-017-9614-6

Keywords

Navigation