Multimedia Tools and Applications

, Volume 76, Issue 3, pp 4471–4489 | Cite as

Cauchy Estimator Discriminant Learning for RGB-D Sensor-based Scene Classification

  • Dapeng Tao
  • Xipeng Yang
  • Weifeng Liu
  • Shuifa Sun
  • Yanan Guo
  • Ying Yu
  • Jianxin Pang


Because depth information has shown its effectiveness in scene classification, RGB-D sensor-based scene classification has received wide attention. However, when images are polluted by noise in the transmission process, the recognition rate will decline significantly. Furthermore, after adopting feature representation schemes, the dimensionality of concatenated features that are extracted from the RGB image and depth image pair is very high. Therefore, a new dimensional reduction algorithm called Cauchy estimator discriminant learning (CEDL) is presented in this paper. CEDL simultaneously addresses two goals: (1) to decrease negative influences to some extent when there is noise in the input samples; (2) to preserve the local and global geometry structure of the input samples. Experiments with the frequently used NYU Depth V1 dataset suggest the effectiveness of CEDL compared with other state-of-the-art scene classification methods.


RGB-D sensors Scene classification Dimensional reduction Patch alignment framework Cauchy estimator 


  1. 1.
    Bai S (2014) Sparse code LBP and SIFT features together for scene categorization. Audio, Language and Image Processing (ICALIP), 2014 International Conference on IEEE, Jul. 2014, pp 200–205Google Scholar
  2. 2.
    Bo L, Ren X, Fox D (2011) Depth kernel descriptors for object recognition. 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems, Sep. 2011, pp 821–826Google Scholar
  3. 3.
    Bo L, Ren X, Fox D (2013) Unsupervised feature learning for RGB-D based object recognition. Springer Tracts Adv Robot 88:387–402CrossRefGoogle Scholar
  4. 4.
    Cai D, He X, Han J, Zhang H (2006) Orthogonal Laplacianfaces for face recognition. IEEE Trans Image Process 15(11):3608–3614CrossRefGoogle Scholar
  5. 5.
    Chen D, Cao X, Wen F, Sun J (2013) Blessing of dimensionality: high-dimensional feature and its efficient compression for face verification. 2013 I.E. Conference on Computer Vision and Pattern Recognition (CVPR), Jun. 2013, pp 3025–3032Google Scholar
  6. 6.
    Chen Y, Wang JZ, Krovetz R (2003) Content-based image retrieval by clustering. Digital Image Processing, pp 193–200Google Scholar
  7. 7.
    Desingh K, Krishna KM, Jawahar CV, Rajan D (2013) Depth really matters: improving visual salient region detection with depth. BMVC, pp 1–11Google Scholar
  8. 8.
    Duan L, Yue K, Jin C, Xu W, Liu W (2015) Tracing errors in probabilistic databases based on the Bayesian network. Database Systems for Advanced Applications. Springer International Publishing, Apr. 2015, pp 104–119Google Scholar
  9. 9.
    Fisher RA (1936) The use of multiple measurements in taxonomic problems. Ann Eugen 7(2):179–188CrossRefGoogle Scholar
  10. 10.
    Graham DB, Allinson NM (1998) Characterizing virtual eigensignatures for general purpose face recognition. In: Wechsler H, Phillips PJ, Bruce V, Fogelman-Soulie F, Huang TS (eds) Face recognition: from theory to applications, vol 163, pp 446–456Google Scholar
  11. 11.
    Han J, Shao L, Xu D, Shotton J (2013) Enhanced computer vision with microsoft Kinect sensor: a review. IEEE Trans Cybern 43(5):1318–1334CrossRefGoogle Scholar
  12. 12.
    He X, Niyogi P (2003) Locality preserving projections. Neural Inf Process Syst 16:153Google Scholar
  13. 13.
    Hotelling H (1933) Analysis of a complex of statistical variables into principal components. J Educ Psychol 24(6):417–441CrossRefMATHGoogle Scholar
  14. 14.
    Huang D, Shan C, Ardabilian M, Wang Y, Chen L (2011) Local binary patterns and its application to facial image analysis: a survey. IEEE Trans Syst Man Cybern Part C Appl Rev 41(6):765–781CrossRefGoogle Scholar
  15. 15.
    Janoch A, Karayev S, Jia Y, Barron JT, Fritz M, Saenko K, Darrell T (2011) A category-level 3D object dataset: putting the Kinect to work. Proceedings of ICCV Workshop on Advances in Computer Vision and Pattern Recognition, pp 141–165Google Scholar
  16. 16.
    Lazebnik S, Schmid C, Ponce J (2006) Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. IEEE International Conference on Computer Vision and Pattern Recognition, Jun. 2006, pp 2167–2178Google Scholar
  17. 17.
    Li L, Su H, Lim Y, Li F (2010) Objects as attributes for scene classification. ECCV 2010 Workshops, Sep. 2010, pp 57–69Google Scholar
  18. 18.
    Liang Y, Song M, Bu J, Chen C (2014) Colorization for gray scale facial image by locality-constrained linear coding. J Signal Process Syst 74(1):59–67CrossRefGoogle Scholar
  19. 19.
    Liu T, Tao D Classification with Noisy Labels by Importance Reweighting. IEEE Trans Pattern Anal Mach Intell (T-PAMI) doi: 10.1109/TPAMI.2015.2456899
  20. 20.
    Madokoro H, Utsumi Y, Sato K (2012) Scene classification using unsupervised neural networks for mobile robot vision. IEEE Proceedings of SICE Annual Conference, pp 1568–1573Google Scholar
  21. 21.
    Mariscal-Ramirez JA, Fernandez-Prieto JA, Canada-Bago J, Gadeo-Martos MA (2015) A new algorithm to monitor noise pollution adapted to resource-constrained devices. Multimedia Tools Appl 74:9175–9189CrossRefGoogle Scholar
  22. 22.
    Mizera I, Muller CH (2002) Breakdown points of Cauchy regression-scale estimators. Stat Probab Lett 57(1):79–89MathSciNetCrossRefMATHGoogle Scholar
  23. 23.
    Monay F, Gatica-Perez D (2003) On image auto-annotation with latent space models. Proceedings of the eleventh ACM international conference on MultimediaACM, pp 275–278Google Scholar
  24. 24.
    Roweis ST, Saul LK (2000) Nonlinear dimensionality reduction by locally linear embedding. Science 290(5500):2323–2326CrossRefGoogle Scholar
  25. 25.
    Shao L, Han J, Xu D, Shotton J (2013) Computer vision for RGB-D sensors: Kinect and its applications [special issue intro.]. IEEE Trans Cybern 43(5):1314–1317CrossRefGoogle Scholar
  26. 26.
    Shao L, Liu L, Li X (2014) Feature learning for image classification via multiobjective genetic programming. IEEE Trans Neural Netw Learn Syst 25:1359–1371CrossRefGoogle Scholar
  27. 27.
    Shao Y, Zhou Y, He X, Cai D, Bao H (2009) Semi-supervised topic modeling for image annotation. In Proceedings of the 17th ACM International Conference on Multimedia, pp 521–524Google Scholar
  28. 28.
    Silberman N, Fergus R (2011) Indoor scene segmentation using a structured light sensor. In Proceedings ICCV Workshop 3-D Representation Recognition, Nov. 2011, pp 601–608Google Scholar
  29. 29.
    Smeulders AW, Worring M, Santini S, Gupta A, Jain R (2000) Content-based image retrieval at the end of the early years. IEEE Trans Pattern Anal Mach Intell 22(12):1349–1380CrossRefGoogle Scholar
  30. 30.
    Tao D, Li X, Wu X, Maybank S (2007) General tensor discriminant analysis and Gabor features for gait recognition. IEEE Trans Pattern Anal Mach Intell 29(10):1700–1715CrossRefGoogle Scholar
  31. 31.
    Tao D, Li X, Wu X, Maybank S (2009) Geometric mean for subspace selection. IEEE Trans Pattern Anal Mach Intell 31(2):260–274CrossRefGoogle Scholar
  32. 32.
    Tenenbaum J, Silva V, Langford J (2000) A global geometric framework for nonlinear dimensionality reduction. Science 290(5500):2319–2323CrossRefGoogle Scholar
  33. 33.
    Tom M, Babu RV, Praveen RG (2015) Compressed domain human action recognition in H.264/AVC video streams. Multimedia Tools Appl 74:9328–9338CrossRefGoogle Scholar
  34. 34.
    Vailaya A, Figueiredo MAT, Jain AK, Zhang H-J (2001) Image classification for content-based indexing. IEEE Trans Image Process 10(1):117–130CrossRefMATHGoogle Scholar
  35. 35.
    Wang D (2005) The time dimension for scene analysis. IEEE Trans Neural Netw 16(6):1401–1426CrossRefGoogle Scholar
  36. 36.
    Wang X, Hou C, Pu L, Hou Y (2015) A depth estimating method from a single image using FoE CRF. Multimedia Tools Appl 74:9491–9506CrossRefGoogle Scholar
  37. 37.
    Wang X, Hou Z, Tan M, Wang Y, Wang X (2008) Corridor-scene classification for mobile robot using spiking neurons. IEEE International Conference on Natural Computation, pp 125–129Google Scholar
  38. 38.
    Wang J, Yang J, Yu K, Lv F, Huang T, Gong Y (2010) Locality-constrained linear coding for image classification. IEEE International Conference on Computer Vision and Pattern Recognition, Jun. 2010, pp 3360–3367Google Scholar
  39. 39.
    Xu C, Tao D, Xu C Multi-view intact space learning. IEEE Trans Patten Anal Mach Intell doi: 10.1109/TPAMI.2015.2417578
  40. 40.
    Yao Y, Fu Y (2012) Real-time hand pose estimation from RGB-D sensor. IEEE International Conference on Multimedia and ExpoIEEE Computer Society, Jul 2012, pp 705–710Google Scholar
  41. 41.
    Zhang T, Tao D, Li X, Yang J (2009) Patch alignment for dimensionality reduction. IEEE Trans Knowl Data Eng 21(9):1299–1313CrossRefGoogle Scholar
  42. 42.
    Zhang L, Zhang L, Tao D, Du B (2015) A sparse and discriminative tensor to vector projection for human gait feature representation. Signal Process 106:245–252CrossRefGoogle Scholar
  43. 43.
    Zhang L, Zhang Q, Zhang L, Tao D, Huang X, Du B (2015) Ensemble manifold regularized sparse low-rank approximation for multiview feature embedding. Pattern Recogn 48(10):3102–3112CrossRefGoogle Scholar
  44. 44.
    Zhu F, Shao L (2014) Weakly-supervised cross-domain dictionary learning for visual recognition. Int J Comput Vis 109:42–59CrossRefMATHGoogle Scholar
  45. 45.
    Zou H, Hastie T, Tibshirani R (2006) Sparse principal component analysis. J Comput Graph Stat 15(2):262–286MathSciNetCrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2016

Authors and Affiliations

  • Dapeng Tao
    • 1
    • 2
  • Xipeng Yang
    • 1
    • 2
  • Weifeng Liu
    • 3
  • Shuifa Sun
    • 2
  • Yanan Guo
    • 1
  • Ying Yu
    • 1
  • Jianxin Pang
    • 4
  1. 1.College of InformationYunnan UniversityKunmingChina
  2. 2.Hubei Key Laboratory of Intelligent Vision Based Monitoring for Hydroelectric EngineeringChina Three Gorges UniversityYichangChina
  3. 3.China University of Petroleum (East China)QingdaoChina
  4. 4.Shenzhen Institutes of Advanced TechnologyChinese Academy of SciencesShenzhenChina

Personalised recommendations