Applied Intelligence

, Volume 40, Issue 2, pp 358–375 | Cite as

A machine learning based intelligent vision system for autonomous object detection and recognition

  • Dominik Maximilián Ramík
  • Christophe SabourinEmail author
  • Ramon Moreno
  • Kurosh Madani


Existing object recognition techniques often rely on human labeled data conducting to severe limitations to design a fully autonomous machine vision system. In this work, we present an intelligent machine vision system able to learn autonomously individual objects present in real environment. This system relies on salient object detection. In its design, we were inspired by early processing stages of human visual system. In this context we suggest a novel fast algorithm for visually salient object detection, robust to real-world illumination conditions. Then we use it to extract salient objects which can be efficiently used for training the machine learning-based object detection and recognition unit of the proposed system. We provide results of our salient object detection algorithm on MSRA Salient Object Database benchmark comparing its quality with other state-of-the-art approaches. The proposed system has been implemented on a humanoid robot, increasing its autonomy in learning and interaction with humans. We report and discuss the obtained results, validating the proposed concepts.


Intelligent machine vision Visual saliency Unsupervised learning Object recognition 


  1. 1.
    Achanta R, Estrada F, Wils P, Süsstrunk S (2008) Salient region detection and segmentation. In: International conference on computer vision systems (ICVS’08). Lecture notes in computer science, vol 5008. Springer, Berlin, pp 66–75 CrossRefGoogle Scholar
  2. 2.
    Achanta R, Hemami S, Estrada F, Susstrunk S (2009) Frequency-tuned salient region detection. In: IEEE international conference on computer vision and pattern recognition (CVPR) Google Scholar
  3. 3.
    An SY, Kang JG, Choi WS, Oh SY (2011) A neural network based retrainable framework for robust object recognition with application to mobile robotics. Appl Intell 35:190–210. doi: 10.1007/s10489-010-0212-9 CrossRefGoogle Scholar
  4. 4.
    Angelopoulou A, Psarrou A, Garcia Rodriguez J, Gupta G (2008) Active-gng: model acquisition and tracking in cluttered backgrounds. In: Proceeding of the 1st ACM workshop on vision networks for behavior analysis, VNBA’08. ACM, New York, pp 17–22 CrossRefGoogle Scholar
  5. 5.
    Avidan S, Shamir A (2007) Seam carving for content-aware image resizing. ACM Trans Graph 26 Google Scholar
  6. 6.
    Bay H, Tuytelaars T, Gool LJV (2006) Surf: speeded up robust features. In: Leonardis A, Bischof H, Pinz A (eds) ECCV (1). Lecture notes in computer science, vol 3951. Springer, Berlin, pp 404–417 Google Scholar
  7. 7.
    Bay H, Ess A, Tuytelaars T, Gool LV (2008) Speeded-up robust features (surf). Comput Vis Image Underst 110:346–359 CrossRefGoogle Scholar
  8. 8.
    Borba GB, Gamba HR, Marques O, Mayron LM (2006) An unsupervised method for clustering images based on their salient regions of interest. In: Proceedings of the 14th annual ACM international conference on multimedia, MULTIMEDIA’06. ACM, New York, pp 145–148 CrossRefGoogle Scholar
  9. 9.
    Bülthoff HH, Wallraven C, Giese MA (2008) Perceptual robotics. In: Siciliano B, Khatib O (eds) Springer handbook of robotics. Springer, Berlin, pp 1481–1498 CrossRefGoogle Scholar
  10. 10.
    Chen LQ, Xie X, Fan X, Ma WY, Zhang HJ, Zhou HQ (2003) A visual attention model for adapting images on small displays. Multimed Syst 9(4):353–364 CrossRefGoogle Scholar
  11. 11.
    Ekvall S, Kragic D (2005) Receptive field cooccurrence histograms for object detection. In: 2005 IEEE/RSJ international conference on intelligent robots and systems (IROS 2005), pp 84–89 CrossRefGoogle Scholar
  12. 12.
    Frintrop S, Kessel M (2009) Most salient region tracking. In: Proceedings of the 2009 IEEE international conference on robotics and automation, ICRA’09. IEEE Press, Piscataway, pp 758–763 Google Scholar
  13. 13.
    Fu K, Mui J (1981) A survey on image segmentation. Pattern Recognit 13(1):3–16 CrossRefMathSciNetGoogle Scholar
  14. 14.
    García-Rodríguez J, García-Chamizo JM (2011) Surveillance and human-computer interaction applications of self-growing models. Appl Soft Comput (in press, corrected proof) Google Scholar
  15. 15.
    Harel J, Koch C, Perona P (2007) Graph-based visual saliency. In: Advances in neural information processing systems, vol 19, pp 545–552 Google Scholar
  16. 16.
    Hossain M, Dewan M, Chae O (2012) A flexible edge matching technique for object detection in dynamic environment. Appl Intell 36:638–648. doi: 10.1007/s10489-011-0281-4 CrossRefGoogle Scholar
  17. 17.
    Hou X, Zhang L (2007) Saliency detection: a spectral residual approach. IEEE Conf Comput Vis Pattern Recognit 2(800):1–8 MathSciNetGoogle Scholar
  18. 18.
    Itti L, Koch C, Niebur E (1998) A model of saliency-based visual attention for rapid scene analysis. IEEE Trans Pattern Anal Mach Intell 20:1254–1259 CrossRefGoogle Scholar
  19. 19.
    Kursun O, Favorov OV (2010) Feature selection and extraction using an unsupervised biologically-suggested approximation to Gebelein’s maximal correlation. Int J Pattern Recognit Artif Intell 24(3):337–358. CrossRefGoogle Scholar
  20. 20.
    Liang Z, Chi Z, Fu H, Feng D (2012) Salient object detection using content-sensitive hypergraph representation and partitioning. Pattern Recognit 45(11):3886–3901 CrossRefGoogle Scholar
  21. 21.
    Liu T, Yuan Z, Sun J, Wang J, Zheng N, Tang X, Shum HY (2011) Learning to detect a salient object. IEEE Trans Pattern Anal Mach Intell 33(2):353–367 CrossRefGoogle Scholar
  22. 22.
    Lowe DG (1999) Object recognition from local scale-invariant features. In: Proceedings of the international conference on computer vision, Washington, pp 1150–1157 CrossRefGoogle Scholar
  23. 23.
    Meger D, Muja M, Helmer S, Gupta A, Gamroth C, Hoffman T, Baumann MA, Southey T, Fazli P, Wohlkinger W, Viswanathan P, Little JJ, Lowe DG, Orwell J (2010) Curious george: an integrated visual search platform. In: CRV. IEEE Press, New York, pp 107–114 Google Scholar
  24. 24.
    Mikolajczyk K, Schmid C (2005) A performance evaluation of local descriptors. IEEE Trans Pattern Anal Mach Intell 27(10):1615–1630 CrossRefGoogle Scholar
  25. 25.
    Mileva Y, Bruhn A, Weickert J (2007) Illumination-robust variational optical flow with photometric invariants. In:Hamprecht FA, Schnörr C, Jähne B (eds) DAGM-symposium. Lecture notes in computer science, vol 4713. Springer, Berlin, pp 152–162 Google Scholar
  26. 26.
    Moreno R, Graña M, Zulueta E (2010) Rgb colour gradient following colour constancy preservation. Electron Lett 46(13):908–910 CrossRefGoogle Scholar
  27. 27.
    Moreno R, Graña M, d’Anjou A (2011) Illumination source chromaticity estimation based on spherical coordinates in rgb. Electron Lett 47(1):28–30 CrossRefGoogle Scholar
  28. 28.
    Navalpakkam V, Itti L (2006) An integrated model of top-down and bottom-up attention for optimizing detection speed. In: Proceedings of the 2006 IEEE computer society conference on computer vision and pattern recognition, CVPR’06, vol 2. IEEE Computer Society, Washington, pp 2049–2056 Google Scholar
  29. 29.
    Porikli F (2005) Integral histogram: a fast way to extract histograms in Cartesian spaces. In: IEEE computer society conference on computer vision and pattern recognition, CVPR 2005, vol 1. IEEE Computer Society, Los Alamitos, pp 829–836 Google Scholar
  30. 30.
    Ramik D, Sabourin C, Madani K (2011) Hybrid salient object extraction approach with automatic estimation of visual attention scale. In: 2011 seventh international conference on signal-image technology and Internet-based systems (SITIS), pp 438–445. doi: 10.1109/SITIS.2011.31 CrossRefGoogle Scholar
  31. 31.
    Rutishauser U, Walther D, Koch C, Perona P (2004) Is bottom-up attention useful for object recognition? In: Proceedings of the 2004 IEEE computer society conference on computer vision and pattern recognition, vol 2. IEEE Press, Washington, pp 37–44 Google Scholar
  32. 32.
    Shafer SA (1985) Using color to separate reflection components. Color Res Appl 10(4):210–218 CrossRefGoogle Scholar
  33. 33.
    van de Weijer J, Gevers T (2004) Robust optical flow from photometric invariants. In: ICIP, pp 1835–1838 Google Scholar
  34. 34.
    Viola PA, Jones MJ (2004) Robust real-time face detection. Int J Comput Vis 57(2):137–154 CrossRefGoogle Scholar
  35. 35.
    Wang Y, Qi Y (2013) Memory-based cognitive modeling for robust object extraction and tracking. Appl Intell 1–16. doi: 10.1007/s10489-013-0437-5

Copyright information

© Springer Science+Business Media New York 2013

Authors and Affiliations

  • Dominik Maximilián Ramík
    • 1
  • Christophe Sabourin
    • 1
    Email author
  • Ramon Moreno
    • 2
  • Kurosh Madani
    • 1
  1. 1.LISSI EA 3956, Senart-FB Institute of TechnologyUniversity Paris Est-Creteil (UPEC)Lieusaint CedexFrance
  2. 2.Facultad de InformáticaGrupo de Inteligencia Computacional de la Universidad del Pais VascoSan SebastianSpain

Personalised recommendations