Abstract
This paper describes an algorithm for building 3D maps of objects detected in the visual scene acquired in an indoor environment. One feature of the described algorithm is that it works with a standard webcam equipped with a simple devices which automatically estimates the camera orientation and its distance from the floor. Another feature is that the algorithm has a low computational complexity. The proposed algorithm first extracts from the acquired images the regions of interest (ROI) which may contain an object. The ROI’s 3D position is then estimated and a map of the environment is generated. ROI extraction is realized with an Haar-like approach. ROIs are represented with edge-based features. The edge representation is filtered with a novel fuzzy-based technique which removes edges introduced by noise. Object classification is performed with a pseudo2D-HMM algorithm. We prove the reliability of our method by discussing some critical applications in the context of human–robot interaction and robot–robot interaction. Finally, we complete our contributions via describing a case study in the robotic field and providing comprehensive experimental results showing the benefits deriving from our approach.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Besl PJ, McKay ND (1992) A method for registration of 3d shapes. IEEE Trans PAMI, 239–256
Biswas J, Veloso M (2012) Depth camera based indoor mobile robot localization and navigation. In: IEEE international conference on robotics and automation, ICRA, 14–18 May, 2012. St. Paul, Minnesota, USA, pp 1697–1702
Campbell J, Sukthankar R, Nourbakhsh I, Pahwa A (2005) A robust visual odometry and precipice detection system using consumergrade monocular vision. In: Proceedings of the 2005 IEEE international conference on robotics and automation ICRA 2005, pp 3421–3427
Canny J (1986) A computational approach to edge detection. Pattern Anal Mach Intell 8(6):679–698
Cuzzocrea A (2006) Accuracy control in compressed multidimensional data cubes for quality of answer-based OLAP tools. In: Proceedings of 18th international conference on scientific and statistical database management, SSDBM 2006, 3–5 July 2006, Vienna, Austria, pp 301–310
Cuzzocrea A (2014) Privacy and security of big data: current challenges and future research perspectives. In Proceedings of the first international workshop on privacy and secuirty of big data, PSBD@CIKM, Shanghai, China, November 7, 2014, pp 45–47
Cuzzocrea A, Furfaro F, Saccà D (2009) Enabling OLAP in mobile environments via intelligent data cube compression techniques. J Intell Inf Syst 33(2):95–143
Cuzzocrea A, Mumolo E, Moro A (2015) A classification-based algorithm for building 3d maps of environmental objects. In: 15th international conference on computational science and its applications, ICCSA, Banff, AB, Canada, June 22–25, 2015, short papers, poster papers, and Ph. D. student showcase works, pp 33–41
Cuzzocrea A, Saccà D (2010) Balancing accuracy and privacy of OLAP aggregations on data cubes. In: Proceedings of DOLAP 2010, ACM 13th international workshop on data warehousing and OLAP, Toronto, Ontario, Canada, October 30, 2010, pp 93–98
Cuzzocrea A, Saccà D, Serafino P (2006) A hierarchy-driven compression technique for advanced OLAP visualization of multidimensional data cubes. In: Proceedings of 8th international conference data warehousing and knowledge discovery, DaWaK, Krakow, Poland, September 4–8, 2006, pp 106–119
Cuzzocrea A, Saccà D, Ullman JD (2013) Big data: a research agenda. In: 17th international database engineering & applications symposium, IDEAS, Barcelona, Spain, October 09–11, 2013, pp 198–203
Cuzzocrea A, Song IY, Davis KC (2011) Analytics over large-scale multidimensional data: the big data revolution!. In: Proceedings of DOLAP, ACM 14th international workshop on data warehousing and OLAP, Glasgow, United Kingdom, October 28, 2011, pp 101–104
Eigen D, Puhrsch C, Fergus R (2014) Depth map prediction from a single image using a multi-scale deep network. CoRR, abs/1406.2283
Felzenszwalb PF, Huttenclocher DP (2004) Efficient graph-based image segmentation. Int J Comput Vis 59(2):167–181
Feng Lu, Milios Evangelos (1997) Robot pose estimation in unknown environments by matching 2d range scans. J Intell Robot Syst 18(3):249–275
Furfaro F, Cuzzocrea A, Masciari E, Saccà D, Sirangelo C (2004) Approximate query answering on sensor network data streams. In: Stefanidis A, Nittel S (eds) GeoSensor Networks. CRC Press, Boca Raton, FL, USA, pp 53–72
Geiger A, Lauer M, Wojek C, Stiller C, Urtasun R (2014) 3D traffic scene understanding from movable platforms. IEEE Trans Pattern Anal Mach Intell 36(5):1012–1025
Gerkey B, Vaughan R, Howard A (2003) The player/stage project: tools for multi-robot and distributed sensor systems. In: Proceedings of the international conference on advanced robotics, pp 317–323
Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: IEEE conference on computer vision and pattern recognition (CVPR’14), pp 580–587
Gonzalez RC, Woods RE (1992) Digital image processing, 2nd edn. Addison-Wesley Longman Publishing Co., Inc. Boston, MA, USA
Graham D, Simmons G, Nguyen DT, Zhou G (2015) A software-based sonar ranging sensor for smart phones. IEEE Internet Things J 2(6):479–489
Hafed ZM, Levine MD (2001) Face recognition using the discrete cosine transform. Int J Comput Vis 43(3):167–188
Ikehara M, Nagai T, Naruse T, Kurematsu A (2002) Hmm-based surface reconstruction from single images. In: Proceeding of IEEE international conference on image processing (ICIP), pp 561–564
Jones MJ, Viola P (2003) Fase multiview face detection. MERL Technical Report No. TR2003-96, Cambridge, MA, USA
Karsch K, Liu C, Kang SB (2012) Depth extraction from video using non-parametric sampling. In: Fitzgibbon A, Lazebnik S, Perona P, Sato Y, Schmid C (eds) Computer Vision - ECCV 2012. ECCV 2012. Lecture Notes in Computer Science, vol 7576. Springer, Berlin, Heidelberg, pp 775–788
Kawakita M, Iizuka K, Aida T, Kurita T, Kikuchi H (2004) Real-time three-dimensional video image composition by depth information. IEICE Electron Express 1:237–242
Kearns J, Saxena A, Driemeyer J, Ng A (2006) Robotic grasping of novel objects. In: Proceeding of 20th anniversary conference neural information processing systems, vol 19
Kirsch R (1971) Computer determination of the constituent structure of biological images. Comput Biomed Res 4:315–328
Konrad J, Wang M, Ishwar P (2012) 2d-to-3d image conversion by learning depth from examples. In: CVPR Workshops. pp 16–22
Ladicky L, Shi J, Pollefeys M (2014) Pulling things out of perspective. In: CVPR. pp 89–96
Ladický L, Zeisl B, Pollefeys M (2014) Discriminatively trained dense surface normal estimation. In: Fleet D, Pajdla T, Schiele B, Tuytelaars T (eds) Computer Vision - ECCV 2014. ECCV 2014. Lecture Notes in Computer Science, vol 8693. Springer, Cham, pp 468–484
Lienhart R, Maydt J (2002) An extended set of Haar-like features for rapid object detection. In: Proceedings 2002. International Conference on In Image Processing. vol 1. pp 900–903
Liu Z, Xu S, Zhang Y, Chen X, Chen CP (2014) Interval type-2 fuzzy kernel based support vector machine algorithm for scene classification of humanoid robot. Soft Comput 18(3):589–606
Liu B, Gould S, Koller D (2010) Single image depth estimation from predicted semantic labels. In: Proceedings of IEEE Internationational Conference on Computer Vision and Pattern Recognition (CVPR). pp 1253–1260
Liu M, Salzmann M, He X (2014) Discrete-continuous depth estimation from a single image. In: CVPR, pp 716–723,
Marr D, Hildreth E (1980) Theory of edge detection. Proc R Soc Lond 207:187–217
McColl D, Zhang Z, Nejat G (2011) Human body pose interpretation and classification for social human–robot interaction. Int J Soc Robot 3(3):313–332
Minguez J, Montesano L, Lamiraux F (2006) Metric-based iterative closest point scan matching for sensor displacement estimation. Trans Robot 22(5):1047–1054
Minguez J, Montesano L, Lamiraux F (2006) Metric-based iterative closest point scan matching for sensor displacement estimation. IEEE Trans Robot 22(5):1047–1054
Modayil J, Kuipers B (2006) Autonomous shape model learning for object localization and recognition. In: International conference on robotics and automation (ICRA), pp 2991–2996,
Montemerlo M, Thrun S, Koller D, Wegbreit B (2002) Fastslam: a factored solution to the simultaneous localization and mapping problem. In: Proceedings of the 18th national conference on artificial intelligence (AAAI), pp 593–598
Moro A, Mumolo E, Nolich M (2008) Visual scene analysis using relaxation labeling and embedded hidden markov models for map-based robot navigation. In: International conference on information technology interfaces ITI, pp 767–772
Mozos OM, Triebel R, Jensfelt P, Rottmann A, Burgard W (2007) Supervised semantic labeling of places using information extracted from sensor data. Robot Auton Syst 55(5):391–402
Nefian A, Hayes MH (1999) An embedded hmm-based aproach for face detection and recognition. In: International conference on acoustics, speech and signal processing, pp 3553–3556
Oquab M, Bottou L, Laptev I, Sivic J (2014) Learning and transferring midlevel image representations using convolutional neural networks. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp 1717–1724
Parent P, Zucker S (1989) Trace inference, curvature consistency and curve detection. IEEE Trans Pattern Anal Mach Intell 11(8):823–839
Peleg S, Rosenfeld A (1978) Determining compatibility coefficients for curve enhancement relaxation processes. IEEE Trans Syst Man Cybern SMC–8:548–555
Ranganathan A, Dellaert F (2007) Semantic modeling of places using objects. In: Proceedings of robotics: science and systems. doi:10.15607/RSS.2007.III.001
Rangel JC, Cazorla M, Garcia-Varea I, Martinez Gomez J, Fromont E, Sebban M (2016) Scene classification based on semantic labeling. Adv Robot 30(11–12):758–769
Romero-Cano V, Agamennoni G, Nieto J (2016) A variational approach to simultaneous multi-object tracking and classification. Int J Robot Res 35(6):654–671
Rosa Tiago, Queluz Maria Paula (2001) Authentication of digital images and video: generic models and a new contribution. Sig Proc Image Commun 16(5):461–475
Rusu RB, Maldonado A, Beetz M, Kranz M, Msenlechner L, Holleis P, Schmidt A (2006) Player/stage as middleware for ubiquitous computing. In: Proceedings of the 8th annual conference on ubiquitous computing, pp 17–21
Saxena A, Sun M, Ng AY (2009) Make3D: learning 3D scene structure from a single still image. IEEE Trans Pattern Anal Mach Intell 31(5):824–840
Schwalb M, Ewerth R, Freisleben B (2007) Using depth features to retrieve monocular video shots. In: Proceeding of ACM international conference on image and video retrieval, pp 210–217
Stark M, Zia Zeeshan M, Schindler K (2013) Explicit occlusion modeling for 3d object class representations. In: CVPR2013, pp 3326–3333
Tomono M (2006) 3-d object map building using dense object models with sift-based recognition features. In: Proceeding of IEEE international conference of intelligent robots and systems - IROS
Torralba A (2003) Contextual priming for object detection. Int J Comput Vis 53(2):161–191
Vasudevan S, Gachter S, Berger M, Siegwart R (2007) Cognitive maps for mobile robots—an object based approach. Robot Auton Syst 55(5):359–371
Yu B, Cuzzocrea A, Jeong DH, Maydebura S (2012) On managing very large sensor-network data using bigtable. In: 12th IEEE/ACM international symposium on cluster, cloud and grid computing, CCGrid 2012, Ottawa, Canada, May 13–16, 2012, pp 918–922
Zucker SW, Hummel RA, Rosenfeld A (1977) An application of relaxation labeling to line and curve enhancement. IEEE Trans Comput 26(4):394–403
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Communicated by V. Loia.
Rights and permissions
About this article
Cite this article
Cuzzocrea, A., Mumolo, E. & Grasso, G.M. Advanced pattern recognition from complex environments: a classification-based approach. Soft Comput 22, 4763–4778 (2018). https://doi.org/10.1007/s00500-017-2661-0
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-017-2661-0