International Journal of Computer Vision

, Volume 118, Issue 2, pp 217–239

Multi-modal RGB–Depth–Thermal Human Body Segmentation

  • Cristina Palmero
  • Albert Clapés
  • Chris Bahnsen
  • Andreas Møgelmose
  • Thomas B. Moeslund
  • Sergio Escalera
Article

Abstract

This work addresses the problem of human body segmentation from multi-modal visual cues as a first stage of automatic human behavior analysis. We propose a novel RGB–depth–thermal dataset along with a multi-modal segmentation baseline. The several modalities are registered using a calibration device and a registration algorithm. Our baseline extracts regions of interest using background subtraction, defines a partitioning of the foreground regions into cells, computes a set of image features on those cells using different state-of-the-art feature extractions, and models the distribution of the descriptors per cell using probabilistic models. A supervised learning algorithm then fuses the output likelihoods over cells in a stacked feature vector representation. The baseline, using Gaussian mixture models for the probabilistic modeling and Random Forest for the stacked learning, is superior to other state-of-the-art methods, obtaining an overlap above 75 % on the novel dataset when compared to the manually annotated ground-truth of human segmentations.

Keywords

Human body segmentation RGB Depth Thermal 

Supplementary material

11263_2016_901_MOESM1_ESM.mp4 (26.4 mb)
Supplementary material 1 (mp4 27017 KB)

References

  1. Abidi, B. (2007). IRIS thermal/visible face database. DOE University Research Program in Robotics under grant DOE-DE-FG02-86NE37968Google Scholar
  2. Alahari, K., Seguin, G., Sivic, J., & Laptev, I. (2013). Pose estimation and segmentation of people in 3D movies. In IEEE international conference on computer vision (ICCV 2013).Google Scholar
  3. Alpert, S., Galun, M., Basri, R., & Brandt, A. (2007). Image segmentation by probabilistic bottom-up aggregation and cue integration. In IEEE conference on computer vision and pattern recognition, 2007 (CVPR ’07) (pp. 1–8). doi:10.1109/CVPR.2007.383017.
  4. Andriluka, M., Roth, S., & Schiele, B. (2009). Pictorial structures revisited: people detection and articulated pose estimation. In IEEE conference on computer vision and pattern recognition, 2009 (CVPR 2009) (pp. 1014–1021).Google Scholar
  5. Andriluka, M., Roth, S., & Schiele, B. (2010). Monocular 3D pose estimation and tracking by detection. In IEEE conference on computer vision and pattern recognition, 2010 (CVPR 2010) (pp. 623–630).Google Scholar
  6. Barbosa, I.B., Cristani, M., Del Bue, A., Bazzani, L., & Murino, V. (2012). Re-identification with RGB-D sensors. In Computer vision ECCV 2012. Workshops and demonstrations (pp. 433-442). Berlin: Springer.Google Scholar
  7. Bertozzi, M., Broggi, A., Gomez, C.H., Fedriga, R.I., Vezzoni, G., & Del Rose, M. (2007). Pedestrian detection in far infrared images based on the use of probabilistic templates. In Intelligent vehicles symposium. 2007 IEEE (pp. 327–332). Piscataway: IEEE.Google Scholar
  8. Bouguet, J. Y. (2004). Camera calibration toolbox for matlab.Google Scholar
  9. Bourdev, L., & Malik, J. (2009). Poselets: body part detectors trained using 3D human pose annotations. In IEEE 12th international conference on computer vision, 2009 (pp. 1365–1372).Google Scholar
  10. Bouwmans, T. (2011). Recent advanced statistical background modeling for foreground detection: A systematic survey. RPCS, 4(3), 147–176.CrossRefGoogle Scholar
  11. Bouwmans, T., El Baf, F., Vachon, B., et al. (2008). Background modeling using mixture of gaussians for foreground detection: A survey. Recent Patents on Computer Science, 1(3), 219–237.CrossRefGoogle Scholar
  12. Boykov, Y. Y., & Jolly, M. P. (2001). Interactive graph cuts for optimal boundary & region segmentation of objects in ND images. In Proceedings of eighth IEEE international conference on computer vision, 2001 (ICCV 2001) (Vol. 1, pp. 105–112).Google Scholar
  13. Bradski, G., & Kaehler, A. (2008). Learning OpenCV: Computer vision with the OpenCV library. Sebastopo: O’reilly.Google Scholar
  14. Bray, M., Kohli, P., & Torr, P.H.S. (2006). Posecut: Simultaneous segmentation and 3D pose estimation of humans using dynamic graph-cuts. In Computer vision–ECCV 2006 (pp. 642–655). Berlin: Springer.Google Scholar
  15. Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5–32.MathSciNetCrossRefMATHGoogle Scholar
  16. Brkić, K., Rašić, S., Pinz, A., Šegvić, S., & Kalafatić, Z. (2013). Combining spatio-temporal appearance descriptors and optical flow for human action recognition in video data. arXiv:1310.0308.
  17. Buys, K., Cagniart, C., Baksheev, A., De Laet, T., De Schutter, J., & Pantofaru, C. (2014). An adaptable system for RGB-D based human body detection and pose estimation. Journal of Visual Communication and Image Representation, 25(1), 39–52.CrossRefGoogle Scholar
  18. Camplani, M., & Salgado, L. (2014). Background foreground segmentation with RGB-D Kinect data: An efficient combination of classifiers. Journal of Visual Communication and Image Representation, 25(1), 122–136.CrossRefGoogle Scholar
  19. Carson, C., Belongie, S., Greenspan, H., & Malik, J. (2002). Blobworld: Image segmentation using expectation-maximization and its application to image querying. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(8), 1026–1038.CrossRefGoogle Scholar
  20. Charles, J., Everingham, M. (2011). Learning shape models for monocular human pose estimation from the Microsoft Xbox Kinect. In 2011 IEEE international conference on computer vision workshops (ICCV Workshops) (pp. 1202–1208).Google Scholar
  21. Chun, S.Y., Lee, C.S. (2013). Applications of human motion tracking: Smart lighting control. In 2013 IEEE conference on computer vision and pattern recognition workshops (CVPRW) (pp. 387–392).Google Scholar
  22. Clapés, A., Reyes, M., & Escalera, S. (2012). User identification and object recognition in clutter scenes based on RGB-Depth analysis. In Articulated motion and deformable objects (pp. 1–11). Berlin: Springer.Google Scholar
  23. Cohen, W. W. (2005). Stacked sequential learning. DTIC Document: Technical report.Google Scholar
  24. Dai, C., Zheng, Y., & Li, X. (2007). Pedestrian detection and tracking in infrared imagery using shape and appearance. Computer Vision and Image Understanding, 106(2), 288–299.CrossRefGoogle Scholar
  25. Dalal, N., Triggs, B. (2005). Histograms of oriented gradients for human detection. In IEEE computer society conference on computer vision and pattern recognition, 2005 (CVPR 2005) (Vol. 1, pp. 886–893).Google Scholar
  26. Dalal, N., Triggs, B., Schmid, C. (2006). Human detection using oriented histograms of flow and appearance. In Computer vision–ECCV 2006 (pp. 428–441) Berlin: Springer.Google Scholar
  27. Davis, J. W., & Sharma, V. (2004). Robust background-subtraction for person detection in thermal imagery. In IEEE international workshop on object tracking and classification beyond the visible spectrum. Google Scholar
  28. Davis, J. W., & Sharma, V. (2007). Background-subtraction using contour-based fusion of thermal and visible imagery. Computer Vision and Image Understanding, 106(2), 162–182.CrossRefGoogle Scholar
  29. Everingham, M., Van Gool, L., Williams, C. K. I., Winn, J., & Zisserman, A. (2012). The PASCAL visual object classes challenge 2012 results. See http://www.pascal-network.org/challenges/VOC/voc2012/workshop/index.html.
  30. Fanelli, G., Dantone, M., Gall, J., Fossati, A., & Van Gool, L. (2013). Random forests for real time 3D face analysis. International Journal of Computer Vision, 101(3), 437–458.CrossRefGoogle Scholar
  31. Farnebäck, G. (2003). Two-frame motion estimation based on polynomial expansion. In Image analysis (pp. 363–370) Berlin: Springer.Google Scholar
  32. Felzenszwalb, P. F., Girshick, R. B., McAllester, D., & Ramanan, D. (2010). Object detection with discriminatively trained part-based models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(9), 1627–1645.CrossRefGoogle Scholar
  33. Fernández-Caballero, A., Castillo, J. C., Serrano-Cuerda, J., & Maldonado-Bascón, S. (2011). Real-time human segmentation in infrared videos. Expert Systems with Applications, 38(3), 2577–2584.CrossRefGoogle Scholar
  34. Fernández-Sánchez, E. J., Díaz, J., & Ros, E. (2013). Background subtraction based on color and depth using active sensors. Sensors, 13(7), 8895–8915.CrossRefGoogle Scholar
  35. Fidler, S., Mottaghi, R., Yuille, A., & Urtasun, R. (2013). Bottom-up segmentation for top-down detection. In 2013 IEEE conference on computer vision and pattern recognition (CVPR) (pp. 3294–3301).Google Scholar
  36. Gade, R., & Moeslund, T. B. (2014). Thermal cameras and applications: A survey. Machine Vision and Applications, 25(1), 245–262.CrossRefGoogle Scholar
  37. Gade, R., Jorgensen, A., & Moeslund, T. B. (2013). Long-term occupancy analysis using graph-based optimisation in thermal imagery. In 2013 IEEE conference on computer vision and pattern recognition (CVPR) (pp. 3698–3705).Google Scholar
  38. Giordano, D., Palazzo, S., & Spampinato, C. (2014). Kernel density estimation using joint spatial-color-depth data for background modeling. In 2014 22nd international conference on pattern recognition (ICPR) (pp. 4388–4393). Piscataway: IEEE.Google Scholar
  39. Girshick, R. B., Felzenszwalb, P. F., & Mcallester, D.A. (2011). Object detection with grammar models. In Advances in neural information processing systems (pp. 442–450).Google Scholar
  40. Gordon, G., Darrell, T., Harville, M., & Woodfill, J. (1999). Background estimation and removal based on range and color. In IEEE computer society conference on computer vision and pattern recognition, 1999 (Vol. 2).Google Scholar
  41. Gulshan, V., Lempitsky, V., & Zisserman, A. (2011). Humanising grabCut: learning to segment humans using the Kinect. In 2011 IEEE International conference on computer vision workshops (ICCV workshops) (pp. 1127–1133).Google Scholar
  42. Hernández-Vela, A., Bautista, M. A., Perez-Sala, X., Ponce, V., Baró, X., Pujol, O., et al. (2012a). BoVDW: Bag-of-Visual-and-Depth-Words for gesture recognition. In 2012 21st International conference on pattern recognition (vICPR) (pp. 449–452). Piscataway: IEEE.Google Scholar
  43. Hernández-Vela, A., Zlateva, N., Marinov, A., Reyes, M., Radeva, P., Dimov, D., Escalera, S. (2012b). Graph cuts optimization for multi-limb human segmentation in depth maps. In 2012 IEEE conference on computer vision and pattern recognition (CVPR) (pp. 726–732).Google Scholar
  44. Hg, R. I., Jasek, P., Rofidal, C., Nasrollahi, K., Moeslund, T. B., Tranchet, G., et al. (2012). An RGB-D database using Microsoft’s Kinect for Windows for face detection. In 2012 eighth international conference on signal image technology and internet based systems (SITIS) (pp. 42–46). Piscataway: IEEE.Google Scholar
  45. Holt, B., Ong, E.J., Cooper, H., & Bowden, R. (2011). Putting the pieces together: Connected poselets for human pose estimation. In 2011 IEEE international conference on computer vision workshops (ICCV workshops) (pp. 1196–1201).Google Scholar
  46. Huynh, T., Min, R., & Dugelay, J. L. (2013). An efficient LBP-based descriptor for facial depth images applied to gender recognition using RGB-D face data. In Computer vision-ACCV 2012 workshops (pp. 133–145). Berlin: Springer.Google Scholar
  47. Irani, R., Nasrollahi, K., Oliu, M., Corneanu, C., Escalera, S., Bahnsen, C., Lundtoft, D., Moeslund, T. B., Pedersen, T., Klitgaa, M.L., & Petrini, L. (2015). Spatiotemporal analysis of rgb-d-t facial images for multi-modal pain level recognition. In IEEE conference on computer vision and pattern recognition workshop.Google Scholar
  48. Koppula, H. S., Gupta, R., & Saxena, A. (2013). Learning human activities and object affordances from RGB-D videos. The International Journal of Robotics Research, 32(8), 951–970.CrossRefGoogle Scholar
  49. Kumar, M.P., Ton, P. H. S., & Zisserman, A. (2005). Obj cut. In IEEE computer society conference on computer vision and pattern recognition, 2005 (CVPR 2005) (Vol. 1, pp. 18–25).Google Scholar
  50. Ladický, L., Sturgess, P., Alahari, K., Russell, C., & Torr, P. H. S. (2010). What, where and how many? combining object detectors and crfs. In Computer vision–ECCV 2010 (pp. 424–437) Berlin: Springer.Google Scholar
  51. Leibe, B., Leonardis, A., & Schiele, B. (2004). Combined object categorization and segmentation with an implicit shape model. In Workshop on statistical learning in computer vision, ECCV (Vol. 2, p. 7).Google Scholar
  52. Leibe, B., Leonardis, A., & Schiele, B. (2008). Robust object detection with interleaved categorization and segmentation. International Journal of Computer Vision, 77(1–3), 259–289.CrossRefGoogle Scholar
  53. Levin, A., & Weiss, Y. (2006). Learning to combine bottom-up and top-down segmentation. In Computer vision–ECCV 2006 (pp. 581–594). Berlin: Springer.Google Scholar
  54. Leykin, A., & Hammoud, R. (2006). Robust multi-pedestrian tracking in thermal-visible surveillance videos. In IEEE conference on computer vision and pattern recognition workshop 2006. (CVPRW’06) (p. 136).Google Scholar
  55. Leykin, A., Ran, Y., & Hammoud, R. (2007). Thermal-visible video fusion for moving target tracking and pedestrian classification. In IEEE conference on computer vision and pattern recognition, 2007. (CVPR’07) (pp. 1–8).Google Scholar
  56. Lin, Z., Davis, L.S., Doermann, D., & DeMenthon, D. (2007). An interactive approach to pose-assisted and appearance-based segmentation of humans. In IEEE 11th international conference on computer vision, 2007 (ICCV 2007) (pp 1–8).Google Scholar
  57. Lopes, O., Reyes, M., Escalera, S., & Gonzalez, J. (2014). Spherical blurred shape model for 3D object and pose recognition: Quantitative analysis and hci applications in smart environments.Google Scholar
  58. Martin, D., Fowlkes, C., Tal, D., & Malik, J. (2001). A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. In Proceedings of eighth IEEE international conference on computure vision, 2001 (ICCV 2001) (Vol. 2, pp. 416–423).Google Scholar
  59. Mittal, A., Zhao, L., & Davis, L. S. (2003). Human body pose estimation using silhouette shape analysis. In Proceedings of IEEE conference on advanced video and signal based surveillance, 2003 (pp 263–270).Google Scholar
  60. Moeslund, T. B. (2011). Visual analysis of humans: Looking at people. London: Springer.CrossRefGoogle Scholar
  61. Møgelmose, A., Bahnsen, C., Moeslund, T., Clapés, A., & Escalera, S. (2013). Tri-modal person re-identification with rgb, depth and thermal features. In IEEE conference on computer vision and pattern recognition workshops (CVPRW), 2013 (pp. 301–307). doi:10.1109/CVPRW.2013.52.
  62. Mori, G., Ren, X., Efros, A. A., & Malik, J. (2004). Recovering human body configurations: combining segmentation and recognition. In Proceedings of the 2004 IEEE computer society conference on computer vision and pattern recognition, 2004 (CVPR 2004) (Vol. 2, pp. II-326).Google Scholar
  63. Nghiem, A.T., Bremond, F., Thonnat, M., & Valentin, V. (2007). ETISEO, performance evaluation for video surveillance systems. In IEEE conference on advanced video signal based surveillance, 2007 (AVSS 2007) (pp. 476–481).Google Scholar
  64. Nikisins, O., Nasrollahi, K., Greitans, M., & Moeslund, T. (2014). Rgb-d-t based face recognition. In 2014 22nd international conference on pattern recognition (ICPR) (pp. 1716–1721).Google Scholar
  65. Olmeda, D., de la Escalera, A., & Armingol, J. M. (2012). Contrast invariant features for human detection in far infrared images. In 2012 IEEE on Intelligent Vehicles Symposium (IV) (pp. 117–122).Google Scholar
  66. Oreifej, O., Liu, Z. (2013). Hon4d: Histogram of oriented 4d normals for activity recognition from depth sequences. In 2013 IEEE conference on computer vision and pattern recognition (CVPR). (pp. 716–723).Google Scholar
  67. Otsu, N. (1975). A threshold selection method from gray-level histograms. Automatica, 11(285–296), 23–27.Google Scholar
  68. Pirsiavash, H., Ramanan, D. (2012). Steerable part models. In 2012 IEEE conference on computer vision and pattern recognition (CVPR). (pp. 3226–3233).Google Scholar
  69. Plagemann, C., Ganapathi, V., Koller, D., Thrun, S. (2010). Real-time identification and localization of body parts from depth images. In 2010 IEEE international conference on robotics and automation (ICRA). (pp. 3108–3113).Google Scholar
  70. Poppe, R. (2010). A survey on vision-based human action recognition. Image and Vision Computing, 28, 976–990. doi:10.1016/j.imavis.2009.11.014.CrossRefGoogle Scholar
  71. Puertas, E., Escalera, S., Pujol, O. (2013). Generalized multi-scale stacked sequential learning for multi-class classification. Pattern Analysis and Applications, 1–15Google Scholar
  72. Pugeault, N., Bowden, R. (2011). Spelling it out: Real-time asl fingerspelling recognition. In 2011 IEEE International conference on computer vision workshops (ICCV workshops). (pp. 1114–1119).Google Scholar
  73. Ramanan, D. (2006). Learning to parse images of articulated bodies. In Advances in neural information processing systems. (pp. 1129–1136).Google Scholar
  74. Rother, C., Kolmogorov, V., Blake, A. (2004). Grabcut: interactive foreground extraction using iterated graph cuts. In ACM transactions on graphics (TOG). (Vol. 23, pp. 309–314). ACM.Google Scholar
  75. Scharwächter, T., Enzweiler, M., Franke, U., Roth, S. (2013). Efficient multi-cue scene segmentation. In Pattern Recognition. (pp. 435–445).Google Scholar
  76. Schwarz, L.A., Mkhitaryan, A., Mateus, D., Navab, N. (2011). Estimating human 3D pose from time-of-flight images based on geodesic distances and optical flow. In 2011 IEEE International Conference on Automatic Face& Gesture Recognition and Workshops (FG 2011). (pp. 700–706).Google Scholar
  77. Sheasby, G., Warrell, J., Zhang, Y., Crook, N., Torr, P.H.S. (2012). Simultaneous human segmentation, depth and pose estimation via dual decomposition. In British Machine Vision Conference, Student Workshop, BMVW.Google Scholar
  78. Sheasby, G., Valentin, J., Crook, N., Torr, P. (2013). A robust stereo prior for human segmentation. In Computer Vision–ACCV 2012. (pp 94–107). Berlin: Springer.Google Scholar
  79. Shi, J., & Malik, J. (2000). Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(8), 888–905.CrossRefGoogle Scholar
  80. Shotton, J., Fitzgibbon, A., Cook, M., Sharp, T., Finocchio, M., Moore, R., Kipman, A., Blake, A. (2011). Real-time human pose recognition in parts from single depth images. In Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition. (CVPR ’11). (pp. 1297–1304). Washington, DC: IEEE Computer Society. doi:10.1109/CVPR.2011.5995316
  81. Spinello, L., Arras, K.O. (2011). People detection in RGB-D data. In 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). (pp. 3838–3843).Google Scholar
  82. Stauffer, C., Grimson, W.E.L. (1999). Adaptive background mixture models for real-time tracking. In IEEE Compututer Society Conference on Computer Vision and Pattern Recognition, 1999. (Vol. 2)Google Scholar
  83. Stefańczyk, M., & Kasprzak, W. (2012). Multimodal segmentation of dense depth maps and associated color information. In Computer vision and graphics. (pp. 626–632). Berlin: Springer.Google Scholar
  84. Suard, F., Rakotomamonjy, A., Bensrhair, A., Broggi, A. (2006). Pedestrian detection using infrared images and histograms of oriented gradients. In Intelligent Vehicles Symposium, 2006 IEEE. (pp. 206–212).Google Scholar
  85. Susperregi, L., Martínez-Otzeta, J.M., Ansuategui, A., Ibarguren, A., Sierra, B. (2013). RGB-D, laser and thermal sensor fusion for people following in a mobile robot. International Journal of Advanced Robotic Systems, 10.Google Scholar
  86. Teichman, A., & Thrun, S. (2013). Learning to segment and track in RGB-D. Algorithmic Found (pp. 575–590). Robot. X: Springer.Google Scholar
  87. Vidas, S., Lakemond, R., Denman, S., Fookes, C., Sridharan, S., & Wark, T. (2012). A mask-based approach for the geometric calibration of thermal-infrared cameras. IEEE Transactions on Instrumentation and Measurement, 61(6), 1625–1635.CrossRefGoogle Scholar
  88. Vineet, V., Sheasby, G., Warrell, J., & Torr, P. H. S. (2013). PoseField: An efficient mean-field based method for joint estimation of human pose, segmentation, and depth. In Energy minimization methods in computer vision and pattern recognition. (pp. 180–194). Berlin: Springer.Google Scholar
  89. Viola, P., Jones, M. J., & Snow, D. (2005). Detecting pedestrians using patterns of motion and appearance. International Journal of Computer Vision, 63(2), 153–161.CrossRefGoogle Scholar
  90. Wang, L., Qiao, Y., Tang, X. (2013). Motionlets: mid-level 3D parts for human motion recognition. In 2013 IEEE conference on computer vision and pattern recognition (CVPR). (pp. 2674–2681).Google Scholar
  91. Wang, W., Zhang, J., Shen, C. (2010). Improved human detection and classification in thermal images. In 2010 17th IEEE International Conference on Image Processing (ICIP). (pp. 2313–2316).Google Scholar
  92. Wang, Y., Tran, D., Liao, Z. (2011). Learning hierarchical poselets for human parsing. In 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). (pp. 1705–1712).Google Scholar
  93. Windheuser, T., Schlickewei, U., Schmidt, F.R., Cremers, D. (2011). Geometrically consistent elastic matching of 3D shapes: a linear programming solution. In 2011 IEEE international conference on computer vision (ICCV). (pp. 2134–2141).Google Scholar
  94. Wolf, C., Mille, J., Lombardi, E., Celiktutan, O., Jiu, M., Baccouche, M., Dellandréa, E., Bichot, C.E., Garcia, C., Sankur, B. (2012). The LIRIS human activities dataset and the ICPR 2012 human activities recognition and localization competition. In LIRIS Umr 5205 CNRS/INSA Lyon/Universite’Claude Bernard Lyon 1/Universite’Lumie ‘re Lyon 2/E’cole Cent.Google Scholar
  95. Xia, L., Chen, C.C., Aggarwal, J.K. (2011). Human detection using depth information by kinect. In 2011 IEEE computer society conference on computer vision and pattern recognition workshops (CVPRW). (pp. 15–22).Google Scholar
  96. Yang, Y., Ramanan, D. (2011). Articulated pose estimation with flexible mixtures-of-parts. In 2011 IEEE conference on computer vision and pattern recognition (CVPR). (pp. 1385–1392).Google Scholar
  97. Yang, Y., & Ramanan, D. (2013). Articulated human detection with flexible mixtures of parts. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(12), 2878–2890.CrossRefGoogle Scholar
  98. Yao, B., Fei-Fei, L. (2010). Grouplet: a structured image representation for recognizing human and object interactions. In 2010 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). (pp. 9–16).Google Scholar
  99. Zhang, L., Wu, B., Nevatia, R. (2007). Pedestrian detection in infrared images based on local shape features. In IEEE Conference on Computer Vision and Pattern Recognition, 2007. (CVPR’07). (pp. 1–8).Google Scholar
  100. Zhao, J., Sen-ching, S.C. (2012). Human segmentation by geometrically fusing visible-light and thermal imageries. Multimedia Tools and Applications, 1–29.Google Scholar
  101. Zhu, L., Chen, Y., Lu, Y., Lin, C., Yuille, A. (2008). Max margin and/or graph learning for parsing the human body. In IEEE conference on computer vision and pattern recognition, 2008. (CVPR 2008). (pp. 1–8).Google Scholar
  102. Zivkovic, Z. (2004). Improved adaptive Gaussian mixture model for background subtraction. In Proceedings of the 17th international conference on pattern recognition, 2004. (ICPR 2004). (Vol. 2, pp 28–31).Google Scholar

Copyright information

© Springer Science+Business Media New York 2016

Authors and Affiliations

  1. 1.Dept. Matemàtica Aplicada i AnàlisiUBBarcelonaSpain
  2. 2.Computer Vision CenterCerdanyola del VallèsSpain
  3. 3.Aalborg UniversityAalborg SVDenmark

Personalised recommendations