International Journal of Computer Vision

, Volume 118, Issue 2, pp 217–239

Multi-modal RGB–Depth–Thermal Human Body Segmentation

  • Cristina Palmero
  • Albert Clapés
  • Chris Bahnsen
  • Andreas Møgelmose
  • Thomas B. Moeslund
  • Sergio Escalera

DOI: 10.1007/s11263-016-0901-x

Cite this article as:
Palmero, C., Clapés, A., Bahnsen, C. et al. Int J Comput Vis (2016) 118: 217. doi:10.1007/s11263-016-0901-x


This work addresses the problem of human body segmentation from multi-modal visual cues as a first stage of automatic human behavior analysis. We propose a novel RGB–depth–thermal dataset along with a multi-modal segmentation baseline. The several modalities are registered using a calibration device and a registration algorithm. Our baseline extracts regions of interest using background subtraction, defines a partitioning of the foreground regions into cells, computes a set of image features on those cells using different state-of-the-art feature extractions, and models the distribution of the descriptors per cell using probabilistic models. A supervised learning algorithm then fuses the output likelihoods over cells in a stacked feature vector representation. The baseline, using Gaussian mixture models for the probabilistic modeling and Random Forest for the stacked learning, is superior to other state-of-the-art methods, obtaining an overlap above 75 % on the novel dataset when compared to the manually annotated ground-truth of human segmentations.


Human body segmentation RGB Depth Thermal 

Supplementary material

11263_2016_901_MOESM1_ESM.mp4 (26.4 mb)
Supplementary material 1 (mp4 27017 KB)

Copyright information

© Springer Science+Business Media New York 2016

Authors and Affiliations

  1. 1.Dept. Matemàtica Aplicada i AnàlisiUBBarcelonaSpain
  2. 2.Computer Vision CenterCerdanyola del VallèsSpain
  3. 3.Aalborg UniversityAalborg SVDenmark

Personalised recommendations