Robust Workflow Recognition Using Holistic Features and Outlier-Tolerant Fused Hidden Markov Models

  • Athanasios Voulodimos
  • Helmut Grabner
  • Dimitrios Kosmopoulos
  • Luc Van Gool
  • Theodora Varvarigou
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6352)


Monitoring real world environments such as industrial scenes is a challenging task due to heavy occlusions, resemblance of different processes, frequent illumination changes, etc. We propose a robust framework for recognizing workflows in such complex environments, boasting a threefold contribution: Firstly, we employ a novel holistic scene descriptor to efficiently and robustly model complex scenes, thus bypassing the very challenging tasks of target recognition and tracking. Secondly, we handle the problem of limited visibility and occlusions by exploiting redundancies through the use of merged information from multiple cameras. Finally, we use the multivariate Student-t distribution as the observation likelihood of the employed Hidden Markov Models, in order to further enhance robustness. We evaluate the performance of the examined approaches under real-life visual behavior understanding scenarios and we compare and discuss the obtained results.


Hide Markov Model Multiple Camera Holistic Feature Motion History Image Scene Representation 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Zelnik-Manor, L.: Statistical analysis of dynamic actions. IEEE Trans. Pattern Anal. Mach. Intell. 28(9), 1530–1535 (2006)CrossRefGoogle Scholar
  2. 2.
    Laptev, I., Pe’rez, P.: Retrieving actions in movies. In: Proc. Int. Conf. Comp. Vis. (ICCV 2007), Rio de Janeiro, Brazil, pp. 1–8 (October 2007)Google Scholar
  3. 3.
    Bobick, A.F., Davis, J.W.: The recognition of human movement using temporal templates. IEEE Trans. Pattern Anal. Mach. Intell. 23(3), 257–267 (2001)CrossRefGoogle Scholar
  4. 4.
    Weinland, D., Ronfard, R., Boyer, E.: Free viewpoint action recognition using motion history volumes. Comput. Vis. Image Underst. 104(2), 249–257 (2006)CrossRefGoogle Scholar
  5. 5.
    Xiang, T., Gong, S.: Beyond tracking: modelling activity and understanding behaviour. International Journal of Computer Vision 67, 21–51 (2006)CrossRefGoogle Scholar
  6. 6.
    Antonakaki, P., Kosmopoulos, D., Perantonis, S.: Detecting abnormal human behaviour using multiple cameras. Signal Processing 89(9), 1723–1738 (2009)zbMATHCrossRefGoogle Scholar
  7. 7.
    Lao, W., Han, J., de With, P.H.N.: Automatic video-based human motion analyzer for consumer surveillance system. IEEE Trans. on Consumer Electronics 55(2), 591–598 (2009)Google Scholar
  8. 8.
    Bregler, C., Malik, J.: Learning appearance based models: Mixtures of second moment experts. In: Mozer, M.C., Jordan, M.I., Petsche, T. (eds.) Advances in Neural Information Processing Systems, vol. 9, p. 845. The MIT Press, Cambridge (1997)Google Scholar
  9. 9.
    Ivanov, Y.A., Bobick, A.F.: Recognition of visual activities and interactions by stochastic parsing. IEEE Trans. Pattern Anal. Mach. Intell. 22(8), 852–872 (2000)CrossRefGoogle Scholar
  10. 10.
    Bashir, F.I., Qu, W., Khokhar, A.A., Schonfeld, D.: Hmm-based motion recognition system using segmented pca. In: ICIP, vol. 3, pp. 1288–1291 (2005)Google Scholar
  11. 11.
    Dupont, S., Luettin, J.: Audio-visual speech modeling for continuous speech recognition. IEEE Transactions on Multimedia 2(3), 141–151 (2000)CrossRefGoogle Scholar
  12. 12.
    Vogler, C., Metaxas, D.: Parallel hidden markov models for american sign language recognition, pp. 116–122 (1999)Google Scholar
  13. 13.
    Zeng, Z., Tu, J., Pianfetti, B., Huang, T.: Audiovisual affective expression recognition through multistream fused hmm. IEEE Trans. Mult. 10(4), 570–577 (2008)CrossRefGoogle Scholar
  14. 14.
    Grabner, H., Bischof, H.: On-line boosting and vision. In: Proc. CVPR, vol. 1, pp. 260–267 (2006)Google Scholar
  15. 15.
    Stalder, S., Grabner, H., van Gool, L.: Exploring context to learn scene specific object detectors. In: Proc. PETS (2009)Google Scholar
  16. 16.
    Chatzis, S., Kosmopoulos, D., Varvarigou, T.: Robust sequential data modeling using an outlier tolerant hidden markov model. IEEE Transactions on Pattern Analysis and Machine Intelligence 31(9), 1657–1669 (2009)CrossRefGoogle Scholar
  17. 17.
    Stauffer, C., Grimson, W.: Adaptive background mixture models for real-time tracking. In: Proc. CVPR, vol. 2, pp. 246–252 (1999)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Athanasios Voulodimos
    • 1
  • Helmut Grabner
    • 2
  • Dimitrios Kosmopoulos
    • 3
  • Luc Van Gool
    • 2
    • 4
  • Theodora Varvarigou
    • 1
  1. 1.School of Electrical & Computer EngineeringNational Technical University of AthensGreece
  2. 2.Computer Vision LaboratoryETH ZurichSwitzerland
  3. 3.Institute of Informatics and TelecommunicationsN.C.S.R. DemokritosGreece
  4. 4.ESAT-PSI/IBBTK.U. LeuvenBelgium

Personalised recommendations