Egocentric Activity Monitoring and Recovery

Behera, Ardhendu; Hogg, David C.; Cohn, Anthony G.

doi:10.1007/978-3-642-37431-9_40

Ardhendu Behera²⁰,
David C. Hogg²⁰ &
Anthony G. Cohn²⁰

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 7726))

Included in the following conference series:

Asian Conference on Computer Vision

3519 Accesses
14 Citations
1 Altmetric

Abstract

This paper presents a novel approach for real-time egocentric activity recognition in which component atomic events are characterised in terms of binary relationships between parts of the body and manipulated objects. The key contribution is to summarise, within a histogram, the relationships that hold over a fixed time interval. This histogram is then classified into one of a number of atomic events. The relationships encode both the types of body parts and objects involved (e.g. wrist, hammer) together with a quantised representation of their distance apart and the normalised rate of change in this distance. The quantisation and classifier are both configured in a prior learning phase from training data. An activity is represented by a Markov model over atomic events. We show the application of the method in the prediction of the next atomic event within a manual procedure (e.g. assembling a simple device) and the detection of deviations from an expected procedure. This could be used for example in training operators in the use or servicing of a piece of equipment, or the assembly of a device from components. We evaluate our approach (’Bag-of-Relations’) on two datasets: ‘labelling and packaging bottles’ and ‘hammering nails and driving screws’, and show superior performance to existing Bag-of-Features methods that work with histograms derived from image features [1]. Finally, we show that the combination of data from vision and inertial (IMU) sensors outperforms either modality alone.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Laptev, I., Marszalek, M., Schmid, C., Rozenfeld, B.: Learning realistic human actions from movies. In: CVPR (2008)
Google Scholar
Moeslund, T.B., Hilton, A., Krüger, V.: A survey of advances in vision-based human motion capture and analysis. Computer Vision and Image Understanding 104, 90–126 (2006)
Article Google Scholar
Turaga, P.K., Chellappa, R., Subrahmanian, V.S., Udrea, O.: Machine recognition of human activities: A survey. IEEE Trans. Circuits Syst. Video Techn. 18, 1473–1488 (2008)
Article Google Scholar
Aggarwal, J.K., Ryoo, M.S.: Human activity analysis: A review. ACM Comput. Surv. 43, 1–16 (2011)
Article Google Scholar
Schüldt, C., Laptev, I., Caputo, B.: Recognizing human actions: A local SVM approach. In: ICPR, pp. 32–36 (2004)
Google Scholar
Blank, M., Gorelick, L., Shechtman, E., Irani, M., Basri, R.: Actions as space-time shapes. In: ICCV, pp. 1395–1402 (2005)
Google Scholar
Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: HMDB: A large video database for human motion recognition. In: ICCV, pp. 2556–2563 (2011)
Google Scholar
Liu, J., Luo, J., Shah, M.: Recognizing realistic actions from videos “in the wild”. In: CVPR, pp. 1996–2003 (2009)
Google Scholar
Gupta, A., Davis, L.S.: Objects in action: An approach for combining action understanding and object perception. In: CVPR (2007)
Google Scholar
Fathi, A., Ren, X., Rehg, J.M.: Learning to recognize objects in egocentric activities. In: CVPR, pp. 3281–3288 (2011)
Google Scholar
Kitani, K.M., Okabe, T., Sato, Y., Sugimoto, A.: Fast unsupervised ego-action learning for first-person sports videos. In: CVPR, pp. 3241–3248 (2011)
Google Scholar
Fathi, A., Farhadi, A., Rehg, J.M.: Understanding egocentric activities. In: ICCV, pp. 407–414 (2011)
Google Scholar
Aghazadeh, O., Sullivan, J., Carlsson, S.: Novelty detection from an ego-centric perspective. In: CVPR, pp. 3297–3304 (2011)
Google Scholar
Wanstall, B.: HUD on the Head for Combat Pilots. Interavia 44, 334–338 (1989)
Google Scholar
Damen, D., Bunnun, P., Calway, A., Mayol-Cuevas, W.: Real-time learning and detection of 3d texture-less objects: A scalable approach. In: BMVC (2012)
Google Scholar
Pinhanez, C., Bobick, A.: Human action detection using pnf propagation of temporal constraints. In: Proc. of IEEE CVPR (1998)
Google Scholar
Ryoo, M.S., Aggarwal, J.K.: Spatio-temporal relationship match: Video structure comparison for recognition of complex human activities. In: ICCV, pp. 1593–1600 (2009)
Google Scholar
Sridhar, M., Cohn, A.G., Hogg, D.C.: Unsupervised learning of event classes from video. In: AAAI (2010)
Google Scholar
Bleser, G., Hendeby, G., Miezal, M.: Using egocentric vision to achieve robust inertial body tracking under magnetic disturbances. In: ISMAR, pp. 103–109 (2011)
Google Scholar
Reiss, A., Hendeby, G., Bleser, G., Stricker, D.: Activity Recognition Using Biomechanical Model Based Pose Estimation. In: Lukowicz, P., Kunze, K., Kortuem, G. (eds.) EuroSSC 2010. LNCS, vol. 6446, pp. 42–55. Springer, Heidelberg (2010)
Chapter Google Scholar
Bobick, A.F., Davis, J.W.: The recognition of human movement using temporal templates. IEEE Trans. Pattern Anal. Mach. Intell. 23, 257–267 (2001)
Article Google Scholar
Efros, A.A., Berg, A.C., Berg, E.C., Mori, G., Malik, J.: Recognizing action at a distance. In: ICCV, pp. 726–733 (2003)
Google Scholar
Ryoo, M.S.: Human activity prediction: Early recognition of ongoing activities from streaming videos. In: ICCV, pp. 1036–1043 (2011)
Google Scholar
Lan, T., Wang, Y., Yang, W., Mori, G.: Beyond actions: Discriminative models for contextual group activities. In: NIPS, pp. 1216–1224 (2010)
Google Scholar
Shi, Y., Huang, Y., Minnen, D., Bobick, A., Essa, I.: Propagation networks for recognition of partially ordered sequential action. In: CVPR, pp. 862–869 (2004)
Google Scholar
Veres, G., Grabner, H., Middleton, L., Van Gool, L.: Automatic Workflow Monitoring in Industrial Environments. In: Kimmel, R., Klette, R., Sugimoto, A. (eds.) ACCV 2010, Part I. LNCS, vol. 6492, pp. 200–213. Springer, Heidelberg (2011)
Chapter Google Scholar
Behera, A., Cohn, A.G., Hogg, D.C.: Workflow Activity Monitoring Using Dynamics of Pair-Wise Qualitative Spatial Relations. In: Schoeffmann, K., Merialdo, B., Hauptmann, A.G., Ngo, C.-W., Andreopoulos, Y., Breiteneder, C. (eds.) MMM 2012. LNCS, vol. 7131, pp. 196–209. Springer, Heidelberg (2012)
Chapter Google Scholar
Worgan, S.F., Behera, A., Cohn, A.G., Hogg, D.C.: Exploiting petrinet structure for activity classification and user instruction within an industrial setting. In: ICMI, pp. 113–120 (2011)
Google Scholar
Starner, T., Pentland, A.: Real-time American sign language recognition from video using hidden Markov models. In: Proc. of Int’l Symposium on Computer Vision, pp. 265–270 (1995)
Google Scholar
Ward, J.A., Lukowicz, P., Troster, G., Starner, T.E.: Activity recognition of assembly tasks using body-worn microphones and accelerometers. IEEE Trans. PAMI 28, 1553–1567 (2006)
Article Google Scholar
Chang, C.C., Lin, C.J.: LIBSVM: a library for support vector machines (2001)
Google Scholar
Vedaldi, A., Zisserman, A.: Efficient additive kernels via explicit feature maps. In: CVPR, pp. 3539–3546 (2010)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Computing, University of Leeds, Leeds, LS2 9JT, UK
Ardhendu Behera, David C. Hogg & Anthony G. Cohn

Authors

Ardhendu Behera
View author publications
You can also search for this author in PubMed Google Scholar
David C. Hogg
View author publications
You can also search for this author in PubMed Google Scholar
Anthony G. Cohn
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Electrical and Computer Engineering, Seoul National University, 1 Gwanak-ro, 151-744, Gwanak-gu, Seoul, Korea
Kyoung Mu Lee
Microsoft Research Asia, No. 5, Danling st., Haidian district, 100080, Beijing, P.R. China
Yasuyuki Matsushita
School of Interactive Computing, Georgia Institute of Technology, 801 Atlantic Drive, CCB 315, 30332, Atlanta, GA, USA
James M. Rehg
Institute of Automation, National Laboratory of Pattern Recognition, Chinese Academy of Sciences, Zhong Quan Cun East Road 95, Haidian District, 100 190, Beijing, P.R. China
Zhanyi Hu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Behera, A., Hogg, D.C., Cohn, A.G. (2013). Egocentric Activity Monitoring and Recovery. In: Lee, K.M., Matsushita, Y., Rehg, J.M., Hu, Z. (eds) Computer Vision – ACCV 2012. ACCV 2012. Lecture Notes in Computer Science, vol 7726. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-37431-9_40

Download citation

DOI: https://doi.org/10.1007/978-3-642-37431-9_40
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-37430-2
Online ISBN: 978-3-642-37431-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics