Automatic Annotation of Daily Activity from Smartphone-Based Multisensory Streams

  • Jihun Hamm
  • Benjamin Stone
  • Mikhail Belkin
  • Simon Dennis
Part of the Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering book series (LNICST, volume 110)


We present a system for automatic annotation of daily experience from multisensory streams on smartphones. Using smartphones as platform facilitates collection of naturalistic daily activity, which is difficult to collect with multiple on-body sensors or array of sensors affixed to indoor locations. However, recognizing daily activities in unconstrained settings is more challenging than in controlled environments: 1) multiples heterogeneous sensors equipped in smartphones are noisier, asynchronous, vary in sampling rates and can have missing data; 2) unconstrained daily activities are continuous, can occur concurrently, and have fuzzy onset and offset boundaries; 3) ground-truth labels obtained from the user’s self-report can be erroneous and accurate only in a coarse time scale. To handle these problems, we present in this paper a flexible framework for incorporating heterogeneous sensory modalities combined with state-of-the-art classifiers for sequence labeling. We evaluate the system with real-life data containing 11721 minutes of multisensory recordings, and demonstrate the accuracy and efficiency of the proposed system for practical lifelogging applications.


mobile computing lifelogging activity recognition automatic annotation 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Altun, K., Barshan, B.: Human Activity Recognition Using Inertial/Magnetic Sensor Units. In: Salah, A.A., Gevers, T., Sebe, N., Vinciarelli, A. (eds.) HBU 2010. LNCS, vol. 6219, pp. 38–51. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  2. 2.
    Altun, Y., Tsochantaridis, I., Hofmann, T.: Hidden markov support vector machines. In: Proceedings of the Twentieth International Conference on Machine Learning, pp. 3–10. AAAI Press (2003)Google Scholar
  3. 3.
    Bao, L., Intille, S.S.: Activity Recognition from User-Annotated Acceleration Data. In: Ferscha, A., Mattern, F. (eds.) PERVASIVE 2004. LNCS, vol. 3001, pp. 1–17. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  4. 4.
    Bieber, G., Voskamp, J., Urban, B.: Activity Recognition for Everyday Life on Mobile Phones. In: Stephanidis, C. (ed.) UAHCI 2009, Part II. LNCS, vol. 5615, pp. 289–296. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  5. 5.
    Blighe, M., le Borgne, H., O’Connor, N., Smeaton, A.F., Jones., G.: Exploiting context information to aid landmark detection in sensecam images. In: ECHISE 2006 - 2nd International Workshop on Exploiting Context Histories in Smart Environments - Infrastructures and Design, Ubicomp (2006)Google Scholar
  6. 6.
    Chang, C.-C., Lin, C.-J.: LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology 2, 27:1–27:27 (2011)Google Scholar
  7. 7.
    Chennuru, S.K., Chen, P.-W., Zhu, J., Zhang, Y.: Mobile lifelogger - recording, indexing, and understanding a mobile user’s life. In: Proceedings of the Second International Conference on Mobile Computing, Applications, and Services, Santa Clara, CA, USA (2010)Google Scholar
  8. 8.
    Choudhury, T., Borriello, G., Consolvo, S., Haehnel, D., Harrison, B., Hemingway, B., Hightower, J., Klasnja, P., Koscher, K., LaMarca, A., Landay, J.A., LeGrand, L., Lester, J., Rahimi, A., Rea, A., Wyatt, D.: The mobile sensing platform: An embedded activity recognition system. IEEE Pervasive Computing 7, 32–41 (2008)CrossRefGoogle Scholar
  9. 9.
    Chung, P.C., Liu, C.-D.: A daily behavior enabled hidden markov model for human behavior understanding. Pattern Recogn. 41(5), 1589–1597 (2008)CrossRefzbMATHGoogle Scholar
  10. 10.
    Doherty, A.R., Caprani, N., Conaire, C., Kalnikaite, V., Gurrin, C., Smeaton, A.F., O’Connor, N.E.: Passively recognising human activities through lifelogging. Comput. Hum. Behav. 27(5), 1948–1958 (2011)CrossRefGoogle Scholar
  11. 11.
    Doherty, A.R., Ó Conaire, C., Blighe, M., Smeaton, A.F., O’Connor, N.E.: Combining image descriptors to effectively retrieve events from visual lifelogs. In: Proceedings of the 1st ACM International Conference on Multimedia Information Retrieval, pp. 10–17. ACM, New York (2008)Google Scholar
  12. 12.
    Farrahi, K., Gatica-Perez, D.: Discovering routines from large-scale human locations using probabilistic topic models. ACM Trans. Intell. Syst. Technol. 2(1), 3:1–3:27 (2011)Google Scholar
  13. 13.
    Fei-Fei, L., Perona, P.: A bayesian hierarchical model for learning natural scene categories. In: 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 524–531 (2005)Google Scholar
  14. 14.
    Gemmell, J., Bell, G., Leuder, R., Drucker, S., Wong, C.: MyLifeBits: Fulfilling the Memex Vision. In: Proceedings of Multimedia 2002 (2002)Google Scholar
  15. 15.
    Ghahramani, Z., Jordan, M.I.: Factorial hidden markov models. Mach. Learn. 29(2-3), 245–273 (1997)CrossRefzbMATHGoogle Scholar
  16. 16.
    Gu, T., Wang, L., Wu, Z., Tao, X., Lu, J.: A pattern mining approach to sensor-based human activity recognition. IEEE Trans. on Knowl. and Data Eng. 23, 1359–1372 (2011)CrossRefGoogle Scholar
  17. 17.
    Huynh, T., Fritz, M., Schiele, B.: Discovery of activity patterns using topic models. In: Proceedings of the 10th International Conference on Ubiquitous Computing, pp. 10–19. ACM, New York (2008)Google Scholar
  18. 18.
    Joachims, T.: SVM-HMM : Sequence tagging with structural support vector machines (2008)Google Scholar
  19. 19.
    Kim, E., Helal, S.: Modeling Human Activity Semantics for Improved Recognition Performance. In: Hsu, C.-H., Yang, L.T., Ma, J., Zhu, C. (eds.) UIC 2011. LNCS, vol. 6905, pp. 514–528. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  20. 20.
    Kim, I.-J., Ahn, S.C., Ko, H., Kim, H.G.: Automatic lifelog media annotation based on heterogeneous sensor fusion. In: Proceedings of IEEE International Conference on Multi Sensor Fusion and Integration for Intelligent Systems, Seoul, Korea, August 20-22 (2008)Google Scholar
  21. 21.
    Lafferty, J.D., McCallum, A., Pereira, F.C.N.: Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In: Proceedings of the Eighteenth International Conference on Machine Learning, pp. 282–289. Morgan Kaufmann Publishers Inc., San Francisco (2001)Google Scholar
  22. 22.
    Lane, N.D., Miluzzo, E., Lu, H., Peebles, D., Choudhury, T., Campbell, A.T.: A survey of mobile phone sensing. Comm. Mag. 48, 140–150 (2010)CrossRefGoogle Scholar
  23. 23.
    Lara, O.D., Labrador, M.A.: A survey on human activity recognition using wearable sensors. Submitted to IEEE Communications Surveys and Tutorials (2012)Google Scholar
  24. 24.
    Lara, O.D., Perez, A.J., Labrador, M.A., Posada, J.D.: Centinela: A human activity recognition system based on acceleration and vital sign data. In: Pervasive and Mobile Computing (2011)Google Scholar
  25. 25.
    Lester, J., Choudhury, T., Borriello, G.: A Practical Approach to Recognizing Physical Activities. In: Fishkin, K.P., Schiele, B., Nixon, P., Quigley, A. (eds.) PERVASIVE 2006. LNCS, vol. 3968, pp. 1–16. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  26. 26.
    Li, M., Rozgić, V., Thatte, G., Lee, S., Emken, A., Annavaram, M., Mitra, U., Spruijt-Metz, D., Narayanan, S.: Multimodal physical activity recognition by fusing temporal and cepstral information. IEEE Transactions on Neural Systems and Rehabilitation Engineering 18(4), 369–380 (2010)CrossRefGoogle Scholar
  27. 27.
    Logan, B., Healey, J., Philipose, M., Tapia, E.M., Intille, S.: A Long-Term Evaluation of Sensing Modalities for Activity Recognition. In: Krumm, J., Abowd, G.D., Seneviratne, A., Strang, T. (eds.) UbiComp 2007. LNCS, vol. 4717, pp. 483–500. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  28. 28.
    Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vision 60(2), 91–110 (2004)CrossRefGoogle Scholar
  29. 29.
    Manjunath, B.S., Salembier, P., Sikora, T. (eds.): Introduction to MPEG-7: Multimedia Content Description Language. Wiley (April 2002)Google Scholar
  30. 30.
    McCallum, A., Nigam, K.: A comparison of event models for naive bayes text classification, pp. 41–48. AAAI Press (1998)Google Scholar
  31. 31.
    Nazerfard, E., Das, B., Holder, L.B., Cook, D.J.: Conditional random fields for activity recognition in smart environments. In: Proceedings of the 1st ACM International Health Informatics Symposium, pp. 282–286. ACM, New York (2010)Google Scholar
  32. 32.
    Pärkkä, J., Ermes, M., Korpipää, P., Mäntyjärvi, J., Peltola, J., Korhonen, I.: Activity classification using realistic data from wearable sensors. IEEE Transactions on Information Technology in Biomedicine 10(1), 119–128 (2006)CrossRefGoogle Scholar
  33. 33.
    Penttilä, J., Peltola, J., Seppänen, T.: A speech/music discriminator-based audio browser with a degree of certanity measure. In: Proc. Infotech Oulu Int. Workshop Information Retrieval, pp. 125–131 (2001)Google Scholar
  34. 34.
    Pijl, M., van de Par, S., Shan, C.: An event-based approach to multi-modal activity modeling and recognition. In: Eigth Annual IEEE International Conference on Pervasive Computing and Communications, PerCom 2010, Mannheim, Germany, March 29 - April 2, pp. 98–106 (2010)Google Scholar
  35. 35.
    Rabiner, L.R.: A tutorial on hidden Markov models and selected applications in speech recognition. Proceedings of the IEEE 77(2), 257–286 (1989)CrossRefGoogle Scholar
  36. 36.
    Riboni, D., Bettini, C.: Cosar: hybrid reasoning for context-aware activity recognition. Personal and Ubiquitous Computing 15, 271–289 (2011)CrossRefGoogle Scholar
  37. 37.
    Sminchisescu, C., Kanaujia, A., Li, Z., Metaxas, D.: Conditional random fields for contextual human motion recognition. In: Proceedings of the Tenth IEEE International Conference on Computer Vision, vol. 2, pp. 1808–1815. IEEE Computer Society, Washington, DC (2005)Google Scholar
  38. 38.
    Sutton, C., McCallum, A., Rohanimanesh, K.: Dynamic conditional random fields: Factorized probabilistic models for labeling and segmenting sequence data. J. Mach. Learn. Res. 8, 693–723 (2007)zbMATHGoogle Scholar
  39. 39.
    Takata, K., Ma, J., Apduhan, B.O., Huang, R., Jin, Q.: Modeling and analyzing individual’s daily activities using lifelog. In: Proceedings of the 2008 International Conference on Embedded Software and Systems, pp. 503–510. IEEE Computer Society, Washington, DC (2008)CrossRefGoogle Scholar
  40. 40.
    Tsochantaridis, I., Joachims, T., Hofmann, T., Altun, Y.: Large margin methods for structured and interdependent output variables. J. Mach. Learn. Res. 6, 1453–1484 (2005)MathSciNetzbMATHGoogle Scholar
  41. 41.
    Vail, D.L., Veloso, M.M., Lafferty, J.D.: Conditional random fields for activity recognition. In: Proceedings of the 6th International Joint Conference on Autonomous Agents and Multiagent Systems, pp. 235:1–235:8. ACM, New York (2007)Google Scholar
  42. 42.
    van Kasteren, T., Noulas, A., Englebienne, G., Kröse, B.: Accurate activity recognition in a home setting. In: Proceedings of the 10th International Conference on Ubiquitous Computing, pp. 1–9. ACM, New York (2008)Google Scholar
  43. 43.
    Wang, H., Huang, M., Zhu, X.: A generative probabilistic model for multi-label classification. In: Proceedings of the 2008 Eighth IEEE International Conference on Data Mining, pp. 628–637. IEEE Computer Society, Washington, DC (2008)CrossRefGoogle Scholar
  44. 44.
    Wu, P., Peng, H.-K., Zhu, J., Zhang, Y.: SensCare: Semi-automatic Activity Summarization System for Elderly Care. In: Zhang, J.Y., Wilkiewicz, J., Nahapetian, A. (eds.) MobiCASE 2011. LNICST, vol. 95, pp. 1–19. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  45. 45.
    Wu, T.Y., Hsu, J.Y.J., Chiang, Y.T.: Continuous recognition of daily activities from multiple heterogeneous sensors. In: AAAI Spring Symposium: Human Behavior Modeling, pp. 80–85. AAAI (2009)Google Scholar
  46. 46.
    Wu, T.-Y., Lian, C.-C., Hsu, J.-Y.: Joint Recognition of Multiple Concurrent Activities using Factorial Conditional Random Fields. In: AAAI Workshop on Plan, Activity, and Intent Recognition, Technical Report WS-07-09. The AAAI Press, Menlo Park (2007)Google Scholar
  47. 47.
    Wyatt, D., Choudhury, T., Kautz, H.: Capturing spontaneous conversation and social dynamics: A privacy sensitive data collection effort. In: Proc. of ICASSP (2007)Google Scholar
  48. 48.
    Zappi, P., Stiefmeier, T., Farella, E., Roggen, D., Benini, L., Tröster, G.: Activity recognition from on-body sensors by classifier fusion: Sensor scalability and robustness. In: 3rd Int. Conf. on Intelligent Sensors, Sensor Networks, and Information Processing, pp. 281–286 (2007)Google Scholar
  49. 49.
    Zhu, Y., Arase, Y., Xie, X., Yang, Q.: Bayesian nonparametric modeling of user activities. In: Proceedings of the 2011 International Workshop on Trajectory Data Mining and Analysis, pp. 1–4. ACM, New York (2011)Google Scholar

Copyright information

© ICST Institute for Computer Science, Social Informatics and Telecommunications Engineering 2013

Authors and Affiliations

  • Jihun Hamm
    • 1
  • Benjamin Stone
    • 2
  • Mikhail Belkin
    • 1
  • Simon Dennis
    • 2
  1. 1.Dept. Computer Science and EngineeringThe Ohio State UniversityColumbusUSA
  2. 2.Dept. PsychologyThe Ohio State UniversityColumbusUSA

Personalised recommendations