Interactive Learning-Based Retrieval Technique for Visual Lifelogging

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11696)


Currently, there is a plethora of video wearable devices that can easily collect data from daily user life. This fact has promoted the development of lifelogging applications for security, healthcare, and leisure. However, the retrieval of not-pre-defined events is still a challenge due to the impossibility of having a potentially unlimited number of fully annotated databases covering all possible events. This work proposes an interactive and weakly supervised learning approach that is able of retrieving any kinds of events using general and weakly annotated databases. The proposed system has been evaluated with the database provided by the Lifelog Moment Retrieval (LMRT) challenge of ImageCLEF (Lifelog2018), where it reached the first position in the final ranking.


Lifelogging Deep learning Interactive Weakly annotated Event detection 


  1. 1.
    Wearable Cameras: Global Market Analysis and Forecasts, Tractica, Boulder, CO, USA, (2015)Google Scholar
  2. 2.
    Jalal, A., Uddin, M.Z., Kim, T.S.: Depth video-based human activity recognition system using translation and scaling invariant features for life logging at smart home. IEEE Trans. Consum. Electron. 58(3), 863–871 (2012)CrossRefGoogle Scholar
  3. 3.
    Doherty, A.R., et al.: Experiences of aiding autobio- graphical memory using the sensecam. Hum.-Comput. Interact. 27(1–2), 151–174 (2012)Google Scholar
  4. 4.
    Hodges, S., et al.: SenseCam: a retrospective memory aid. In: Dourish, P., Friday, A. (eds.) UbiComp 2006. LNCS, vol. 4206, pp. 177–193. Springer, Heidelberg (2006). Scholar
  5. 5.
    Lee, M.L., Dey, A.K.: Lifelogging memory appliance for people with episodic memory impairment. In: Proceedings of the 10th International Conference on Ubiquitous Computing, pp. 44–53. ACM (2008)Google Scholar
  6. 6.
    Magazine, G.: LifeLog: DARPA looking to record lives of interested parties (2013). Accessed 28 May 2018
  7. 7.
    Gemmell, J., Bell, G., Lueder, R., Drucker, S., Wong, C.: MyLifeBits: fulfilling the Memex vision. In: Proceedings of the Tenth ACM International Conference on Multimedia, pp. 235–238. ACM (2002)Google Scholar
  8. 8.
    Gemmell, J., Bell, G., Lueder, R.: MyLifeBits: a personal database for everything. Commun. ACM 49(1), 88–95 (2006)CrossRefGoogle Scholar
  9. 9.
    Gurrin, C., Joho, H., Hopfgartner, F., Zhou, L., Albatal, R.: Overview of NTCIR-12 lifelog task. In: Proceedings of the 12th NTCIR Conference on Evaluation of Information Access Technologies, Tokyo, Japan (2012)Google Scholar
  10. 10.
    Dang-Nguyen, D.T., Piras, L., Riegler, M., Boato, G., Zhou, L., Gurrin, C.: Overview of ImageCLEFlifelog 2017: lifelog retrieval and summarization. In: CLEF2017 Working Notes, Dublin, Ireland, vol. 1866 (2017)Google Scholar
  11. 11.
    Dang-Nguyen, D.T., Piras, L., Riegler, M., Zhou, L., Lux, M., Gurrin, C.: Overview of ImageCLEFlifelog 2018: daily living understanding and lifelog moment retrieval. In: CLEF2018 Working Notes. CEUR Workshop Proceedings (2018)Google Scholar
  12. 12.
    Ionescu, B., et al.: Overview of ImageCLEF 2018: challenges, datasets and evaluation. In: Bellot, P., et al. (eds.) CLEF 2018. LNCS, vol. 11018, pp. 309–334. Springer, Cham (2018). Scholar
  13. 13.
    Gygli, M., Grabner, H., Van Gool, L.: Video summarization by learning submodular mixtures of objectives. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3090–3098 (2015)Google Scholar
  14. 14.
    Lin, Y.-L., Morariu, V., Hsu, W.: Summarizing while recording: context-based highlight detection for egocentric videos. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 51–59 (2015)Google Scholar
  15. 15.
    Money, A.G., Agius, H.: Video summarisation: a conceptual frame- work and survey of the state of the art. J. Vis. Commun. Image Represent. 19(2), 121–143 (2008)CrossRefGoogle Scholar
  16. 16.
    Bolanos, M., Dimiccoli, M., Radeva, P.: Towards storytelling from visual lifelogging: an overview, arXiv preprint arXiv:1507.06120 (2015)
  17. 17.
    Betancourt, A., Morerio, P., Regazzoni, C.S., Rauterberg, M.: The evolution of first person vision methods: a survey. IEEE Trans. Circ. Syst. Video Technol. 25(5), 744–760 (2015)CrossRefGoogle Scholar
  18. 18.
    Lee, Y.J., Grauman, K.: Predicting important objects for egocentric summarization. Int. J. Comput. Vis. 114, 38–55 (2015)MathSciNetCrossRefGoogle Scholar
  19. 19.
    Tan, C., Goh, H., Chandrasekhar, V., Li, L., Lim, J.H.: Understanding the nature of first-person videos: characterization and classification using low-level features. In: IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 549–556. IEEE (2014)Google Scholar
  20. 20.
    Bolanos, M., Dimiccoli, M., Radeva, P.: Toward storytelling from visual lifelogging: an overview. IEEE Trans. Hum.-Mach. Syst. 47(1), 77–90 (2017)Google Scholar
  21. 21.
    Aghazadeh, O., Sullivan, J., Carlsson, S.: Novelty detection from an ego-centric perspective. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 3297–3304 (2011)Google Scholar
  22. 22.
    Wang, Z., Hoffman, M.D., Cook, P.R., Li, K.: Vferret: content-based similarity search tool for continuous archived video. In: ACM Workshop on Continuous Archival and Retrieval of Personal Experiences, pp. 19–26 (2006)Google Scholar
  23. 23.
    Wang, P., Smeaton, A.F.: Semantics-based selection of everyday concepts in visual lifelogging. Int. J. Multimedia Inf. Retrieval 1(2), 87–101 (2012)CrossRefGoogle Scholar
  24. 24.
    Min, W., Li, X., Tan, C., Mandal, B., Li, L., Lim, J.H.: Efficient retrieval from large-scale egocentric visual data using a sparse graph representation. In: IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 541–548 (2014)Google Scholar
  25. 25.
    Chandrasekhar, V., Tan, C., Min, W., Liyuan, L., Xiaoli, L., Hwee, L.J.: Incremental graph clustering for efficient retrieval from streaming egocentric video data. In: IEEE International Conference on Pattern Recognition, pp. 2631–2636 (2014)Google Scholar
  26. 26.
    Radeva, P., Aksasse, B., Ouanan, M.: Using content-based image retrieval to automatically assess day similarity in visual lifelogs. In: 2017 Intelligent Systems and Computer Vision (ISCV). IEEE (2017)Google Scholar
  27. 27.
    Penna, A., Mohammadi, S., Jojic, N., Murino, V.: Summarization and classification of wearable camera streams by learning the distributions over deep features of out-of-sample image sequences. In: IEEE International Conference on Computer Vision (ICCV), Venice, pp. 4336–4344 (2017)Google Scholar
  28. 28.
    Rabiner, L.R.: A tutorial on hidden Markov models and selected applications in speech recognition. Proc. IEEE 77(2), 257–286 (1989)CrossRefGoogle Scholar
  29. 29.
    Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)Google Scholar
  30. 30.
    Szegedy, C., et al.: Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–9 (2015)Google Scholar
  31. 31. Accessed 25 Aug 2018

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Department of Information and Communication Systems EngineeringUniversity of the AegeanSamosGreece
  2. 2.Grupo de Tratamiento de Imágenes, Information Processing and Telecommunications Center (IPTC) and ETSI TelecomunicacioìnUniversidad Politécnica de MadridMadridSpain

Personalised recommendations