Place Recognition via 3D Modeling for Personal Activity Lifelog Using Wearable Camera

  • Hazem Wannous
  • Vladislavs Dovgalecs
  • Rémi Mégret
  • Mohamed Daoudi
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7131)


In this paper, a method for location recognition in a visual lifelog is presented. Its motivation is the detection of activity related places within an indoor environment to facilitate navigation in the lifelog. It takes advantage of a camera mounted on the shoulder, which is primarily designed for the behavioral analysis of Instrumental Activities of Daily Living (IADL). The proposed approach provides an automatic indexing of the content stream, based on the presence in specific 3D places related to instrumental activites. It relies on 3D models of the places of interest that are built thanks to a lightweight semi-supervised approach. Performance evaluation on real data show the potential of this approach compared to 2D only recognition.


Structure From Motion Place Recognition Location Recognition Wearable Camera Narrow Angle Camera 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Arya, S., Mount, D.M., Netanyahu, N.S., Silverman, R., Wu, A.Y.: An optimal algorithm for approximate nearest neighbor searching in fixed dimensions. Journal of the ACM 45(6), 891–923 (1998)CrossRefzbMATHGoogle Scholar
  2. 2.
    Blighe, M., O’Connor, N.: Myplaces: Detecting important settings in a visual diary. In: ACM International Conference on Image and Video Retrieval, Niagara Falls, Canada, July 7-9 (2008)Google Scholar
  3. 3.
    Conaire, C.Ó., Blighe, M., O’Connor, N.E.: SenseCam Image Localisation Using Hierarchical SURF Trees. In: Huet, B., Smeaton, A., Mayer-Patel, K., Avrithis, Y. (eds.) MMM 2009. LNCS, vol. 5371, pp. 15–26. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  4. 4.
    Doherty, A., O’Conaire, C., Blighe, M., Smeaton, A., O’Connor, N.: Combining image descriptors to effectively retrieve events from visual lifelogs. In: ACM Multimedia Information Retrievaly, Vancouver, Canada, October 30-31 (2008)Google Scholar
  5. 5.
    Dovgalecs, V., Mégret, R., Wannous, H., Berthoumieu, Y.: Semi-supervised learning for location recognition from wearable video. In: CBMI (2010)Google Scholar
  6. 6.
    Fischler, M.A., Bolles, R.C.: Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography. Communications of the ACM 24(6), 381–395 (1981)CrossRefGoogle Scholar
  7. 7.
    Furukawa, Y., Ponce, J.: Accurate, dense, and robust multiview stereopsis. IEEE Trans. on PAMI 32, 1362–1376 (2010)CrossRefGoogle Scholar
  8. 8.
    Irschara, A., Zach, C., Frahm, J.-M., Bischof, H.: From structure-from-motion point clouds to fast location recognition. In: CVPR, pp. 2599–2606 (2010)Google Scholar
  9. 9.
    Kang, H., Efros, A., Hebert, M., Kanade, T.: Image matching in large scale indoor environment. In: First Workshop on Egocentric Vision (2009)Google Scholar
  10. 10.
    Kourogi, M., Kurata, T.: A method of personal positioning based on sensor data fusion of wearable camera and self-contained sensors. In: IEEE Int. Conf. on Multisensor Fusion and Integration for Intelligent Systems, pp. 287–292 (2003)Google Scholar
  11. 11.
    Lourakis, M., Argyros, A.: The design and implementation of a generic sparse bundle adjustment software package based on the levenberg-marquardt algorithm. Technical Report 340, Inst. of Computer Science-FORTH, Heraklion, Crete, Greece (2004),
  12. 12.
    Lowe, D.: Distinctive image features from scale-invariant keypoints. IJCV 60(2), 91–110 (2004)CrossRefGoogle Scholar
  13. 13.
    Mégret, R., Dovgalecs, V., Wannous, H., Karaman, S., Benois-Pineau, J., El Khoury, E., Pinquier, J., Joly, P., André-Obrecht, R., Gaëstel, Y., Dartigues, J.: The IMMED project: wearable video monitoring of people with age dementia. In: ACM Multimedia, Firenze, Italy, pp. 1299–1302 (2010)Google Scholar
  14. 14.
    Nister, D., Stewenius, H.: Scalable recognition with a vocabulary tree. In: CVPR, vol. 2, pp. 2161–2168 (2006)Google Scholar
  15. 15.
    Scaramuzza, D., Martinelli, A., Siegwart, R.: A flexible technique for accurate omnidirectional camera calibration and structure from motion. In: ICVS (2006)Google Scholar
  16. 16.
    Sivic, J., Zisserman, A.: Video google: A text retrieval approach to object matching in videos. In: ICCV, vol. 2, pp. 1470–1477 (2003)Google Scholar
  17. 17.
    Snavely, N., Seitz, S.M., Szeliski, R.: Modeling the world from internet photo collections. IJCV 80(2), 189–210 (2007)CrossRefGoogle Scholar
  18. 18.
    Sundaram, S., Mayol-Cuevas, W.: High level activity recognition using low resolution wearable vision. In: First Workshop on Egocentric Vision (2009)Google Scholar
  19. 19.
    Torre, F., Hodgins, J., Montano, J., Valcarcel, S.: Detailed human data acquisition of kitchen activities: the cmu-multimodal activity database. In: HCI Workshop (2009)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Hazem Wannous
    • 1
    • 2
  • Vladislavs Dovgalecs
    • 1
  • Rémi Mégret
    • 1
  • Mohamed Daoudi
    • 2
  1. 1.IMS, UMR 5218 CNRSUniversity of BordeauxTalenceFrance
  2. 2.LIFL, UMR 8022University of LilleVilleneuve d’AscqFrance

Personalised recommendations