Skip to main content

A hybrid egocentric video summarization method to improve the healthcare for Alzheimer patients

Abstract

Alzheimer patients face difficulty to remember the identity of persons and performing daily life activities. This paper presents a hybrid method to generate the egocentric video summary of important people, objects and medicines to facilitate the Alzheimer patients to recall their deserted memories. Lifelogging video data analysis is used to recall the human memory; however, the massive amount of lifelogging data makes it a challenging task to select the most relevant content to educate the Alzheimer’s patient. To address the challenges associated with massive lifelogging content, static video summarization approach is applied to select the key-frames that are more relevant in the context of recalling the deserted memories of the Alzheimer patients. This paper consists of three main modules that are face, object, and medicine recognition. Histogram of oriented gradient features are used to train the multi-class SVM for face recognition. SURF descriptors are employed to extract the features from the input video frames that are then used to find the corresponding points between the objects in the input video and the reference objects stored in the database. Morphological operators are applied followed by the optical character recognition to recognize and tag the medicines for Alzheimer patients. The performance of the proposed system is evaluated on 18 real-world homemade videos. Experimental results signify the effectiveness of the proposed system in terms of providing the most relevant content to enhance the memory of Alzheimer patients.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

References

  • Aghdam HH, Heravi EJ, Puig D (2015) An unsupervised method for summarizing egocentric sport videos. In: Eighth international conference on machine vision (ICMV 2015)

  • Bay H, Tuytelaars T, Van Gool L (2006) SURF: speeded up robust features. In: Computer Vision—ECCV 2006. Austria

  • Blighe M, Doherty A, Smeaton AF, Connor NEO (2008) Keyframe detection in visual lifelogs. In: Conference on pervasive technologies

  • Bolanos M, Dimiccoli M, Radeva P (2017) Towards storytelling from visual lifelogging: an overview. IEEE Trans Hum Mach Syst 47:77–90

    Google Scholar 

  • Crandall D, Antani S, Kasturi R (2002) Extraction of special effects caption text events from digital video. Int J Doc Anal Recognit 5:148–150

  • Dietterich TG, Bakiri G (1995) Solving multiclass learning problems via error-correcting output codes. J Artif Intell Res 263–286

  • Doherty AR, Byrne D, Smeaton AF, Jones GJF, Hughes M (2008) Investigating keyframe selection methods in the novel domain of passively captured visual lifelogs. In: Proceedings of the 2008 international conference on content-based image and video retrieval, pp 259–268. ACM

  • Grauman K, Lu Z (2013) Story-driven summarization for egocentric video. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR). Texas

  • Javed A, Bajwa KB, Malik H, Irtaza A (2016) An efficient framework for automatic highlights generation from sports videos. IEEE Signal Process Lett 23(7):954–958

    Article  Google Scholar 

  • Jeong D, Yoo HJ, Cho NI (2016) A static video summarization method based on the sparse coding of features and representativeness of frames. EURASIP J Image Video Process 2017(1):1

    Article  Google Scholar 

  • Karaman S, Benois-Pineau J, Dovgalecs V, Mégret R, Pinquier J, André-Obrecht R, Gaëstel Y, Dartigues J-F (2014) Hierarchical Hidden Markov Model in detecting activities of daily living in wearable videos for studies of dementia. Multimedia Tools Appl 69(3):743–771

    Article  Google Scholar 

  • Lee YJ, Grauman K (2015) Predicting important objects for egocentric video summarization. Int J Comput Vis 114(1):38–55

    MathSciNet  Article  Google Scholar 

  • Lidon A, Bolanos M, Dimiccoli M, Radeva P, Garolera M (2017) Semantic summarization of egocentric photo stream events. In: LTA’17 Proceedings of the 2nd workshop on lifelogging tools and applications, Mountain View, California, USA, 23–24 October 2017. ACM, New York

  • Lu Y (1995) Machine printed character segmentation—an overview. Pattern Recognit 28(1):67–80

    Article  Google Scholar 

  • Meditskos G, Plans P-M, Stavropoulos TG, Benois-Pineau J, Buso V, Kompatsiaris I (2018) Multi-modal activity recognition from egocentric vision, semantic enrichment and lifelogging applications for the care of dementia. J Vis Commun Image Represent 51:169–190

    Article  Google Scholar 

  • Nguyen T-H-C, Nebel J-C, Florez-Revuelta F (2016) Recognition of activities of daily living with egocentric vision: a review. Sensors (Basel) 16:72

    Article  Google Scholar 

  • Shivakumara P, Sreedhar RP, Phan TQ, Lu S, Tan CL (2012) Multioriented video scene text detection through bayesian classification and boundary growing. IEEE Trans Circuits Syst Video Technol 22(8):1231–1233

    Article  Google Scholar 

  • Smith R (2007) An overview of the tesseract OCR engine. In: Proceedings of 9th international conference on document analysis and recognition (ICDAR)

  • Song X, Sun L, Lei J, Tao D, Yuan G, Song M (2016) Event-based large scale surveillance video summarization. J Neurocomput 187(C):66–74

    Article  Google Scholar 

  • Su Y-C, Grauman K (2016) Detecting engagement in egocentric video. In: Proceedings of the European conference on computer vision (ECCV). Amsterdam

  • Tang P, Wang C, Wang X, Liu W, Zeng W, Wang J (2018) Object detection in videos by short and long range object linking. arXiv:1801.09823

  • Toshev A, Makadia A, Daniilidis K (2009) Shape-based object recognition in videos using 3D synthetic object models. In: 2009 IEEE conference on computer vision and pattern recognition

  • Varini P, Serra G, Cucchiara R (2015) Egocentric video summarization of cultural tour based on user preferences. In: MM ‘15 Proceedings of the 23rd ACM international conference on Multimedia. Brisbane

  • Varini P, Serra G, Cucchiara R (2015) Personalized egocentric video summarization for cultural experience. In: Proceedings of the 5th ACM on international conference on multimedia retrieval. New York

  • Viola P, Jones MJ (2004) Robust real-time face detection. Int J Comput Vis 57(2):137–154

    Article  Google Scholar 

  • Zhang K, Sha F, Chao W-L, Grauman K (2016) Summary transfer: exemplar-based subset selection for video summarization. In: IEEE conference on computer vision and pattern recognition (CVPR). Las Vegas

  • Zhang K, Chao W-L, Sha F, Grauman K (2016) Video summarization with long short-term memory. In: Proceedings of European conference on computer vision (ECCV), California, 2016

  • Zhang Y, Kampffmeyer M, Liang X, Tan M, Xing EP (2018a) Query-conditioned three-player adversarial network for video summarization. Computer Vision and Pattern Recognition. BMVC 2018, pp 1–9

  • Zhang Y, Liang X, Zhang D, Tan M, Xing EP (2018b) Unsupervised object-level video summarization with online motion auto-encoder. Pattern Recogn Lett. https://doi.org/10.1016/j.patrec.2018.07.030

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ali Javed.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Sultan, S., Javed, A., Irtaza, A. et al. A hybrid egocentric video summarization method to improve the healthcare for Alzheimer patients. J Ambient Intell Human Comput 10, 4197–4206 (2019). https://doi.org/10.1007/s12652-019-01444-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12652-019-01444-6

Keywords

  • Alzheimer
  • Education
  • Egocentric data
  • Healthcare
  • Video summarization