Skip to main content

R-Clustering for Egocentric Video Segmentation

  • Conference paper
  • First Online:
Pattern Recognition and Image Analysis (IbPRIA 2015)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 9117))

Included in the following conference series:

Abstract

In this paper, we present a new method for egocentric video temporal segmentation based on integrating a statistical mean change detector and agglomerative clustering(AC) within an energy-minimization framework. Given the tendency of most AC methods to oversegment video sequences when clustering their frames, we combine the clustering with a concept drift detection technique (ADWIN) that has rigorous guarantee of performances. ADWIN serves as a statistical upper bound for the clustering-based video segmentation. We integrate both techniques in an energy-minimization framework that serves to disambiguate the decision of both techniques and to complete the segmentation taking into account the temporal continuity of video frames descriptors. We present experiments over egocentric sets of more than 13.000 images acquired with different wearable cameras, showing that our method outperforms state-of-the-art clustering methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Aghaei, M., Radeva, P.: Bag-of-tracklets for person tracking in life-logging data. In: CCIA 2014, pp. 35ā€“44, Barcelona, Spain, October 2014

    Google ScholarĀ 

  2. Bifet, A., Gavalda, R.: Learning from time-changing data with adaptive windowing. In: SDM, vol. 7. SIAM (2007)

    Google ScholarĀ 

  3. BolaƱos, M., Garolera, M., Radeva, P.: Video segmentation of life-logging videos. In: Perales, F.J., Santos-Victor, J. (eds.) AMDO 2014. LNCS, vol. 8563, pp. 1ā€“9. Springer, Heidelberg (2014)

    Google ScholarĀ 

  4. Boykov, Y., Veksler, O., Zabih, R.: Fast approximate energy minimization via graph cuts. IEEE Trans. Pattern Anal. Mach. Intell. 23(11), 1222ā€“1239 (2001)

    ArticleĀ  Google ScholarĀ 

  5. Doherty, A.R., Smeaton, A.F.: Automatically segmenting lifelog data into events. In: Proceedings of WIAMIS 2008, pp. 20ā€“23. IEEE Computer Society, Washington, DC (2008)

    Google ScholarĀ 

  6. Drozdzal, M., Vitria, J., Segui, S., Malagelada, C., Azpiroz, F., Radeva, P.: Intestinal event segmentation for endoluminal video analysis. In: ICIP (2014)

    Google ScholarĀ 

  7. Fukunaga, K., Hostetler, L.: The estimation of the gradient of a density function, with applications in pattern recognition. IEEE Trans. Inf. Theor. 21(1), 32ā€“40 (2006)

    ArticleĀ  MathSciNetĀ  Google ScholarĀ 

  8. Goodfellow, I.J., Ibarz, J., Bulatov, Y., Arnoud, S., Shet, V.: Multi-digit Number Recognition from Street View Imagery Using Deep Convolutional Neural Networks. Google Inc., Mountain View (2014)

    Google ScholarĀ 

  9. Hoeffding, W.: Probability inequalities for sums of bounded random variables. J. Am. Stat. Assoc. 58(301), 13ā€“30 (1963)

    ArticleĀ  MATHĀ  MathSciNetĀ  Google ScholarĀ 

  10. Jia, Y. : Caffe: An open source convolutional architecture for fast feature embedding (2013). http://caffe.berkeleyvision.org/

  11. Jojic, N., Perina, A., Murino, V.: Structural epitome: a way to summarize oneā€™s visual experience. In: NIPS, pp. 1027ā€“1035 (2010)

    Google ScholarĀ 

  12. Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Pereira, F., Burges, C., Bottou, L., Weinberger, K. (eds.) NIPS 25, pp. 1097ā€“1105. Curran Associates Inc., Red Hook (2012)

    Google ScholarĀ 

  13. Laganire, R., Bacco, R., Hocevar, A., Lambert, P., Pas, G., Ionescu, B.: Video summarization from spatio-temporal features. In: TVS, pp. 144ā€“148. ACM (2008)

    Google ScholarĀ 

  14. Lee, Y.J., Ghosh, J., Grauman, K.: Discovering important people and objects for egocentric video summarization. In: CVPR, pp. 1346ā€“1353. IEEE (2012)

    Google ScholarĀ 

  15. Li, Z., Wei, Z., Jia, W., Sun, M.: Daily life event segmentation for lifestyle evaluation based on multi-sensor data recorded by a wearable device. In: EMBC 2013, pp. 2858ā€“2861. IEEE (2013)

    Google ScholarĀ 

  16. Lin, W.-H., Hauptmann, A.: Structuring continuous video recording of everyday life using time-constrained clustering. Computer Science Department 959 (2006)

    Google ScholarĀ 

  17. Lu, Z., Grauman, K.: Story-driven summarization for egocentric video. In: CVPR, pp. 2714ā€“2721. IEEE (2013)

    Google ScholarĀ 

  18. MacQueen, J.B.: Some methods for classification and analysis of multivariate observations. In: Le Cam, L.M., Neyman, J. (eds. ) Proceedings of the 5th Berkeley Symposium on Mathematical Statistics and Probability, vol. 1, pp. 281ā€“297 (1967)

    Google ScholarĀ 

  19. Murtagh, F., Contreras, P.: Methods of hierarchical clustering. CoRR, abs/1105.0121 (2011)

    Google ScholarĀ 

  20. Ngo, C.-W., Ma, Y.-F., Zhang, H..: Automatic video summarization by graph modeling. pages 104ā€“109. IEEE Computer Society (2003)

    Google ScholarĀ 

  21. Poleg, Y., Arora, C., Peleg, S.: Temporal segmentation of egocentric videos. In: IEEE Conference On Computer Vision and Pattern Recognition (CVPR) (2014)

    Google ScholarĀ 

  22. SenseCam. Sensecam overview (2013)

    Google ScholarĀ 

  23. Tan, P.N., Steinbach, M., Kumar, V.: Introduction to Data Mining, 1st edn. Addison-Wesley Longman Publishing Co., Boston (2005)

    Google ScholarĀ 

  24. Zheng, L., Wang, S., He, F., Tian, Q.: Seeing the big picture: Deep embedding with contextual evidences. CoRR, abs/1406.0132 (2014)

    Google ScholarĀ 

Download references

Acknowledgments

This work was partially founded by TIN2012-38187-C03-01 and SGR 1219.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Estefania Talavera .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

Ā© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Talavera, E., Dimiccoli, M., BolaƱos, M., Aghaei, M., Radeva, P. (2015). R-Clustering for Egocentric Video Segmentation. In: Paredes, R., Cardoso, J., Pardo, X. (eds) Pattern Recognition and Image Analysis. IbPRIA 2015. Lecture Notes in Computer Science(), vol 9117. Springer, Cham. https://doi.org/10.1007/978-3-319-19390-8_37

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-19390-8_37

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-19389-2

  • Online ISBN: 978-3-319-19390-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics