R-Clustering for Egocentric Video Segmentation

Talavera, Estefania; Dimiccoli, Mariella; Bolaños, Marc; Aghaei, Maedeh; Radeva, Petia

doi:10.1007/978-3-319-19390-8_37

Estefania Talavera^16,17,
Mariella Dimiccoli^16,18,
Marc Bolaños¹⁶,
Maedeh Aghaei¹⁶ &
…
Petia Radeva^16,18

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 9117))

Included in the following conference series:

Iberian Conference on Pattern Recognition and Image Analysis

4167 Accesses
23 Citations

Abstract

In this paper, we present a new method for egocentric video temporal segmentation based on integrating a statistical mean change detector and agglomerative clustering(AC) within an energy-minimization framework. Given the tendency of most AC methods to oversegment video sequences when clustering their frames, we combine the clustering with a concept drift detection technique (ADWIN) that has rigorous guarantee of performances. ADWIN serves as a statistical upper bound for the clustering-based video segmentation. We integrate both techniques in an energy-minimization framework that serves to disambiguate the decision of both techniques and to complete the segmentation taking into account the temporal continuity of video frames descriptors. We present experiments over egocentric sets of more than 13.000 images acquired with different wearable cameras, showing that our method outperforms state-of-the-art clustering methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Aghaei, M., Radeva, P.: Bag-of-tracklets for person tracking in life-logging data. In: CCIA 2014, pp. 35–44, Barcelona, Spain, October 2014
Google Scholar
Bifet, A., Gavalda, R.: Learning from time-changing data with adaptive windowing. In: SDM, vol. 7. SIAM (2007)
Google Scholar
Bolaños, M., Garolera, M., Radeva, P.: Video segmentation of life-logging videos. In: Perales, F.J., Santos-Victor, J. (eds.) AMDO 2014. LNCS, vol. 8563, pp. 1–9. Springer, Heidelberg (2014)
Google Scholar
Boykov, Y., Veksler, O., Zabih, R.: Fast approximate energy minimization via graph cuts. IEEE Trans. Pattern Anal. Mach. Intell. 23(11), 1222–1239 (2001)
Article Google Scholar
Doherty, A.R., Smeaton, A.F.: Automatically segmenting lifelog data into events. In: Proceedings of WIAMIS 2008, pp. 20–23. IEEE Computer Society, Washington, DC (2008)
Google Scholar
Drozdzal, M., Vitria, J., Segui, S., Malagelada, C., Azpiroz, F., Radeva, P.: Intestinal event segmentation for endoluminal video analysis. In: ICIP (2014)
Google Scholar
Fukunaga, K., Hostetler, L.: The estimation of the gradient of a density function, with applications in pattern recognition. IEEE Trans. Inf. Theor. 21(1), 32–40 (2006)
Article MathSciNet Google Scholar
Goodfellow, I.J., Ibarz, J., Bulatov, Y., Arnoud, S., Shet, V.: Multi-digit Number Recognition from Street View Imagery Using Deep Convolutional Neural Networks. Google Inc., Mountain View (2014)
Google Scholar
Hoeffding, W.: Probability inequalities for sums of bounded random variables. J. Am. Stat. Assoc. 58(301), 13–30 (1963)
Article MATH MathSciNet Google Scholar
Jia, Y. : Caffe: An open source convolutional architecture for fast feature embedding (2013). http://caffe.berkeleyvision.org/
Jojic, N., Perina, A., Murino, V.: Structural epitome: a way to summarize one’s visual experience. In: NIPS, pp. 1027–1035 (2010)
Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Pereira, F., Burges, C., Bottou, L., Weinberger, K. (eds.) NIPS 25, pp. 1097–1105. Curran Associates Inc., Red Hook (2012)
Google Scholar
Laganire, R., Bacco, R., Hocevar, A., Lambert, P., Pas, G., Ionescu, B.: Video summarization from spatio-temporal features. In: TVS, pp. 144–148. ACM (2008)
Google Scholar
Lee, Y.J., Ghosh, J., Grauman, K.: Discovering important people and objects for egocentric video summarization. In: CVPR, pp. 1346–1353. IEEE (2012)
Google Scholar
Li, Z., Wei, Z., Jia, W., Sun, M.: Daily life event segmentation for lifestyle evaluation based on multi-sensor data recorded by a wearable device. In: EMBC 2013, pp. 2858–2861. IEEE (2013)
Google Scholar
Lin, W.-H., Hauptmann, A.: Structuring continuous video recording of everyday life using time-constrained clustering. Computer Science Department 959 (2006)
Google Scholar
Lu, Z., Grauman, K.: Story-driven summarization for egocentric video. In: CVPR, pp. 2714–2721. IEEE (2013)
Google Scholar
MacQueen, J.B.: Some methods for classification and analysis of multivariate observations. In: Le Cam, L.M., Neyman, J. (eds. ) Proceedings of the 5th Berkeley Symposium on Mathematical Statistics and Probability, vol. 1, pp. 281–297 (1967)
Google Scholar
Murtagh, F., Contreras, P.: Methods of hierarchical clustering. CoRR, abs/1105.0121 (2011)
Google Scholar
Ngo, C.-W., Ma, Y.-F., Zhang, H..: Automatic video summarization by graph modeling. pages 104–109. IEEE Computer Society (2003)
Google Scholar
Poleg, Y., Arora, C., Peleg, S.: Temporal segmentation of egocentric videos. In: IEEE Conference On Computer Vision and Pattern Recognition (CVPR) (2014)
Google Scholar
SenseCam. Sensecam overview (2013)
Google Scholar
Tan, P.N., Steinbach, M., Kumar, V.: Introduction to Data Mining, 1st edn. Addison-Wesley Longman Publishing Co., Boston (2005)
Google Scholar
Zheng, L., Wang, S., He, F., Tian, Q.: Seeing the big picture: Deep embedding with contextual evidences. CoRR, abs/1406.0132 (2014)
Google Scholar

Download references

Acknowledgments

This work was partially founded by TIN2012-38187-C03-01 and SGR 1219.

Author information

Authors and Affiliations

Universitat de Barcelona, Barcelona, Spain
Estefania Talavera, Mariella Dimiccoli, Marc Bolaños, Maedeh Aghaei & Petia Radeva
University of Groningen, Groningen, The Netherlands
Estefania Talavera
Computer Vision Center, Barcelona, Bellaterra, Spain
Mariella Dimiccoli & Petia Radeva

Authors

Estefania Talavera
View author publications
You can also search for this author in PubMed Google Scholar
Mariella Dimiccoli
View author publications
You can also search for this author in PubMed Google Scholar
Marc Bolaños
View author publications
You can also search for this author in PubMed Google Scholar
Maedeh Aghaei
View author publications
You can also search for this author in PubMed Google Scholar
Petia Radeva
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Estefania Talavera .

Editor information

Editors and Affiliations

Universitat Politècnica de València, València, Spain
Roberto Paredes
Universidade do Porto, Porto, Portugal
Jaime S. Cardoso
Universidade de Santiago de Compostela, Santiago de Compostela, Spain
Xosé M. Pardo

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Talavera, E., Dimiccoli, M., Bolaños, M., Aghaei, M., Radeva, P. (2015). R-Clustering for Egocentric Video Segmentation. In: Paredes, R., Cardoso, J., Pardo, X. (eds) Pattern Recognition and Image Analysis. IbPRIA 2015. Lecture Notes in Computer Science(), vol 9117. Springer, Cham. https://doi.org/10.1007/978-3-319-19390-8_37

Download citation

DOI: https://doi.org/10.1007/978-3-319-19390-8_37
Published: 09 June 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-19389-2
Online ISBN: 978-3-319-19390-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics