Skip to main content
Log in

Group-based spatio-temporal video analysis and abstraction using wavelet parameters

  • Original Paper
  • Published:
Signal, Image and Video Processing Aims and scope Submit manuscript

Abstract

In this paper, we present a spatio-temporal event-based approach to video signal analysis and abstraction employing wavelet transform features. The video signal is assumed to be a sequence of overlapping independent visual components called events, which typically are temporally overlapping compact functions that describe temporal evolution of a given set of the spatial parameters of the video signal. We utilize event-based temporal decomposition technique to resolve the overlapping arrangement of the video signal that is known to be one of the main concerns in video analysis via conventional frame-based schemes. In our method, a set of spatial parameters, extracted from the video, is expressed as a linear combination of a set of temporally overlapping compact functions, called events, through an optimization process. First, to reduce computational complexity, the video sequence is divided into overlapped groups. Next, Generalized Gaussian Density (GGD) parameters, extracted from 2D wavelet transform subbands, are used as the spatial parameters. Temporal decomposition is then applied to the GGD parameters, structured as a frame-based matrix of GGD vectors, to compute the event functions and associated orthogonal GGD parameters. Frames located at event centroids, which are much smaller in number than the number of frames in the original video, are taken as candidates for the keyframes that are selected based on a distance criterion in the feature space. Our contribution is that this still image video abstraction scheme does not need shot or cluster boundary detection, unlike current methods. Experimental results confirm the efficiency and accuracy of our approach.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Truong B.T., Venkatesh S.: Video abstraction: a systematic review and classification. ACM Trans. Multimed. Comput. Commun. Appl. 3, 1–37 (2007). doi:10.1145/1198302.1198305

    Article  Google Scholar 

  2. Li Y., Lee S.H., Yeh S.H., Kuo C.-C.J.: Techniques for movie content analysis and skimming. In: IEEE Signal Process. Mag. 23, 79–89 (2006). doi:10.1109/MSP.2006.1621451

    Article  MATH  Google Scholar 

  3. Mallat S.: A theory for multiresolution signal decomposition: the wavelet representation. In: IEEE Trans. Patt. Recognit. Mach. Intell. 11(7), 674–693 (1989). doi:10.1109/34.192463

    Article  MATH  Google Scholar 

  4. Oh, T.H., Besar, R.: JPEG2000 and JPEG: image quality measures of compressed medical images. In: 4th National Conference on Telecommunication Technology, NCTT Proceedings, pp. 31–35. (2003). doi:10.1109/NCTT.2003.1188296

  5. Simoncelli, E.P., Duccigrossi, R.W.: Embedded wavelet image compression based on a joint property model. In: IEEE International Conference Image Processing, vol. 1, pp. 640–643 (1997). doi:10.1109/ICIP.1997.647994

  6. Do, M.N.: Directional multiresolution image representations. PhD thesis. Swiss Federal Institute of Technology (2001)

  7. Zhuang, Y., Rui, Y., Huang, T.S., Mehrotra, S.: Adaptive key frame extraction using unsupervised clustering. In: IEEE International Conference on Image Processing, pp. 283–287. (1998). doi:10.1109/ICIP.1998.723655

  8. Nagasaka, A., Tanaka, Y.: Automatic video indexing and full-video search for object appearances. In: Visual Database System, II, vol. 15(2), pp. 113–127. Elsevier, North-Holland (1992)

  9. Chen W., Zhang Y.J.: Parametric model for video content analysis. Pattern Recognit. Lett. 29, 181–191 (2008). doi:10.1016/j.patrec.2007.09.020

    Article  MATH  Google Scholar 

  10. Manor, L.Z., Irani, M.: Event-Based Video Analysis. In: IEEE Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 123–130. (2001). doi:10.1109/CVPR.2001.990935

  11. Janvier B., Bruno E., Pun T., Maillet S.M.: Information-theoretic temporal segmentation of video and applications: multiscale keyframes selection and shot boundaries detection. Multimed. Tools Appl. 3(3), 273–288 (2006). doi:10.1007/s11042-006-0026-2

    Article  Google Scholar 

  12. Bulut, E., Capin, T.: Key frame extraction from motion capture data by curve saliency. In: Computer Animation and Social Agents, CASA (2007)

  13. Shao, L., Ji, L.: Motion histogram analysis based key frame extraction for human action/activity representation. In: 6th Canadian Conference on Computer and Robot Vision, CRV, pp. 88–92. (2009). doi:10.1109/CRV.2009.36

  14. Cooper, M.L., Foote, J.: Discriminative techniques for keyframe selection. In: International Conference on Multimedia and Expo, ICME, pp. 502–505. (2005). doi:10.1109/ICME.2005.1521470

  15. Polana, R., Nelson, R.: Detecting activities. In: IEEE Conference on Computer Vision and Pattern Recognition. pp. 2–5. (1993). doi:10.1109/CVPR.1993.341009

  16. Atal, B.S.: Efficient coding of LPC parameters by temporal decomposition. In: IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP, pp. 81–84. (1983). doi:10.1109/ICASSP.1983.1172248

  17. Ghaemmaghami, S.: Audio segmentation and classification based on a selective analysis scheme. In: 10th International Multimedia Modelling Conference, MMM, pp. 42–47. (2004). doi:10.1109/MULMM.2004.1264965

  18. Manjunath, B.S., Chandrasekaran, S., Wang, Y.F.: An eigenspace update algorithm for image analysis. In: International Symposium on Computer Vision, pp. 551–556. (1995). doi:10.1109/ISCV.1995.477059

  19. http://www.irisa.fr/vista/Equipe/People/Laptev/download.html 2011). Accessed 15 April 2011

  20. http://www-nlpir.nist.gov/projects/trecvid (2011). National Institute of Standards and Technology (NIST). Accessed 15 April 2011

  21. http://nsl.cs.sfu.ca/wiki/index.php/Video_Library_and_Tools (2011). Accessed 15 April 2011

  22. http://www.open-video.org (2011). Accessed 15 April 2011

  23. Pickering M.J., Ryger S.: Evaluation of key frame-based retrieval techniques for video. Comput. Vis. Image Underst. 92(2–3), 217–235 (2003). doi:10.1016/j.cviu.2003.06.002

    Article  Google Scholar 

  24. Liu T., Zhang H.J., Qi F.: A novel video Key-frame extraction algorithm based on perceived motion energy model. In: IEEE Trans. Circuits Syst. Video Technol. 13(10), 1006–1013 (2003). doi:10.1109/TCSVT.2003.816521

    Article  Google Scholar 

  25. Cover T.M., Thomes J.A.: Elements of Information Theory. Wiley, New York (1991)

    Book  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to M. Omidyeganeh.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Omidyeganeh, M., Ghaemmaghami, S. & Shirmohammadi, S. Group-based spatio-temporal video analysis and abstraction using wavelet parameters. SIViP 7, 787–798 (2013). https://doi.org/10.1007/s11760-011-0268-y

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11760-011-0268-y

Keywords

Navigation