Skip to main content
Log in

Adaptive Video Fast Forward

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

We derive a statistical graphical model of video scenes with multiple, possibly occluded objects that can be efficiently used for tasks related to video search, browsing and retrieval. The model is trained on query (target) clip selected by the user. Shot retrieval process is based on the likelihood of a video frame under generative model. Instead of using a combination of weighted Euclidean distances as a shot similarity measure, the likelihood model automatically separates and balances various causes of variability in video, including occlusion, appearance change and motion. Thus, we overcome tedious and complex user interventions required in previous studies. We use the model in the adaptive video forward application that adapts video playback speed to the likelihood of the data. The similarity measure of each candidate clip to the target clip defines the playback speed. Given a query, the video is played at a higher speed as long as video content has low likelihood, and when frames similar to the query clip start to come in, the video playback rate drops. Set of experiments o12n typical home videos demonstrate performance, easiness and utility of our application.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. S.-F. Chang, W. Chen, H.J. Meng, H. Sundaram, and D. Zhong, “A fully automated content-based video search engine supporting spatiotemporal queries,” IEEE Trans. On Circuits and Systems for Video Technology, Vol. 8, No. 5, pp. 602–615, 1998.

  2. J.S. De Bonet and P. Viola, “Structure driven image database retrieval,” in Advances in Neural Information Processing Systems, No. 10, 1997.

  3. A.P. Dempster, N.M. Laird, and D.B. Rubin, “Maximum Likelihood from incomplete data via the EM algorithm, Journal of the Royal statistical Society, Series B, Vol. 39, No. 1, pp. 1–38, 1977.

    Google Scholar 

  4. E. Hadjidemetriou, M.D. Grossberg, and S.K. Nayar, “Spatial Information in Multiresolution Histograms,” in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Vol. I, pp. 702–709, 2001.

  5. M. Irani and P. Anandan, “Video indexing based on mosaic representation,” IEEE Transaction on Pattern Analysis and Machine Intelligence, Vol. 86, No. 5, May 1998.

  6. N. Jojic and B. Frey, “Learning flexible sprites in video layers,” in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2001.

  7. N. Jojic, N. Petrovic, B. Frey, and T.S. Huang, “Transformed hidden markov models: Estimating mixture models and inferring spatial transformations in video sequences,” in Proceedings IEEE Conf. Comp. Vis. Pattern Recogn., 2000.

  8. N. Jojic, N. Petrovic, and T.S. Huang, “Scene generative models for adaptive video fast forward,” Accepted, IEEE Int. Conf. Image Proc., Barcelona, Spain, 2003.

  9. M.I. Jordan, Z. Ghahramani, T. Jaakkola, and K. Saul, “An introduction to variational methods for graphical models,” Machine Learning, Vol. 37, No. 2, pp. 183–233, 1999.

    Google Scholar 

  10. O. Maron and A.L. Ratan, “Multiple-instance learning for natural scene classification,” in Proc. 15th Int. Conf. on Machine Learning, 1998, pp. 341–349.

  11. M.R. Naphade, T. Kristjansson, B. Frey, and T.S. Huang, “Probabilistic multimedia objects (multijects): A novel approach to video indexing and retrieval in multimedia systems,” in Proceedings of ICIP, Vol. 3, 1998, pp. 536–540.

  12. C.W. Ngo, T.C. Pong, and H.J. Zhang, “On clustering and retrieval of video shots,” in Proceedings of ACM Multimedia, 2001.

  13. J. Pearl, “Probabilistic Reasoning in Intelligent Systems Morgan Kaufmann Publishers: San Mateo, CA, 1988.

    Google Scholar 

  14. G. Pingali, G.A. Opalach, and I. Carlbom, “Multimedia retrieval through spatio-temporal activity maps,” in Proceedings of ACM Multimedia, 2001.

  15. C. Schmid, “Constructing models for content-based image retrieval,” in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2001.

  16. D. Spiegelhalter, A. Thomas, and W. Gilks, “BUGS, Bayesian inference using Gibbs sampling,” Technical Report, MRC Biostatistics Unit, Cambridge, UK, 1993.

    Google Scholar 

  17. C. Stauffer, E.G. Miller, and K. Tieu, “Transform-invariant image decomposition with similarity templates,” in Adv. in Neural Inf. Proc. Systems, 2001.

  18. M.J. Swain and D.H. Ballard, “Color Indexing,” Int. J. Comp. Vis, Vol. 7, No. 1, 1991.

  19. K. Tieu and P. Viola, “Boosting image retrieval,” CVPR, pp. 228–235, 2000.

  20. C. Wren, A. Azarbayejani, T. Darrell, and A. Pentland, “Pfinder: Real-time tracking of the human body,” IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 19, No. 7, pp. 780–785, 1997.

    Google Scholar 

  21. L. Zelnik-Manor, and M. Irani, “Event-based video analysis,” CVPR, 2001.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nemanja Petrovic.

Additional information

Nemanja Petrovic received his Ph.D. from University of Illinois in 2004. He is currently the member of research staff at Siemens Corporate Research at Princeton, New Jersey. His professional interests are computer vision and machine learning. Dr. Petrovic has published 20 papers in the area of video understanding, data clustering and enhancement.

Nebojsa Jojic received his Ph.D. from University of Illinois at 2001. His currently a researcher at Microsoft Research at Redmond, Washington. His professional interest include computer vision and machine learning. Dr. Jojic has published over 40 papers in the area of computer vision, bioinformatics and graphical models.

Thomas Huang received his Sc.D. from MIT in 1963. He is William L. Everitt Distinguished Professor in the University of Illinois, Department of Electrical and Computer Engineering and a full-time faculty member in the Beckman Institute Image Formation and Processing and Artificial Intelligence groups. Professor Huang has published over 600 papers in the area of computer vision, image compression and enhancement, pattern recognition, and multimodal signal processing. He is a Member of the National Academy of Engineering, Foreign Member of the Chinese Academy of Engineering and Chinese Academy of Science, and recipient of IEEE Jack S. Kilby Signal Processing Medal.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Petrovic, N., Jojic, N. & Huang, T.S. Adaptive Video Fast Forward. Multimed Tools Appl 26, 327–344 (2005). https://doi.org/10.1007/s11042-005-0895-9

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-005-0895-9

Keywords

Navigation