Abstract
High-spatial-resolution videos offer the possibility of viewing an arbitrary region-of-interest (RoI) interactively. The user can pan/tilt/zoom while watching the video. This chapter presents spatial-random-access-enabled video compression that encodes the content such that arbitrary RoIs corresponding to different zoom factors can be extracted from the compressed bit-stream. The chapter also covers RoI trajectory prediction, which allows pre-fetching relevant content in a streaming scenario. The more accurate the prediction the lower is the percentage of missing pixels. RoI prediction techniques can perform better by adapting according to the video content in addition to simply extrapolating previous moves of the input device. Finally, the chapter presents a streaming system that employs application-layer peer-to-peer (P2P) multicast while still allowing the users to freely choose individual RoIs. The P2P overlay adapts on-the-fly for exploiting the commonalities in the peers’ RoIs. This enables peers to relay data to each other in real-time, thus drastically reducing the bandwidth required from dedicated servers.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Fehn, C., Weissig, C., Feldmann, I., Mueller, M., Eisert, P., Kauff, P., Bloss, H.: Creation of High-Resolution Video Panoramas of Sport Events. In: Proc. IEEE 8th International Symposium on Multimedia, San Diego, CA, USA (2006)
Kopf, J., Uyttendaele, M., Deussen, O., Cohen, M.F.: Capturing and Viewing Gigapixel Images. In: Proc. ACM SIGGRAPH, San Diego, CA, USA (2007)
Halo: Video conferencing product by Hewlett-Packard, http://www.hp.com/halo/index.html (accessed November 5, 2009)
Smolic, A., McCutchen, D.: 3DAV Exploration of Video-based Rendering Technology in MPEG. IEEE Transactions on Circuits and Systems for Video Technology 14(3), 348–356 (2004)
Dodeca 2360: An omni-directional video camera providing over 100 million pixels per second by Immersive Media, http://www.immersivemedia.com (accessed November 5, 2009)
Video clip showcasing interactive TV with pan/tilt/zoom, http://www.youtube.com/watch?v=Ko9jcIjBXnk (accessed November 5, 2009)
ISO/IEC 15444-1:2004, JPEG 2000 Specification. Standard (2004)
Taubman, D., Rosenbaum, R.: Rate-Distortion Optimized Interactive Browsing of JPEG 2000 Images. In: Proc. IEEE International Conference on Image Processing, Barcelona, Spain (2000)
Taubman, D., Prandolini, R.: Architecture, Philosophy and Performance of JPIP: Internet Protocol Standard for JPEG 2000. In: Proc. SPIE International Symposium on Visual Communications and Image Processing, Lugano, Switzerland (2003)
H.264/AVC/MPEG-4 Part 10 (ISO/IEC 14496-10: Advanced Video Coding). Standard (2003)
Wiegand, T., Sullivan, G., Bjontegaard, G., Luthra, A.: Overview of the H.264/AVC Video Coding Standard. IEEE Transactions on Circuits and Systems for Video Technology 13(7), 560–576 (2003)
Dhondt, Y., Lambert, P., Notebaert, S., Van de Walle, R.: Flexible macroblock ordering as a content adaptation tool in H.264/AVC. In: Proc. SPIE Conference on Multimedia Systems and Applications VIII, Boston, MA, USA (2005)
Annex G of H.264/AVC/MPEG-4 Part 10: Scalable Video Coding (SVC). Standard (2007)
Schwarz, H., Marpe, D., Wiegand, T.: Overview of the Scalable Video Coding Extension of the H.264/AVC Standard. IEEE Transactions on Circuits and Systems for Video Technology 17(9), 1103–1120 (2007)
Baccichet, P., Zhu, X., Girod, B.: Network-Aware H.264/AVC Region-of-Interest Coding for a Multi-Camera Wireless Surveillance Network. In: Proc. Picture Coding Symposium, Beijing, China (2006)
Devaux, F., Meessen, J., Parisot, C., Delaigle, J., Macq, B., Vleeschouwer, C.D.: A Flexible Video Transmission System based on JPEG 2000 Conditional Replenishment with Multiple References. In: Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing, Honolulu, HI, USA (2007)
Makar, M., Mavlankar, A., Girod, B.: Compression-Aware Digital Pan/Tilt/Zoom. In: Proc. 43rd Annual Asilomar Conference on Signals, Systems, and Computers, Pacific Grove, CA, USA (2009)
Girod, B.: The Efficiency of Motion-Compensating Prediction for Hybrid Coding of Video Sequences. IEEE Journal on Selected Areas in Communications 5(7), 1140–1154 (1987)
Girod, B.: Motion-Compensating Prediction with Fractional-Pel Accuracy. IEEE Transactions on Communications 41(4), 604–612 (1993)
Girod, B.: Efficiency analysis of multihypothesis motion-compensated prediction for video coding. IEEE Transactions on Image Processing 9(2), 173–183 (2000)
Gruenheit, C., Smolic, A., Wiegand, T.: Efficient Representation and Interactive Streaming of High-Resolution Panoramic Views. In: Proc. IEEE International Conference on Image Processing, Rochester, NY, USA (2002)
Heymann, S., Smolic, A., Mueller, K., Guo, Y., Rurainsky, J., Eisert, P., Wiegand, T.: Representation, Coding and Interactive Rendering of High-Resolution Panoramic Images and Video using MPEG-4. In: Proc. Panoramic Photogrammetry Workshop, Berlin, Germany (2005)
Kauff, P., Schreer, O.: Virtual Team User Environments - A Step from Tele-cubicles towards Distributed Tele-collaboration in Mediated Workspaces. In: Proc. IEEE International Conference on Multimedia and Expo., Lausanne, Switzerland (2002)
Tanimoto, M.: Free Viewpoint Television — FTV. In: Proc. Picture Coding Symposium, San Francisco, CA, USA (2004)
Smolic, A., Mueller, K., Merkle, P., Fehn, C., Kauff, P., Eisert, P., Wiegand, T.: 3D Video and Free Viewpoint Video - Technologies, Applications and MPEG Standards. In: Proc. IEEE International Conference on Multimedia and Expo., Toronto, ON, Canada (2006)
Shum, H.Y., Kang, S.B., Chan, S.C.: Survey of Image-based Representations and Compression Techniques. IEEE Transactions on Circuits and Systems for Video Technology 13(11), 1020–1037 (2003)
Levoy, M., Hanrahan, P.: Light Field Rendering. In: Proc. ACM SIGGRAPH, New Orleans, LA, USA (1996)
Bauermann, I., Steinbach, E.: RDTC Optimized Compression of Image-Based Scene Representations (Part I): Modeling and Theoretical Analysis. IEEE Transactions on Image Processing 17(5), 709–723 (2008)
Bauermann, I., Steinbach, E.: RDTC Optimized Compression of Image-Based Scene Representations (Part II): Practical Coding. IEEE Transactions on Image Processing 17(5), 724–736 (2008)
Kimata, H., Kitahara, M., Kamikura, K., Yashima, Y., Fujii, T., Tanimoto, M.: Low-Delay Multiview Video Coding for Free-viewpoint Video Communication. Systems and Computers in Japan 38(5), 14–29 (2007)
Liu, Y., Huang, Q., Zhao, D., Gao, W.: Low-delay View Random Access for Multi-view Video Coding. In: Proc. IEEE International Symposium on Circuits and Systems, New Orleans, LA, USA (2007)
Flierl, M., Mavlankar, A., Girod, B.: Motion and Disparity Compensated Coding for Multi-View Video. IEEE Transactions on Circuits and Systems for Video Technology 17(11), 1474–1484 (2007) (invited Paper)
Cheung, G., Ortega, A., Cheung, N.M.: Generation of Redundant Frame Structure for Interactive Multiview Streaming. In: Proc. IEEE 17th Packet Video Workshop, Seattle, WA, USA (2009)
Ramanathan, P., Girod, B.: Rate-Distortion Optimized Streaming of Compressed Light Fields with Multiple Representations. In: Proc. IEEE 14th Packet Video Workshop, Irvine, CA, USA (2004)
Ramanathan, P., Girod, B.: Random Access for Compressed Light Fields using Multiple Representations. In: Proc. IEEE 6th International Workshop on Multimedia Signal Processing, Siena, Italy (2004)
Jagmohan, A., Sehgal, A., Ahuja, N.: Compression of Lightfield Rendered Images using Coset Codes. In: Proc. 37th Annual Asilomar Conference on Signals, Systems, and Computers, Pacific Grove, CA, USA (2003)
Aaron, A., Ramanathan, P., Girod, B.: Wyner-Ziv Coding of Light Fields for Random Access. In: Proc. IEEE 6th Workshop on Multimedia Signal Processing, Siena, Italy (2004)
Cheung, N.M., Ortega, A., Cheung, G.: Distributed Source Coding Techniques for Interactive Multiview Video Streaming. In: Proc. Picture Coding Symposium, Chicago, IL, USA (2009)
Azuma, R., Bishop, G.: A Frequency-domain Analysis of Head-motion Prediction. In: Proc. ACM SIGGRAPH, Los Angeles, CA, USA (1995)
Singhal, S.K., Cheriton, D.R.: Exploiting Position History for Efficient Remote Rendering in Networked Virtual Reality. Presence: Teleoperators and Virtual Environments 4, 169–193 (1995)
Ramanathan, P., Kalman, M., Girod, B.: Rate-Distortion Optimized Interactive Light Field Streaming. IEEE Transactions on Multimedia 9(4), 813–825 (2007)
Kiruluta, A., Eizenman, M., Pasupathy, S.: Predictive Head Movement Tracking using a Kalman Filter. IEEE Transactions on Systems, Man and Cybernetics, Part B: Cybernetics 27(2), 326–331 (1997)
Kurutepe, E., Civanlar, M.R., Tekalp, A.M.: A Receiver-driven Multicasting Framework for 3DTV Transmission. In: Proc. 13th European Signal Processing Conference, Antalya, Turkey (2005)
Kurutepe, E., Civanlar, M.R., Tekalp, A.M.: Interactive Transport of Multi-view Videos for 3DTV Applications. Journal of Zhejiang University - Science A (2006)
Deering, S.: Host Extensions for IP Multicasting. RFC 1112 (1989)
Albanna, Z., Almeroth, K., Meyer, D., Schipper, M.: IANA guidelines for IPv4 multicast address assignments. RFC 3171 (2001)
McCanne, S., Jacobson, V., Vetterli, M.: Receiver-Driven Layered Multicast. In: Proc. ACM SIGCOMM, Stanford, CA, USA (1996)
Estrin, D., Handley, M., Helmy, A., Huang, P., Thaler, D.: A Dynamic Bootstrap Mechanism for Rendezvous-based Multicast Routing. In: Proc. IEEE INFOCOM, New York, USA (1999)
Chu, Y.H., Rao, S., Seshan, S., Zhang, H.: A Case for End System Multicast. IEEE Journal on Selected Areas in Communications 20(8), 1456–1471 (2002)
Setton, E., Baccichet, P., Girod, B.: Peer-to-Peer Live Multicast: A Video Perspective. Proceedings of the IEEE 96(1), 25–38 (2008)
Magharei, N., Rejaie, R., Guo, Y.: Mesh or Multiple-Tree: A Comparative Study of Live Peer-to-Peer Streaming Approaches. In: Proc. IEEE INFOCOM (2007)
Agarwal, S., Singh, J., Mavlankar, A., Baccichet, P., Girod, B.: Performance of P2P Live Video Streaming Systems on a Controlled Test-bed. In: Proc. 4th International Conference on Testbeds and Research Infrastructures for the Development of Networks and Communities, Innsbruck, Austria (2008)
Mavlankar, A., Baccichet, P., Girod, B., Agarwal, S., Singh, J.: Video Quality Assessment and Comparative Evaluation of Peer-to-Peer Video Streaming Systems. In: Proc. IEEE International Conference on Multimedia and Expo., Hanover, Germany (2008)
Kurutepe, E., Sikora, T.: Feasibility of Multi-View Video Streaming Over P2P Networks. In: 3DTV Conference: The True Vision - Capture, Transmission and Display of 3D Video (2008)
Kurutepe, E., Sikora, T.: Multi-view video streaming over p2p networks with low start-up delay. In: Proc. IEEE International Conference on Image Processing (2008)
Mavlankar, A., Baccichet, P., Varodayan, D., Girod, B.: Optimal Slice Size for Streaming Regions of High Resolution Video with Virtual Pan/Tilt/Zoom Functionality. In: Proc. 15th European Signal Processing Conference, Poznan, Poland (2007)
Massey, M., Bender, W.: Salient Stills: Process and Practice. IBM Systems Journal 35(3&4), 557–573 (1996)
Farin, D., de With, P., Effelsberg, W.: Robust Background Estimation for Complex Video Sequences. In: Proc. IEEE International Conference on Image Processing, Barcelona, Spain (2003)
Wiegand, T., Zhang, X., Girod, B.: Long-Term Memory Motion-Compensated Prediction. IEEE Transactions on Circuits and Systems for Video Technology 9(1), 70–84 (1999)
Mavlankar, A., Girod, B.: Background Extraction and Long-Term Memory Motion-Compensated Prediction for Spatial-Random-Access-Enabled Video Coding. In: Proc. Picture Coding Symposium, Chicago, IL, USA (2009)
Bernstein, J., Girod, B., Yuan, X.: Hierarchical Encoding Method and Apparatus Employing Background References for Efficiently Communicating Image Sequences. US Patent (1992)
Hepper, D.: Efficiency Analysis and Application of Uncovered Background Prediction in a Low Bit Rate Image Coder. IEEE Transactions on Communications 38(9), 1578–1584 (1990)
Mavlankar, A., Varodayan, D., Girod, B.: Region-of-Interest Prediction for Interactively Streaming regions of High Resolution Video. In: Proc. IEEE 16th Packet Video Workshop, Lausanne, Switzerland (2007)
Mavlankar, A., Girod, B.: Pre-fetching based on Video Analysis for Interactive Region-of-Interest Streaming of Soccer Sequences. In: Proc. IEEE International Conference on Image Processing, Cairo, Egypt (2009)
Tomasi, C., Kanade, T.: Detection and Tracking of Point Features. Tech. Rep. CMU-CS-91-132, Carnegie Mellon University, Pittsburgh, PA (1991)
Takacs, G., Chandrasekhar, V., Girod, B., Grzeszczuk, R.: Feature Tracking for Mobile Augmented Reality Using Video Coder Motion Vectors. In: Proc. IEEE and ACM 6th International Symposium on Mixed and Augmented Reality, Nara, Japan (2007)
Mavlankar, A., Noh, J., Baccichet, P., Girod, B.: Peer-to-Peer Multicast Live Video Streaming with Interactive Virtual Pan/Tilt/Zoom Functionality. In: Proc. IEEE International Conference on Image Processing, San Diego, CA, USA (2008)
Mavlankar, A., Noh, J., Baccichet, P., Girod, B.: Optimal Server Bandwidth Allocation for Streaming Multiple Streams via P2P Multicast. In: Proc. IEEE 10th Workshop on Multimedia Signal Processing, Cairns, Australia (2008)
Setton, E., Noh, J., Girod, B.: Rate-Distortion Optimized Video Peer-to-Peer Multicast Streaming. In: Proc. Workshop on Advances in Peer-to-Peer Multimedia Streaming at ACM Multimedia, Singapore (2005) (invited Paper)
Setton, E., Noh, J., Girod, B.: Low Latency Video Streaming over Peer-To-Peer Networks. In: Proc. IEEE International Conference on Multimedia and Expo., Toronto, Canada (2006)
Setton, E., Noh, J., Girod, B.: Congestion-Distortion Optimized Peer-to-Peer Video Streaming. In: Proc. IEEE International Conference on Image Processing, Atlanta, GA, USA (2006)
Baccichet, P., Noh, J., Setton, E., Girod, B.: Content-Aware P2P Video Streaming with Low Latency. In: Proc. IEEE International Conference on Multimedia and Expo., Beijing, China (2007)
Setton, E.: Congestion-Aware Video Streaming over Peer-to-Peer Networks. Ph.D. thesis, Stanford University, Stanford, CA, USA (2006)
Noh, J., Baccichet, P., Girod, B.: Experiences with a Large-Scale Deployment of Stanford Peer-to-Peer Multicast. In: Proc. IEEE 17th Packet Video Workshop, Seattle, WA, USA (2009)
Mavlankar, A., Noh, J., Baccichet, P., Girod, B.: Optimal Server Bandwidth Allocation among Multiple P2P Multicast Live Video Streaming Sessions. In: Proc. IEEE 17th Packet Video Workshop, Seattle, WA, USA (2009)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Mavlankar, A., Girod, B. (2010). Video Streaming with Interactive Pan/Tilt/Zoom. In: Mrak, M., Grgic, M., Kunt, M. (eds) High-Quality Visual Experience. Signals and Communication Technology. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-12802-8_19
Download citation
DOI: https://doi.org/10.1007/978-3-642-12802-8_19
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-12801-1
Online ISBN: 978-3-642-12802-8
eBook Packages: EngineeringEngineering (R0)