Multimedia Tools and Applications

, Volume 75, Issue 20, pp 12431–12461 | Cite as

Adaptive delivery of immersive 3D multi-view video over the Internet

  • Cagri Ozcinar
  • Erhan Ekmekcioglu
  • Janko Ćalić
  • Ahmet Kondoz


The increase in Internet bandwidth and the developments in 3D video technology have paved the way for the delivery of 3D Multi-View Video (MVV) over the Internet. However, large amounts of data and dynamic network conditions result in frequent network congestion, which may prevent video packets from being delivered on time. As a consequence, the 3D video experience may well be degraded unless content-aware precautionary mechanisms and adaptation methods are deployed. In this work, a novel adaptive MVV streaming method is introduced which addresses the future generation 3D immersive MVV experiences with multi-view displays. When the user experiences network congestion, making it necessary to perform adaptation, the rate-distortion optimum set of views that are pre-determined by the server, are truncated from the delivered MVV streams. In order to maintain high Quality of Experience (QoE) service during the frequent network congestion, the proposed method involves the calculation of low-overhead additional metadata that is delivered to the client. The proposed adaptive 3D MVV streaming solution is tested using the MPEG Dynamic Adaptive Streaming over HTTP (MPEG-DASH) standard. Both extensive objective and subjective evaluations are presented, showing that the proposed method provides significant quality enhancement under the adverse network conditions.


3D Multi-view video Multi-view video coding Video coding Video streaming Video adaptation Adaptive streaming MPEG-DASH 



This work was supported by the ROMEO project (grant number: 287896), which was funded by the EC FP7 ICT collaborative research programme. This paper is an extended version of the original paper [49] which appeared in the Proceedings of the 2013 ACM International Workshop on Immersive Media Experiences [7]. Special thanks to the anonymous reviewers and program chairs in the workshop and the journal for their constructive comments and suggestions that assisted in enhancing the paper.


  1. 1.
    Alioscopy - Glasses-free (2016) Alioscopy glasses-free 3D displays.
  2. 2.
    Apple Developer (2016) Apple HTTP live streaming.
  3. 3.
    Benzie P, Watson J, Surman P, Rakkolainen I, Hopf K, Urey H, Sainov V, Kopylow C (2007) A survey of 3DTV displays: Techniques and technologies. IEEE Trans Circuits Syst Video Technol 17(11):1647–1658CrossRefGoogle Scholar
  4. 4.
    Bjøtegaard G (2001) Calculation of average PSNR differences between RD-curves (vceg-m33). Tech. Rep. M16090, VCEG Meeting (ITU-T SG16 Q.6) Austin,Texas,USAGoogle Scholar
  5. 5.
    Carballeira P, Tech G, Cabrera J, Müler K, Jaureguizar F, Wiegand T, Garcia N (2010) Block based Rate-Distortion analysis for quality improvement of synthesized views. In: 3DTV-Conference: The True Vision - Capture, Transmission and Display of 3D Video (3DTV-CON), 2010, Tampere, pp 1–4Google Scholar
  6. 6.
    Chakareski J (2013) Adaptive multiview video streaming: challenges and opportunities. IEEE Commun Mag 51(5):94–100CrossRefGoogle Scholar
  7. 7.
    Chambel T, Bove V, Strover S, Viana P, Thomas G (2013) ImmersiveMe ’13: Proceedings of the 2013 ACM International Workshop on Immersive Media Experiences. In: Proceedings of the 2013 ACM International Workshop on Immersive Media Experiences, ACM, Barcelona, Spain, p 433137Google Scholar
  8. 8.
    Cheng C M, Lin S J, Lai S H (2011) Spatio-Temporally Consistent Novel View Synthesis Algorithm From Video-Plus-Depth Sequences for Autostereoscopic Displays. IEEE Trans Broadcast 57(2):523–532CrossRefGoogle Scholar
  9. 9.
    Cheung G, Velisavljevic V, Ortega A (2011) On dependent bit allocation for multiview image coding with depth-image-based rendering. IEEE Trans Image Process 20(11):3179–3194MathSciNetCrossRefGoogle Scholar
  10. 10.
    Cheung NM, Tian D, Vetro A, Sun H (2012) On modeling the rendering error in 3D video. In: Image Processing (ICIP), 2012 19th IEEE International Conference on, pp 3021–3024Google Scholar
  11. 11.
    Christensen E, Curbera F, Meredith G, Weerawarana S (2016) Web services description language (WSDL) 1.1Google Scholar
  12. 12.
    Cohen B (2003) Incentives build robustness in BitTorrent. In: Workshop on Economics of Peer-to-Peer systems, vol 6, pp 68–72Google Scholar
  13. 13.
    Dempsey B, Liebeherr J, Weaver A (1996) On Retransmission-based Error Control for Continuous Media Traffic in Packet-switching Networks. Computer Networks and ISDN Systems 28(5):719–736CrossRefGoogle Scholar
  14. 14.
    Detti A, Ricci B, Blefari-Melazzi N (2015) Mobile Peer-to-peer Video Streaming over Information-centric Networks. Comput Netw 81(C):272–288CrossRefGoogle Scholar
  15. 15.
    Dimenco - No-glasses 3D (2016) Dimenco displays.
  16. 16.
    Domański M, Stankiewicz O, Wegner K, Kurc M, Konieczny J, Siast J, Stankowski J, Ratajczak R, Grajek T (2013) High efficiency 3D video coding using new tools based on view synthesis. IEEE Trans Image Process 22(9):3517–3527CrossRefGoogle Scholar
  17. 17.
    Dufaux F, Pesquet-Popescu B, Cagnazzo M (2013) Emerging technologies for 3D video: creation, coding, transmission and rendering. John Wiley & SonsGoogle Scholar
  18. 18.
    Fehn C (2004) Depth-image-based rendering (DIBR), compression, and transmission for a new approach on 3D-TV. Proc SPIE 5291:93–104CrossRefGoogle Scholar
  19. 19.
    Feldmann I, Mueller M, Zilly F, Tanger R, Mueller K, Smolic A, Kauff P, Wiegand T (2008) HHI test material for 3D video. Tech. Rep. MPEG2008/M15413 ISO/IEC JTC1/SC29/WG11. Archamps, FranceGoogle Scholar
  20. 20.
    GIST Electronics and telecommunications research institute and gwangju institute of science and technology (2016) 3DV Sequences of ETRI DS, Gwangju, Korea, Scholar
  21. 21.
    Gürler C, Bagci K, Tekalp A (2010) Adaptive stereoscopic 3D video streaming. In: Image Processing (ICIP), 2010 17th IEEE International Conference on, Hong Kong, pp 2409–2412Google Scholar
  22. 22.
    Helle P, Oudin S, Bross B, Marpe D, Bici M, Ugur K, Jung J, Clare G, Wiegand T (2012) Block merging for quadtree-based partitioning in HEVC. IEEE Trans Circuits Syst Video Technol 22(12):1720–1731CrossRefGoogle Scholar
  23. 23.
    HHI Fraunhofer (2016) 3DV Sequences of HHI Fraunhofer heinrich hertz institute, Berlin, Germany, Scholar
  24. 24.
    Ho Y S, Lee E K, Lee C (2008) Multiview video test sequence and camera parameters. Tech. Rep. MPEG2008/M15419 ISO/IEC JTC1/SC29/WG11. Archamps, FranceGoogle Scholar
  25. 25.
    Hur J H, Cho S, Lee Y L (2007) Adaptive local illumination change compensation method for H.264/AVC-based multiview video coding. IEEE Trans Circuits Syst Video Technol 17(11):1496–1505CrossRefGoogle Scholar
  26. 26.
    ISO/IEC JTC1/SC29/WG11 (2005) Report of the subjective quality evaluation for MVC call for evidence. Tech. Rep. MPEG2005/N6999, Hong Kong, ChinaGoogle Scholar
  27. 27.
    ISO/IEC JTC1/SC29/WG11 (2011) Call for proposals on 3D video coding technology. Tech. Rep. MPEG2011/N12036, Geneva, SwitzerlandGoogle Scholar
  28. 28.
    ISO/IEC JTC1/SC29/WG11 (2015) Call for Evidence on Free-Viewpoint Television: Super-Multiview and Free Navigation. Tech. Rep. MPEG2015/N15733, Geneva, SwitzerlandGoogle Scholar
  29. 29.
    ISO/IEC JTC1/SC29/WG11 (2016) MPEG 3-DV View Synthesis Reference Software.
  30. 30.
    ITU-R Recommendation (2012) ITU-R BT.500-13, Methodology for the subjective assessment of the quality of television picturesGoogle Scholar
  31. 31.
    Jacobson V (1988) Congestion avoidance and control. ACM SIGCOMM Computer Communication Review 1:314–329CrossRefGoogle Scholar
  32. 32.
    Jacobson V, Smetters D, Thornton J, Plass M, Briggs N, Braynard R (2009) Networking named content. In: Proceedings of the 5th International Conference on Emerging Networking Experiments and Technologies, ACM, New York, NY, USA, CoNEXT ’09, pp 1–12Google Scholar
  33. 33.
    Kang Y S, Lee E K, Jung J I, Lee J H, Shin I Y (2009) 3D video test sequence and camera parameters. Tech. Rep. MPEG2009/M16949, ISO/IEC JTC1/SC29/WG11. Sian, ChinaGoogle Scholar
  34. 34.
    Kanungo T, Mount D, Netanyahu N, Piatko C, Silverman R, Wu A (2002) An efficient k-means clustering algorithm: Analysis and implementation. IEEE Trans Pattern Anal Mach Intell 24(7):881–892CrossRefMATHGoogle Scholar
  35. 35.
    Kim WS, Ortega A, Lai P, Tian D, Gomila C (2009) Depth map distortion analysis for view rendering and depth coding. In: Image Processing (ICIP), 16th IEEE International Conference on, pp 721–724Google Scholar
  36. 36.
    Kondoz A (2004) Digital Speech: coding for low bitrate communication systems. Wiley Online LibraryGoogle Scholar
  37. 37.
    Kondoz A, Dagiuklas T (2014) 3D Future Internet Media. SpringerGoogle Scholar
  38. 38.
    Köppel M, Ndjiki-Nya P, Doshkov D, Lakshman H, Merkle P, Müller K, Wiegand T (2010) Temporally consistent handling of disocclusions with texture synthesis for depth-image-based rendering. In: Image Processing (ICIP), 2010 17th IEEE International Conference on, pp 1809–1812Google Scholar
  39. 39.
    Lederer S, Müller C, Timmerer C (2012) Dynamic adaptive streaming over HTTP dataset. In: Proceedings of the 3rd Multimedia Systems Conference, ACM, New York, NY, USA, MMSys ’12, pp 89–94Google Scholar
  40. 40.
    Lewandowski F, Paluszkiewicz M, Grajek T, Wegner K (2012) Subjective quality assessment methodology for 3D video compression technology. In: Signals and Electronic Systems (ICSES), 2012 International Conference on, pp 1–5Google Scholar
  41. 41.
    Lightstone M, Mitra S (1997) Quadtree optimization for image and video coding. Journal of VLSI signal processing systems for signal, image and video technology 17 (2-3):215–224CrossRefGoogle Scholar
  42. 42.
    Matusik W, Pfister H (2004) 3D TV: A scalable system for real-time acquisition, transmission, and autostereoscopic display of dynamic scenes. ACM Trans Graph 23 (3):814–824CrossRefGoogle Scholar
  43. 43.
    Miller K, Quacchio E, Gennari G, Wolisz A (2012) Adaptation algorithm for adaptive streaming over HTTP. 2012 19th International Packet Video Workshop (PV):173–178Google Scholar
  44. 44.
    Müller C, Lederer S, Timmerer C (2012) An evaluation of dynamic adaptive streaming over HTTP in vehicular environments. In: Proceedings of the 4th Workshop on Mobile Video, ACM, New York, NY, USA, MoVid ’12, pp 37–42Google Scholar
  45. 45.
    Müller K, Schwarz H, Marpe D, Bartnik C, Bosse S, Brust H, Hinz T, Lakshman H, Merkle P, Rhee F, Tech G, Winken M, Wiegand T (2013) 3D high-efficiency video coding for multi-view video and depth data. IEEE Trans Image Process 22(9):3366–3378MathSciNetCrossRefGoogle Scholar
  46. 46.
    Nagoya University (2016) 3DV Sequences of Nagoya University, Japan.
  47. 47.
    Oh KJ, Yea S, Ho YS (2009) Hole filling method using depth based in-painting for view synthesis in free viewpoint television and 3-D video. In: Picture Coding Symposium, 2009. PCS 2009, Chicago, IL, pp 1–4Google Scholar
  48. 48.
    Onural L (2007) Television in 3D: What are the prospects?. Proc IEEE 95 (6):1143–1145CrossRefGoogle Scholar
  49. 49.
    Ozcinar C, Ekmekcioglu E, Kondoz A (2013) Dynamic adaptive 3D multi-view video streaming over the Internet. In: Proceedings of the 2013 ACM International Workshop on Immersive Media Experiences, ACM, Barcelona, Spain, ImmersiveMe ’13, pp 51–56Google Scholar
  50. 50.
    Ozcinar C, Ekmekcioglu E, Kondoz A (2014) Adaptive 3D multi-view video streaming over P2P networks. In: Image Processing (ICIP), 2014 IEEE International Conference on, Paris, pp 2462–2466Google Scholar
  51. 51.
    Oztas B, Pourazad M, Nasiopoulos P, Sodagar I, Leung V (2014) A rate adaptation approach for streaming multiview plus depth content. In: Computing, Networking and Communications (ICNC), 2014 International Conference on, Honolulu, HI, pp 1006–1010Google Scholar
  52. 52.
    P.910 ITU-T Recommendation (1999) Subjective video quality assessment methods for multimedia applicationsGoogle Scholar
  53. 53.
    Pourebrahimi B, Bertels K, Vassiliadis S (2005) A survey of peer-to-peer networks. In: Proceedings of the 16th Annual Workshop on Circuits, Systems and Signal ProcessingGoogle Scholar
  54. 54.
    Pulipaka A, Seeling P, Reisslein M, Karam L (2013) Traffic and Statistical Multiplexing Characterization of 3-D Video Representation Formats. IEEE Trans Broadcast 59(2):382–389CrossRefGoogle Scholar
  55. 55.
    Rizzo L (1997) Dummynet: A simple approach to the evaluation of network protocols. ACM SIGCOMM Computer Communication Review 27(1):31–41CrossRefGoogle Scholar
  56. 56.
    Savas S, Gurler C, Tekalp A (2012) Evaluation of adaptation methods for multi-view video. In: Image Processing (ICIP), 2012 19th IEEE International Conference on, Orlando, FL, pp 2273–2276Google Scholar
  57. 57.
    Savas S, Gurler C, Tekalp A, Ekmekcioglu E, Worrall S, Kondoz A (2012) Adaptive streaming of multi-view video over P2P networks. Signal Process Image Commun 27(5):522 – 531CrossRefGoogle Scholar
  58. 58.
    Seeling P, Reisslein M (2005) The rate variability-distortion (VD) curve of encoded video and its impact on statistical multiplexing. IEEE Trans Broadcast 51 (4):473–492CrossRefGoogle Scholar
  59. 59.
    Seema A, Schwoebel L, Shah T, Morgan J, Reisslein M (2015) WVSNP-DASH Name-Based Segmented Video Streaming. IEEE Trans Broadcast 61 (3):346–355CrossRefGoogle Scholar
  60. 60.
    Smolic A, Mueller K, Stefanoski N, Ostermann J, Gotchev A, Akar G, Triantafyllidis G, Koz A (2007) Coding algorithms for 3DTV - a survey. IEEE Trans Circuits Syst Video Technol 17(11):1606–1621CrossRefGoogle Scholar
  61. 61.
    Sodagar I (2011) The MPEG-DASH Standard for Multimedia Streaming Over the Internet. Multi Media IEEE 18(4):62–67CrossRefGoogle Scholar
  62. 62.
    Sripanidkulchai K, Maggs B, Zhang H (2004) An Analysis of Live Streaming Workloads on the Internet. In: Proceedings of the 4th ACM SIGCOMM Conference on Internet Measurement, ACM, New York, NY, USA, IMC ’04, pp 41–54Google Scholar
  63. 63.
    Stockhammer T (2011) Dynamic Adaptive Streaming over HTTP –: Standards and Design Principles. In: Proceedings of the Second Annual ACM Conference on Multimedia Systems, ACM, New York, NY, USA, MMSys ’11, pp 133–144Google Scholar
  64. 64.
    Sugiyama Y (1986) An algorithm for solving discrete-time Wiener-Hopf equations based upon Euclid’s algorithm. IEEE Trans Inf Theory 32(3):394–409MathSciNetCrossRefMATHGoogle Scholar
  65. 65.
    Sullivan G, Baker R (1994) Efficient quadtree coding of images and video. IEEE Trans Image Process 3(3):327–331CrossRefGoogle Scholar
  66. 66.
    Sullivan G, Wiegand T (1998) Rate-distortion optimization for video compression. IEEE Signal Process Mag 15(6):74–90CrossRefGoogle Scholar
  67. 67.
    Sullivan G, Ohm J, Han W J, Wiegand T (2012) Overview of the high efficiency video coding (HEVC) standard. IEEE Trans Circuits Syst Video Technol 22 (12):1649–1668CrossRefGoogle Scholar
  68. 68.
    Sullivan G, Boyce J, Chen Y, Ohm J R, Segall C, Vetro A (2013) Standardized Extensions of High Efficiency Video Coding (HEVC). IEEE J Sel Top Sign Proces 7(6):1001–1016CrossRefGoogle Scholar
  69. 69.
    Sun W, Cheung G, Chou P, Florencio D, Zhang C, Au O (2013) Rate-distortion optimized 3D reconstruction from noise-corrupted multiview depth videos. In: Multimedia and Expo (ICME), 2013 IEEE International Conference on, San Jose, CA, pp 1–6Google Scholar
  70. 70.
    Tanimoto M (2009) Overview of FTV (free-viewpoint television). In: Multimedia and Expo, 2009. ICME 2009. IEEE International Conference on New York, NY, pp 1552–1553Google Scholar
  71. 71.
    Tanimoto M, Fujii T, Suzuki K, Fukushima N, Mori Y (2008) Reference softwares for depth estimation and view synthesis. Tech. Rep. MPEG2008/M15377, ISO/IEC JTC1/SC29/WG11, ArchampsGoogle Scholar
  72. 72.
    Tech G, Schwarz H, Müller K, Wiegand T (2012) 3D video coding using the synthesized view distortion change. In: Picture Coding Symposium (PCS), 2012, Krakow, pp 25–28Google Scholar
  73. 73.
    Thang T, Ho Q D, Kang J, Pham A (2012) Adaptive streaming of audiovisual content using MPEG DASH. IEEE Trans Consum Electron 58(1):78–85CrossRefGoogle Scholar
  74. 74.
    The official Microsoft IIS site (2016) Microsoft Smooth-Streaming.
  75. 75.
    Vetro A, Sodagar I (2011) Industry and Standards The MPEG-DASH Standard for Multimedia Streaming Over the Internet. IEEE MultiMedia 18(4):62–67CrossRefGoogle Scholar
  76. 76.
    Vetro A, Tourapis A, Müller K, Tao C (2011) 3D-TV content storage and transmission. IEEE Trans Broadcast 57(2):384–394CrossRefGoogle Scholar
  77. 77.
    Vetro A, Wiegand T, Sullivan G (2011) Overview of the stereo and multiview video coding extensions of the H.264/MPEG-4 AVC standard. Proc IEEE 99(4):626–642CrossRefGoogle Scholar
  78. 78.
    Wang Z, Bovik A, Sheikh H, Simoncelli E (2004) Image quality assessment: from error visibility to structural similarity. IEEE Trans Image Process 13(4):600–612CrossRefGoogle Scholar
  79. 79.
    Wegner K, Stankiewicz O, Klimaszewski K, Domański M (2010) Comparison of multiview compresion performance using MPEG-4 MVC and prospective HVC technology. Tech. Rep. MPEG M17913 ISO/IEC JTC1/SC29/WG11. Geneve, SwitzerlandGoogle Scholar
  80. 80.
    Wiegand T, Sullivan G, Bjontegaard G, Luthra A (2003) Overview of the H.264/AVC video coding standard. IEEE Trans Circuits Syst Video Technol 13 (7):560–576CrossRefGoogle Scholar
  81. 81.
    Yaning L, Geurts J, Point JC, Lederer S, Rainer B, Müller C, Timmerer C, Hellwagner H (2013) Dynamic adaptive streaming over CCN: A caching and overhead analysis. In: Communications (ICC), 2013 IEEE International Conference on , pp 3629–3633Google Scholar
  82. 82.
    Zhang C, Yin Z, Florencio D (2009) Improving depth perception with motion parallax and its application in teleconferencing. In: Multimedia Signal Processing, MMSP ’09. IEEE International Workshop on, Rio De Janeiro, pp 1–6Google Scholar
  83. 83.
    Zhang Q, Tian L, Huang L, Wang X, Zhu H (2014) Rendering distortion estimation model for 3D high efficiency depth coding. Math Probl Eng 2014(940737):7Google Scholar
  84. 84.
    Zhao Y, Yu L (2010) A perceptual metric for evaluating quality of synthesized sequences in 3DV system. In: Proceedings of SPIE Vol, vol 7744, pp 77,440X–1Google Scholar

Copyright information

© Springer Science+Business Media New York 2016

Authors and Affiliations

  • Cagri Ozcinar
    • 1
  • Erhan Ekmekcioglu
    • 3
  • Janko Ćalić
    • 2
  • Ahmet Kondoz
    • 3
  1. 1.LTCI, CNRS, Télécom ParisTechUniversité Paris-SaclayParisFrance
  2. 2.Centre for Vision, Speech and Signal ProcessingUniversity of SurreyGuildfordUK
  3. 3.Institute for Digital TechnologiesLoughborough University LondonLondonUK

Personalised recommendations