Skip to main content
Log in

2D-to-3D conversion using optical flow based depth generation and cross-scale hole filling algorithm

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

3D display has become the inevitable trend of display technology. Converting the traditional and classical 2D videos to 3D videos is an important and effective measure to solve the shortage of 3D contents. The major work about 2D-to-3D video conversion is how to extract depth information from the 2D video, and synthesize a new image from the existing viewpoint. We propose a depth extraction method based on dense edge-preserving optical flow from 2D videos, reducing the matching error in textureless regions. Moreover, we use the Gaussian Pyramid and Laplace Pyramid at cross scales to fill the holes in the image at new view point after 3D warping. The experiments show that our results outperform state-of-the-art methods in visual effect and statistics.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

References

  1. Arbelaez P, Maire M, Fowlkes C, Malik J (2011) Contour detection and hierarchical image segmentation. IEEE Trans Pattern Anal Mach Intell 33(5):898–916

    Article  Google Scholar 

  2. Assa J, Wolf L (2010) Diorama construction from a single image. Comput Graphics Forum 26(3):599–608

    Article  Google Scholar 

  3. Barnes C, Shechtman E, Finkelstein A, Goldman D (2009) Patchmatch: a randomized correspondence algorithm for structural image editing. ACM Trans Graph-TOG 28(3):24

    Google Scholar 

  4. Butler DJ, Wulff J, Stanley GB, Black MJ (2012) A naturalistic open source movie for optical flow evaluation. In: European conference on computer vision. Springer, pp 611–625

  5. Cheng C-C, Li C-T, Chen L-G (2010) A novel 2dd-to-3d conversion system using edge information. IEEE Trans Consum Electron 56(3):1739–1745

    Article  Google Scholar 

  6. Cozman F, Krotkov E (1997) Depth from scattering. In: 1997 IEEE computer society conference on computer vision and pattern recognition, 1997. Proceedings. IEEE, pp 801–806

  7. Criminisi A, Perez P, Toyama K (2003) Object removal by exemplar-based inpainting. In: 2003 IEEE Computer society conference on computer vision and pattern recognition, 2003. Proceedings, vol 2. IEEE, pp II–721

  8. Criminisi A, Pérez P, Toyama K (2004) Region filling and object removal by exemplar-based image inpainting. IEEE Trans Image Process 13(9):1200–1212

    Article  Google Scholar 

  9. Cui J, Liu Y, Xu Y, Zhao H, Zha H (2013) Tracking generic human motion via fusion of low-and high-dimensional approaches. IEEE Trans Syst Man Cybern Syst Hum 43(4):996–1002

    Article  Google Scholar 

  10. Do L, Zinger S et al (2010) Quality improving techniques for free-viewpoint dibr. In: IS&T/SPIE electronic imaging. International Society for Optics and Photonics, pp 75240I–75240I

  11. Dollár P, Zitnick CL (2013) Structured forests for fast edge detection. In: Proceedings of the IEEE international conference on computer vision, pp 1841–1848

  12. Du C, Chen YL, Ye M, Ren L (2016) Edge snapping-based depth enhancement for dynamic occlusion handling in augmented reality, pp 54–62

  13. Farnebäck G (2003) Two-frame motion estimation based on polynomial expansion. In: Scandinavian conference on image analysis. Springer, pp 363–370

  14. Guttmann M, Wolf L, Cohen-Or D (2009) . In: Semi-automatic stereo extraction from video footage, vol 30, pp 136–142

  15. Hebborn AK, Hohner N, Muller S (2017) Occlusion matting: realistic occlusion handling for augmented reality applications. In: IEEE international symposium on mixed and augmented reality, pp 62–71

  16. Horn BK, Brooks MJ (1986) The variational approach to shape from shading. Comput Vis Graph Image Process 33(2):174–208

    Article  Google Scholar 

  17. Karsch K, Liu C, Kang SB (2014) Depth transfer: depth extraction from video using non-parametric sampling. IEEE Trans Pattern Anal Mach Intell 36(11):2144

    Article  Google Scholar 

  18. Kim D-S, Lee S-S, Choi B-H (2010) A real-time stereo depth extraction hardware for intelligent home assistant robot. IEEE Trans Consum Electron 3:56

    Google Scholar 

  19. Konrad J, Wang M, Ishwar P (2012) 2d-to-3d image conversion by learning depth from examples. In: Computer vision and pattern recognition workshops, pp 16–22

  20. Liu Y, Zhang X, Cui J, Wu C, Aghajan H, Zha H (2010) Visual analysis of child-adult interactive behaviors in video sequences. In: International conference on virtual systems and multimedia, pp 26–33

  21. Liu Y, Nie L, Han L, Zhang L, Rosenblum DS (2015) Action2activity: recognizing complex activities from sensor data. In: International conference on artificial intelligence, pp 1617–1623

  22. Liu L, Cheng L, Liu Y, Jia Y, Rosenblum DS (2016) Recognizing complex activities by a probabilistic interval-based model. In: Thirtieth AAAI conference on artificial intelligence, pp 1266–1272

  23. Liu Y, Nie L, Liu L, Rosenblum DS (2016) From action to activity: sensor-based activity recognition. Neurocomputing 181:108–115

    Article  Google Scholar 

  24. Loh AM, Hartley RI et al (2005) Shape from non-homogeneous, non-stationary, anisotropic, perspective texture. In: BMVC, pp 69–78

  25. Lucas BD, Kanade T et al (1981) An iterative image registration technique with an application to stereo vision. IJCAI 81(1):674–679

    Google Scholar 

  26. Mark WR, McMillan L, Bishop G (1997) Post-rendering 3d warping. In: Proceedings of the 1997 symposium on Interactive 3D graphics. ACM, pp 7–ff

  27. Moré JJ (1978) The levenberg-marquardt algorithm: implementation and theory. In: Numerical analysis. Springer, pp 105–116

  28. Mpeg-ftv test sequence download page. http://www.tanimoto.nuee.nagoya-u.ac.jp/fukushima/mpegftv/

  29. Oh K-J, Yea S, Ho Y-S (2009) Hole filling method using depth based in-painting for view synthesis in free viewpoint television and 3-d video. In: Picture coding symposium, 2009. PCS 2009. IEEE, pp 1–4

  30. Pérez JS, Meinhardt-Llopis E, Facciolo G (2013) Tv-l1 optical flow estimation. Image Process On Line 2013:137–150

    Article  Google Scholar 

  31. Pourazad MT, Nasiopoulos P, Ward RK (2009) An h. 264-based scheme for 2d to 3d video conversion. IEEE Trans Consum Electron 55(2):742–748

    Article  Google Scholar 

  32. Revaud J, Weinzaepfel P, Harchaoui Z, Schmid C (2015) Epicflow: edge-preserving interpolation of correspondences for optical flow. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1164–1172

  33. Saxena A, Sun M, Ng AY (2009) Make3d: Learning 3d scene structure from a single still image. IEEE Trans Pattern Anal Mach Intell 31(5):824–840

    Article  Google Scholar 

  34. Scharstein D, Szeliski R (2002) A taxonomy and evaluation of dense two-frame stereo correspondence algorithms. Int J Comput Vis 47(1–3):7–42

    Article  Google Scholar 

  35. Schmeing M, Jiang X (2011) Time-consistency of disocclusion filling algorithms in depth image based rendering. In: 3dtv conference: the true vision—capture, transmission and display of 3d video, pp 1–4

  36. Shade J, Gortler S, He L-W, Szeliski R (1998) Layered depth images. In: Proceedings of the 25th annual conference on Computer graphics and interactive techniques. ACM, pp 231–242

  37. Shao M, Simchony T, Chellappa R (1988) New algorithms from reconstruction of a 3-d depth map from one or more images. In: Computer society conference on computer vision and pattern recognition, 1988. Proceedings CVPR ’88, pp 530–535

  38. Taketomi Y, Ikeoka H, Hamamoto T (2013) Depth estimation based on defocus blur using a single image taken by a tilted lens optics camera, pp 403–408

  39. Tam WJ, Zhang L (2006) 3d-tv content generation: 2d-to-3d conversion. In: 2006 IEEE international conference on multimedia and expo. IEEE, pp 1869–1872

  40. Tao M, Bai J, Kohli P, Paris S (2012) Simpleflow: a non-iterative, sublinear optical flow algorithm. In: Computer graphics forum, vol 31, no 2pt1. Wiley Online Library, pp 345–353

  41. Telea A (2004) An image inpainting technique based on the fast marching method. J Graph Tools 9(1):23–34

    Article  Google Scholar 

  42. Ward B, Kang SB, Bennett EP (2011) Depth director: a system for adding depth to movies. IEEE Comput Graph Appl 31(1):36–48

    Article  Google Scholar 

  43. Wasserman L (2013) All of statistics: a concise course in statistical inference. Springer Science & Business Media

  44. Wei Q (2005) Converting 2d to 3d: a survey. In: International conference, Page (s), vol 7, p 14

  45. Weinzaepfel P, Revaud J, Harchaoui Z, Schmid C (2013) Deepflow: large displacement optical flow with deep matching. In: Proceedings of the IEEE international conference on computer vision, pp 1385–1392

  46. Wong EF, Wong KT (2004) Single image depth from defocus. Master’s thesis, Delft university of Technology, The Netherlands

  47. Xie J, Girshick R, Farhadi A (2016) Deep3d: fully automatic 2d-to-3d video conversion with deep convolutional neural networks. In: European conference on computer vision. Springer, pp 842–857

  48. Yao L, Han Y, Li X (2016) Virtual viewpoint synthesis using cuda acceleration. In: ACM conference on virtual reality software and technology, pp 367–368

  49. Zach C, Pock T, Bischof H (2007) A duality based approach for realtime tv-l1 optical flow. In: Joint pattern recognition symposium. Springer, pp 214–223

  50. Zhang L, Tam WJ (2005) Stereoscopic image generation based on depth images for 3d tv. IEEE Trans Broadcast 51(2):191–199

    Article  Google Scholar 

  51. Zhang R, Tsai P-S, Cryer JE, Shah M (1999) Shape-from-shading: a survey. IEEE Trans Pattern Anal Mach Intell 21(8):690–706

    Article  Google Scholar 

  52. Zitnick CL, Kang SB, Uyttendaele M, Winder S, Szeliski R (2004) High-quality video view interpolation using a layered representation. In: ACM Transactions on Graphics (TOG), vol 23, no. 3. ACM, pp 600–608

Download references

Acknowledgements

This work is supported by Industrial Prospective Project of Jiangsu Technology Department under Grant No. BE2017081.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Li Yao.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yao, L., Liu, Z. & Wang, B. 2D-to-3D conversion using optical flow based depth generation and cross-scale hole filling algorithm. Multimed Tools Appl 78, 10543–10564 (2019). https://doi.org/10.1007/s11042-018-6583-3

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-018-6583-3

Keywords

Navigation