International Journal of Computer Vision

, Volume 97, Issue 1, pp 104–121 | Cite as

Automatic Real-Time Video Matting Using Time-of-Flight Camera and Multichannel Poisson Equations

  • Liang WangEmail author
  • Minglun GongEmail author
  • Chenxi Zhang
  • Ruigang Yang
  • Cha Zhang
  • Yee-Hong Yang


This paper presents an automatic real-time video matting system. The proposed system consists of two novel components. In order to automatically generate trimaps for live videos, we advocate a Time-of-Flight (TOF) camera-based approach to video bilayer segmentation. Our algorithm combines color and depth cues in a probabilistic fusion framework. The scene depth information returned by the TOF camera is less sensitive to environment changes, which makes our method robust to illumination variation, dynamic background and camera motion. For the second step, we perform alpha matting based on the segmentation result. Our matting algorithm uses a set of novel Poisson equations that are derived for handling multichannel color vectors, as well as the depth information captured. Real-time processing speed is achieved through optimizing the algorithm for parallel processing on graphics hardware. We demonstrate the effectiveness of our matting system on an extensive set of experimental results.


Bilayer segmentation Video matting Time-of-flight camera 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Supplementary material

(AVI 898 KB)

(AVI 1.41 MB)

11263_2011_471_MOESM3_ESM.avi (1 mb)
(AVI 1.01 MB)

(AVI 625 KB)

(AVI 1.11 MB)

(AVI 1.28 MB)

(AVI 641 KB)

(AVI 1 MB)

(AVI 1.09 MB)


  1. Bai, X., & Sapiro, G. (2007). A geodesic framework for fast interactive image and video segmentation and matting. In Proc. of ICCV. Google Scholar
  2. Blake, A., Rother, C., Brown, M., Perez, P., & Torr, P. (2004). Interactive image segmentation using an adaptive GMMRF model. In Proc. of ECCV. Google Scholar
  3. Boykov, Y., & Jolly, M.-P. (2001). Interactive graph cuts for optimal boundary and region segmentation of objects in N-D images. In Proc. of ICCV. Google Scholar
  4. Boykov, Y., Veksler, O., & Zabih, R. (2001). Fast approximate energy minimization via graph cuts. IEEE TPAMI, 23(11), 1222–1239. CrossRefGoogle Scholar
  5. Canesta Inc.
  6. Chuang, Y.-Y., Curless, B., Salesin, D., & Szeliski, R. (2001). A Bayesian approach to digital matting. In Proc. of CVPR (pp. 264–271). Google Scholar
  7. Chuang, Y.-Y., Agarwala, A., Curless, B., Salesin, D. H., & Szeliski, R. (2002). Video matting of complex scenes. Proceedings of the SIGGRAPH, 21(3), 243–248. Google Scholar
  8. Crabb, R., Tracey, C., Puranik, A., & Davis, J. (2008). Real-time foreground segmentation via range and color imaging. In Proc. of IEEE workshop on time of flight camera based computer vision. Google Scholar
  9. Criminisi, A., Cross, G., Blake, A., & Kolmogorov, V. (2006). Bilayer segmentation of live video. In Proc. of CVPR. Google Scholar
  10. Davis, J., & Gonzalesz-Banos, H. (2003). Enhanced shape recovery with shuttered pulses of light. In Proc. of IEEE workshop on projector-camera systems. Google Scholar
  11. Gastal, E. S. L., & Oliveira, M. M. (2010). Shared sampling for real-time alpha matting. In Proc. of Eurographics. Google Scholar
  12. Gong, M., & Yang, Y.-H. (2009). Near-real-time image matting with known background. In Proc. of Canadian conference on computer and robot vision. Google Scholar
  13. Gong, M., Wang, L., Yang, R., & Yang, Y.-H. (2010). Real-time video matting using multichannel Poisson equations. In Proc. of graphics interface. Google Scholar
  14. Gordon, G., Darrell, T., Harville, M., & Woodfill, J. (1999). Background estimation and removal based on range and color. In Proc. of CVPR. Google Scholar
  15. Grady, L., Schiwietz, T., Aharon, S., & Westermann, R. (2005). Random walks for interactive alpha-matting. In Proc. of VIIP (pp. 423–429). Google Scholar
  16. Harville, M., Gordon, G., & Woodfill, J. (2001). Foreground segmentation using adaptive mixture models in color and depth. In Proc. of IEEE workshop on detection and recognition of events in video. Google Scholar
  17. Joshi, N., Matusik, W., & Avidan, S. (2006). Natural video matting using camera arrays. In Proc. of SIGGRAPH (pp. 779–786). Google Scholar
  18. Kolmogorov, V., Criminisi, A., Blake, A., Cross, G., & Rother, C. (2005). Bilayer segmentation of binocular stereo video. In Proc. of CVPR. Google Scholar
  19. Levin, A., Lischinski, D., & Weiss, Y. (2008). A closed form solution to natural image matting. IEEE TPAMI, 30(2), 228–242. CrossRefGoogle Scholar
  20. Li, Y., Sun, J., & Shum, H.-Y. (2005). Video object cut and paste. Proceedings of the SIGGRAPH, 24(3), 595–600. CrossRefGoogle Scholar
  21. McGuire, M., Matusik, W., Pfister, H., Hughes, J. F., & Durand, F. (2005). Defocus video matting. In Proc. of SIGGRAPH (pp. 567–576). Google Scholar
  22. McGuire, M., Matusik, W., & Yerazunis, W. (2006). Practical, real-time studio matting using dual imagers. In Proc. of Eurographics symposium on rendering. Google Scholar
  23. MESA Imaging AG.
  24. Mishima, Y. (1993). Soft edge chroma-key generation based upon hexoctahedral color space. US Patent 5,355,174. Google Scholar
  25. Open Source Computer Vision (OpenCV) Libiary.
  26. Pham, V.-Q., Takahashi, K., & Naemura, T. (2009). Real-time video matting based on bilayer segmentation. In Proc. of ACCV. Google Scholar
  27. Porter, T., & Duff, T. (1984). Compositing digital images. In Proc. of SIGGRAPH (pp. 673–678). Google Scholar
  28. Rhemann, C., Rother, C., Rav-Acha, A., & Sharp, T. (2008). High resolution matting via interactive trimap segmentation. In Proc. of CVPR. Google Scholar
  29. Rother, C., Kolmogorov, V., & Blake, A. (2004). GrabCut: interactive foreground extraction using iterated graph cuts. Proceedings of the SIGGRAPH, 23(3), 309–314. CrossRefGoogle Scholar
  30. Sun, J., Jia, J., Tang, C.-K., & Shum, H.-Y. (2004). Poisson matting. In Proc. of SIGGRAPH (pp. 315–321). Google Scholar
  31. Sun, J., Zhang, W., Tang, X., & Shum, H.-Y. (2006). Background cut. In Proc. of ECCV (pp. 628–641). Google Scholar
  32. Sun, J., Sun, J., Kang, S.-B., Xu, Z.-B., Tang, X., & Shum, H.-Y. (2007). Flash cut: foreground extraction with flash and no-flash image pairs. In Proc. of CVPR. Google Scholar
  33. Wang, J., & Cohen, M. (2005). An iterative optimization approach for unified image segmentation and matting. In Proc. of ICCV (pp. 936–943). Google Scholar
  34. Wang, J., & Cohen, M. (2007a). Optimized color sampling for robust matting. In Proc. of CVPR. Google Scholar
  35. Wang, J., & Cohen, M. (2007b). Image and video matting: a survey. FTCGV, 3(2), 97–175 Google Scholar
  36. Wang, J., Bhat, P., Colburn, R. A., Agrawala, M., & Cohen, M. F. (2005). Interactive video cutout. In Proc. of SIGGRAPH (pp. 585–594). Google Scholar
  37. Wang, J., Agrawala, M., & Cohen, M. (2007a). Soft scissors: an interactive tool for realtime high quality matting. In Proc. of SIGGRAPH. Google Scholar
  38. Wang, O., Finger, J., Yang, Q., Davis, J., & Yang, R. (2007b). Automatic natural video matting with depth. In Proc. of Pacific graphics. Google Scholar
  39. Wang, L., Zhang, C., Yang, R., & Zhang, C. (2010). TofCut: towards robust real-time foreground extraction using a time-of-flight camera. In Proc. of 3DPVT. Google Scholar
  40. Wu, Q., Boulanger, P., & Bischof, W. F. (2008). Robust real-time Bi-layer video segmentation using infrared video. In Proc. of Canadian conference on computer and robot vision. Google Scholar
  41. Yang, Q., Yang, R., Davis, J., & Nister, D. (2007). Spatial-depth super resolution for range images. In Proc. of CVPR. Google Scholar
  42. Yin, P., Criminisi, A., Winn, J., & Essa, I. (2007). Tree-based classifiers for bilayer video segmentation. In Proc. of CVPR. Google Scholar
  43. Yu, T., Zhang, C., Cohen, M., Rui, Y., & Wu, Y. (2007). Monocular video foreground/background segmentation by tracking spatial-color Gaussian mixture models. In Proc. IEEE workshop on motion and video computing. Google Scholar
  44. Zhang, G., Jia, J., Wong, T.-T., & Bao, H. (2008). Recovering consistent video depth maps via bundle optimization. In Proc. of CVPR. Google Scholar
  45. Zhang, G., Jia, J., Hua, W., & Bao, H. (2011). Robust bilayer segmentation and motion/depth estimation with a handheld camera. IEEE TPAMI, 33(3), 603–617. zbMATHCrossRefGoogle Scholar
  46. Zhu, J., Wang, L., Yang, R., & Davis, J. (2008). Fusion of time-of-flight depth and stereo for high accuracy depth maps. In Proc. of CVPR. Google Scholar
  47. Zhu, J., Liao, M., Yang, R., & Pan, Z. (2009). Joint depth and alpha matte optimization via fusion of stereo and time-of-flight sensor. In Proc. of CVPR. Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2011

Authors and Affiliations

  1. 1.Department of Computer ScienceUniversity of KentuckyLexingtonUSA
  2. 2.Department of Computer ScienceMemorial University of NewfoundlandSt. John’sCanada
  3. 3.Microsoft ResearchRedmondUSA
  4. 4.Department of Computing ScienceUniversity of AlbertaEdmontonCanada

Personalised recommendations