Skip to main content

Kinect based real-time synthetic aperture imaging through occlusion


Real-time and high performance occluded object imaging is a big challenge to many computer vision applications. In recent years, camera array synthetic aperture theory proves to be a potential powerful way to solve this problem. However, due to the high cost of complex system hardware, the severe blur of occluded object imaging, and the slow speed of image processing, the exiting camera array synthetic aperture imaging algorithms and systems are difficult to apply in practice. In this paper, we present a novel handheld system to handle those challenges. The objective of this work is to design a convenient system for real-time high quality object imaging even under severe occlusion. The main characteristics of our work include: (1) To the best of our knowledge, this is the first real-time handheld system for seeing occluded object in synthetic imaging domain using color and depth images. (2) A novel sequential synthetic aperture imaging framework is designed to achieve seamless interaction among multiple novel modules, and this framework includes object probability generation, virtual camera array generation, and sequential synthetic aperture imaging. (3) In the virtual camera array generation module, based on the integration of color and depth information, a novel feature set iterative optimization algorithm is presented, which can improve the robustness and accuracy of camera pose estimation even in dynamic occlusion scene. Experimental results in challenging scenarios demonstrate the superiority of our system both in robustness and efficiency compared against the state-of-the-art algorithms.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8


  1. 1.

    Borgefors G (1986) Distance transformations in digital images. Comput Vis Graph Image Process 34(3):344–371

    Article  Google Scholar 

  2. 2.

    Davis A, Levoy M, Durand F (2012) Unstructured light fields. EUROGRAPHICS 31(2):305–314

    Google Scholar 

  3. 3.

    Ding YY, Li F, Ji Y, Yu JY (2011) Dynamic 3D fluid surface acquisition using a camera array. IEEE International Conference on Computer Vision, 2478–2485

  4. 4.

    Engelhard N, Endres F, Hess J, Sturm J, Burgard W (2009) Real-time 3D visual SLAM with a hand-held RGB-D camera. European Robotics Forum RGB-D Workshop on 3D Perception in Robotics

  5. 5.

    Han J, Shao L, Xu D, Shotton J (2013) Enhanced computer vision with microsoft Kinect sensor: A review. IEEE Trans Cybern 43(5):1318–1334

    Article  Google Scholar 

  6. 6.

    Han J, Pauwels EJ, Zeeuw de PM (2012) Employing a RGB-D sensor for real-time tracking of humans across multiple re-entries in a smart environment. IEEE Trans Consum Electron 58(2):255–263

    Article  Google Scholar 

  7. 7.

    Henry P, Krainin M, Herbst E, Ren XF, Fox D (2012) RGB-D mapping: Using Kinect-style depth cameras for dense 3D modeling of indoor environments. Int J Robot Res 31(5):647–663

    Article  Google Scholar 

  8. 8.

    Herrera C, Kannala J (2012) Joint depth and color camera calibration with distortion correction. IEEE Trans Pattern Anal Mach Intell 34(10):2058–2064

    Article  Google Scholar 

  9. 9.

    Joshi N, Avidan S, Matusik W, Kriegman D (2007) Synthetic aperture tracking: tracking through occlusions. International Conference on Computer Vision, 1–8

  10. 10.

    Joshi N, Matusik W, Avidan S (2006) Natural video matting using camera arrays. ACM Trans Graph 779–786

  11. 11.

    Klein G, Murray D (2009) Parallel tracking and mapping on a camera phone. IEEE International Symposium on Mixed and Augmented Reality, 83–86

  12. 12.

    Lei C, Yang YH (2008) Design and implementation of a cluster based smart camera array application framework. ACM/IEEE International Conference on Distributed Smart Cameras, 1–10

  13. 13.

    Maitre M, Shinagawa Y, Do MN (2008) Symmetric multi-view stereo reconstruction from planar camera arrays. IEEE Conference on Computer Vision and Pattern Recognition, 1–8

  14. 14.

    Pei Z, Zhang Y, Chen X, Yang YH (2013) Synthetic aperture imaging using pixel labeling via energy minimization. Pattern Recogn 46(1):174–187

    Article  Google Scholar 

  15. 15.

    Pei Z, Zhang Y, Yang T, Zhang XW, Yang YH (2011) A novel multi-object detection method in complex scene using synthetic aperture imaging. Pattern Recogn 45(4):1637–1658

    Article  Google Scholar 

  16. 16.

    Shotton J, Fitzgibbon A, Cook M, Sharp T, Finocchio M, Moore R, Kipman A, Blake A (2011) Real-Time human pose recognition in parts from a single depth image. IEEE Computer Vision and Pattern Recognition

  17. 17.

    Spinello L, Arras K (2011) People detection in RGB-D data. Intelligent Robots and Systems 3838–3843

  18. 18.

    Spinello L, Stachniss C, Burgard W (2012) Scene in the loop: Towards adaptation-by-tracking in RGB-D data. Proc Workshop RGB-D Adv Reason Depth Cameras

  19. 19.

    Taguchi Y, Takahashi K, Naemura T (2008) Real-time all-in-focus video-based rendering using a network camera array. 3DTV Conference: The True Vision-Capture, Transmission and Display of 3D Video, 241–244

  20. 20.

    Vaish V, Wilburn B, Joshi N, Levoy M (2004) Using plane + parallax for calibrating dense camera arrays. IEEE Conf Comput Vis Pattern Recognit 1:2–9

    Google Scholar 

  21. 21.

    Vaish V, Levoy M, Szeliski R, Zitnick CL, Sing BK (2006) Reconstructing occluded surfaces using synthetic apertures: stereo, focus and robust methods. IEEE Conf Comput Vis Pattern Recognit 2331–2338

  22. 22.

    Venkataraman K, Lelescu D, Duparr J, McMahon A, Molina G, Chatterjee P, Mullis R (2013) PiCam: An ultra-thin high performance monolithic camera array. ACM Trans Graph 32(5):1–13

    Article  Google Scholar 

  23. 23.

    Wilburn B, Joshi N, Vaish V, Levoy M, Horowitz M (2004) High-speed videography using a dense camera array. IEEE Conference on Computer Vision and Pattern Recognition, II-294-301

  24. 24.

    Wilburn B, Joshi N, Vaish V, Talvala E, Antunez E, Barth A, Adams A, Horowitz M, Levoy M (2005) High performance imaging using large camera arrays. ACM Trans Graph 24(3):765–776

    Article  Google Scholar 

  25. 25.

    Yamamoto K, Ichihashi Y, Senoh T, Oi R, Kurita T (2012) Interactive electronic holography and 300-camera array in dense arrangement. IEEE International Conference on Acoustics, Speech, and Signal Processing, 5453–5456

  26. 26.

    Yang T, Zhang Y, Tong X M, Zhang XQ, Yu R (2013) A new hybrid synthetic aperture imaging model for tracking and seeing people through occlusion. IEEE Trans Circuits Syst Video Technol 23(9):1461–1475

    Article  Google Scholar 

  27. 27.

    Zhang C, Chen T (2004) A self-reconfigurable camera array. ACM special interest group on Computer Graphics, 151

  28. 28.

    Zhang Z, Liu W, Metsis V, Athitsos V (2012) A viewpoint-independent statistical method for fall detection. Int Conf Pattern Recog 3636–3630

  29. 29.

    Zhao SC, Chen LJ, Yao HX, Zhang YH, Sun XS (2015) Strategy for dynamic 3D depth data matching towards robust action retrieval. Neurocomputing 151:533–543

    Article  Google Scholar 

  30. 30.

    Zhao SC, Yao HX, Zhang YH, Wang Y, Liu SH (2014) View-based 3D object retrieval via multi-modal graph learning. Signal Processing

Download references


This work is supported by the National Natural Science Foundation of China with Grant Number 61272288 and 61231016, NPU New AoXiang Star with Grant Number G2015KY0301 and 12GH0311, NPU New People and Direction with Grant Number 13GH014604, NSF grants IIS-CAREER-0845268 and IIS-1218156, and Foundation of China Scholarship Council with Grant Number 201206965020 and 201303070083.

Author information



Corresponding author

Correspondence to Tao Yang.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Yang, T., Ma, W., Wang, S. et al. Kinect based real-time synthetic aperture imaging through occlusion. Multimed Tools Appl 75, 6925–6943 (2016).

Download citation


  • See through occlusion
  • Kinect
  • Synthetic aperture imaging
  • Virtual camera array