Computational Visual Media

, Volume 2, Issue 2, pp 131–142 | Cite as

3D modeling and motion parallax for improved videoconferencing

  • Zhe Zhu
  • Ralph R. Martin
  • Robert Pepperell
  • Alistair Burleigh
Open Access
Research Article


We consider a face-to-face videoconferencing system that uses a Kinect camera at each end of the link for 3D modeling and an ordinary 2D display for output. The Kinect camera allows a 3D model of each participant to be transmitted; the (assumed static) background is sent separately. Furthermore, the Kinect tracks the receiver’s head, allowing our system to render a view of the sender depending on the receiver’s viewpoint. The resulting motion parallax gives the receivers a strong impression of 3D viewing as they move, yet the system only needs an ordinary 2D display. This is cheaper than a full 3D system, and avoids disadvantages such as the need to wear shutter glasses, VR headsets, or to sit in a particular position required by an autostereo display. Perceptual studies show that users experience a greater sensation of depth with our system compared to a typical 2D videoconferencing system.


naked-eye 3D motion parallax videoconferencing real-time 3D modeling 


  1. [1]
    Rosenthal, A. H. Two-way television communication unit. US Patent 2420198, 1947Google Scholar
  2. [2]
    Okada, K.-I.; Maeda, F.; Ichikawaa, Y.; Matsushita, Y. Multiparty videoconferencing at virtual social distance: MAJIC design. In: Proceedings of ACM Conference on Computer Supported Cooperative Work, 385–393, 1994Google Scholar
  3. [3]
    Sellen, A.; Buxton, B.; Arnott, J. Using spatial cues to improve videoconferencing. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, 651–652, 1992Google Scholar
  4. [4]
    Tang, J. C.; Minneman, S. Video Whiteboard: Video shadows to support remote collaboration. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, 315–322, 1991Google Scholar
  5. [5]
    Vertegaal, R. The GAZE groupware system: Mediating joint attention in multiparty communication and collaboration. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, 294–301, 1999Google Scholar
  6. [6]
    Vertegaal, R.; Ding, Y. Explaining effects of eye gaze on mediated group conversations: Amount or synchronization? In: Proceedings of ACM Conference on Computer Supported Cooperative Work, 41–48, 2002Google Scholar
  7. [7]
    Pirenne, M. H. Optics, Painting and Photography. Cambridge, UK: Cambridge University Press, 1970Google Scholar
  8. [8]
    Solso, R. L. Cognition and the Visual Arts. Cambridge, MA, USA: MIT Press, 1996Google Scholar
  9. [9]
    Pepperell, R.; Haertel, M. Do artists use linear perspective to depict visual space? Perception Vol. 43, No. 5, 395–416, 2014CrossRefGoogle Scholar
  10. [10]
    Baldwin, J.; Burleigh, A.; Pepperell, R. Comparing artistic and geometrical perspective depictions of space in the visual field. i-Perception Vol. 5, No. 6, 536–547, 2014CrossRefGoogle Scholar
  11. [11]
    Kemp, M. The Science of Art: Optical Themes in Western Art from Brunelleschi to Seurat. New Haven, CT, USA: Yale University Press, 1990Google Scholar
  12. [12]
    Kingslake, R. Optics in Photography. Bellingham, WA, USA: SPIE Publications, 1992CrossRefGoogle Scholar
  13. [13]
    Ogle, K. N. Research in Binocular Vision, 2nd edn. New York: Hafner Publishing Company, 1964Google Scholar
  14. [14]
    Harrison, C.; Hudson, S. E. Pseudo-3D video conferencing with a generic webcam. In: Proceedings of the 10th IEEE International Symposium on Multimedia, 236–241, 2008Google Scholar
  15. [15]
    Zhang, C.; Yin, Z.; Florencio, D. Improving depth perception with motion parallax and its application in teleconferencing. In: Proceedings of IEEE International Workshop on Multimedia Signal Processing, 1–6, 2009Google Scholar
  16. [16]
    Izadi, S.; Kim, D.; Hilliges, O.; Molyneaux, D.; Newcombe, R.; Kohli, P.; Shotton, J.; Hodges, S.; Freeman, D.; Davison, A.; Fitzgibbon, A. KinectFusion: Real-time 3D reconstruction and interaction using a moving depth camera. In: Proceedings of the 24th Annual ACM Symposium on User Interface Software and Technology, 559–568, 2011Google Scholar
  17. [17]
    Newcombe, R. A.; Izadi, S.; Hilliges, O.; Molyneaux, D.; Kim, D.; Davison, A. J.; Kohli, P.; Shotton, J.; Hodges, S.; Fitzgibbon, A. KinectFusion: Real-time dense surface mapping and tracking. In: Proceedings of the 10th IEEE International Symposium on Mixed and Augmented Reality, 127–136, 2011Google Scholar
  18. [18]
    Kim, K.; Bolton, J.; Girouard, A.; Cooperstock, J.; Vertegaal, R. TeleHuman: Effects of 3D perspective on gaze and pose estimation with a life-size cylindrical telepresence pod. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, 2531–2540, 2012Google Scholar
  19. [19]
    Lee, J. C. Head tracking for desktop VR displays using the Wii remote. Available at projects/wii/.Google Scholar
  20. [20]
    iPhone User Guide For iOS 8.1 Software. Apple Inc., 2014Google Scholar
  21. [21]
    Levin, A.; Lischinski, D.; Weiss, Y. A closed form solution to natural image matting. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 30, No. 2, 228–242, 2008CrossRefGoogle Scholar
  22. [22]
    Rydfalk, M. CANDIDE, a parameterized face. Technical Report LiTH-ISY-I-866. Link¨oping University, 1987.Google Scholar
  23. [23]
    Welsh, B. Model-based coding of images. Ph.D. Thesis. British Telecom Research Lab, 1991Google Scholar
  24. [24]
    Ahlberg, J. CANDIDE-3—An updated parameterised face. Technical Report LiTH-ISY-R-2326. Linköping University, 2001Google Scholar
  25. [25]
    Rusinkiewicz, S.; Hall-Holt, O.; Levoy, M. Real-time 3D model acquisition. ACM Transactions on Graphics Vol. 21, No. 3, 438–446, 2002CrossRefGoogle Scholar
  26. [26]
    3dMD Static Systems. Available at http://www.3dmd. com/3dMD-systems/.Google Scholar
  27. [27]
    Chen, J.; Bautembach, D.; Izadi, S. Scalable real-time volumetric surface reconstruction. ACM Transactions on Graphics Vol. 32, No. 4, Article No. 113, 2013Google Scholar
  28. [28]
    Wexler, Y.; Shechtman, E.; Irani, M. Space–time completion of video. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 29, No. 3, 463–476, 2007CrossRefGoogle Scholar
  29. [29]
    Chen, T.; Zhu, Z.; Shamir, A.; Hu, S.-M.; Cohen-Or, D. 3-Sweep: Extracting editable objects from a single photo. ACM Transactions on Graphics Vol. 32, No. 6, Article No. 195, 2013Google Scholar
  30. [30]
    Gal, R.; Sorkine, O.; Mitra, N. J.; Cohen-Or, D. iWIRES: An analyze-and-edit approach to shape manipulation. ACM Transactions on Graphics Vol. 28, No. 3, Article No. 33, 2009.Google Scholar
  31. [31]
    Schulz, A.; Shamir, A.; Levin, D. I. W.; Sitthi-amorn, P.; Matusik, W. Design and fabrication by example. ACM Transactions on Graphics Vol. 33, No. 4, Article No. 62, 2014.Google Scholar
  32. [32]
    Zheng, Y.; Fu, H.; Cohen-Or, D.; Au, O. K.-C.; Tai, C.-L. Component-wise controllers for structurepreserving shape manipulation. Computer Graphics Forum Vol. 30, No. 2, 563–572, 2011CrossRefGoogle Scholar

Copyright information

© The Author(s) 2016

Authors and Affiliations

  • Zhe Zhu
    • 1
  • Ralph R. Martin
    • 2
  • Robert Pepperell
    • 3
  • Alistair Burleigh
    • 3
  1. 1.TNListTsinghua UniversityBeijingChina
  2. 2.School of Computer Science & InformaticsCardiff UniversityCardiffUK
  3. 3.Cardiff School of Art & DesignCardiff Metropolitan UniversityCardiffUK

Personalised recommendations