Advanced Scene Sensing for Virtual Teleconference

Minárik, Ivan; Vančo, Marek; Rozinaj, Gregor

doi:10.1007/978-3-030-96878-6_18

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1527))

Included in the following conference series:

International Conference on Systems, Signals and Image Processing

330 Accesses

Abstract

Over the last year the need for video conferences has risen significantly due to the ongoing global pandemic. The goal of this project is to improve user experience from having access to only voice and plain 2D image by adding a third spatial dimension, creating a more immersive setting. Azure Kinect Development Kit utilizes multiple cameras, namely the RGB camera and depth camera. Depth camera is based on ToF principle, which uses near-IR to cast modulated illumination onto the scene. The setup uses multiple Azure Kinect devices in sync and offset in space to obtain non-static 3D capture of a person. Unity engine with Azure Kinect SDK is used to process the data gathered by all devices. Firstly, a depth spatial map is created by combining overlaid outputs from each device. Secondly, RGB pixels are mapped onto depth spatial points to provide a final texture to the 3D model. Taking into account the need to export a continuous capture of raw data to a server, body tracking and image processing algorithms are used. Finally, the processed data can be exported and utilized in AR, VR or any other 3D capable interface. This 3D projection aims to enhance sensory experience by utilising non-verbal communication along with classical speech in video conferences.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 64.99; Price excludes VAT (USA)

Softcover Book: USD 84.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Lapakko, D.: Communication is 93% nonverbal: an urban legend proliferates. Commun. Theater Assoc. Minnesota J. 34(1), 2 (2007)
Google Scholar
BIT, T., et al.: A study of learning experience with a dash-based multimedia delivery system. In: EDULEARN 2018: 10th International Conference on Education and New Learning Technologies, EDULEARN Proceedings, Palma, Spain, pp. 8590–8595, 02–04 July 2018. ISSN 2340-1117, ISBN 978-84-09-02709-5, WOS:000531474303006
Google Scholar
Vančo, M., Minárik, I., Rozinaj, G.: Gesture identification for system navigation in 3D scene. In: 54th ELMAR International Symposium ELMAR-2012, ELMAR Proceedings, Zadar, Croatia, pp. 45–48, 12–14 Sept 2012. ISSN 1334-2630, ISBN 978-953-7044-13-8, WOS:000399723300010
Google Scholar
Polakovič, A., Vargic, R., Rozinaj, G.: Adaptive multimedia content delivery in 5G networks using DASH and saliency information. In: 25th International Conference on Systems, Signals and Image Processing (IWSSIP), Maribor, Slovenia, 20–22 June 2018. ISSN 2157-8672, ISBN 978-1-5386-6979-2, WOS: 000451277200008, https://doi.org/10.1109/IWSSIP.2018.8439215
Sarbolandi, H., Lefloch, D., Kolb, A.: Kinect range sensing: structured-light versus time-of-flight kinect. Comput. Vis. Image Underst. 139, 1–20 (2015)
Google Scholar
The basics of stereo depth vision - IntelRealSense™Depth and Tracking Cameras (2021). IntelRealSense™Depth and Tracking Cameras [online]
Google Scholar
Introducing the IntelRealSense™Depth Camera D455 (2021). IntelRealSense™Depth and Tracking Cameras [online]
Google Scholar
Zennaro, S., et al.: Performance evaluation of the 1st and 2nd generation Kinect for multimedia applications. In: 2015 IEEE International Conference on Multimedia and Expo (ICME), pp. 1–6. IEEE (2015)
Google Scholar
Albert, J.A., et al.: Evaluation of the pose tracking performance of the Azure Kinect and Kinect v2 for gait analysis in comparison with a gold standard: a pilot study. Sensors 20(18), 5104 (2020)
Google Scholar
Azure Kinect DK depth camera (2021). Docs.microsoft.com [online]
Google Scholar
Wasenmüller, O., Stricker, D.: Comparison of kinect V1 and V2 depth images in terms of accuracy and precision. In: Chen, C.-S., Lu, J., Ma, K.-K. (eds.) ACCV 2016. LNCS, vol. 10117, pp. 34–45. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-54427-4_3
Chapter Google Scholar
Wen Yu, C.: Stereo-Camera Occupancy Grid Mapping [online]. The Pennsylvania State University (2020). Master thesis. The Pennsylvania State University, Aerospace Engineering. Thesis Advisor: Eric Norman Johnson. [cit. 2021-5-6]. Link: https://etda.libraries.psu.edu/catalog/18031wuc188
Gutta, V., et al.: A comparison of depth sensors for 3D object surface reconstruction. In: CMBES Proceedings, vol. 42 (2019)
Google Scholar
Tölgyessy, M., et al.: Evaluation of the Azure kinect and its comparison to kinect v1 and kinect v2. Sensors 21(2), 413 (2021)
Google Scholar
Haas, J.: A history of the unity game engine. Dissertation, WORCESTER POLYTECHNIC INSTITUTE (2014)
Google Scholar
UNITY TECHNOLOGIES: Unity - Unity (2021). https://unity.com/. Accessed 8 May 2021
Shi, S.: Emgu CV Essentials. Packt Publishing Ltd., Birmingham (2013)
Google Scholar

Download references

Acknowledgement

Research described in the paper was financially supported by the 2020-1-CZ01-KA226-VET-094346 DiT4LL ERASMUS+ Innovation Project, MonEd - Modern Trends and New Technologies of Online Education in ICT Study Programs in European Educational Space (KEGA 015STU-4/2021), and the Excellent creative team project VirTel.

Author information

Authors and Affiliations

Slovak University of Technology, 841 04, Bratislava, Slovakia
Ivan Minárik, Marek Vančo & Gregor Rozinaj

Authors

Ivan Minárik
View author publications
You can also search for this author in PubMed Google Scholar
Marek Vančo
View author publications
You can also search for this author in PubMed Google Scholar
Gregor Rozinaj
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ivan Minárik .

Editor information

Editors and Affiliations

Slovak University of Technology in Bratislava, Bratislava, Slovakia
Gregor Rozinaj
Slovak University of Technology in Bratislava, Bratislava, Slovakia
Radoslav Vargic

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Minárik, I., Vančo, M., Rozinaj, G. (2022). Advanced Scene Sensing for Virtual Teleconference. In: Rozinaj, G., Vargic, R. (eds) Systems, Signals and Image Processing. IWSSIP 2021. Communications in Computer and Information Science, vol 1527. Springer, Cham. https://doi.org/10.1007/978-3-030-96878-6_18

Download citation

DOI: https://doi.org/10.1007/978-3-030-96878-6_18
Published: 02 March 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-96877-9
Online ISBN: 978-3-030-96878-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics