Abstract
Augmented reality is a technology that combines a virtual world with the real world. How to improve the realism of augmented reality is an important topic. One focus of this paper is lighting consistency between virtual and real world, and the other is interaction with virtual object using hands. Estimating lighting conditions through traditional methods often requires many prior knowledge of the scene. We propose a method that estimates the light direction based on shadows and foreground objects with only one scene image. We detect and calculate the relative direction of an object and its shadow in the scene to estimate the azimuth of the light, and use area size ratio of the object and its shadow to estimate the elevation angle of the light. We used some real scenes to test our method. However, the exact light direction of the real world is difficult to acquire, so we further verified our method by establishing a number of virtual scenes with preset light direction. Moreover, hand gesture-based human–computer interaction provides a natural and easy way for interaction. Traditional augmented reality interactions use markers or touch screens. We apply gesture recognition and hand touchable interaction in augmented reality, allowing the user’s hand to occlude the virtual object in the picture, and recognize the gesture. By adding estimated light and hand touchable interactions, we enhance the realism of augmented reality.
Similar content being viewed by others
References
Bambach S, Lee S, Crandall DJ, Yu C (2015) Lending a hand: detecting hands and recognizing activities in complex egocentric interactions. In: Proceedings of the IEEE international conference on computer vision, pp 1949–1957
Berbar MA (2011) Novel colors correction approaches for natural scenes and skin detection techniques. Int J Video Image Process Netw Sec IJVIPNS-IJENS 11(2):1–10
Cambon F (2014) Pikachu model. https://sketchfab.com/3d-models/025-pikachu-e69fbccccf1449acb0d9328ac9bea79d
Chang SH, Chiu CY, Chang CS, Chen KW, Yao CY, Lee RR, Chu HK (2018) Generating 360 outdoor panorama dataset with reliable sun position estimation. In: SIG-GRAPH Asia 2018 posters, pp 22
Chen LC, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2017) Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Trans Pattern Anal Mach Intell 40(4):834–848
Chen Q, Georganas ND, Petriu EM (2007) Real-time vision-based hand gesture recognition using haar-like features. In: IEEE instrumentation and measurement technology conference IMTC, pp 1–6
Chen X, Wang K, Jin X (2011) Single image based illumination estimation for lighting virtual object in real scene. In: Proceedings of the 12th international conference on computer aided design and computer graphics, pp 450–455
Debevec P (1998) Rendering synthetic objects into real scenes: bridging traditional and image-based graphics with global illumination and high dynamic range photography. In: Proceedings of the 25th annual conference on computer graphics and interactive techniques, pp 189–198
Douglas DH, Peucker TK (1973) Algorithms for the reduction of the number of points required to represent a digitized line or its caricature. Cartogr Int J Geogr Inform Geovisual 10(2):112–122
Duchene S, Riant C, Chaurasia G, Lopez-Moreno J, Laffont PY, Popov S, Bousseau A, Drettakis G (2015) Multiview intrinsic images of outdoors scenes with an application to relighting. ACM Trans Graph 34(5):164
Eigen D, Fergus R (2015) Predicting depth, surface normals and semantic labels with a common multi-scale convolutional architecture. In: Proceedings of the IEEE international conference on computer vision, pp 2650–2658.
El Sibai R, Jaoude CA, De-merjian J (2017) A new robust approach for real-time hand detection and gesture recognition. In: International conference on computer and applications (ICCA), pp 18–25.
de Figueiredo LH, Velho L, de Castro TK (2012) Realistic shadows for mobile augmented reality. In: Proceedings of the 14th symposium on virtual and augmented reality, pp 36–45
Flickr (2019) Scene images from Internet. https://www.flickr.com/
Georgoulis S, Rematas K, Ritschel T, Fritz M, Van Gool L, Tuytelaars N (2016) Delight-net: decomposing reflectance maps into specular materials and natural illumination
Godbehere AB, Matsukawa A, Goldberg K (2012) Visual tracking of human visitors under variable-lighting conditions for a responsive audio art installation. In: American control conference, pp 4305–4312
Google (2019) ARCore Resources. https://developers.google.com/ar/discover/
Greg (2019) Scene images from Internet. https://www.gregsindigenouslandscapes.com.au/Category17.php
Haber T, Fuchs C, Bekaer P, Seidel HP, Goesele M, Lensch HPA (2009) Relighting objects from image collections. In: IEEE conference on computer vision and pattern recognition, pp 627–634
He K, Gkioxari G, Dollar P, Girshick R (2017) Mask r-cnn. In: Proceedings of the IEEE international conference on computer vision, pp 2961–2969
Hoiem D, Efros AA, Hebert M (2007) Recovering surface layout from an image. Int J Comput vis 75(1):151–172
Hold-Geoffroy Y, Sunkavalli K, Hadap S, Gambaretto E, Lalonde JF (2017) Deep outdoor illumination estimation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7312–7321
Hosek L, Wilkie A (2012) An analytic model for full spectral sky-dome radiance. ACM Trans Graph (TOG) 31(4):95
Huang SC, Cheng FC, Chiu YS (2012) Efficient contrast enhancement using adaptive gamma correction with weighting distribution. IEEE Trans Image Process 22(3):1032–1041
Kaewtrakulpong P, Bowden R (2002) An improved adaptive background mixture model for real-time tracking with shadow detection. In: Video-based surveillance systems, pp 135–144
Kanbara M, Yokoya N, Takemura H (2004) Real-time estimation of light source environment for photorealistic augmented reality. In: Proceedings of the 17th international conference on pattern recognition, pp 911–914
Koˇsecka J, Zhang W (2002) Video compass. In: European conference on computer vision, pp 476–490
Laffont PY, Bousseau A, Paris S, Durand F, Drettakis G (2012) Coherent intrinsic images from photo collections. ACM Trans Graph 31(6):11
Lalonde JF, Matthews I (2014) Lighting estimation in outdoor image collections. In: Proceedings of the 2nd international conference on 3D vision, vol 1, pp 131–138
Lalonde JF, Efros AA, Narasimhan SG (2009) Estimating natural illumination from a single outdoor image. In: IEEE 12th international conference on computer vision, pp 183–190
Lin TY, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollar P, Zitnick CL (2014) Microsoft coco: common objects in context. In: European conference on computer vision, pp 740–755
Liu S, Qi L, Qin H, Shi J, Jia J (2018) Path aggregation network for instance segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 8759–8768
Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3431–3440
Manresa C, Varona J, Mas R, Perales FJ (2005) Hand tracking and gesture recognition for human-computer interaction. ELCVIA Elect Lett Comput vis Image Anal 5(3):96–104
Matterport (2017) Scene images from Internet. https://github.com/matterport/Mask_RCNN/tree/master/images
Ng CW, Ranganath S (2002) Real-time gesture recognition system and application. Image vis Comput 20(13–14):993–1007
Ong EJ, Bowden R (2004) A boosted classifier tree for hand shape detection. In: Proceedings of the sixth IEEE international conference on automatic face and gesture recognition, pp 889–894
Panagopoulos A, Yago Vicente TF, Samaras D (2011) Illumination estimation from shadow borders. In: IEEE international conference on computer vision workshops (ICCV workshops), pp 798–805
Pavlovic VI, Sharma R, Huang TS (1997) Visual interpretation of hand gestures for human-computer interaction: a review. IEEE Trans Pattern Anal Mach Intell 19(7):677–695
Perez R, Seals R, Michalsky J (1993) All-weather model for sky luminance distribution—preliminary configuration and validation. Sol Energy 50(3):235–245
Piccardi M (2004) Background subtraction techniques: a review. IEEE Int Conf Syst Man Cybern 4:3099–3104
Poudel RPK, Liwicki S, Cipolla R (2019) Fast-scnn: fast semantic segmentation network
Ramer U (1972) An iterative procedure for the polygonal approximation of plane curves. Comput Graph Image Process 1(3):244–256
Rematas K, Ritschel T, Fritz M, Gavves E, Tuytelaars T (2016) Deep reflectance maps. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4508–4516
Ren S, He K, Girshick R, Sun J (2015) Faster r-cnn: towards real-time object detection with region proposal networks. In: Advances in neural information processing systems, pp 91–99
Rother C, Kolmogorov V, Blake A (2004) Grabcut: interactive foreground extraction using iterated graph cuts. ACM Trans Graph 23:309–314
Sanin A, Sanderson C, Lovell BC (2012) Shadow detection: a survey and comparative evaluation of recent methods. Pattern Recogn 45(4):1684–1695
Sato I, Sato Y, Ikeuchi K (2003) Illumination from shadows. IEEE Trans Pattern Anal Mach Intell 25(3):290–300
Seo DW, Lee JY (2013) Direct hand touchable interactions in augmented reality environments for natural and intuitive user experiences. Exp Syst Appl 40(9):3784–3793
Singh A (2017) Cnn gesture recognizer. Doi:10.5281/zenodo.1064825
Sugano N, Kato H, Tachibana K (2003) The effects of shadow representation of virtual objects in augmented reality. In: The second IEEE and ACM international symposium on mixed and augmented reality, pp 76–83.
Wang Y, Samaras D (2003) Estimation of multiple directional light sources for synthesis of augmented reality images. Graph Models 65(4):185–205
Wehrwein S, Bala K, Snavely N (2015) Shadow detection and sun direction in photo collections. In: International conference on 3D vision, pp 460–468.
Xu P (2017) A real-time hand gesture recognition and human-computer interaction system
Yeo HS, Lee BG, Lim H (2015) Hand tracking and gesture recognition system for human-computer interaction using low-cost hardware. Mult Tools Appl 74(8):2687–2715
Zivkovic Z (2004) Improved adaptive gaussian mixture model for background subtraction. In: Proceedings of the 17th international conference on pattern recognition, vol 2, pp 28–31
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Supplementary file1 (MP4 19886 KB)
Supplementary file2 (MP4 18406 KB)
Supplementary file3 (MP4 3845 KB)
Rights and permissions
About this article
Cite this article
Liu, D.SM., Wu, SJ. Light direction estimation and hand touchable interaction for augmented reality. Virtual Reality 26, 1155–1172 (2022). https://doi.org/10.1007/s10055-022-00624-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10055-022-00624-8