Skip to main content
Log in

Light direction estimation and hand touchable interaction for augmented reality

  • Original Article
  • Published:
Virtual Reality Aims and scope Submit manuscript

Abstract

Augmented reality is a technology that combines a virtual world with the real world. How to improve the realism of augmented reality is an important topic. One focus of this paper is lighting consistency between virtual and real world, and the other is interaction with virtual object using hands. Estimating lighting conditions through traditional methods often requires many prior knowledge of the scene. We propose a method that estimates the light direction based on shadows and foreground objects with only one scene image. We detect and calculate the relative direction of an object and its shadow in the scene to estimate the azimuth of the light, and use area size ratio of the object and its shadow to estimate the elevation angle of the light. We used some real scenes to test our method. However, the exact light direction of the real world is difficult to acquire, so we further verified our method by establishing a number of virtual scenes with preset light direction. Moreover, hand gesture-based human–computer interaction provides a natural and easy way for interaction. Traditional augmented reality interactions use markers or touch screens. We apply gesture recognition and hand touchable interaction in augmented reality, allowing the user’s hand to occlude the virtual object in the picture, and recognize the gesture. By adding estimated light and hand touchable interactions, we enhance the realism of augmented reality.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8.
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20
Fig. 21
Fig. 22
Fig. 23
Fig. 24
Fig. 25
Fig. 26
Fig. 27
Fig. 28
Fig. 29.
Fig. 30
Fig. 31
Fig. 32
Fig. 33
Fig. 34
Fig. 35

taken from our mobile phone)

Fig. 36
Fig. 37

Similar content being viewed by others

References

  • Bambach S, Lee S, Crandall DJ, Yu C (2015) Lending a hand: detecting hands and recognizing activities in complex egocentric interactions. In: Proceedings of the IEEE international conference on computer vision, pp 1949–1957

  • Berbar MA (2011) Novel colors correction approaches for natural scenes and skin detection techniques. Int J Video Image Process Netw Sec IJVIPNS-IJENS 11(2):1–10

    Google Scholar 

  • Cambon F (2014) Pikachu model. https://sketchfab.com/3d-models/025-pikachu-e69fbccccf1449acb0d9328ac9bea79d

  • Chang SH, Chiu CY, Chang CS, Chen KW, Yao CY, Lee RR, Chu HK (2018) Generating 360 outdoor panorama dataset with reliable sun position estimation. In: SIG-GRAPH Asia 2018 posters, pp 22

  • Chen LC, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2017) Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Trans Pattern Anal Mach Intell 40(4):834–848

    Article  Google Scholar 

  • Chen Q, Georganas ND, Petriu EM (2007) Real-time vision-based hand gesture recognition using haar-like features. In: IEEE instrumentation and measurement technology conference IMTC, pp 1–6

  • Chen X, Wang K, Jin X (2011) Single image based illumination estimation for lighting virtual object in real scene. In: Proceedings of the 12th international conference on computer aided design and computer graphics, pp 450–455

  • Debevec P (1998) Rendering synthetic objects into real scenes: bridging traditional and image-based graphics with global illumination and high dynamic range photography. In: Proceedings of the 25th annual conference on computer graphics and interactive techniques, pp 189–198

  • Douglas DH, Peucker TK (1973) Algorithms for the reduction of the number of points required to represent a digitized line or its caricature. Cartogr Int J Geogr Inform Geovisual 10(2):112–122

    Google Scholar 

  • Duchene S, Riant C, Chaurasia G, Lopez-Moreno J, Laffont PY, Popov S, Bousseau A, Drettakis G (2015) Multiview intrinsic images of outdoors scenes with an application to relighting. ACM Trans Graph 34(5):164

    Article  Google Scholar 

  • Eigen D, Fergus R (2015) Predicting depth, surface normals and semantic labels with a common multi-scale convolutional architecture. In: Proceedings of the IEEE international conference on computer vision, pp 2650–2658.

  • El Sibai R, Jaoude CA, De-merjian J (2017) A new robust approach for real-time hand detection and gesture recognition. In: International conference on computer and applications (ICCA), pp 18–25.

  • de Figueiredo LH, Velho L, de Castro TK (2012) Realistic shadows for mobile augmented reality. In: Proceedings of the 14th symposium on virtual and augmented reality, pp 36–45

  • Flickr (2019) Scene images from Internet. https://www.flickr.com/

  • Georgoulis S, Rematas K, Ritschel T, Fritz M, Van Gool L, Tuytelaars N (2016) Delight-net: decomposing reflectance maps into specular materials and natural illumination

  • Godbehere AB, Matsukawa A, Goldberg K (2012) Visual tracking of human visitors under variable-lighting conditions for a responsive audio art installation. In: American control conference, pp 4305–4312

  • Google (2019) ARCore Resources. https://developers.google.com/ar/discover/

  • Greg (2019) Scene images from Internet. https://www.gregsindigenouslandscapes.com.au/Category17.php

  • Haber T, Fuchs C, Bekaer P, Seidel HP, Goesele M, Lensch HPA (2009) Relighting objects from image collections. In: IEEE conference on computer vision and pattern recognition, pp 627–634

  • He K, Gkioxari G, Dollar P, Girshick R (2017) Mask r-cnn. In: Proceedings of the IEEE international conference on computer vision, pp 2961–2969

  • Hoiem D, Efros AA, Hebert M (2007) Recovering surface layout from an image. Int J Comput vis 75(1):151–172

    Article  Google Scholar 

  • Hold-Geoffroy Y, Sunkavalli K, Hadap S, Gambaretto E, Lalonde JF (2017) Deep outdoor illumination estimation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7312–7321

  • Hosek L, Wilkie A (2012) An analytic model for full spectral sky-dome radiance. ACM Trans Graph (TOG) 31(4):95

    Article  Google Scholar 

  • Huang SC, Cheng FC, Chiu YS (2012) Efficient contrast enhancement using adaptive gamma correction with weighting distribution. IEEE Trans Image Process 22(3):1032–1041

    Article  MathSciNet  Google Scholar 

  • Kaewtrakulpong P, Bowden R (2002) An improved adaptive background mixture model for real-time tracking with shadow detection. In: Video-based surveillance systems, pp 135–144

  • Kanbara M, Yokoya N, Takemura H (2004) Real-time estimation of light source environment for photorealistic augmented reality. In: Proceedings of the 17th international conference on pattern recognition, pp 911–914

  • Koˇsecka J, Zhang W (2002) Video compass. In: European conference on computer vision, pp 476–490

  • Laffont PY, Bousseau A, Paris S, Durand F, Drettakis G (2012) Coherent intrinsic images from photo collections. ACM Trans Graph 31(6):11

    Article  Google Scholar 

  • Lalonde JF, Matthews I (2014) Lighting estimation in outdoor image collections. In: Proceedings of the 2nd international conference on 3D vision, vol 1, pp 131–138

  • Lalonde JF, Efros AA, Narasimhan SG (2009) Estimating natural illumination from a single outdoor image. In: IEEE 12th international conference on computer vision, pp 183–190

  • Lin TY, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollar P, Zitnick CL (2014) Microsoft coco: common objects in context. In: European conference on computer vision, pp 740–755

  • Liu S, Qi L, Qin H, Shi J, Jia J (2018) Path aggregation network for instance segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 8759–8768

  • Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3431–3440

  • Manresa C, Varona J, Mas R, Perales FJ (2005) Hand tracking and gesture recognition for human-computer interaction. ELCVIA Elect Lett Comput vis Image Anal 5(3):96–104

    Article  Google Scholar 

  • Matterport (2017) Scene images from Internet. https://github.com/matterport/Mask_RCNN/tree/master/images

  • Ng CW, Ranganath S (2002) Real-time gesture recognition system and application. Image vis Comput 20(13–14):993–1007

    Article  Google Scholar 

  • Ong EJ, Bowden R (2004) A boosted classifier tree for hand shape detection. In: Proceedings of the sixth IEEE international conference on automatic face and gesture recognition, pp 889–894

  • Panagopoulos A, Yago Vicente TF, Samaras D (2011) Illumination estimation from shadow borders. In: IEEE international conference on computer vision workshops (ICCV workshops), pp 798–805

  • Pavlovic VI, Sharma R, Huang TS (1997) Visual interpretation of hand gestures for human-computer interaction: a review. IEEE Trans Pattern Anal Mach Intell 19(7):677–695

    Article  Google Scholar 

  • Perez R, Seals R, Michalsky J (1993) All-weather model for sky luminance distribution—preliminary configuration and validation. Sol Energy 50(3):235–245

    Article  Google Scholar 

  • Piccardi M (2004) Background subtraction techniques: a review. IEEE Int Conf Syst Man Cybern 4:3099–3104

    Google Scholar 

  • Poudel RPK, Liwicki S, Cipolla R (2019) Fast-scnn: fast semantic segmentation network

  • Ramer U (1972) An iterative procedure for the polygonal approximation of plane curves. Comput Graph Image Process 1(3):244–256

    Article  Google Scholar 

  • Rematas K, Ritschel T, Fritz M, Gavves E, Tuytelaars T (2016) Deep reflectance maps. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4508–4516

  • Ren S, He K, Girshick R, Sun J (2015) Faster r-cnn: towards real-time object detection with region proposal networks. In: Advances in neural information processing systems, pp 91–99

  • Rother C, Kolmogorov V, Blake A (2004) Grabcut: interactive foreground extraction using iterated graph cuts. ACM Trans Graph 23:309–314

    Article  Google Scholar 

  • Sanin A, Sanderson C, Lovell BC (2012) Shadow detection: a survey and comparative evaluation of recent methods. Pattern Recogn 45(4):1684–1695

    Article  Google Scholar 

  • Sato I, Sato Y, Ikeuchi K (2003) Illumination from shadows. IEEE Trans Pattern Anal Mach Intell 25(3):290–300

    Article  Google Scholar 

  • Seo DW, Lee JY (2013) Direct hand touchable interactions in augmented reality environments for natural and intuitive user experiences. Exp Syst Appl 40(9):3784–3793

    Article  Google Scholar 

  • Singh A (2017) Cnn gesture recognizer. Doi:10.5281/zenodo.1064825

  • Sugano N, Kato H, Tachibana K (2003) The effects of shadow representation of virtual objects in augmented reality. In: The second IEEE and ACM international symposium on mixed and augmented reality, pp 76–83.

  • Wang Y, Samaras D (2003) Estimation of multiple directional light sources for synthesis of augmented reality images. Graph Models 65(4):185–205

    Article  Google Scholar 

  • Wehrwein S, Bala K, Snavely N (2015) Shadow detection and sun direction in photo collections. In: International conference on 3D vision, pp 460–468.

  • Xu P (2017) A real-time hand gesture recognition and human-computer interaction system

  • Yeo HS, Lee BG, Lim H (2015) Hand tracking and gesture recognition system for human-computer interaction using low-cost hardware. Mult Tools Appl 74(8):2687–2715

    Article  Google Scholar 

  • Zivkovic Z (2004) Improved adaptive gaussian mixture model for background subtraction. In: Proceedings of the 17th international conference on pattern recognition, vol 2, pp 28–31

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Damon Shing-Min Liu.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file1 (MP4 19886 KB)

Supplementary file2 (MP4 18406 KB)

Supplementary file3 (MP4 3845 KB)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Liu, D.SM., Wu, SJ. Light direction estimation and hand touchable interaction for augmented reality. Virtual Reality 26, 1155–1172 (2022). https://doi.org/10.1007/s10055-022-00624-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10055-022-00624-8

Keywords

Navigation