Abstract
Local occlusion cue has been successfully exploited to infer depth ordering from monocular image. However, due to uncertainty of occluded relations, inconsistent results frequently arise, especially for the image of complex scenarios. We propose a depth propagation mechanism which incorporates local occlusion and global ground cues together in the way of probabilistic-to-energetic Bayesian framework. By maximizing posterior namely minimizing energy of latent relative depth variables with well-defined pairwise occlusion priori, we recover correct depth ordering in monocular setting. Our model can guarantee the consistency of relative depth labeling in automatically constructed topological graph via transferring more confident aligned multi-depth cues amongst different segments. Experiments demonstrate that more reasonable and accurate outcomes can be achieved by our depth propagation mechanism and they are also superior to common-used occlusion-based approaches in complex nature.
Similar content being viewed by others
References
Amer MR, Yousefi S, Raich R, Todorovic S (2015) Monocular extraction of 2.1D sketch using constrained convex optimization. Int J Comput Vis 112(1):23–42
Arbelaez P, Maire M, Fowlkes CC, et al (2009) From contours to regions: an empirical evaluation. Comput Vis Pattern Recognit 2294-2301
Arbeláez P, Maire M, Fowlkes CC, Malik J (2011) Contour detection and hierarchical image segmentation. IEEE Trans Pattern Anal Mach Intell 33(5):898–916
Calderero F, Caselles V (2013) Recovering relative depth from low-level features without explicit t-junction detection and interpretation. Int J Comput Vis 104(1):38–68
Cheng H, Tseng C, Hsin C, et al. (2013) Single-image 3-D depth estimation for urban scenes. Int Conf Image Proc 2121-2125
Hoiem D, Efros AA, Hebert M (2005) Geometric context from a single image. In: Computer Vision, 2005. Tenth IEEE International Conference on, vol. 1. IEEE, Beijing, pp 654–661
Hoiem D, Efros AA, Hebert M (2007) Recovering surface layout from an image. Int J Comput Vis 75(1):151
Hoiem D, Efros AA, Hebert M (2008) Putting objects in perspective. Int J Comput Vis 80(1):3–15
Hoiem D, Efros AA, Hebert M, et al (2008) Closing the loop in scene interpretation. Comput Vis Pattern Recognit 1-8
Hoiem D, Efros AA, Hebert M (2011) Recovering occlusion boundaries from an image. Int J Comput Vis 91(3):328–346
Jia Z, Gallagher A C, Chang Y, et al. (2012) A learning-based framework for depth ordering. Comput Vis Pattern Recognit. 294-301
Kosecka J, Zhang W (2002) Video Compass. Eur Conf Comput Vis 2353:476–490
Liu Y, Zhang X, Cui J, et al (2010) Visual analysis of child-adult interactive behaviors in video sequences. Int Conf Virtual Syst Multimed 26-33
Liu B, Gould S, Koller D, et al (2010) Single image depth estimation from predicted semantic labels. Comput Vis Pattern Recognit 1253-1260
Liu Y, Cui J, Zhao H, Zha H (2012) Fusion of low-and high-dimensional approaches by trackers sampling for generic human motion tracking. Int Conf Pattern Recognit (ICPR), 2012 21st International Conference on. IEEE, Tsukuba, pp 898–901
Liu Y, Nie L, Han L, Zhang L, Rosenblum DS (2015). Action2Activity: RecognizingComplex Activities from Sensor Data. In IJCAI, Buenos Aire, pp 1617–1623
Liu Y, Zheng Y, Liang Y, et al. (2016) Urban water quality prediction based on multi-task multi-view learning. Int Joint Conf Artif Intell 2576-2581
Lu Y, Wei Y, Liu L et al (2017) Towards unsupervised physical activity recognition using smartphone accelerometers. Multimed Tools Appl 76(8):10701–10719
Ming A, Xun B, Ni J, et al (2015) Learning discriminative occlusion feature for depth ordering inference on monocular image. Int Conf Image Proc 2525-2529
Ming A, Wu T, Ma J et al (2016) Monocular depth-ordering reasoning with occlusion edge detection and couple layers inference. IEEE Intell Syst 31(2):54–65
Nagata S (1991) How to reinforce perception of depth in single two-dimensional pictures. Pictorial communication in virtual and real environments. Taylor & Francis, Inc. 527-545
Palou G, Salembier P (2013) Monocular depth ordering using T-junctions and convexity occlusion cues. IEEE Trans Image Proc A Publ IEEE Signal Proc Soc 22(5):1926–1939
Rother C (2002) A new approach to vanishing point detection in architectural environments. Image Vis Comput 20(9-10):647–655
Saxena A, Sun M, Ng AY (2009) Make3D: Learning 3D Scene Structure from a Single Still Image. IEEE Trans Pattern Anal Mach Intell 31(5):824–840
Wang P, Yuille A L. (2015) DOC: Deep OCclusion Estimation from a single image. Eur Conf Comput Vis. 545-561
Yang J, Price B L, Cohen S D, et al. (2016) Object contour detection with a fully convolutional encoder-decoder network. Comput Vis Pattern Recognit 193-202
Zhang Z, Schwing A G, Fidler S, et al. (2015) Monocular object instance segmentation and Depth ordering with CNNs. Int Conf Comput Vis 2614-2622
Zhuo W, Salzmann M, He X, et al. (2015) Indoor scene structure analysis for single image depth estimation. Comput Vis Pattern Recognit. 614-622
Acknowledgements
This research was supported by National Key Research and Development Program of China (2017YFB1002203), National Nature Science Foundation of China (61503111, 61501467), and Anhui Province Key Laboratory of Industry Safety and Emergency Technology.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Wu, K. Monocular relative depth reordering by propagating confidence of local and global cues. Multimed Tools Appl 78, 27155–27173 (2019). https://doi.org/10.1007/s11042-017-5432-0
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-017-5432-0