Holistic indoor scene understanding by context-supported instance segmentation


We propose a new method flow that utilizes pixel-level labeling information for instance-level object detection in indoor scenes from RGB-D data. Semantic labeling and instance segmentation are two different paradigms for indoor scene understanding that are usually accomplished separately and independently. We are interested in integrating the two tasks in a synergistic way in order to take advantage of their complementary nature for comprehensive understanding. Our work can capitalize on any deep learning networks used for semantic labeling by treating the intermediate layer as the category-wise local detection output, from which instance segmentation is optimized by jointly considering both the spatial fitness and the relational context encoded by three graphical models, namely, the vertical placement model (VPM), horizontal placement model (HPM) and non-placement model (NPM). VPM, HPM and NPM represent three common but distinct indoor object placement configurations: vertical, horizontal and hanging relationships, respectively. Experimental results on two standard RGB-D datasets show that our method can significantly improve small object segmentation with promising overall performance that is competitive with the state-of-the-art methods.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13


This work is supported in part by the US National Institutes of Health (NIH) Grant R15 AG061833 and the Oklahoma Center for the Advancement of Science and Technology (OCAST) Health Research Grant HR18-069.

