Interactively Co-segmentating Topically Related Images with Intelligent Scribble Guidance
- 538 Downloads
We present an algorithm for Interactive Co-segmentation of a foreground object from a group of related images. While previous works in co-segmentation have focussed on unsupervised co-segmentation, we use successful ideas from the interactive object-cutout literature. We develop an algorithm that allows users to decide what foreground is, and then guide the output of the co-segmentation algorithm towards it via scribbles. Interestingly, keeping a user in the loop leads to simpler and highly parallelizable energy functions, allowing us to work with significantly more images per group. However, unlike the interactive single-image counterpart, a user cannot be expected to exhaustively examine all cutouts (from tens of images) returned by the system to make corrections. Hence, we propose iCoseg, an automatic recommendation system that intelligently recommends where the user should scribble next. We introduce and make publicly available the largest co-segmentation dataset yet, the CMU-Cornell iCoseg dataset, with 38 groups, 643 images, and pixelwise hand-annotated groundtruth. Through machine experiments and real user studies with our developed interface, we show that iCoseg can intelligently recommend regions to scribble on, and users following these recommendations can achieve good quality cutouts with significantly lower time and effort than exhaustively examining all cutouts.
KeywordsInteractive segmentation Co-segmentation Scribbles Energy minimization
Unable to display preview. Download preview PDF.
- Bagon, S. (2006). Matlab wrapper for graph cut. http://www.wisdom.weizmann.ac.il/~bagon.
- Bai, X., & Sapiro, G. (2007). A geodesic framework for fast interactive image and video segmentation and matting. In ICCV. Google Scholar
- Batra, D., Sukthankar, R., & Chen, T. (2008). Semi-supervised clustering via learnt codeword distances. In BMVC. Google Scholar
- Batra, D., Kowdle, A., Parikh, D., Tang, K., & Chen, T. (2009). http://amp.ece.cornell.edu/projects/touch-coseg/. Interactive Co-segmentation by Touch.
- Batra, D., Kowdle, A., Parikh, D., Luo, J., & Chen, T. (2010). icoseg: interactive co-segmentation with intelligent scribble guidance. In CVPR. Google Scholar
- Bouman, C. A. (1997). Cluster: an unsupervised algorithm for modeling Gaussian mixtures. Available from http://www.ece.purdue.edu/~bouman.
- Boykov, Y., & Jolly, M. P. (2001). Interactive graph cuts for optimal boundary and region segmentation of objects in n-d images. In ICCV. Google Scholar
- Chen, Z., Chou, H. L., & Chen, W. C. (2008). A performance controllable octree construction method. In ICPR (pp. 1–4). Google Scholar
- Collins, B., Deng, J., Li, K., & Fei-Fei, L. (2008). Towards scalable dataset construction: an active learning approach. In ECCV. Google Scholar
- Criminisi, A., Sharp, T., & Blake, A. (2008). Geos: geodesic image segmentation. In ECCV . Google Scholar
- Cui, J., Yang, Q., Wen, F., Wu, Q., Zhang, C. Gool, L. V., & Tang, X. (2008). Transductive object cutout. In CVPR. Google Scholar
- Curless, B., & Levoy, M. (1996). A volumetric method for building complex models from range images. In SIGGRAPH ’96: proceedings of the 23rd annual conference on computer graphics and interactive techniques (pp. 303–312). New York: ACM. doi:http://doi.acm.org/10.1145/237170.237269. CrossRefGoogle Scholar
- Fitzgibbon, A. W., Cross, G., & Zisserman, A. (1998). Automatic 3d model construction for turn-table sequences. In Proceedings of SMILE workshop on structure from multiple images in large scale environments (Vol. 1560, pp. 154–170). Google Scholar
- Forbes, K., Nicolls, F., de Jager, G., & Voigt, A. (2006). Shape-from-silhouette with two mirrors and an uncalibrated camera. In ECCV (pp. 165–178). Google Scholar
- Franco, J. S., & Boyer, E. (2003). Exact polyhedral visual hulls. In BMVC (Vol. 1, pp. 329–338). http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.4.5634. Google Scholar
- Gallagher, A., & Chen, T. (2008). Estimating age, gender and identity using first name priors. In CVPR. Google Scholar
- Hochbaum, D. S., & Singh, V. (2009). An efficient algorithm for co-segmentation. In ICCV. Google Scholar
- Hoiem, D., Efros, A. A., & Hebert, M. (2005). Geometric context from a single image. In ICCV. Google Scholar
- Kapoor, A., Grauman, K., Urtasun, R., & Darrell, T. (2007). Active learning with Gaussian processes for object categorization. In ICCV. Google Scholar
- Kowdle, A., Batra, D., Chen, W. C., & Chen, T. (2010). imodel: interactive co-segmentation for object of interest 3d modeling. In Workshop on reconstruction and modeling of large-scale 3D virtual environments at European conference on computer vision. Google Scholar
- Lee, Y. J., & Grauman, K. (2010). Collect-cut: segmentation with top-down cues discovered in multi-object images. In CVPR. Google Scholar
- Leung, T., & Malik, J. (1998). Contour continuity in region based image segmentation. In ECCV. Google Scholar
- Li, Y., Sun, J., Tang, C. K., & Shum, H. Y. (2004). Lazy snapping. In SIGGRAPH. Google Scholar
- Mu, Y., & Zhou, B. (2007). Co-segmentation of image pairs with quadratic global constraint in MRFs. In ACCV. Google Scholar
- Mukherjee, L., Singh, V., & Dyer, C. R. (2009). Half-integrality based algorithms for co-segmentation of images. In CVPR. Google Scholar
- Rother, C., Kolmogorov, V., & Blake, A. (2004). “Grabcut”: interactive foreground extraction using iterated graph cuts. In SIGGRAPH. Google Scholar
- Rother, C., Minka, T., Blake, A., & Kolmogorov, V. (2006). Cosegmentation of image pairs by histogram matching—incorporating a global constraint into MRFs. In CVPR. Google Scholar
- Schnitman, Y., Caspi, Y., Cohen Or, D., & Lischinski, D. (2006). Inducing semantic segmentation from an example. In ACCV. Google Scholar
- Settles, B. (2009). Active learning literature survey (Computer Sciences Technical Report 1648). Madison: University of Wisconsin. Google Scholar
- Seung, H. S., Opper, M., & Sompolinsky, H. (1992). Query by committee. In COLT. Google Scholar
- Snavely, N., Seitz, S., & Szeliski, R. (2006). Photo tourism: exploring photo collections in 3d. In SIGGRAPH (pp. 835–846). Google Scholar
- Vicente, S., Kolmogorov, V., & Rother, C. (2010). Cosegmentation revisited: models and optimization. In ECCV. Google Scholar
- Vijayanarasimhan, S., & Grauman, K. (2009). What’s it going to cost you?: Predicting effort vs. informativeness for multi-label image annotations. In CVPR. Google Scholar
- Vlasic, D., Baran, I., Matusik, W., & Popović, J. (2008). Articulated mesh animation from multi-view silhouettes. In SIGGRAPH (pp. 1–9). New York: ACM. Google Scholar
- Yan, R., Yang, J., & Hauptmann, A. (2003). Automatically labeling video data using multi-class active learning. In ICCV. Google Scholar
- Zhang, L., Curless, B., & Seitz, S. M. (2002). Rapid shape acquisition using color structured light and multi-pass dynamic programming. In 3DPVT (p. 24). Google Scholar