Abstract
Many successful applications of computer vision to image or video manipulation are interactive by nature. However, parameters of such systems are often trained neglecting the user. Traditionally, interactive systems have been treated in the same manner as their fully automatic counterparts. Their performance is evaluated by computing the accuracy of their solutions under some fixed set of user interactions. In this paper, we study the problem of evaluating and learning interactive segmentation systems which are extensively used in the real world. The key questions in this context are how to measure (1) the effort associated with a user interaction, and (2) the quality of the segmentation result as perceived by the user. We conduct a user study to analyze user behavior and answer these questions. Using the insights obtained from these experiments, we propose a framework to evaluate and learn interactive segmentation systems which brings the user in the loop. The framework is based on the use of an active robot user—a simulated model of a human user. We show how this approach can be used to evaluate and learn parameters of state-of-the-art interactive segmentation systems. We also show how simulated user models can be integrated into the popular max-margin method for parameter learning and propose an algorithm to solve the resulting optimisation problem.
Similar content being viewed by others
Notes
E.g. ICCV 2007, NIPS 2009 and CVPR 2010.
We will refer to each user interaction in this scenario as a brush stroke.
http://research.microsoft.com/en-us/um/cambridge/ projects/visionimagevideoediting/segmentation/grabcut.htm.
The quality of the segmentation results is not affected by this down-scaling.
We started the learning from no initial brushes and let it run for 60 brush strokes. The learned parameters were similar as with starting from 20 brushes.
Note, one could do even better by looking at two or more brushes after each other and then selecting the optimal one. However, the solution grows exponentially with the number look-ahead steps.
This behaviour is also observed in our experiments. Note that after each user interaction we obtain the global optimum of our current energy. Also, note that the energy changes with each user interaction.
This is number-of-data-point-fold cross validation.
However, compared to an exhaustive search over all possible joint settings of the parameters, we are not guaranteed to find the global optimum of the objective function.
Note, the fact that the uncertainty of the “tight trimap” learning is high, gives an indication that this value can not be trusted very much.
We write images of size (n x ×n y ×n c ) as vectors \(\in\mathbb{R}^{n},\:n=n_{x} n_{y} n_{z}\) for simplicity. All involved operations respect the 2d grid structure absent in general n-vectors.
We use the Hamming loss Δ H (y ∗,y k)=1 ⊤|y k−y ∗|.
It is in fact the most informative feature with corresponding predictor given by the identity.
To our knowledge, there is no simple graph cut like algorithm to do the minimisation in U all at once.
The cost is K runs of dynamic graphcuts of size n, though.
In the end, we can only safely flip a single pixel \(u_{i}^{k}\) at a time to guarantee descent.
We did not fix w u to 1, as before, to give the system the freedom to set it to 0.
References
amazon.com (2010). Amazon mechanical turk. https://www.mturk.com
Bai, X., & Sapiro, G. (2007). A geodesic framework for fast interactive image and video segmentation and matting. In ICCV.
Batra, D., Kowdle, A., Parikh, D., Luo, J., & Chen, T. (2010). iCoseg: interactive co-segmentation with intelligent scribble guidance. In CVPR.
Blake, A., Rother, C., Brown, M., Perez, P., & Tor, P. (2004). Interactive image segmentation using an adaptive GMMRF model. In ECCV.
Blake, A., Kohli, P., & Rother, C. (2011). Markov random fields for vision and image processing. Cambridge: MIT Press.
Boykov, Y., & Jolly, M. (2001). Interactive graph cuts for optimal boundary and region segmentation of objects in N-D images. In ICCV.
Duchenne, O., Audibert, J. Y., Keriven, R., Ponce, J., & Ségonne, F. (2008). Segmentation by transduction. In CVPR.
Everingham, M., Gool, L. V., Williams, C. K. I., Winn, J., & Zisserman, A. (2009). http://www.pascal-network.org/challenges/VOC
Finley, T., & Joachims, T. (2008). Training structural SVMs when exact inference is intractable. In ICML.
Grady, L. (2006). Random walks for image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28, 1–17.
Gulshan, V., Rother, C., Criminisi, A., Blake, A., & Zisserman, A. (2010). Geodesic star convexity for interactive image segmentation. In CVPR.
Kohli, P., Ladicky, L., & Torr, P. (2008). Robust higher order potentials for enforcing label consistency. In CVPR.
Kohli, P., & Torr, P. (2005). Efficiently solving dynamic MRFs using graph cuts. In ICCV.
Li, Y., Sun, J., Tang, C. K., & Shum, H. Y. (2004). Lazy snapping. In SIGGRAPH (Vol. 23).
Liu, J., Sun, J., & Shum, H. Y. (2009). Paint selection. In SIGGRAPH.
McGuinness, K., & O’Connor, N. E. (2010). A comparative evaluation of interactive segmentation algorithms. Pattern Recognition, 43(2), 434–444.
McGuinness, K., & O’Connor, N. E. (2011). Toward automated evaluation of interactive segmentation. In CVIU.
Mortensen, E. N., & Barrett, W. A. (1998). Interactive segmentation with intelligent scissors. In Graphical models and image processing.
Nickisch, H., Kohli, P., & Rother, C. (2009). Learning an interactive segmentation system (Tech. rep.). http://arxiv.org/abs/0912.2492
Nickisch, H., Rother, C., Kohli, P., & Rhemann, C. (2010). Learning and evaluating interactive segmentation systems. In ICVGIP.
Nowozin, S., & Lampert, C. H. (2009). Global connectivity potentials for random field models. In CVPR.
Rother, C., Bordeaux, L., Hamadi, Y., & Blake, A. (2006). Autocollage. ACM Transactions on Graphics, 25(3), 847–852.
Rother, C., Kolmogorov, V., & Blake, A. (2004). “GrabCut”—interactive foreground extraction using iterated graph cuts. In SIGGRAPH.
Russell, B. C., Torralba, A., Murphy, K. P., & Freeman, W. T. (2008). Labelme: a database and web-based tool for image annotation. International Journal of Computer Vision, 77, 157–173.
Singaraju, D., Grady, L., & Vidal, R. (2009). P-brush: Continuous valued MRFs with normed pairwise distributions for image segmentation. In CVPR.
Sorokin, A., & Forsyth, D. (2008). Utility data annotation with amazon mechanical turk. In Internet vision workshop at CVPR.
Szeliski, R., Zabih, R., Scharstein, D., Veksler, O., Kolmogorov, V., Agarwala, A., Tappen, M., & Rother, C. (2006). A comparative study of energy minimization methods for Markov random fields. In ECCV.
Szummer, M., Kohli, P., & Hoiem, D. (2008). Learning CRFs using graph cuts. In ECCV.
Taskar, B., Chatalbashev, V., & Koller, D. (2004). Learning associative Markov networks. In ICML.
Tsochantaridis, I., Hofmann, T., Joachims, T., & Altun, Y. (2004). Support vector learning for interdependent and structured output spaces. In ICML.
Vicente, S., Kolmogorov, V., & Rother, C. (2008). Graph cut based image segmentation with connectivity priors. In CVPR.
Vijayanarasimhan, S., & Grauman, K. (2009). What’s it going to cost you?: Predicting effort vs. informativeness for multi-label image annotations. In CVPR (pp. 2262–2269).
Vijayanarasimhan, S., & Grauman, K. (2011a). Cost-SENSITive active visual category learning. International Journal of Computer Vision, 91(1), 24–44.
Vijayanarasimhan, S., & Grauman, K. (2011b). Large-scale live active learning: Training object detectors with crawled data and crowds. In CVPR.
von Ahn, L., & Dabbish, L. (2004). Labeling images with a computer game. In SIGCHI (pp. 319–326).
Wasserman, L. (2004). All of statistics. Berlin: Springer.
Acknowledgement
Christoph Rhemann was supported by the Vienna Science and Technology Fund (WWTF) under project ICT08-019.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Kohli, P., Nickisch, H., Rother, C. et al. User-Centric Learning and Evaluation of Interactive Segmentation Systems. Int J Comput Vis 100, 261–274 (2012). https://doi.org/10.1007/s11263-012-0537-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11263-012-0537-4