International Journal of Computer Vision

, Volume 100, Issue 3, pp 261–274 | Cite as

User-Centric Learning and Evaluation of Interactive Segmentation Systems

  • Pushmeet Kohli
  • Hannes Nickisch
  • Carsten Rother
  • Christoph Rhemann


Many successful applications of computer vision to image or video manipulation are interactive by nature. However, parameters of such systems are often trained neglecting the user. Traditionally, interactive systems have been treated in the same manner as their fully automatic counterparts. Their performance is evaluated by computing the accuracy of their solutions under some fixed set of user interactions. In this paper, we study the problem of evaluating and learning interactive segmentation systems which are extensively used in the real world. The key questions in this context are how to measure (1) the effort associated with a user interaction, and (2) the quality of the segmentation result as perceived by the user. We conduct a user study to analyze user behavior and answer these questions. Using the insights obtained from these experiments, we propose a framework to evaluate and learn interactive segmentation systems which brings the user in the loop. The framework is based on the use of an active robot user—a simulated model of a human user. We show how this approach can be used to evaluate and learn parameters of state-of-the-art interactive segmentation systems. We also show how simulated user models can be integrated into the popular max-margin method for parameter learning and propose an algorithm to solve the resulting optimisation problem.


Interactive systems Image segmentation Learning 



Christoph Rhemann was supported by the Vienna Science and Technology Fund (WWTF) under project ICT08-019.


  1. (2010). Amazon mechanical turk.
  2. Bai, X., & Sapiro, G. (2007). A geodesic framework for fast interactive image and video segmentation and matting. In ICCV. Google Scholar
  3. Batra, D., Kowdle, A., Parikh, D., Luo, J., & Chen, T. (2010). iCoseg: interactive co-segmentation with intelligent scribble guidance. In CVPR. Google Scholar
  4. Blake, A., Rother, C., Brown, M., Perez, P., & Tor, P. (2004). Interactive image segmentation using an adaptive GMMRF model. In ECCV. Google Scholar
  5. Blake, A., Kohli, P., & Rother, C. (2011). Markov random fields for vision and image processing. Cambridge: MIT Press. zbMATHGoogle Scholar
  6. Boykov, Y., & Jolly, M. (2001). Interactive graph cuts for optimal boundary and region segmentation of objects in N-D images. In ICCV. Google Scholar
  7. Duchenne, O., Audibert, J. Y., Keriven, R., Ponce, J., & Ségonne, F. (2008). Segmentation by transduction. In CVPR. Google Scholar
  8. Everingham, M., Gool, L. V., Williams, C. K. I., Winn, J., & Zisserman, A. (2009).
  9. Finley, T., & Joachims, T. (2008). Training structural SVMs when exact inference is intractable. In ICML. Google Scholar
  10. Grady, L. (2006). Random walks for image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28, 1–17. CrossRefGoogle Scholar
  11. Gulshan, V., Rother, C., Criminisi, A., Blake, A., & Zisserman, A. (2010). Geodesic star convexity for interactive image segmentation. In CVPR. Google Scholar
  12. Kohli, P., Ladicky, L., & Torr, P. (2008). Robust higher order potentials for enforcing label consistency. In CVPR. Google Scholar
  13. Kohli, P., & Torr, P. (2005). Efficiently solving dynamic MRFs using graph cuts. In ICCV. Google Scholar
  14. Li, Y., Sun, J., Tang, C. K., & Shum, H. Y. (2004). Lazy snapping. In SIGGRAPH (Vol. 23). Google Scholar
  15. Liu, J., Sun, J., & Shum, H. Y. (2009). Paint selection. In SIGGRAPH. Google Scholar
  16. McGuinness, K., & O’Connor, N. E. (2010). A comparative evaluation of interactive segmentation algorithms. Pattern Recognition, 43(2), 434–444. zbMATHCrossRefGoogle Scholar
  17. McGuinness, K., & O’Connor, N. E. (2011). Toward automated evaluation of interactive segmentation. In CVIU. Google Scholar
  18. Mortensen, E. N., & Barrett, W. A. (1998). Interactive segmentation with intelligent scissors. In Graphical models and image processing. Google Scholar
  19. Nickisch, H., Kohli, P., & Rother, C. (2009). Learning an interactive segmentation system (Tech. rep.).
  20. Nickisch, H., Rother, C., Kohli, P., & Rhemann, C. (2010). Learning and evaluating interactive segmentation systems. In ICVGIP. Google Scholar
  21. Nowozin, S., & Lampert, C. H. (2009). Global connectivity potentials for random field models. In CVPR. Google Scholar
  22. Rother, C., Bordeaux, L., Hamadi, Y., & Blake, A. (2006). Autocollage. ACM Transactions on Graphics, 25(3), 847–852. CrossRefGoogle Scholar
  23. Rother, C., Kolmogorov, V., & Blake, A. (2004). “GrabCut”—interactive foreground extraction using iterated graph cuts. In SIGGRAPH. Google Scholar
  24. Russell, B. C., Torralba, A., Murphy, K. P., & Freeman, W. T. (2008). Labelme: a database and web-based tool for image annotation. International Journal of Computer Vision, 77, 157–173. CrossRefGoogle Scholar
  25. Singaraju, D., Grady, L., & Vidal, R. (2009). P-brush: Continuous valued MRFs with normed pairwise distributions for image segmentation. In CVPR. Google Scholar
  26. Sorokin, A., & Forsyth, D. (2008). Utility data annotation with amazon mechanical turk. In Internet vision workshop at CVPR. Google Scholar
  27. Szeliski, R., Zabih, R., Scharstein, D., Veksler, O., Kolmogorov, V., Agarwala, A., Tappen, M., & Rother, C. (2006). A comparative study of energy minimization methods for Markov random fields. In ECCV. Google Scholar
  28. Szummer, M., Kohli, P., & Hoiem, D. (2008). Learning CRFs using graph cuts. In ECCV. Google Scholar
  29. Taskar, B., Chatalbashev, V., & Koller, D. (2004). Learning associative Markov networks. In ICML. Google Scholar
  30. Tsochantaridis, I., Hofmann, T., Joachims, T., & Altun, Y. (2004). Support vector learning for interdependent and structured output spaces. In ICML. Google Scholar
  31. Vicente, S., Kolmogorov, V., & Rother, C. (2008). Graph cut based image segmentation with connectivity priors. In CVPR. Google Scholar
  32. Vijayanarasimhan, S., & Grauman, K. (2009). What’s it going to cost you?: Predicting effort vs. informativeness for multi-label image annotations. In CVPR (pp. 2262–2269). Google Scholar
  33. Vijayanarasimhan, S., & Grauman, K. (2011a). Cost-SENSITive active visual category learning. International Journal of Computer Vision, 91(1), 24–44. zbMATHCrossRefGoogle Scholar
  34. Vijayanarasimhan, S., & Grauman, K. (2011b). Large-scale live active learning: Training object detectors with crawled data and crowds. In CVPR. Google Scholar
  35. von Ahn, L., & Dabbish, L. (2004). Labeling images with a computer game. In SIGCHI (pp. 319–326). Google Scholar
  36. Wasserman, L. (2004). All of statistics. Berlin: Springer. zbMATHGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2012

Authors and Affiliations

  • Pushmeet Kohli
    • 1
  • Hannes Nickisch
    • 2
  • Carsten Rother
    • 1
  • Christoph Rhemann
    • 3
  1. 1.Microsoft Research CambridgeCambridgeUK
  2. 2.MPI for Intelligent SystemsTübingenGermany
  3. 3.Vienna University of TechnologyViennaAustria

Personalised recommendations