Skip to main content
Log in

Predicting How to Distribute Work Between Algorithms and Humans to Segment an Image Batch

  • Published:
International Journal of Computer Vision Aims and scope Submit manuscript

Abstract

Foreground object segmentation is a critical step for many image analysis tasks. While automated methods can produce high-quality results, their failures disappoint users in need of practical solutions. We propose a resource allocation framework for predicting how best to allocate a fixed budget of human annotation effort in order to collect higher quality segmentations for a given batch of images and automated methods. The framework is based on a prediction module that estimates the quality of given algorithm-drawn segmentations. We demonstrate the value of the framework for two novel tasks related to predicting how to distribute annotation efforts between algorithms and humans. Specifically, we develop two systems that automatically decide, for a batch of images, when to recruit humans versus computers to create (1) coarse segmentations required to initialize segmentation tools and (2) final, fine-grained segmentations. Experiments demonstrate the advantage of relying on a mix of human and computer efforts over relying on either resource alone for segmenting objects in images coming from three diverse modalities (visible, phase contrast microscopy, and fluorescence microscopy).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Notes

  1. To afford similar contributions of each dataset, we randomly sample 2000 segmentations for the MSRA10K dataset.

  2. For the one dataset large enough to train a deep model, MSRA10K, we find that fine-tuning off-the-shelf CNNs (namely, AlexNet, VGG, and ResNet) yields similar or worse performance than the other models tested in our experiments, including those using the frozen CNN features without fine-tuning. This suggests that the proposed features are well matched for the target task.

References

  • Alpert, S., Galun, M., Basri, R., & Brandt, A. (2007). Image segmentation by probabilistic bottom-up aggregation and cue integration. In IEEE conference on computer vision and pattern recognition (CVPR) (pp. 1–8).

  • Arbelaez, P., Maire, M., Fowlkes, C., & Malik, J. (2011). Contour detection and hierarchical image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 33(5), 898–916.

    Article  Google Scholar 

  • Arbeláez, P., Pont-Tuset, J., Barron, J., Marques, F., & Malik, J. (2014) Multiscale combinatorial grouping. In IEEE conference on computer vision and pattern recognition (pp. 328–335).

  • Ballard, D. (1981). Generalizing the Hough transform to detect arbitrary shapes. Pattern Recognition, 13(2), 111–122.

    Article  MATH  Google Scholar 

  • Batra, D., Kowdle, A., Parikh, D., Luo, J., & Chen, T. (2010) iCoseg: Interactive co-segmentation with intelligent scribble guidance. In IEEE conference on computer vision and pattern recognition (CVPR) (pp. 3169–3176). IEEE.

  • Bernard, O., Friboulet, D., Thevenaz, P., & Unser, M. (2009). Variational b-spline level-set: A linear filtering approach for fast, deformable model evolution. IEEE Transactions on Image Processing, 18(6), 1179–1191.

    Article  MathSciNet  MATH  Google Scholar 

  • Biswas, A., & Parikh, D. (2013) Simultaneous active learning of classifiers & attributes via relative feedback. In IEEE conference on computer vision and pattern recognition (CVPR) (pp. 644–651).

  • Branson, S., Grant, V. H., Wah, C., Perona, P., & Belongie, S. (2014). The ignorant led by the blind: A hybrid human-machine vision system for fine-grained categorization. International Journal of Computer Vision, 108, 3–29.

    MathSciNet  MATH  Google Scholar 

  • Carlier, A., Charvillat, V., Salvador, A., i Nieto, X. G., & Marques, O. (2014). Click’n’Cut: Crowdsourced interactive segmentation with object candidates. In International ACM workshop on crowdsourcing for multimedia (pp. 53–56).

  • Carreira, J., Sminchisescu, C. (2010). Constrained parametric min-cuts for automatic object segmentation. In IEEE conference on computer vision and pattern recognition (CVPR) (pp. 3241–3248).

  • Caselles, V., Kimmel, R., & Sapiro, G. (1997). Geodesic active contours. IEEE Transactions on Image Processing, 22(1), 61–79.

    MATH  Google Scholar 

  • Chan, T., & Vese, L. (2001). Active contours without edges. IEEE Transactions on Image Processing, 10(2), 266–277.

    Article  MATH  Google Scholar 

  • Cheng, M., Mitra, N. J., Huang, X., Torr, P. H. S., & Hu, S. (2014). Global contrast based salient region detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 37(3), 569–582.

    Article  Google Scholar 

  • Chittajallu, D. R., Florian, S., Kohler, R. H., Iwamoto, Y., Orth, J. D., Weissleder, R., et al. (2015). In vivo cell-cycle profiling in xenograft tumors by quantitative intravital microscopy. Nature Methods, 12(6), 577–585.

    Article  Google Scholar 

  • Cui, J., Yang, Q., Wen, F., Wu, Q., Zhang, C., Gool, L. V., & Tang, X. (2008) Transductive object cutout. In IEEE conference on computer vision and pattern recognition (CVPR) (pp. 1–8).

  • Endres, I., & Hoiem, D. (2010). Category independent object proposals. In European conference on computer vision (ECCV) (pp. 575–588).

  • Everingham, M., Gool, L. V., Williams, C. K. I., Winn, J., & Zisserman, A. (2010). The pascal visual object classes (voc) challenge. International Journal of Computer Vision, 88(2), 303–338.

    Article  Google Scholar 

  • Glenn, D. R., Lee, K., Park, H., Weissleder, R., Yacoby, A., Lukin, M. D., et al. (2015). Single-cell magnetic imaging using a quantum diamond microscope. Nature Methods, 12, 736–738.

    Article  Google Scholar 

  • Grady, L., Jolly, M. P., & Seitz, A. (2011). Segmentation from a box. In IEEE international conference on computer vision (ICCV) (pp. 367–374).

  • Gulshan, V., Rother, C., Criminisi, A., Blake, A., & Zisserman, A. (2010). Geodesic star convexity for interactive image segmentation. In IEEE conference on computer vision and pattern recognition (CVPR) (pp. 3129–3136).

  • Gurari, D., He, K., Xiong, B., Zhang, J., Sameki, M., Jain, S. D., et al. (2018). Predicting foreground object ambiguity and efficiently crowdsourcing the segmentation(s). International Journal on Computer Vision (IJCV), 126, 714–730.

    Article  MathSciNet  Google Scholar 

  • Gurari, D., Jain, S. D., Betke, M., & Grauman, K. (2016). Pull the plug? predicting if computers or humans should segment images. In IEEE conference on computer vision and pattern recognition (CVPR) (pp. 382–391).

  • Gurari, D., Theriault, D., Sameki, M., & Betke, M. (2014). How to use level set methods to accurately find boundaries of cells in biomedical images? Evaluation of six methods paired with automated and crowdsourced initial contours. In Conference on medical image computing and computer assisted intervention (MICCAI): Interactive medical image computation (IMIC) workshop (pp. 9).

  • Gurari, D., Theriault, D., Sameki, M., Isenberg, B., Pham, T. A., Purwada, A., Solski, P., Walker, M., Zhang, C., Wong, J. Y., & Betke, M. (2015). How to collect segmentations for biomedical images? A benchmark evaluating the performance of experts, crowdsourced non-experts, and algorithms. In IEEE winter conference on applications in computer vision (WACV) (pp. 8).

  • He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In IEEE conference on computer vision and pattern recognition (pp. 770–778).

  • Jain, S. D., & Grauman, K. (2013). Predicting sufficient annotation strength for interactive foreground segmentation. In IEEE international conference on computer vision (ICCV) (pp. 1313–1320).

  • Jain, S. D., Xiong, B., & Grauman, K. (2017). Fusionseg: Learning to combine motion and appearance for fully automatic segmention of generic objects in videos. In 2017 IEEE conference on computer vision and pattern recognition (CVPR) (Vol. 1).

  • Kohlberger, T., Singh, V., Alvino, C., Bahlmann, C., & Grady, L. (2012). Evaluating segmentation error without ground truth. In Medical image computing and computer assisted intervention (MICCAI) (pp. 528–536).

  • Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). ImageNet classification with deep convolutional neural networks. In Advances in neural information processing systems (NIPS) (pp. 1097–1105).

  • Lankton, S., & Tannenbaum, A. (2008). Localizing region-based active contours. IEEE Transactions on Image Processing, 17(11), 2029–2039.

    Article  MathSciNet  MATH  Google Scholar 

  • Lempitsky, V., Kohli, P., Rother, C., Sharp, T. (2009). Image segmentation with a bounding box prior. In IEEE international conference on computer vision (ICCV) (pp. 277–284).

  • Li, C., Kao, C. Y., Gore, J. C., & Ding, Z. (2008). Minimization of region-scalable fitting energy for image segmentation. IEEE Transactions on Image Processing, 17(10), 1940–1949.

    Article  MathSciNet  MATH  Google Scholar 

  • Li, H., Meng, F., Luo, B., & Zhu, S. (2014). Repairing bad co-segmentation using its quality evaluation and segment propagation. IEEE Transactions on Image Processing, 23(8), 3545–3559.

    Article  MathSciNet  MATH  Google Scholar 

  • Liu, D., Xiong, Y., Pulli, K., & Shapiro, L. (2011). Estimating image segmentation difficulty. In Machine learning and data mining in pattern recognition (pp. 484–495).

  • Liu, T., Yuan, Z., Sun, J., Wang, J., Zheng, N., Tang, X., et al. (2011). Learning to detect a salient object. IEEE Transactions on Pattern Analysis and Machine Intelligence, 33(2), 353–367.

    Article  Google Scholar 

  • Maitra, M., Gupta, R. K., & Mukherjee, M. (2012). Detection and counting of red blood cells in blood cell images using Hough transform. International Journal of Computer Applications, 53(16), 18–22.

    Article  Google Scholar 

  • Martin, D., Fowlkes, C., Tal, D., & Malik, J. (2001). A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. In International conference on computer vision (ICCV) (Vol. 2, pp. 416–423).

  • Otsu, N. (1979). A threshold selection method from gray-level histograms. IEEE Transactions on Systems, Man, and Cybernetics, 9(1), 62–66.

    Article  Google Scholar 

  • Rother, C., Kolmogorov, V., & Blake, A. (2004). GrabCut: Interactive foreground extraction using iterated graph cuts. ACM Transactions on Graphics, 3, 309–314.

    Article  Google Scholar 

  • Settles, B. (2010). Active learning literature survey. Technical report, University of Wisconsin, Madison.

  • Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556.

  • Vijayanarasimhan, S., & Grauman, K. (2011). Cost-sensitive active visual category learning. International Journal of Computer Vision, 91, 24–44.

    Article  MATH  Google Scholar 

  • Wah, C., Maji, S., & Belongie, S. (2015). Learning localized perceptual similarity metrics for interactive categorization. In IEEE Winter conference on applications in computer vision (WACV) (pp. 502–509).

  • Wu, J., Zhao, Y., Zhu, J., Luo, S., & Tu, Z. (2014). MILCut: A sweeping line multiple instance learning paradigm for interactive image segmentation. In IEEE conference on computer vision and pattern recognition (CVPR) (pp. 256–263).

Download references

Acknowledgements

The authors thank the anonymous crowd workers for participating in our experiments. This work is supported in part by National Science Foundation funding to DG (IIS-1755593), a gift from Adobe to DG, National Science Foundation funding to MB (IIS-1421943), a Google Faculty Award to MB, AWS Machine Learning Research Award to KG, IBM Faculty Award to KG, IBM Open Collaborative Research Award to KG, and a gift from Qualcomm to KG.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Danna Gurari.

Additional information

Communicated by Gang Hua.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Gurari, D., Zhao, Y., Jain, S.D. et al. Predicting How to Distribute Work Between Algorithms and Humans to Segment an Image Batch. Int J Comput Vis 127, 1198–1216 (2019). https://doi.org/10.1007/s11263-019-01172-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11263-019-01172-6

Keywords

Navigation