Skip to main content
Log in

Power to the Oracle? Design Principles for Interactive Labeling Systems in Machine Learning

KI - Künstliche Intelligenz Aims and scope Submit manuscript

Abstract

Labeling is the process of enclosing information to some object. In machine learning it is required as ground truth to leverage the potential of supervised techniques. A key challenge in labeling is that users are not necessarily eager to behave as simple oracles, that is, repeatedly answering questions whether a label is right or wrong. In this respect, scholars acknowledge designing interactivity in labeling systems as a promising area for further improvements. In recent years, a considerable number of articles focusing on interactive labeling systems have been published. However, there is a lack of consolidated principles how to design these systems. In this article, we identify and discuss five design principles for interactive labeling systems based on a literature review and offer a frame for detecting common ground in the implementation of corresponding solutions. With these guidelines, we strive to contribute design knowledge for the increasingly important class of interactive labeling systems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Notes

  1. We queried the NIPS back to the year of 2003 as this coincides with our oldest previously identified publication [cf. 20]

  2. The regular proceedings of ICML were already covered by our database search up until 2009. From 2010 onwards we relied on icml.cc and proceedings.mlr.press to retrieve the corresponding articles.

References

  1. Chen NC, Drouhard M, Kocielnik R, Suh J, Aragon CR (2018) Using machine learning to support qualitative coding in social science. ACM Trans Interact Intell Syst 8(2):1–20. https://doi.org/10.1145/3185515

    Article  Google Scholar 

  2. Watson H (2017) Preparing for the cognitive generation of decision support. MIS Q Exec 16(3):153–169. https://aisel.aisnet.org/misqe/vol16/iss3/3/

  3. Baccala M, Curran C, Garrett D, Likens S, Rao A, Ruggles A, Shehab M (2018) 2018 AI predictions—8 insights to shape business strategy. https://doi.org/10.1007/s12193-015-0195-2. https://www.pwc.pl/pl/publikacje/ai-predictions-2018-report-pwc.pdf

  4. Anthes G (2017) Artificial intelligence poised to ride a new wave. Commun ACM 60(7):19–21. https://doi.org/10.1145/3088342

    Article  Google Scholar 

  5. Liu S, Liu X, Liu Y, Feng L, Qiao H, Zhou J, Wang Y (2018) Perceptual visual interactive learning. CoRR abs/1810.10789:1–11. arXiv:1810.10789

  6. Bernard J, Hutter M, Zeppelzauer M, Fellner D, Sedlmair M (2018) Comparing visual-interactive labeling with active learning: an experimental study. IEEE Trans Vis Comput Graph 24(1):298–308. https://doi.org/10.1109/TVCG.2017.2744818

    Article  Google Scholar 

  7. Zhang L, Tong Y, Ji Q (2008) Active image abeling and its application to facial action labeling. In: Forsyth D, Torr P, Zisserman A (eds) Lecture notes in computer science (including subseries Lecture notes in artificial intelligence and lecture notes in bioinformatics). vol 5303 LNCS, Springer, Berlin, Heidelberg, pp 706–719. https://doi.org/10.1007/978-3-540-88688-4_52

  8. Amershi S, Cakmak M, Knox WB, Kulesza T (2014) Power to the people: the role of humans in interactive machine learning. AI Mag 35(4):105–120. https://doi.org/10.1609/aimag.v35i4.2513

    Article  Google Scholar 

  9. Cakmak M, Chao C, Thomaz AL (2010) Designing interactions for robot active learners. IEEE Trans Auton Ment Dev 2(2):108–118. https://doi.org/10.1109/TAMD.2010.2051030

    Article  Google Scholar 

  10. Dudley JJ, Kristensson PO (2018) A review of user interface design for interactive machine learning. ACM Trans Interact Intell Syst 8(2):1–37. https://doi.org/10.1145/3185517

    Article  Google Scholar 

  11. Nalisnik M, Gutman DA, Kong J, Cooper LAD (2015) An interactive learning framework for scalable classification of pathology images. In: International conference on big data, IEEE, pp 928–935. https://doi.org/10.1109/BigData.2015.7363841

  12. Yimam SM, Biemann C, Majnaric L, Šabanović Š, Holzinger A (2016) An adaptive annotation approach for biomedical entity and relation recognition. Brain Inf 3(3):157–168. https://doi.org/10.1007/s40708-016-0036-4

    Article  Google Scholar 

  13. Gligic L, Kormilitzin A, Goldberg P, Nevado-Holgado AJ (2019) Named entity recognition in electronic health records using transfer learning bootstrapped neural networks. CoRR abs/1901.01592:1–11, arXiv:1901.01592

  14. Kim B, Pardo B (2018) A human-in-the-loop system for sound event detection and annotation. ACM Trans Interact Intell Syst 8(2):1–23. https://doi.org/10.1145/3214366

    Article  Google Scholar 

  15. Trivedi G (2016) On interactive machine learning. Available at: https://www.trivedigaurav.com/blog/on-interactive-machine-learning/. Accessed 9 Jan 2020

  16. Settles B (2010) Active learning literature survey. Tech. rep. University of Wisconsin, Madison

    Google Scholar 

  17. Fürnkranz J, Hüllermeier E (2011) Preference learning: an introduction, Springer, Berlin, pp 1–17. https://doi.org/10.1007/978-3-642-14125-6_1

  18. Sen S, Vig J, Riedl J (2009) Tagommenders: connecting users to items through tags. In: Proceedings of the 18th international conference on World Wide Web, ACM, New York, NY, USA, WWW ’09, pp 671–680. https://doi.org/10.1145/1526709.1526800

  19. Webster J, Watson RT (2002) Analyzing the past to prepare for the future: writing a literature review. MIS Q 26(2):13–23. http://www.jstor.org/stable/4132319

  20. Fails JA, Olsen DR (2003) Interactive machine learning. In: Proceedings of the international conference on Intelligent user interfaces, ACM Press, New York, NY, USA, IUI ’03, pp 39–45. https://doi.org/10.1145/604045.604056

  21. Rashid AM, Ling K, Tassone RD, Resnick P, Kraut R, Riedl J (2006) Motivating participation by displaying the value of contribution. In: Proceedings of the SIGCHI conference on human factors in computing systems, ACM, New York, CHI ’06, pp 955–958. https://doi.org/10.1145/1124772.1124915

  22. Thomaz AL, Breazeal C (2008) Teachable robots: understanding human teaching behavior to build more effective robot learners. Artif Intell 172(6–7):716–737. https://doi.org/10.1016/j.artint.2007.09.009

    Article  Google Scholar 

  23. Zikmund WG (2010) Business research methods. South-Western Cengage Learning, UK

    Google Scholar 

  24. Fogarty J, Tan D, Kapoor A, Winder S (2008) CueFlik: interactive concept learning in image search. In: Proceedings of the SIGCHI conference on human factors in computing systems, ACM, New York, NY, USA, CHI ’08. pp 29–38, https://doi.org/10.1145/1357054.1357061

  25. Xu Y, Zhang H, Miller K, Singh A, Dubrawski A (2017) Noise-tolerant interactive learning using pairwise comparisons. In: Guyon I, Luxburg UV, Bengio S, Wallach H, Fergus R, Vishwanathan S, Garnett R (eds) Advances in neural information processing systems 30, Curran Associates, Inc., pp 2431–2440. http://papers.nips.cc/paper/6837-noise-tolerant-interactive-learning-using-pairwise-comparisons.pdf

  26. Shivaswamy P, Joachims T (2012) Online structured prediction via coactive learning. CoRR abs/1205.4213:1–8. arXiv:1205.4213

  27. Borovikov I, Harder J, Sadovsky M, Beirami A (2019) Towards interactive training of non-player characters in video games. CoRR abs/1906.00535:1–6. arXiv:1906.00535

  28. Plummer BA, Kiapour MH, Zheng S, Piramuthu R (2018) Give me a hint! navigating image databases using human-in-the-loop feedback. CoRR abs/1809.08714:1–10. arXiv:1809.08714

  29. Amershi S, Fogarty J, Weld D (2012) Regroup: interactive machine learning for on-demand group creation in social networks. Proceedings of the ACM annual conference on human factors in computing systems, pp 21–30. https://doi.org/10.1145/2207676.2207680

  30. Self JZ, Vinayagam RK, Fry JT, North C (2016) Bridging the gap between user intention and model parameters for human-in-the-loop data analytics. In: Proceedings of the Workshop on Human-In-the-Loop Data Analytics, ACM, New York, NY, USA, HILDA ’16, pp 1–6. https://doi.org/10.1145/2939502.2939505

  31. Dasgupta S, Poulis S, Tosh C (2019) Interactive topic modeling with anchor words. CoRR abs/1907.04919:1–7. arXiv:1907.04919

  32. Cheng TY, Lin G, Gong X, Liu KJ, Wu SH (2016) Learning user perceived clusters with feature-level supervision. In: Lee DD, Sugiyama M, Luxburg UV, Guyon I, Garnett R (eds) Advances in neural information processing systems 29, Curran Associates, Inc., pp 532–540. http://papers.nips.cc/paper/6260-learning-user-perceived-clusters-with-feature-level-supervision.pdf

  33. MacGlashan J, Ho MK, Loftin R, Peng B, Wang G, Roberts DL, Taylor ME, Littman ML (2017) Interactive learning from policy-dependent human feedback. In: Precup D, Teh YW (eds) Proceedings of the international conference on machine learning, PMLR, International Convention Centre, Sydney, Australia, Proceedings of Machine Learning Research, vol 70, pp 2285–2294. http://proceedings.mlr.press/v70/macglashan17a.html

  34. Porter R, Theiler J, Hush D (2013) Interactive machine learning in data exploitation. Comput Sci Eng 15(5):12–20. https://doi.org/10.1109/MCSE.2013.74

    Article  Google Scholar 

  35. Guo X, Wu H, Cheng Y, Rennie S, Tesauro G, Feris R (2018) Dialog-based interactive image retrieval. In: Bengio S, Wallach H, Larochelle H, Grauman K, Cesa-Bianchi N, Garnett R (eds) Advances in neural information processing systems 31, Curran Associates, Inc., pp 678–688. http://papers.nips.cc/paper/7348-dialog-based-interactive-image-retrieval.pdf

  36. Hebbalaguppe R, McGuinness K, Kuklyte J, Healy G, O’Connor N, Smeaton A (2013) How interaction methods affect image segmentation: User experience in the task. In: Workshop on user-centered computer vision, IEEE, pp 19–24. https://doi.org/10.1109/UCCV.2013.6530803

  37. Acuna D, Ling H, Kar A, Fidler S (2018) Efficient interactive annotation of segmentation datasets with polygon-rnn++. CoRR abs/1803.09693:1–21. arXiv:1803.09693

  38. Lopresti D, Nagy G (2012) Optimal data partition for semi-automated labeling. In: Proceedings of the international conference on pattern recognition, IEEE, pp 286–289

  39. Bryan N, Mysore G (2013) An efficient posterior regularized latent variable model for interactive sound source separation. In: Dasgupta S, McAllester D (eds) Proceedings of the international conference on machine learning, PMLR, Atlanta, Georgia, USA, Proceedings of Machine Learning Research, vol 28, pp 208–216. http://proceedings.mlr.press/v28/bryan13.html

  40. Stumpf S, Rajaram V, Li L, Burnett M, Dietterich T, Sullivan E, Drummond R, Herlocker J (2007) Toward harnessing user feedback for machine learning. In: Proceedings of the international conference on intelligent user interfaces, ACM, New York, NY, USA, IUI ’07, pp 82–91. https://doi.org/10.1145/1216295.1216316

  41. Boyko A, Funkhouser T (2014) Cheaper by the dozen: group annotation of 3D Data. In: Proceedings of the annual ACM symposium on user interface software and technology, ACM, New York, NY, USA, pp 33–42. https://doi.org/10.1145/2642918.2647418

  42. Kim B, Glassman E, Johnson B, Shah J (2015) iBCM: Interactive Bayesian case model empowering humans via intuitive interaction. https://dspace.mit.edu/handle/1721.1/96315

  43. Sun Q, DeJong G (2005) Explanation-augmented SVM. In: Proceedings of the international conference on machine learning, ACM, New York, NY, USA, ICML ’05, pp 864–871. https://doi.org/10.1145/1102351.1102460

  44. Early K, Fienberg SE, Mankoff J (2016) Test time feature ordering with FOCUS. In: Proceedings of the ACM international joint conference on pervasive and ubiquitous computing, ACM, New York, NY, USA, UbiComp ’16, pp 992–1003. https://doi.org/10.1145/2971648.2971748

  45. Weigl E, Walch A, Neissl U, Meyer-Heye P, Heidl W, Radauer T, Lughofer E, Eitzinger C (2016) MapView: graphical data representation for active learning. CEUR Workshop Proceedings, Sun SITE, Aachen, 1707:3–8

  46. Datta S, Adar E (2018) CommunityDiff: visualizing community clustering algorithms. ACM Trans Knowl Discov Data 12(1):1–34. https://doi.org/10.1145/3047009

    Article  Google Scholar 

  47. Jain S, Munukutla S, Held D (2019) Few-shot point cloud region annotation with human in the loop. CoRR abs/1906.04409:1–6. arXiv:1906.04409

  48. Wallace BC, Small K, Brodley CE, Lau J, Trikalinos TA (2012) Deploying an interactive machine learning system in an evidence-based practice center: Abstrackr. In: Proceedings of the SIGHIT international health informatics symposium, ACM, New York, NY, USA, pp 819–824. https://doi.org/10.1145/2110363.2110464

  49. Zhu Y, Yang K (2019) Tripartite active learning for interactive anomaly discovery. IEEE Access 7:63195–63203. https://doi.org/10.1109/ACCESS.2019.2915388

    Article  Google Scholar 

  50. Yan Y, Rosales R, Fung G, Dy JG (2011) Active learning from crowds. In: Proceedings of the international conference on machine learning, Omnipress, USA, ICML’11, pp 1161–1168. http://dl.acm.org/citation.cfm?id=3104482.3104628

  51. Cui S, Dumitru CO, Datcu M (2014) Semantic annotation in earth observation based on active learning. Int J Image Data Fusion 5(2):152–174. https://doi.org/10.1080/19479832.2013.858778

    Article  Google Scholar 

  52. Burkovski A, Kessler W, Heidemann G, Kobdani H, Schütze H (2011) Self organizing maps in NLP: exploration of coreference feature space. In: Proceedings of the international conference on advances in self-organizing maps, Springer, Berlin, Heidelberg, pp 228–237. https://doi.org/10.1007/978-3-642-21566-7_23

  53. Rosenthal SL, Dey AK (2010) Towards maximizing the accuracy of human-labeled sensor data. In: Proceedings of the international conference on intelligent user interfaces, ACM, New York, NY, USA, pp 259–268. https://doi.org/10.1145/1719970.1720006

  54. Kagy J, Kayadelen T, Ma J, Rostamizadeh A, Strnadová J (2019) The practical challenges of active learning: Lessons learned from live experimentation. CoRR abs/1907.00038:1–7. arXiv:1907.00038

  55. Benato BC, Telea AC, Falcão AX (2018) Semi-supervised learning with interactive label propagation guided by feature space projections. In: SIBGRAPI conference on graphics, patterns and images, pp 392–399. https://doi.org/10.1109/SIBGRAPI.2018.00057

  56. Harvey N, Porter R (2016) User-driven sampling strategies in image exploitation. Inf Vis 15(1):64–74. https://doi.org/10.1177/1473871614557659

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mario Nadj.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Nadj, M., Knaeble, M., Li, M.X. et al. Power to the Oracle? Design Principles for Interactive Labeling Systems in Machine Learning. Künstl Intell 34, 131–142 (2020). https://doi.org/10.1007/s13218-020-00634-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13218-020-00634-1

Keywords

Navigation