Putting Users in the Loop: How User Research Can Guide AI Development for a Consumer-Oriented Self-service Portal

Binder, Frank; Diels, Jana; Balling, Julian; Albrecht, Oliver; Sachunsky, Robert; Philipp, J. Nathanael; Scheurer, Yvonne; Münsch, Marlene; Otto, Markus; Niekler, Andreas; Heyer, Gerhard; Thorun, Christian

doi:10.1007/978-3-031-05434-1_1

Frank Binder ORCID: orcid.org/0000-0003-1964-0393⁸,
Jana Diels⁹,
Julian Balling^8,10,
Oliver Albrecht¹¹,
Robert Sachunsky¹⁰,
J. Nathanael Philipp¹⁰,
Yvonne Scheurer¹¹,
Marlene Münsch⁹,
Markus Otto¹²,
Andreas Niekler¹⁰,
Gerhard Heyer^8,10 &
…
Christian Thorun⁹

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13324))

Included in the following conference series:

International Conference on Human-Computer Interaction

1347 Accesses
1 Citations

Abstract

This study investigates three challenges for developing machine learning-based self-service web apps for consumers. First, we argue that user research must accompany the development of ML-based products so that they better serve users’ needs at all stages of development. Second, we discuss the data sourcing dilemma in developing consumer-oriented ML-based apps and propose a way to solve it by implementing an interaction design that balances the workload between users and computers according to the ML component’s performance. To dynamically define the role of the user-in-the-loop, we monitor user success and ML performance over time. Finally, we propose a lightweight typology of ML-based systems to assess the generalizability of our findings to other ML use cases.

Our case study uses a newly developed web application that allows consumers to analyze their heating bills for potential energy and cost savings. Based on domain-specific data values extracted from user-provided document images, an assessment of potential savings is derived and reported back to the user.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 69.99; Price excludes VAT (USA)

Softcover Book: USD 89.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
We use the term machine-learning-based systems, or ML-based systems, as software systems that include “one or more components that learn how to perform a task from a given data set” [29]. See Sect. 3 for further terminological considerations.
2.
Shared tasks, such as the table detection and recognition challenges of the ICDAR conference series [10, 11], are a ubiquitous means within machine learning communities. They usually focus on solving or improving a specific ML use case by applying and fine-tuning (highly specialized) machine-learning techniques towards a predefined, shared goal. While a helpful motivation and illustration for the specific tasks and the applicable techniques, there is usually no need to further contextualize or generalize beyond the specific setting of the task at hand.
3.
See Sect. 3.3 for a brief typology of ML use cases.
4.
Further examples and perspectives on human-in-the-loop approaches can be found in [14, 21, 35]. Examples of domain-specific roles for humans include the doctor-in-the-loop [15] and the analyst-in-the-loop [6].
5.
Note that there is at least one famous class of “behind the scenes” data annotation scenarios, where users are motivated merely by their will to successfully interact with the annotation tool in order to pass a specific test: ReCaptcha requires web users to “voluntarily” perform (partly difficult) annotation tasks of (sections of) scans or photos from extensive image collections in order to authenticate themselves as humans [34].
6.
It is precisely those “on stage” settings, where the paradigm shift, that is referred to in the invitation to this panel, can be expected to be successfully implemented.
7.
The technical description of the Smart_HEC web app and its ML component is adopted from the corresponding project’s final report [31].
8.
Note that we do not perform any fine-tuning of language models for OCR. We use Tesseract’s pre-trained models for contemporary German as provided. Once the correct ROIs for the target values are identified by the Mask R-CNN, our lever for improving the OCR results lies mainly with ranking and filtering Tesseract’s hypotheses through pattern matching in the post-decoder.
9.
This highly dynamic layout with unknown positionings of the target values does not allow for classical form data extraction or otherwise useful table detection heuristics, cf. a similar discussion in [5]. Hence, our ML-based approach attempts to mimic a human visual lookup strategy for finding the required target values on the document page images.
10.
Improvements between the two stages were mainly achieved by re-annotating large numbers of ROIs in the ground truth and re-training the Mask R-CNN, after systematic problems with the previous annotations had been discovered.
11.
These results could also indicate problems with the exact locations of the identified ROIs in the production environment. Such problems were, however, not observed in the lab setting.
12.
For instance, in our case, it might be possible to reduce the human annotation workload through automatically pre-labelling potential ROIs by locating the users’ corrected target values on the corresponding document images.

References

Abdulla, W.: Mask R-CNN for object detection and instance segmentation on Keras and TensorFlow. https://github.com/matterport/Mask_RCNN. Accessed 11 Feb 2022
Auer, F., Felderer, M.: Shifting quality assurance of machine learning algorithms to live systems. In: Tichy, M., Bodden, E., Kuhrmann, M., Wagner, S., Steghöfer, J.-P. (eds.) Software Engineering und Software Management 2018, pp. 211–212. Gesellschaft für Informatik, Bonn (2018)
Google Scholar
Baur, N., Blasius, J. (eds.): Handbuch Methoden der empirischen Sozialforschung. Springer, Wiesbaden (2014). https://doi.org/10.1007/978-3-531-18939-0
Book Google Scholar
Beede, E., et al.: A human-centered evaluation of a deep learning system deployed in clinics for the detection of diabetic retinopathy. In: Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems, Honolulu, HI, USA, pp. 1–12. ACM (2020). https://doi.org/10.1145/3313831.3376718
Bürgl, K., Reinhardt, L., Binder, F., Müller, L., Niekler, A.: Digitizing Drilling Logs - Challenges of typewritten forms. In: Gesellschaft für Informatik (ed.) 51. Jahrestagung der Gesellschaft für Informatik, INFORMATIK 2021 - Computer Science & Sustainability, Berlin, pp. 709–718. Gesellschaft für Informatik, Bonn (2021). https://doi.org/10.18420/informatik2021-059
Chegini, M., et al.: Interactive visual labelling versus active learning: an experimental comparison. Front. Inf. Technol. Electron. Eng. 21, 524–535 (2020). https://doi.org/10.1631/FITEE.1900549
Article Google Scholar
Davis, F.D.: Perceived usefulness, perceived ease of use, and user acceptance of information technology. MIS Q. 319–340 (1989)
Google Scholar
Dietrich, T., Trischler, J., Schuster, L., Rundle-Thiele, S.: Co-designing services with vulnerable consumers. J. Serv. Theory Pract. 27, 663–688 (2017). https://doi.org/10.1108/jstp-02-2016-0036
Article Google Scholar
Engl, E.: OCR-D kompakt: Ergebnisse und Stand der Forschung in der Förderinitiative. Bibliothek Forschung und Praxis (44), 218–230 (2020). https://doi.org/10.1515/bfp-2020-0024
Gao, L., et al.: ICDAR 2019 competition on table detection and recognition (cTDaR). In: 2019 International Conference on Document Analysis and Recognition (ICDAR), pp. 1510–1515 (2019). https://doi.org/10.1109/ICDAR.2019.00243
Göbel, M., Hassan, T., Oro, E., Orsi, G.: ICDAR 2013 table competition. In: 12th International Conference on Document Analysis and Recognition, pp. 1449–1453 (2013). https://doi.org/10.1109/ICDAR.2013.292
He, K., Gkioxari, G., Dollár, P., Girshick, R.B.: Mask R-CNN. In: 2017 IEEE International Conference on Computer Vision (ICCV), pp. 2980–2988 (2017). https://doi.org/10.1109/ICCV.2017.322
Hesenius, M., Schwenzfeier, N., Meyer, O., Koop, W., Gruhn, V.: Towards a software engineering process for developing data-driven applications. In: 2019 IEEE/ACM 7th International Workshop on Realizing Artificial Intelligence Synergies in Software Engineering (RAISE), pp. 35–41. IEEE (2019). https://doi.org/10.1109/raise.2019.00014
Holzinger, A.: Interactive machine learning for health informatics: when do we need the human-in-the-loop? Brain Inform. 3(2), 119–131 (2016). https://doi.org/10.1007/s40708-016-0042-6
Article Google Scholar
Holzinger, A., Valdez, A.C., Ziefle, M.: Towards interactive recommender systems with the doctor-in-the-loop. In: Weyers, B., Dittmar, A. (eds.) Mensch und Computer 2016 - Workshopband. Gesellschaft für Informatik e.V., Aachen (2016). https://doi.org/10.18420/MUC2016-WS11-0001
Kettner, S.E., Thorun, C.: Verbraucherstudie 2019: Wie erreicht man Verbraucherin- nen und Verbraucher im Zeitalter digitaler Informationsangebote. Final report. ConPolicy GmbH, Berlin (2019)
Google Scholar
Lell, O., Kettner, S.E., Thorun, C., Bendig, T.: Verbraucherschutz digital neu denken: Consumer Protection Technologies - Politische Relevanz, Potential und Handlungsbedarf. ConPolicy GmbH, Berlin (2021)
Google Scholar
Lewis, C.: Using the “thinking-aloud” method in cognitive interface design. IBM TJ Watson Research Center, Yorktown Heights (1982)
Google Scholar
Lin, T.-Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48
Chapter Google Scholar
Mahlke, S.: Factors influencing the experience of website usage. In: Extended Abstracts on Human Factors in Computing Systems, CHI 2002, pp. 846–847 (2002)
Google Scholar
Monarch, R.: Human-in-the-Loop Machine Learning. Manning Publications, New York (2021)
Google Scholar
Morville, P.: User experience design. https://semanticstudios.com/user_experience_design/. Accessed 11 Feb 2022
Moser, C.: User Experience Design. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-13363-3
Neudecker, C., et al.: OCR-D: an end-to-end open source OCR framework for historical printed documents. In: Proceedings of the 3rd International Conference on Digital Access to Textual Cultural Heritage, Brussels, pp. 53–58. ACM (2019). https://doi.org/10.1145/3322905.3322917
Ng, A.: Structured and Unstructured Data: Implications for AI Development. The Batch. https://read.deeplearning.ai/the-batch/structured-and-unstructured-data-implications-for-ai-development/. Accessed 05 Nov 2021
Patton, J., Economy, P.: User Story Mapping: Discover the Whole Story, Build the Right Product. 1st edn. O’Reilly Media Inc. (2014)
Google Scholar
Reder, B.: Machine Learning 2021. IDG Business Media GmbH, München (2021)
Google Scholar
Reul, C., Springmann, U., Puppe, F.: LAREX: a semi-automatic open-source tool for layout analysis and region extraction on early printed books. In: Proceedings of the 2nd International Conference on Digital Access to Textual Cultural Heritage, Göttingen, pp. 137–142. Association for Computing Machinery (2017). https://doi.org/10.1145/3078081.3078097
Riccio, V., Jahangirova, G., Stocco, A., Humbatova, N., Weiss, M., Tonella, P.: Testing machine learning based systems: a systematic mapping. Empir. Softw. Eng. 25(6), 5193–5254 (2020). https://doi.org/10.1007/s10664-020-09881-0
Article Google Scholar
Roberts, L.: The value of AI: now and the future (PART 2) AI Failures, Pitfalls, Key Learnings and Success. https://www.linkedin.com/pulse/value-ai-now-future-part-2-failures-pitfalls-key-success-roberts/. Accessed 05 Nov 2021
Scheurer, Y., et al.: Abschlussbericht Smart_HEC (Kurzfassung). co2online gGmbH, Berlin (2021)
Google Scholar
Smith, R.: An overview of the tesseract OCR engine. In: Ninth International Conference on Document Analysis and Recognition (ICDAR 2007), pp. 629–633 (2007). https://doi.org/10.1109/ICDAR.2007.4376991
Thielsch, M.T., Blotenberg, I., Jaron, R.: User evaluation of websites: from first impression to recommendation. Interact. Comput. 26(1), 89–102 (2014)
Article Google Scholar
von Ahn, L., Maurer, B., McMillen, C., Abraham, D., Blum, M.: reCAPTCHA: human-based character recognition via web security measures. Science 321, 1465–1468 (2008). https://doi.org/10.1126/science.1160379
Article MathSciNet MATH Google Scholar
Yimam, S.M., Biemann, C., Majnaric, L., Šabanović, Š, Holzinger, A.: An adaptive annotation approach for biomedical entity and relation recognition. Brain Inform. 3(3), 157–168 (2016). https://doi.org/10.1007/s40708-016-0036-4
Article Google Scholar

Download references

Acknowledgments

This research was supported by the German Federal Ministry of Justice and Consumer Protection (BMJV) under grants no. 28V2304A19, 28V2304B19, 28V2304C19, 28V2304D19. Partial contributions were funded by the German Federal Ministry of Education and Research (BMBF) under grant no. 01IS20091B, and by the Development Bank of Saxony (SAB) under project number 100335729.

Author information

Authors and Affiliations

Institute for Applied Informatics at Leipzig University (InfAI), Leipzig, Germany
Frank Binder, Julian Balling & Gerhard Heyer
ConPolicy GmbH - Institute for Consumer Policy, 10827, Berlin, Germany
Jana Diels, Marlene Münsch & Christian Thorun
Natural Language Processing Group, Leipzig University, Leipzig, Germany
Julian Balling, Robert Sachunsky, J. Nathanael Philipp, Andreas Niekler & Gerhard Heyer
co2online gGmbH, Hochkirchstr. 9, 10829, Berlin, Germany
Oliver Albrecht & Yvonne Scheurer
SEnerCon GmbH, Hochkirchstr. 11, 10829, Berlin, Germany
Markus Otto

Authors

Frank Binder
View author publications
You can also search for this author in PubMed Google Scholar
Jana Diels
View author publications
You can also search for this author in PubMed Google Scholar
Julian Balling
View author publications
You can also search for this author in PubMed Google Scholar
Oliver Albrecht
View author publications
You can also search for this author in PubMed Google Scholar
Robert Sachunsky
View author publications
You can also search for this author in PubMed Google Scholar
J. Nathanael Philipp
View author publications
You can also search for this author in PubMed Google Scholar
Yvonne Scheurer
View author publications
You can also search for this author in PubMed Google Scholar
Marlene Münsch
View author publications
You can also search for this author in PubMed Google Scholar
Markus Otto
View author publications
You can also search for this author in PubMed Google Scholar
Andreas Niekler
View author publications
You can also search for this author in PubMed Google Scholar
Gerhard Heyer
View author publications
You can also search for this author in PubMed Google Scholar
Christian Thorun
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Frank Binder .

Editor information

Editors and Affiliations

Eindhoven University of Technology, Eindhoven, The Netherlands
Matthias Rauterberg

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Binder, F. et al. (2022). Putting Users in the Loop: How User Research Can Guide AI Development for a Consumer-Oriented Self-service Portal. In: Rauterberg, M. (eds) Culture and Computing. HCII 2022. Lecture Notes in Computer Science, vol 13324. Springer, Cham. https://doi.org/10.1007/978-3-031-05434-1_1

Download citation

DOI: https://doi.org/10.1007/978-3-031-05434-1_1
Published: 16 June 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-05433-4
Online ISBN: 978-3-031-05434-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics