Skip to main content

Designing a Large Language Model Based Open Data Assistant for Effective Use

  • Conference paper
  • First Online:
Design Science Research for a Resilient Future (DESRIST 2024)

Abstract

Open data is widely recognized for its potential positive impact on society and economy. However, many open data sets remain underutilized because users, such as civil servants and citizens, lack the necessary technical and analytical skills. Additionally, existing open data portals often fall short of providing user-friendly access to data. Although conversational agents equipped with Large Language Models have emerged as a promising solution to address these challenges, it is unclear how to design Large Language Model based open data assistants that allow users to formulate their information needs in natural language and ultimately use open data effectively. To address this gap, we undertake a Design Science Research project guided by the theory of effective use. In this first cycle of the project, we present meta-requirements and propose initial design principles on how to design a Large Language Model based open data assistant for effective use. Subsequently, we instantiate our principles in a prototype and evaluate it in a focus group with experts from a medium-sized German city. Our results contribute design knowledge in the form of design principles for open data assistants and inform future design cycles of our Design Science Research project.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 109.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 79.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Attard, J., Orlandi, F., Scerri, S., Auer, S.: A systematic review of open government data initiatives. Gov. Inf. Q. 32(4), 399–418 (2015). https://doi.org/10.1016/j.giq.2015.07.006

    Article  Google Scholar 

  2. Bran, A.M., Cox, S., Schilter, O., Baldassari, C., White, A.D., Schwaller, P.: ChemCrow: augmenting large-language models with chemistry tools, October 2023. https://doi.org/10.48550/arXiv.2304.05376

  3. Burton-Jones, A., Grange, C.: From use to effective use: a representation theory perspective. Inf. Syst. Res. 24(3), 632–658 (2013). https://doi.org/10.1287/isre.1120.0444

  4. Christian Stocker: Ask ZüriCityGPT anything about the government and administration of the City of Zurich, June 2023. https://www.liip.ch/en/blog/askzuricitygpt-anything-about-the-government-of-the-city-of-zurich

  5. Conradie, P., Choenni, S.: On the barriers for local government releasing open data. Gov. Inf. Q. 31, 10–17 (2014). https://doi.org/10.1016/j.giq.2014.01.003

    Article  Google Scholar 

  6. Diederich, S., Brendel, A., Morana, S., Kolbe, L.: On the design of and interaction with conversational agents: an organizing and assessing review of human computer interaction research. J. Assoc. Inf. Syst. (2022). https://doi.org/10.17705/1jais.00724

  7. European Commission: Riding the wave How Europe can gain from the rising tide of scientific data Final report of the High Level Expert Group on Scientific Data. European Commission, January 2010

    Google Scholar 

  8. European Parliament: Directive (EU) 2019/1024 of the European Parliament and of the Council of 20 June 2019 on open data and the re-use of public sector information (recast), June 2019. http://data.europa.eu/eli/dir/2019/1024/oj/eng

  9. European Union: The official portal for European data. https://data.europa.eu/en

  10. Frauenhofer DPS: FragDenStaat Analytics. https://publicanalytics.fokus.fraunhofer.de/fragdenstaat/dashboard

  11. German Federal Ministry of the Interior and Community: Informationsfreiheitsgesetz. https://www.bmi.bund.de/DE/themen/moderne-verwaltung/opengovernment/informationsfreiheitsgesetz/informationsfreiheitsgesetz-artikel.html

  12. Hevner, A.R., March, S.T., Park, J., Ram, S.: Design science in information systems research. MIS Q. 28(1), 75–105 (2004). https://doi.org/10.2307/25148625

    Article  Google Scholar 

  13. del Hoyo-Alonso, R., Rodrigalvarez-Chamarro, V., Vea-Murgía, J., Zubizarreta, I., Moyano-Collado, J.: Aragón open data assistant, lesson learned of an intelligent assistant for open data access. In: Følstad, A., et al. (eds.) Chatbot Research and Design. CONVERSATIONS 2023. LNCS, vol. 14524, pp. 42–57. Springer, Cham (2024). https://doi.org/10.1007/978-3-031-54975-5_3

  14. Hu, K., Hu, K.: ChatGPT sets record for fastest-growing user base - analyst note. Reuters, February 2023. https://www.reuters.com/technology/chatgpt-setsrecord-fastest-growing-user-base-analyst-note-2023-02-01/

  15. Janssen, M., Charalabidis, Y., Zuiderwijk, A.: Benefits, adoption barriers and myths of open data and open government. Inf. Syst. Manag. 29(4), 258–268 (2012). https://doi.org/10.1080/10580530.2012.716740

    Article  Google Scholar 

  16. Jiang, J., Zhou, K., Dong, Z., Ye, K., Zhao, W.X., Wen, J.R.: StructGPT: a general framework for large language model to reason over structured data, October 2023. https://doi.org/10.48550/arXiv.2305.09645

  17. Karpas, E., et al.: MRKL systems: a modular, neuro-symbolic architecture that combines large language models, external knowledge sources and discrete reasoning, May 2022. https://doi.org/10.48550/arXiv.2205.00445

  18. Keyner, S., Savenkov, V., Vakulenko, S.: Open data Chatbot. In: Hitzler, P., et al. (eds.) The Semantic Web: ESWC 2019 Satellite Events. ESWC 2019. LNCS, vol. 11762, pp. 111–115. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-32327-1_22

  19. Kuechler, W., Vaishnavi, V.: On theory development in design science research: anatomy of a research project. EJIS 17, 489–504 (2008)

    Google Scholar 

  20. LangChain Inc: LangChain Docs. https://python.langchain.com/docs

  21. Lewis, P., et al.: Retrieval-augmented generation for knowledge-intensive NLP tasks. In: Advances in Neural Information Processing Systems, vol. 33, pp. 9459–9474 (2020)

    Google Scholar 

  22. Lourenco, R.P.: An analysis of open government portals: a perspective of transparency for accountability. Gov. Inf. Q. 32(3), 323–332 (2015). https://doi.org/10.1016/j.giq.2015.05.006

    Article  Google Scholar 

  23. McTear, M., Callejas, Z., Griol, D.: The Conversational Interface. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-32967-3

  24. Murray-Rust, P.: Open Data in Science. Nature Precedings, p. 1, January 2008. https://doi.org/10.1038/npre.2008.1526.1, publisher: Nature Publishing Group

  25. Orszag, P.: Open Government Directive (2009). http://www.whitehouse.gov/open/documents/opengovernment-directive

  26. Purwanto, A., Zuiderwijk, A., Janssen, M.: Citizen engagement with open government data: a systematic literature review of drivers and inhibitors. Int. J. Electron. Gov. Res. 16(3), 1–25 (2020). https://doi.org/10.4018/IJEGR.2020070101

    Article  Google Scholar 

  27. Quarati, A., De Martino, M.: Open government data usage: a brief overview. In: Proceedings of the 23rd International Database Applications & Engineering Symposium. pp. 1–8. IDEAS ’19, June 2019. https://doi.org/10.1145/3331076.3331115

  28. Rajkumar, N., Li, R., Bahdanau, D.: Evaluating the Text-to-SQL Capabilities of Large Language Models, March 2022. https://doi.org/10.48550/arXiv.2204.00498

  29. Ruijer, E., Grimmelikhuijsen, S., Meijer, A.: Open data for democracy: developinga theoretical framework for open data use. Gov. Inf. Q. 34(1), 45–52 (2017). https://doi.org/10.1016/j.giq.2017.01.001

    Article  Google Scholar 

  30. Ruoff, M., Gnewuch, U., Maedche, A., Scheibehenne, B.: Designing conversational dashboards for effective use in crisis response. J. Assoc. Inf. Syst. 24(6), 1500–1526 (2023). https://doi.org/10.17705/1jais.00801

  31. Sadiq, S., Indulska, M.: Open data: quality over quantity. Int. J. Inf. Manag. 37(3), 150–154 (2017). https://doi.org/10.1016/j.ijinfomgt.2017.01.003

    Article  Google Scholar 

  32. Safarov, I., Meijer, A., Grimmelikhuijsen, S.: Utilization of open government data: a systematic literature review of types, conditions, effects and users. Inf. Polity 22, 1–24 (2017). https://doi.org/10.3233/IP-160012

    Article  Google Scholar 

  33. Streamlit Inc.: Streamlit Docs. https://docs.streamlit.io/

  34. United Nations General Assembly: Universal Declaration of Human Rights (1948). https://www.un.org/en/about-us/universal-declaration-of-human-rights

  35. Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, vol. 30. Curran Associates, Inc. (2017)

    Google Scholar 

  36. Venable, J., Pries-Heje, J., Baskerville, R.: FEDS: a framework for evaluation in design science research. Eur. J. Inf. Syst. 25(1), 77–89 (2016). https://doi.org/10.1057/ejis.2014.36

    Article  Google Scholar 

  37. Wang, D., Richards, D., Bilgin, A.A., Chen, C.: Implementation of a conversational virtual assistant for open government data portal: effects on citizens. J. Inf. Sci. (2023).https://doi.org/10.1177/01655515221151140, publisher:SAGEPublicationsLtd

  38. Weerakkody, V., Irani, Z., Kapoor, K., Sivarajah, U., Dwivedi, Y.K.: Open data and its usability: an empirical view from the Citizen’s perspective. Inf. Syst. Front. 19(2), 285–300 (2017). https://doi.org/10.1007/s10796-0169679-1

    Article  Google Scholar 

  39. Wei, J., et al.: Emergent Abilities of Large Language Models, October 2022. https://doi.org/10.48550/arXiv.2206.07682

  40. Wei, J., et al.: Chain-of-thought prompting elicits reasoning in large language models. Adv. Neural. Inf. Process. Syst. 35, 24824–24837 (2022)

    Google Scholar 

  41. Weizenbaum, J.: ELIZA—a computer program for the study of natural language communication between man and machine. Commun. ACM 9(1), 36–45 (1966). https://doi.org/10.1145/365153.365168

    Article  Google Scholar 

  42. Yao, S., et al.: ReAct: Synergizing Reasoning and Acting in Language Models, March 2023. https://doi.org/10.48550/arXiv.2210.03629

  43. Zuiderwijk, A., Janssen, M., Choenni, S., Meijer, R., Sheikh Alibaks, R.: Socio technical impediments of open data. Electron. J. eGov. 10, 156–172 (2012)

    Google Scholar 

  44. Zuiderwijk, A., Janssen, M., Dwivedi, Y.K.: Acceptance and use predictors of open data technologies: drawing upon the unified theory of acceptance and use of technology. Gov. Inf. Q. 32(4), 429–440 (2015). https://doi.org/10.1016/j.giq.2015.09.005

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Till Carlo Schelhorn .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Schelhorn, T.C., Gnewuch, U., Maedche, A. (2024). Designing a Large Language Model Based Open Data Assistant for Effective Use. In: Mandviwalla, M., Söllner, M., Tuunanen, T. (eds) Design Science Research for a Resilient Future. DESRIST 2024. Lecture Notes in Computer Science, vol 14621. Springer, Cham. https://doi.org/10.1007/978-3-031-61175-9_27

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-61175-9_27

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-61174-2

  • Online ISBN: 978-3-031-61175-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics