Abstract
Pre-trained large language models (LLMs) have demonstrated extraordinary adaptability across varied tasks, notably in data analysis when supplemented with relevant contextual cues. However, supplying this context without compromising data privacy can prove complicated and time-consuming, occasionally impeding the model’s output quality. To address this, we devised a novel system adept at discerning context from the multifaceted desktop environments commonplace amongst office workers. Our approach prioritizes real-time interaction with applications, according precedence to those recently engaged with and have sustained prolonged user interaction. Observing this landscape, the system identifies the dominant data analysis tools based on user engagement and intelligently aligns concise user queries with the data’s inherent structure to determine the most appropriate tool. This meticulously sourced context, when combined with optimally chosen prefabricated prompts, empowers LLMs to generate code that mirrors user intentions. An evaluation with 18 participants, each using three popular data analysis tools in real-world office and R &D scenarios, benchmarked our approach against a conventional baseline. The results showcased an impressive 93.0% success rate of our system across seven distinct data-focused tasks. In conclusion, our method significantly improves user accessibility, satisfaction, and comprehension in data analytics.
Similar content being viewed by others
Code availability
The prompts substantiating the conclusions can be found in the appendix section of the manuscript. Additionally, the corresponding author will furnish the complete source code upon receiving a justified request.
References
Bach, J., Bolton, M.: A Context-Driven Approach to Automation in Testing. Technical report, Satisfice, Inc., Feb. 2016. Available at https://shorturl.at/cORT9 (2016)
Bazire, M., Brézillon, P.: Understanding context before using it. In: Modeling and Using Context: 5thInternational and Interdisciplinary Conference CONTEXT 2005, Paris, France, . Proceedings 5, pp. 29–40. Springer (2005)
Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A.: Language models are few-shot learners. Adv. Neural Inform. Process. Syst. 33, 1877–1901 (2020)
Chase, H.: Langchain (2023). https://python.langchain.com/en/latest/index.html
Chen, M., Tworek, J., Jun, H., Yuan, Q., Pinto, H.P.d.O., Kaplan, J., Edwards, H., Burda, Y., Joseph, N., Brockman, G.: Evaluating large language models trained on code. arXiv preprint https://arxiv.org/abs/2107.03374 (2021)
Chowdhery, A., Narang, S., Devlin, J., Bosma, M., Mishra, G., Roberts, A., Barham, P., Chung, H.W., Sutton, C., Gehrmann, S.: Palm: Scaling language modeling with pathways. arXiv preprint https://arxiv.org/abs/2204.02311 (2022)
Christiano, P.F., Leike, J., Brown, T., Martic, M., Legg, S., Amodei, D.: Deep Reinforcement Learning from Human Preferences. Curran Associates, Inc. (2017). https://proceedings.neurips.cc/paper_files/paper/2017/file/d5e2c0adad503c91f91df240d0cd4e49-Paper.pdf
Chung, H.W., Hou, L., Longpre, S., Zoph, B., Tay, Y., Fedus, W., Li, E., Wang, X., Dehghani, M., Brahma, S.: Scaling instruction-finetuned language models. arXiv preprint https://arxiv.org/abs/2210.11416 (2022)
Corporation, M.: Microsoft excel (2019 (16.0)). https://office.microsoft.com/excel
Coutaz, J., Crowley, J.L., Dobson, S., Garlan, D.: Context is key. Communi. of the ACM 48(3), 49–53 (2005)
Diederich, S., Brendel, A.B., Morana, S., Kolbe, L.: On the design of and interaction with conversational agents: an organizing and assessing review of human-computer interaction research. J. Assoc. Inf. Syst. 23(1), 96–138 (2022)
Ding, J., Zhao, B., Huang, Y., Wang, Y., Shi, Y.: Gazereader: Detecting unknown word using webcam for english as a second language (esl) learners. In: Extended Abstracts of the 2023 CHI Conference on Human Factors in Computing Systems, pp. 1–7
Feng, Yingchaojie., Wang, Xingbo., Pan, Bo., Wong, Kam Kwai., Ren, Yi., Liu, Shi., Yan, Zihan., Ma, Yu xin., Qu, Huamin., Chen, Wei.: eng 2023/04/07 IEEE Trans Vis Comput Graph. 2023 Jan 26;PP. https://doi.org/10.1109/TVCG.2023.3240003
Feng, Y., Wang, X., Pan, B., Wong, K.K., Ren, Y., Liu, S., Yan, Z., Ma, Y., Qu, H., Chen, W.: Explaining and diagnosing nli-based visual data analysis. IEEE Trans. Vis. Comput. Graph. (2023). https://doi.org/10.1109/TVCG.2023.3240003
Gao, L., Madaan, A., Zhou, S., Alon, U., Liu, P., Yang, Y., Callan, J., Neubig, G.: Pal: Program-aided language models. arXiv preprint https://arxiv.org/abs/2211.10435 (2022)
Gauselmann, P., Runge, Y., Jilek, C., Frings, C., Maus, H., Tempel, T.: A relief from mental overload in a digitalized world: How context-sensitive user interfaces can enhance cognitive performance. Int. J. Human-Comput. Interact. 39(1), 140–150 (2023)
Greenberg, S.: Context as a dynamic construct. Human-Comput. Interact. 16(2–4), 257–268 (2001)
Hoque, E., Kavehzadeh, P., Masry, A.: Chart question answering: state of the art and future directions. Comput. Graphics Forum 41(3), 555–572 (2022). https://doi.org/10.1111/cgf.14573
Jiang, Z., Xu, F.F., Araki, J., Neubig, G.: How can we know what language models know? Trans. Assoc. Comput. Ling. 8, 423–438 (2020)
Joshi, H., Ebenezer, A., Cambronero, J., Gulwani, S., Kanade, A., Le, V., Radiček, I., Verbruggen, G.: Flame: A small language model for spreadsheet formulas. arXiv preprint https://arxiv.org/abs/2301.13779 (2023)
Karaman, Ç.Ç., Sezgin, T.M.: Gaze-based predictive user interfaces: visualizing user intentions in the presence of uncertainty. Int. J. Human-Comput. Stud. 111, 78–91 (2018). https://doi.org/10.1016/j.ijhcs.2017.11.005
Khatry, A., Cahoon, J., Henkel, J., Deep, S., Emani, V., Floratou, A., Gulwani, S., Le, V., Raza, M., Shi, S.: From words to code: Harnessing data for program synthesis from natural language. arXiv preprint https://arxiv.org/abs/2305.01598 (2023)
Kumar, S., Talukdar, P.: Reordering examples helps during priming-based few-shot learning. arXiv preprint https://arxiv.org/abs/2106.01751 (2021)
Lazaridou, A., Gribovskaya, E., Stokowiec, W., Grigorev, N.: Internet-augmented language models through few-shot prompting for open-domain question answering. arXiv preprint https://arxiv.org/abs/2203.05115 (2022)
Liang, Y., Wu, C., Song, T., Wu, W., Xia, Y., Liu, Y., Ou, Y., Lu, S., Ji, L., Mao, S.: Taskmatrix. ai: Completing tasks by connecting foundation models with millions of apis. arXiv preprint https://arxiv.org/abs/2303.16434 (2023)
Liu, M.X., Sarkar, A., Negreanu, C., Zorn, B., Williams, J., Toronto, N., Gordon, A.D.: “what it wants me to say”: Bridging the abstraction gap between end-user programmers and code-generating large language models. In: Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems, pp. 1–31
Liu, C., Han, Y., Jiang, R., Yuan, X.: ADVISor: automatic visualization answer for natural-language question on tabular data. Pac. Vis. Symp. (2021). https://doi.org/10.1109/PacificVis52677.2021.00010
Liu, P., Yuan, W., Fu, J., Jiang, Z., Hayashi, H., Neubig, G.: Pre-train, prompt, and predict: a systematic survey of prompting methods in natural language processing. ACM Comput. Surv. 55(9), 1–35 (2023)
Lu, Y., Bartolo, M., Moore, A., Riedel, S., Stenetorp, P.: Fantastically ordered prompts and where to find them: Overcoming few-shot prompt order sensitivity. arXiv preprint https://arxiv.org/abs/2104.08786 (2021)
Luo, Y., Tang, N., Li, G., Chai, C., Li, W., Qin, X.: Synthesizing Natural Language to Visualization (NL2VIS) Benchmarks from NL2SQL Benchmarks (2021). 10.1145/3448016.3457261
Luo, Y., Tang, N., Li, G., Tang, J., Chai, C., Qin, X.: Natural language to visualization by neural machine translation. IEEE Trans. Vis. Comput. Graph. 28(1), 217–226 (2022). https://doi.org/10.1109/TVCG.2021.3114848
Luo, Yuyu, Tang, Nan, Li, Guoliang, Tang, Jiawei, Chai, Chengliang, Qin, Xuedi: Eng Research Support, Non-U.S. Gov’t 2021/11/17. IEEE Trans. Vis. Comput. Graph. 28(1), 217–226 (2022). https://doi.org/10.1109/TVCG.2021.3114848
Maddigan, P., Susnjak, T.: Chat2vis: Fine-tuning data visualisations using multilingual natural language text and pre-trained large language models. arXiv preprint https://arxiv.org/abs/2303.14292 (2023)
Maddigan, P., Susnjak, T.: Chat2vis: Generating data visualisations via natural language using chatgpt, codex and gpt-3 large language models. arXiv preprint https://arxiv.org/abs/2302.02094 (2023)
Min, S., Lyu, X., Holtzman, A., Artetxe, M., Lewis, M., Hajishirzi, H., Zettlemoyer, L.: Rethinking the role of demonstrations: What makes in-context learning work? arXiv preprint https://arxiv.org/abs/2202.12837 (2022)
Nakano, R., Hilton, J., Balaji, S., Wu, J., Ouyang, L., Kim, C., Hesse, C., Jain, S., Kosaraju, V., Saunders, W.: Webgpt: Browser-assisted question-answering with human feedback. arXiv preprint https://arxiv.org/abs/2112.09332 (2021)
Narechania, A., Srinivasan, A., Stasko, J.: Nl4dv: a toolkit for generating analytic specifications for data visualization from natural language queries. IEEE Trans. Vis. Comput. Graph. 27(2), 369–379 (2021). https://doi.org/10.1109/TVCG.2020.3030378
Narechania, Arpit, Srinivasan, Arjun, Stasko, John: Eng research support , U.S. Gov’t, Non-P.H.S. 2020/10/14. IEEE Trans. Vis. Comput. Graph. 27(2), 369–379 (2021). https://doi.org/10.1109/TVCG.2020.3030378
Ni, A., Iyer, S., Radev, D., Stoyanov, V., Yih, W.-t., Wang, S.I., Lin, X.V.: Lever: Learning to verify language-to-code generation with execution. arXiv preprint https://arxiv.org/abs/2302.08468 (2023)
OpenAI: Code interpreter-chatgpt plugins (July 2023). https://openai.com/blog/chatgpt-plugins
OpenAI: Gpt-4 technical report. arXiv preprint https://arxiv.org/abs/2303.08774 (2023)
(Open Source): Welcome to streamlit (2023 (1.18.1)). https://github.com/streamlit/streamlit
Ouyang, L., Wu, J., Jiang, X., Almeida, D., Wainwright, C., Mishkin, P., Zhang, C., Agarwal, S., Slama, K., Ray, A.: Training language models to follow instructions with human feedback. Adv. Neural Inform. Process. Syst. 35, 27730–27744 (2022)
Rath, A.S., Devaurs, D., Lindstaedt, S.N.: Uico (2009). https://doi.org/10.1145/1552262.1552270
(Open Source): xlwings - make excel fly with python (2023 (0.29.1)).
Setlur, V., Battersby, S.E., Tory, M., Gossweiler, R., Chang, A.X.: Eviza (2016). https://doi.org/10.1145/2984511.2984588
Press, O., Zhang, M., Min, S., Schmidt, L., Smith, N.A., Lewis, M.: Measuring and narrowing the compositionality gap in language models. arXiv preprint https://arxiv.org/abs/2210.03350 (2022)
Setlur, V., Tory, M.: How do you converse with an analytical chatbot? revisiting gricean maxims for designing analytical conversational behavior. In: Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems, pp. 1–17
Shen, L., Shen, E., Luo, Y., Yang, X., Hu, X., Zhang, X., Tai, Z., Wang, J.: Towards natural language interfaces for data visualization: A survey. arXiv preprint https://arxiv.org/abs/2109.03506 (2021)
Srinivasa Ragavan, S., Hou, Z., Wang, Y., Gordon, A.D., Zhang, H., Zhang, D.: GridBook: Natural Language Formulas for the Spreadsheet Grid (2022). https://doi.org/10.1145/3490099.3511161
Srinivasan, A., Nyapathy, N., Lee, B., Drucker, S.M., Stasko, J.: Collecting and Characterizing Natural Language Utterances for Specifying Data Visualizations (2021). https://doi.org/10.1145/3411764.3445400
Stumpf, S., Bao, X., Dragunov, A., Dietterich, T.G., Herlocker, J., Johnsrude, K., Li, L., Shen, J.: The tasktracker system. In: PROCEEDINGS OF THE NATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE, vol. 20, p. 1712. Menlo Park, CA; Cambridge, MA; London; AAAI Press; MIT Press; 1999
Suzgun, M., Scales, N., Schärli, N., Gehrmann, S., Tay, Y., Chung, H.W., Chowdhery, A., Le, Q.V., Chi, E.H., Zhou, D.: Challenging big-bench tasks and whether chain-of-thought can solve them. arXiv preprint https://arxiv.org/abs/2210.09261 (2022)
Tang, J., Luo, Y., Ouzzani, M., Li, G., Chen, H.: Sevi: Speech-to-Visualization through Neural Machine Translation (2022). https://doi.org/10.1145/3514221.3520150
Taylor, R., Kardas, M., Cucurull, G., Scialom, T., Hartshorn, A., Saravia, E., Poulton, A., Kerkez, V., Stojnic, R.: Galactica: A large language model for science. arXiv preprint https://arxiv.org/abs/2211.09085 (2022)
team, T.: Pandas for python (2023) https://doi.org/10.5281/zenodo.10304236
The MathWorks, I.: Matlab engine for python (2022). https://www.mathworks.com/help/matlab/matlab_external/install-the-matlab-engine-for-python.html
The MathWorks, I.: Matlab version: 9.13.0 (r2022b) (2022). https://www.mathworks.com
Thoppilan, R., De Freitas, D., Hall, J., Shazeer, N., Kulshreshtha, A., Cheng, H.-T., Jin, A., Bos, T., Baker, L., Du, Y.: Lamda: Language models for dialog applications. arXiv preprint https://arxiv.org/abs/2201.08239 (2022)
Van Binsbergen, L.T., Verano Merino, M., Jeanjean, P., Van Der Storm, T., Combemale, B., Barais, O.: A principled approach to repl interpreters. In: Proceedings of the 2020 ACM SIGPLAN International Symposium on New Ideas, New Paradigms, and Reflections on Programming and Software, pp. 84–100 (2020)
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Adv. Neural Inform. Process. syst. 30 (2017)
Vuong, T., Jacucci, G., Ruotsalo, T.: Watching inside the screen: digital activity monitoring for task recognition and proactive information retrieval. Proceed. ACM Interact. Mob. Wear. Ubiquitous Technol. 1(3), 1–23 (2017)
Wang, Yun., Hou, Zhitao., Shen, Leixian., Wu, Tongshuang., Wang, Jiaqi., Huang, He., Zhang, Haidong., Zhang, Dongmei. 2023 Eng 2022/10/06 IEEE Trans Vis Comput Graph. 29(1):1222-1232. 10.1109/TVCG.2022.3209357
Wang, Y., Hou, Z., Shen, L., Wu, T., Wang, J., Huang, H., Zhang, H., Zhang, D.: Towards natural language-based visualization authoring. IEEE Trans. Vis. Comput. Graph. 29(1), 1222–1232 (2023). https://doi.org/10.1109/TVCG.2022.3209357
Wei, J., Tay, Y., Bommasani, R., Raffel, C., Zoph, B., Borgeaud, S., Yogatama, D., Bosma, M., Zhou, D., Metzler, D.: Emergent abilities of large language models. arXiv preprint https://arxiv.org/abs/2206.07682 (2022)
Wei, J., Wang, X., Schuurmans, D., Bosma, M., Chi, E., Le, Q., Zhou, D.: Chain of thought prompting elicits reasoning in large language models. arXiv preprint https://arxiv.org/abs/2201.11903 (2022)
Yao, S., Zhao, J., Yu, D., Du, N., Shafran, I., Narasimhan, K., Cao, Y.: React: Synergizing reasoning and acting in language models. arXiv preprint https://arxiv.org/abs/2210.03629 (2022)
Zhou, D., Schärli, N., Hou, L., Wei, J., Scales, N., Wang, X., Schuurmans, D., Bousquet, O., Le, Q., Chi, E.: Least-to-most prompting enables complex reasoning in large language models. arXiv preprint https://arxiv.org/abs/2205.10625 (2022)
Acknowledgements
This work is supported by the Natural Science Foundation of China under Grant No. 62132010, and by Beijing Key Lab of Networked Multimedia, the Institute for Guo Qiang, Tsinghua University, Institute for Artificial Intelligence, Tsinghua University (THUAI), and by 2025 Key Technological Innovation Program of Ningbo City under Grant No. 2022Z080. Additionally, we acknowledge the support provided by the Beijing Municipal Science & Technology Commission and the Administrative Commission of Zhongguancun Science Park under Grant No. Z221100006722018.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
On behalf of all authors, the corresponding author states that there is no conflict of interest.
Appendices
Appendix 1 Source Code of Background Service
Appendix 2 APIs Prompts
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Jadoon, A.K., Yu, C. & Shi, Y. ContextMate: a context-aware smart agent for efficient data analysis. CCF Trans. Pervasive Comp. Interact. (2024). https://doi.org/10.1007/s42486-023-00144-7
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s42486-023-00144-7