Breaking Bad: Unraveling Influences and Risks of User Inputs to ChatGPT for Game Story Generation

Taveekitworachai, Pittawat; Abdullah, Febri; Gursesli, Mustafa Can; Dewantoro, Mury F.; Chen, Siyuan; Lanata, Antonio; Guazzini, Andrea; Thawonmas, Ruck

doi:10.1007/978-3-031-47658-7_27

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14384))

Included in the following conference series:

International Conference on Interactive Digital Storytelling

678 Accesses

Abstract

This study presents an investigation into the influence and potential risks of using user inputs as part of a prompt, a message used to interact with ChatGPT. We demonstrate the influence of user inputs in a prompt through game story generation and story ending classification. To assess risks, we utilize a technique called adversarial prompting, which involves deliberately manipulating the prompt or parts of the prompt to exploit the safety mechanisms of large language models, leading to undesirable or harmful responses. We assess the influence of positive and negative sentiment words, as proxies for user inputs in a prompt, on the generated story endings. The results suggest that ChatGPT tends to adhere to its guidelines, providing safe and non-harmful outcomes, i.e., positive endings. However, malicious intentions, such as “jailbreaking”, can be achieved through prompting injection. These actions carry significant risks of producing unethical outcomes, as shown in an example. As a result, this study also suggests preliminary ways to mitigate these risks: content filtering, rare token-separators, and enhancing training datasets and alignment processes.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 59.99; Price excludes VAT (USA)

Softcover Book: USD 74.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Source code and raw data are available at https://github.com/Pittawat2542/chatgpt-words-influence-risks.
2.
https://platform.openai.com/docs/api-reference/chat.
3.
Alignment process is a refinement step that involves further fine-tuning pre-trained LLMs to generate better responses that align with user input and predefined guidelines. In other words, the goal is to ensure that the model’s output aligns with the predefined standards and the user’s intentions or instructions.
4.
System prompt is an instruction given to the model before interacting with users and usually contains guidelines or rules for models to follow throughout that conversation window.
5.
Prompt injection: https://bit.ly/icids-2023-prompt-injection.
Normal conversation: https://bit.ly/icids-2023-direct-prompt.

References

Borji, A.: A categorical archive of ChatGPT failures (2023)
Google Scholar
Cascella, M., Montomoli, J., Bellini, V., et al.: Evaluating the feasibility of ChatGPT in healthcare: an analysis of multiple clinical and research scenarios. J. Med. Syst. 47(1), 33 (2023). https://doi.org/10.1007/s10916-023-01925-4
Chen, Y., Skiena, S.: Building sentiment lexicons for all major languages. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pp. 383–389 (2014)
Google Scholar
Dwivedi, Y.K., Kshetri, N., Hughes, L., et al.: Opinion paper: “so what if ChatGPT wrote it?” multidisciplinary perspectives on opportunities, challenges and implications of generative conversational AI for research, practice and policy. Int. J. Inf. Manage. 71, 102642 (2023). https://doi.org/10.1016/j.ijinfomgt.2023.102642. https://www.sciencedirect.com/science/article/pii/S0268401223000233
Glukhov, D., Shumailov, I., Gal, Y., et al.: LLM censorship: a machine learning challenge or a computer security problem? arXiv preprint arXiv:2307.10719 (2023)
Greshake, K., Abdelnabi, S., Mishra, S., et al.: Not what you’ve signed up for: compromising real-world LLM-integrated applications with indirect prompt injection. arXiv preprint arXiv:2302.12173 (2023)
de Hoog, N., Verboon, P.: Is the news making us unhappy? The influence of daily news exposure on emotional states. Br. J. Psychol. 111(2), 157–173 (2020). https://doi.org/10.1111/bjop.12389. https://bpspsychub.onlinelibrary.wiley.com/doi/abs/10.1111/bjop.12389
Hu, M., Liu, B.: Mining and summarizing customer reviews. In: Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2004, pp. 168–177. Association for Computing Machinery, New York (2004). https://doi.org/10.1145/1014052.1014073
Imasato, N., Miyazawa, K., Duncan, C., et al.: Using a language model to generate music in its symbolic domain while controlling its perceived emotion. IEEE Access (2023)
Google Scholar
Islamovic, A.: Meet stable beluga 1 and stable beluga 2, our large and mighty instruction fine-tuned language models (2023). https://stability.ai/blog/stable-beluga-large-instruction-fine-tuned-models
Jones, M., Neumayer, C., Shklovski, I.: Embodying the algorithm: exploring relationships with large language models through artistic performance. In: Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems, pp. 1–24 (2023)
Google Scholar
Kshetri, N.: Cybercrime and privacy threats of large language models. IT Prof. 25(3), 9–13 (2023). https://doi.org/10.1109/MITP.2023.3275489
Article Google Scholar
Liu, Y., Deng, G., Xu, Z., et al.: Jailbreaking ChatGPT via prompt engineering: an empirical study (2023)
Google Scholar
Lu, Y., Bartolo, M., Moore, A., et al.: Fantastically ordered prompts and where to find them: overcoming few-shot prompt order sensitivity (2021)
Google Scholar
Markov, T., Zhang, C., Agarwal, S., et al.: A holistic approach to undesired content detection in the real world. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 37, no. 12, pp. 15009–15018 (2023). https://doi.org/10.1609/aaai.v37i12.26752. https://ojs.aaai.org/index.php/AAAI/article/view/26752
Min, B., Ross, H., Sulem, E., et al.: Recent advances in natural language processing via large pre-trained language models: a survey. ACM Comput. Surv. (2021)
Google Scholar
Mökander, J., Schuett, J., Kirk, H.R., et al.: Auditing large language models: a three-layered approach. AI Ethics 1–31 (2023)
Google Scholar
OpenAI: Introducing ChatGPT (2022). https://openai.com/blog/chatgpt
Ross, S.I., Martinez, F., Houde, S., et al.: The programmer’s assistant: conversational interaction with a large language model for software development. In: Proceedings of the 28th International Conference on Intelligent User Interfaces, pp. 491–514 (2023)
Google Scholar
Sallam, M.: ChatGPT utility in healthcare education, research, and practice: systematic review on the promising perspectives and valid concerns. Healthcare 11(6) (2023). https://www.mdpi.com/2227-9032/11/6/887
Simon, N., Muise, C.: TattleTale: storytelling with planning and large language models. In: ICAPS Workshop on Scheduling and Planning Applications (2022)
Google Scholar
Sison, A.J.G., Daza, M.T., Gozalo-Brizuela, R., et al.: ChatGPT: more than a weapon of mass deception, ethical challenges and responses from the human-centered artificial intelligence (HCAI) perspective. arXiv preprint arXiv:2304.11215 (2023)
Stolper, C.D., Lee, B., Henry Riche, N., et al.: Emerging and recurring data-driven storytelling techniques: analysis of a curated collection of recent stories. Technical report, Microsoft (2016)
Google Scholar
Swartjes, I., Theune, M.: Iterative authoring using story generation feedback: debugging or co-creation? In: Iurgel, I.A., Zagalo, N., Petta, P. (eds.) ICIDS 2009. LNCS, vol. 5915, pp. 62–73. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-10643-9_10
Chapter Google Scholar
Taveekitworachai, P., Abdullah, F., Dewantoro, M.F., et al.: ChatGPT4PCG competition: character-like level generation for science birds (2023)
Google Scholar
Teubner, T., Flath, C.M., Weinhardt, C., et al.: Welcome to the era of ChatGPT et al. the prospects of large language models. Bus. Inf. Syst. Eng. 65(2), 95–101 (2023)
Google Scholar
Thue, D., Schiffel, S., Guðmundsson, T.Þ, Kristjánsson, G.F., Eiríksson, K., Björnsson, M.V.: Open world story generation for increased expressive range. In: Nunes, N., Oakley, I., Nisi, V. (eds.) ICIDS 2017. LNCS, vol. 10690, pp. 313–316. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-71027-3_33
Chapter Google Scholar
Todd, G., Earle, S., Nasir, M.U., et al.: Level generation through large language models. In: Proceedings of the 18th International Conference on the Foundations of Digital Games, FDG 2023. Association for Computing Machinery, New York (2023). https://doi.org/10.1145/3582437.3587211
Touvron, H., Martin, L., Stone, K., et al.: LLaMA 2: open foundation and fine-tuned chat models (2023)
Google Scholar
Wang, Z., Xie, Q., Ding, Z., et al.: Is ChatGPT a Good Sentiment Analyzer? A Preliminary Study (2023)
Google Scholar
Ye, W., Ou, M., Li, T., et al.: Assessing hidden risks of LLMs: an empirical study on robustness, consistency, and credibility. arXiv preprint arXiv:2305.10235 (2023)
Yuan, A., Coenen, A., Reif, E., et al.: Wordcraft: story writing with large language models. In: 27th International Conference on Intelligent User Interfaces, pp. 841–852 (2022)
Google Scholar
Zhou, J., Zhang, Y., Luo, Q., et al.: Synthetic lies: understanding AI-generated misinformation and evaluating algorithmic and human solutions. In: Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems, pp. 1–20 (2023)
Google Scholar
Zhuo, T.Y., Huang, Y., Chen, C., et al.: Red teaming ChatGPT via jailbreaking: bias, robustness, reliability and toxicity (2023)
Google Scholar
Zou, A., Wang, Z., Kolter, J.Z., et al.: Universal and transferable adversarial attacks on aligned language models (2023)
Google Scholar

Download references

Author information

Authors and Affiliations

Graduate School of Information Science and Engineering, Ritsumeikan University, Kusatsu, Shiga, Japan
Pittawat Taveekitworachai, Febri Abdullah, Mury F. Dewantoro & Siyuan Chen
Department of Information Engineering, Università degli Studi di Firenze, Florence, Italy
Mustafa Can Gursesli & Antonio Lanata
Department of Education, Literatures, Intercultural Studies, Languages and Psychology, Università degli Studi di Firenze, Florence, Italy
Andrea Guazzini
College of Information Science and Engineering, Ritsumeikan University, Kusatsu, Shiga, Japan
Ruck Thawonmas

Authors

Pittawat Taveekitworachai
View author publications
You can also search for this author in PubMed Google Scholar
Febri Abdullah
View author publications
You can also search for this author in PubMed Google Scholar
Mustafa Can Gursesli
View author publications
You can also search for this author in PubMed Google Scholar
Mury F. Dewantoro
View author publications
You can also search for this author in PubMed Google Scholar
Siyuan Chen
View author publications
You can also search for this author in PubMed Google Scholar
Antonio Lanata
View author publications
You can also search for this author in PubMed Google Scholar
Andrea Guazzini
View author publications
You can also search for this author in PubMed Google Scholar
Ruck Thawonmas
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Pittawat Taveekitworachai .

Editor information

Editors and Affiliations

University of Skövde, Skövde, Sweden
Lissa Holloway-Attaway
University of Central Florida, Orlando, FL, USA
John T. Murray

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Taveekitworachai, P. et al. (2023). Breaking Bad: Unraveling Influences and Risks of User Inputs to ChatGPT for Game Story Generation. In: Holloway-Attaway, L., Murray, J.T. (eds) Interactive Storytelling. ICIDS 2023. Lecture Notes in Computer Science, vol 14384. Springer, Cham. https://doi.org/10.1007/978-3-031-47658-7_27

Download citation

DOI: https://doi.org/10.1007/978-3-031-47658-7_27
Published: 31 October 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-47657-0
Online ISBN: 978-3-031-47658-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Breaking Bad: Unraveling Influences and Risks of User Inputs to ChatGPT for Game Story Generation