Abstract
The growing popularity of chatbots has transformed the way users interact with apps and services. ChatGPT, a cutting-edge conversational Artificial Intelligence (AI) model, has emerged as a strong tool capable of providing tailored interactions and creating human-like responses. However, as the user base grows and workloads become more dynamic, ChatGPT’s architectural scalability becomes critical to maintaining responsiveness, minimizing latency, and optimizing resource use. This research paper provides a complete case study of ChatGPT’s architectural scalability, with a focus on its capacity to handle increasing user demands efficiently. Scaling a complex conversational AI model like ChatGPT comes with its own set of hurdles. We go into the complexities of vertical scaling, which includes raising individual instance resources, and horizontal scaling, which involves adding more instances to manage concurrent user interactions. We do performance studies on different cloud platforms Amazon Web Services (AWS), Google Cloud Platform (GCP), and Microsoft Azure and their available services for scalability of ChatGPT. Our research includes vertical and horizontal scaling scenarios, allowing us to analyze each platform’s effectiveness in handling various workloads and user traffic. Our study’s findings provide important insights into the effective scaling of ChatGPT. The study emphasizes the importance of constant monitoring and dynamic scaling in order to react to shifting user demands while maintaining high availability.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Brain [Brn.Ai] Code For Equity, Published in Chatbots Journal: Chatbot Trends Report (2021). https://chatbotsjournal.com/chatbot-trends-report-2021-b15479c404e4. Accessed 3 Mar 2021
Følstad, A., Araujo, T., Law, E.LC., et al.: Future directions for chatbot research: an interdisciplinary research agenda. Computing 103, 2915–2942 (2021). https://doi.org/10.1007/s00607-021-01016-7
Haque, M.A.: A brief analysis of “chatGPT” – a revolutionary tool designed by openAI. EAI Endorsed Trans AI Robotics 1, e15 (2023)
Medium, ChatGPT & GPT 4, How it works? https://medium.com/@fenjiro/chatgpt-gpt-4-how-it-works-10b33fb3f12b. Accessed 17 Apr 2023
Stackbuiders, Inside the brain of ChatGPT. https://www.stackbuilders.com/blog/inside-the-brain-of-chatgpt/#:~:text=chatGPT%20is%20an%20ai%20tool,in%20natural%20language%20p rocessing%20tasks. 2 May 2023
Scalable Path. https://www.scalablepath.com/data-science/chatgpt-architecture-explained. 9 May 2023
Vaswani, A., et al.: Attention is all you need. NIPS’17: Proceedings of the 31st International Conference on Neural Information Processing Systems, pp. 6000–6010 (2017)
Medium. https://medium.com/@amol-wagh/open-ai-understand-foundational-concepts-of-chatgpt-and-cool-stuff-you-can-explore-a7a77baf0ee3#:~:text=It%20is%20based%20on%20the,is%20based%20on%20Transformer%20architecture. 5 Feb 2023
TechRound, How does Chat GPT Actually work?. https://techround.co.uk/guides/how-does-chat-gpt-actually-work/. 15 Feb 2023
ThoughtSpot. https://www.thoughtspot.com/data-trends/ai/what-is-transformer-architecture-chatgpt. 23 Feb 2023
Subedi Medium. https://subedi.medium.com/chatgpt-101-pre-training-56a98f04389. 4 Feb 2023
Subedi Medium. https://subedi.medium.com/chatgpt-101-fine-tuning-caa0cb4cc936. 4 Feb 2023
Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019). https://d4mucfpksywv.cloudfront.net/better-language-models/language-models.pdf
Keskar, N.S., Mudigere, D., Nocedal, J., Smelyanskiy, M., Tang, P.T.P.: On large-batch training for deep learning: Generalization gap and sharp minima. arXiv preprint arXiv:1609.04836. https://arxiv.org/pdf/1609.04836.pdf, (2019)
Hoffman, M.W., et al.: Acme: A Research Framework for Distributed Reinforcement Learning. arXiv preprint arXiv:2006.00979 (2020)
Wang, F.Y., Li, J., Qin, R., Zhu, J., Mo, H., Hu, B.: ChatGPT for computational social systems: from conversational applications to human-oriented operating systems. IEEE Trans. Computational Social Syst. 10(2), 414–425 (2023). https://doi.org/10.1109/TCSS.2023.3252679
Mantel group, ChatGPT decoded A comprehensive overview of large language models. https://eliiza.com.au/wp-content/uploads/2023/03/ChatGPT-decoded.pdf. 1 Mar 2023
GPT blogs, ChatGPT: How Much Data Is Used in the Training Process?. https://gptblogs.com/chatgpt-how-much-data-is-used-in-the-training-process#training-chatgpt-the-importance-of-a-diverse-dataset-5. 1 Feb 2023
Open AI Master. How to Get chatGPT Faster Response. https://openaimaster.com/how-to-get-chat-gpt-faster-response/. 3 June 2023
AIM, Is Parallel Programming Really That Difficult? https://analyticsindiamag.com/is-parallel-programming-really-that-difficult/. 21 Feb 2023
Ts2, Best Practices for Programming ChatGPT in Shell: Code Optimization and Performance. https://ts2.space/en/best-practices-for-programming-chatgpt-in-shell-code-optimization-and-performance/. 23 June 2023
Czech, Z.: References. In: Introduction to Parallel Computing, pp. 323–342. Cambridge University Press, Cambridge (2017). https://doi.org/10.1017/9781316795835.011
Talent, Beginner’s Guide to Batch Processing. https://www.talend.com/resources/batch-processing/#:~:text=Batch%20processing%20handles%20large%20amounts,the%20efficiency%20of%20job%20processing2023
Kili, P.L.: How to Perform Distributed Training? (2023). https://kili-technology.com/data-labeling/machine-learning/how-to-perform-distributed-training
Microsoft, Distributed training with Azure Machine Learning. https://learn.microsoft.com/en-us/azure/machine-learning/concept-distributed-training?view=azureml-api-2#data-parallelism. 7 Mar 2023
Cao, Y., et al.: A Comprehensive Survey of AI-Generated Content (AIGC): A History of Generative AI from GAN to ChatGPT. https://doi.org/10.48550/arXiv.2303.04226. 7 Mar 2023
Towards Data Science, J. Davis, Understanding Mixed Precision Training. https://towardsdatascience.com/understanding-mixed-precision-training-4b246679c7c4. 28 Jan 2021
Nvidia Docs Hub, Train With Mixed Precision. https://docs.nvidia.com/deeplearning/performance/mixed-precision-training/index.html. 1 Feb 2023
Micikevicius, P., et al.: Mixed Precision Training. , Published as a conference paper at ICLR (2018). https://doi.org/10.48550/arXiv.1710.03740
Hugging Face, Performance and Scalability: How To Fit a Bigger Model and Train It Faster. https://huggingface.co/docs/transformers/v4.18.0/en/performance. 11 Jan 2022
Kaggle, Optimization approaches for Transformers (2022). https://www.kaggle.com/code/vad13irt/optimization-approaches-for-transformers
Medium. https://medium.com/@travismartin991/chatgpt-and-cloud-computing-are-two-technologies-that-are-rapidly-gaining-popularity-in-various-f8e6ebd6bf04. 5 Feb 2023
NuttyCloud. https://nuttycloud.com/on-which-cloud-technology-chatgpt-has-been-built-and-developed/. 4 Feb 2023
Chatbots with Large Cloud Providers – AWS vs GCP vs Azure. https://chatbotbusinessframework.com/chatbot-platform-comparison-solutions-amazon-aws-google-cloud-microsoft-azure/. 3 Nov 2020
Telefonica Tech, ChatGPT and Cloud Computing: A happy marriage. https://telefonicatech.com/en/blog/chatgpt-and-cloud-computing-a-happy-marriage#:~:text=The%20relationship%20between%20ChatGPT%20and,established%20between%20OpenAI%20and%20Microsoft. 30 May 2023
Medium, Scalability in the Cloud: Vertical vs Horizontal Scaling. https://medium.com/javarevisited/scalability-in-the-cloud-vertical-vs-horizontal-scaling-ba38ca29d1b7. 9 July 2022
Esds, What is the Difference Between Horizontal & Vertical Scaling? (2021). https://www.esds.co.in/blog/what-is-the-difference-between-horizontal-vertical-scaling/#:~:text=There%20are%20two%20types%20of,distributed%20workload%20is%20Horizontal%20Scaling
AWS. https://aws.amazon.com/blogs/storage/accelerating-gpt-large-language-model-training-with-aws-services/. 18 May 2023
AWS (2023). https://aws.amazon.com/ec2/
Google Cloud, Documentation. https://cloud.google.com/kubernetes-engine/docs/concepts/verticalpodautoscaler. 15 Sep 2023
Google Cloud, Documentation. https://cloud.google.com/kubernetes-engine/docs/concepts/horizontalpodautoscaler. 15 Sep 2023
Google Cloud, Documentation. https://cloud.google.com/compute/docs/load-balancing-and-autoscaling#:~:text=documentation%20for%20descriptions.-,Autoscaling,need%20for%20resources%20is%20lower. 20 Sep 2023
Orange Mantra, Microsoft’s ChatGPT Integration, Explained!. https://www.orangemantra.com/blog/microsofts-chatgpt-integration-explained/#:~:text=Azure%20provides%20powerful%20scalability%20options,be%20optimized%20for%20cost%20efficiency. 10 Mar 2023
Microsoft, Vertical Pod Autoscaling (preview) in Azure Kubernetes Service (AKS). https://learn.microsoft.com/en-us/azure/aks/vertical-pod-autoscaler. 22 Mar 2023
Microsoft, Tutorial: Scale applications in Azure Kubernetes Service (AKS). https://learn.microsoft.com/en-us/azure/aks/tutorial-kubernetes-scale?tabs=azure-cli#autoscale-the-application. 4 May 2023
Data Science Central, How To Use ChatGPT in Cloud Computing. https://www.datasciencecentral.com/how-to-use-chatgpt-in-cloud-computing/#:~:text=Using%20ChatGPT%20as%20a%20built,latest%20updates%20to%20your%20infrastructure. 21 Feb 2023
Baeldung, Attention Mechanism in the Transformers Model. https://www.baeldung.com/cs/attention-mechanism-transformers. 16 June 2023
Ray, P.P.: ChatGPT: a comprehensive review on background, applications, key challenges, bias, ethics, limitations and future scope, Internet of Things and Cyber-Physical Systems, 3, Pages 121–154, ISSN 2667–3452 (2023). https://doi.org/10.1016/j.iotcps.2023.04.003
AwesomeScreen, Understanding Chat GPT: What It Is and How to Use It. https://www.awesomescreenshot.com/blog/knowledge/what-is-chat-gpt. 29 Mar 2023
Assemblyai, How ChatGPT actually works. https://www.assemblyai.com/blog/how-chatgpt-actually-works/. 23 Dec 2022
GeeksforGeeks, System Design – Horizontal and Vertical Scaling. https://www.geeksforgeeks.org/system-design-horizontal-and-vertical-scaling/. 16 Feb 2023
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Mechkaroska, D., Domazet, E., Feta, A., Shikoska, U.R. (2024). Architectural Scalability of Conversational Chatbot: The Case of ChatGPT. In: Arai, K. (eds) Advances in Information and Communication. FICC 2024. Lecture Notes in Networks and Systems, vol 919. Springer, Cham. https://doi.org/10.1007/978-3-031-53960-2_5
Download citation
DOI: https://doi.org/10.1007/978-3-031-53960-2_5
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-53959-6
Online ISBN: 978-3-031-53960-2
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)