Skip to main content
Log in

An analysis of large language models: their impact and potential applications

  • Review
  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract

Large language models (LLMs) have transformed the interpretation and creation of human language in the rapidly developing field of computerized language processing. These models, which are based on deep learning techniques like transformer architectures, have been painstakingly trained on massive text datasets. This study paper takes an in-depth look into LLMs, including their architecture, historical evolution, and applications in education, healthcare, and finance sector. LLMs provide logical replies by interpreting complicated verbal patterns, making them beneficial in a variety of real-world scenarios. Their development and implementation, however, raise ethical concerns and have societal ramifications. Understanding the importance and limitations of LLMs is critical for guiding future research and ensuring the ethical use of their enormous potential. This survey exposes the influence of these models as they change, providing a roadmap for researchers, developers, and policymakers navigating the world of artificial intelligence and language processing.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

(adopted from: [10])

Fig. 4
Fig. 5

Similar content being viewed by others

References

  1. Arisoy E, Sainath TN, Kingsbury B, Ramabhadran B (2012) Deep neural network language models. In: Proceedings of the NAACL-HLT 2012 workshop: Will we ever really replace the n-gram model? On the Future of Language Modeling for HLT, pp 20–28

  2. Mikolov T, Karafiát M, Burget L, Cernocký J, Khudanpur S (2010) Recurrent neural network based language model. In: Interspeech, vol. 2. pp 1045–1048

  3. Huang J, Chang KCC (2022) Towards reasoning in large language models: a survey. arXiv preprint arXiv:2212.10403

  4. Bharathi Mohan G, Prasanna Kumar R (2022) Survey of text document summarization based on ensemble topic vector clustering model. In: IoT based control networks and intelligent systems: proceedings of 3rd ICICNIS 2022. Springer Nature Singapore, Singapore, pp. 831–847

  5. Li Y, Wang S, Ding H, Chen H (2023) Large language models in finance: a survey. In: Proceedings of the fourth ACM international conference on AI in finance, pp 374–382

  6. Liu Y, Ott M, Goyal N, Du J, Joshi M, Chen D, Levy O, Lewis M, Zettlemoyer L, Stoyanov V (2019) Roberta: a robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692

  7. Yu Y, Zhuang Y, Zhang J, Meng Y, Ratner A, Krishna R, Shen J, Zhang C (2023) Large language model as attributed training data generator: a tale of diversity and bias. arXiv preprint arXiv:2306.15895

  8. Devlin J, Chang MW, Lee K, Toutanova K (2018) Bert: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805

  9. Sennrich R, Haddow B, Birch A (2015) Neural machine translation of rare words with subword units. arXiv preprint arXiv:1508.07909

  10. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Polosukhin I et al. (2017) Attention Is All You Need. In: 31st Conference on neural information processing systems (NIPS 2017), Long Beach, CA, pp 5998–6008

  11. Forsyth D, Forsyth D (2019) Hidden Markov models. Applied machine learning. Springer, Cham, pp 305–332

    Chapter  Google Scholar 

  12. Zhao WX, Zhou K, Li J, Tang T, Wang X, Hou Y, Min Y, Zhang B, Zhang J, Dong Z, Du Y (2023) A survey of large language models. arXiv preprint arXiv:2303.18223

  13. Sutskever I, Vinyals O, Le QV (2014) Sequence to sequence learning with neural networks. In: Advances in neural information processing systems, vol 27.

  14. Naveed H, Khan AU, Qiu S, Saqib M, Anwar S, Usman M, Barnes N, Mian A (2023) A comprehensive overview of large language models. arXiv preprint arXiv:2307.06435

  15. Yang Z, Dai Z, Yang Y, Carbonell JG (2019) Ruslan Salakhutdinov, Quoc V. Le: XLNet: Generalized autoregressive pretraining for language understanding. NeurIPS 2019, pp 5754–5764.

  16. Radford A, Wu J, Child R, Luan D, Amodei D, Sutskever I (2019) Language models are unsupervised multitask learners. OpenAI Blog 1(8):9

    Google Scholar 

  17. Brown T, Mann B, Ryder N, Subbiah M, Kaplan JD, Dhariwal P, Neelakantan A, Shyam P, Sastry G, Askell A, Agarwal S (2020) Language models are few-shot learners. Adv Neural Inf Process Syst 33:1877–1901

    Google Scholar 

  18. Yan L, Sha L, Zhao L, Li Y, Martinez-Maldonado R, Chen G, Li X, Jin Y, Gašević D (2023) Practical and ethical challenges of large language models in education: a systematic literature review. arXiv preprint arXiv:2303.13379

  19. Ellaway RH, Tolsgaard M (2023) Artificial scholarship: LLMs in health professions education research. Adv Health Sci Educ 28:659

    Article  Google Scholar 

  20. Katz A, Shakir U, Chambers B (2023) The utility of large language models and generative AI for education research. https://doi.org/10.48550/arXiv.2305.18125

  21. Meyer JG, Urbanowicz RJ, Martin PC, O’Connor K, Li R, Peng PC, Bright TJ, Tatonetti N, Won KJ, Gonzalez-Hernandez G, Moore JH (2023) ChatGPT and large language models in academia: opportunities and challenges. BioData Min 16(1):20

    Article  Google Scholar 

  22. Milano S, McGrane JA, Leonelli S (2023) Large language models challenge the future of higher education. Nat Mach Intell 5(4):333–334

    Article  Google Scholar 

  23. Aher GV, Arriaga RI, Kalai AT (2023) Using large language models to simulate multiple humans and replicate human subject studies. In: International conference on machine learning. PMLR, pp 337–371

  24. Abd-Alrazaq A, AlSaad R, Alhuwail D, Ahmed A, Healy PM, Latifi S, Aziz S, Damseh R, Alrazak SA, Sheikh J (2023) Large language models in medical education: opportunities, challenges, and future directions. JMIR Med Educ 9(1):e48291

    Article  Google Scholar 

  25. Kasneci E, Seßler K, Küchemann S, Bannert M, Dementieva D, Fischer F, Gasser U, Groh G, Günnemann S, Hüllermeier E, Krusche S (2023) ChatGPT for good? On opportunities and challenges of large language models for education. Learn Individ Differ 103:102274

    Article  Google Scholar 

  26. Bewersdorff A, Seßler K, Baur A, Kasneci E, Nerdel C (2023) Assessing student errors in experimentation using artificial intelligence and large language models: a comparative study with human raters. Comput Educ Artif Intell 5:100177

    Article  Google Scholar 

  27. Bawden R, Yvon F (2023) Investigating the translation performance of a large multilingual language model: the case of bloom. arXiv preprint arXiv:2303.01911

  28. Touvron H, Lavril T, Izacard G, Martinet X, Lachaux MA, Lacroix T, Rozière B, Goyal N, Hambro E, Azhar F, Rodriguez A (2023) Llama: open and efficient foundation language models. arXiv preprint arXiv:2302.13971

  29. Bubeck S, Chandrasekaran V, Eldan R, Gehrke J, Horvitz E, Kamar E, Lee P, Lee YT, Li Y, Lundberg S, Nori H (2023) Sparks of artificial general intelligence: early experiments with gpt-4. arXiv preprint arXiv:2303.12712

  30. Zhu D, Chen J, Shen X, Li X, Elhoseiny M (2023) Minigpt-4: enhancing vision-language understanding with advanced large language models. arXiv preprint arXiv:2304.10592

  31. Bharathi Mohan G, Prasanna Kumar R, Parathasarathy S, Aravind S, Hanish KB, Pavithria G (2023). Text summarization for big data analytics: a comprehensive review of GPT 2 and BERT approaches. In: Sharma R, Jeon G, Zhang Y (eds) Data analytics for internet of things infrastructure. Internet of Things. Springer, Cham. https://doi.org/10.1007/978-3-031-33808-3_14

  32. Azunre P (2021) Transfer learning for natural language processing. Simon and Schuster

    Google Scholar 

  33. Meskó B, Topol EJ (2023) The imperative for regulatory oversight of large language models (or generative AI) in healthcare. NPJ Digit Med 6(1):120

    Article  Google Scholar 

  34. Reddy S (2023) Evaluating large language models for use in healthcare: a framework for translational value assessment. Informat Med Unlocked 41:101304

    Article  Google Scholar 

  35. Sallam M (2023) The utility of ChatGPT as an example of large language models in healthcare education, research and practice: systematic review on the future perspectives and potential limitations. medRxiv, pp 2023–02

  36. Huang H, Zheng O, Wang D, Yin J, Wang Z, Ding S, Yin H, Xu C, Yang R, Zheng Q, Shi B (2023) ChatGPT for shaping the future of dentistry: the potential of multi-modal large language model. Int J Oral Sci 15(1):29

    Article  Google Scholar 

  37. Alhaidry HM, Fatani B, Alrayes JO, Almana AM, Alfhaed NK (2023) ChatGPT in dentistry: a comprehensive review. Cureus 15(4):e38317. https://doi.org/10.7759/cureus.38317

    Article  Google Scholar 

  38. Liu Y, Han T, Ma S, Zhang J, Yang Y, Tian J, He H, Li A, He M, Liu Z, Wu Z (2023) Summary of chatgpt-related research and perspective towards the future of large language models. Meta-Radiol 1:100017

    Article  Google Scholar 

  39. Liu XY, Wang G, Zha D (2023) Fingpt: democratizing internet-scale data for financial large language models. arXiv preprint arXiv:2307.10485

  40. Gu Y, Zhang S, Usuyama N, Woldesenbet Y, Wong C, Sanapathi P, Wei M, Valluri N, Strandberg E, Naumann T, Poon H (2023) Distilling large language models for biomedical knowledge extraction: A case study on adverse drug events. arXiv preprint arXiv:2307.06439

  41. Brameier DT, Alnasser AA, Carnino JM, Bhashyam AR, von Keudell AG, Weaver MJ (2023) Artificial intelligence in orthopaedic surgery: Can a large language model “write” a believable orthopaedic journal article? JBJS 105(17):1388–1392

    Article  Google Scholar 

  42. Cabrera J, Loyola MS, Magaña I, Rojas R (2023) Ethical dilemmas, mental health, artificial intelligence, and llm-based chatbots. In: International work-conference on bioinformatics and biomedical engineering. Springer Nature Switzerland, Cham, pp 313–326

  43. Cascella M, Montomoli J, Bellini V, Bignami E (2023) Evaluating the feasibility of ChatGPT in healthcare: an analysis of multiple clinical and research scenarios. J Med Syst 47(1):33

    Article  Google Scholar 

  44. Thirunavukarasu AJ, Ting DSJ, Elangovan K, Gutierrez L, Tan TF, Ting DSW (2023) Large language models in medicine. Nat Med 29(8):1930–1940

    Article  Google Scholar 

  45. De Angelis L, Baglivo F, Arzilli G, Privitera GP, Ferragina P, Tozzi AE, Rizzo C (2023) ChatGPT and the rise of large language models: the new AI-driven infodemic threat in public health. Front Public Health 11:1166120

    Article  Google Scholar 

  46. Sharaf S, Anoop VS (2023) An analysis on large language models in healthcare: a case study of BioBERT. arXiv preprint arXiv:2310.07282

  47. Yang, X., PourNejatian, N., Shin, H.C., Smith, K.E., Parisien, C., Compas, C., Martin, C., Flores, M.G., Zhang, Y., Magoc, T. and Harle, C.A., 2022. GatorTron: A Large Language Model for Clinical Natural Language Processing. medRxiv, pp.2022–02.

  48. Zhang H, Chen J, Jiang F, Yu F, Chen Z, Li J, Chen G, Wu X, Zhang Z, Xiao Q, Wan X (2023) HuatuoGPT, towards taming language model to Be a doctor. arXiv preprint arXiv:2305.15075

  49. Zhou S, Wang N, Wang L, Liu H, Zhang R (2022) CancerBERT: a cancer domain-specific language model for extracting breast cancer phenotypes from electronic health records. J Am Med Inform Assoc 29(7):1208–1216

    Article  Google Scholar 

  50. Santos T, Tariq A, Das S, Vayalpati K, Smith GH, Trivedi H, Banerjee I (2022) PathologyBERT-Pre-trained vs a new transformer language model for pathology domain. In: AMIA annual symposium proceedings, vol 2022. American Medical Informatics Association, p 962

  51. Yang H, Liu XY, Wang CD (2023) FinGPT: open-source financial large language models. arXiv preprint arXiv:2306.06031

  52. Yang Y, Tang Y, Tam KY (2023) InvestLM: a large language model for investment using financial domain instruction tuning. arXiv preprint arXiv:2309.13064

  53. Nourbakhsh A, Bang G (2019) A framework for anomaly detection using language modeling, and its applications to finance. arXiv preprint arXiv:1908.09156

  54. Wu S, Irsoy O, Lu S, Dabravolski V, Dredze M, Gehrmann S, Kambadur P, Rosenberg D, Mann G (2023) Bloomberggpt: a large language model for finance. arXiv preprint arXiv:2303.17564

  55. Yang Y, Uy MCS, Huang A (2020) Finbert: a pretrained language model for financial communications. arXiv preprint arXiv:2006.08097

  56. Xie Q, Han W, Zhang X, Lai Y, Peng M, Lopez-Lira A, Huang J (2023) PIXIU: a large language model, instruction data and evaluation benchmark for finance. arXiv preprint arXiv:2306.05443

  57. Shi W, Ajith A, Xia M, Huang Y, Liu D, Blevins T, Chen D, Zettlemoyer L (2023) Detecting pretraining data from large language models. arXiv preprint arXiv:2310.16789

  58. Kojima T, Gu SS, Reid M, Matsuo Y, Iwasawa Y (2022) Large language models are zero-shot reasoners. Adv Neural Inf Process Syst 35:22199–22213

    Google Scholar 

  59. Liddy E (2001) Advances in automatic text summarization. Inf Retr 4(1):82–83

    Article  Google Scholar 

  60. Liu X, Croft WB (2005) Statistical language modeling for information retrieval. Annu Rev Inf Sci Technol 39(1):1–31

    Article  Google Scholar 

  61. Juang BH, Rabiner LR (2005) Automatic speech recognition–a brief history of the technology development. Georgia Institute of Technology. Atlanta Rutgers University and the University of California. Santa Barbara, 1, p 67

  62. Kovačević A, Kečo D (2022) Bidirectional LSTM networks for abstractive text summarization. In: Advanced technologies, systems, and applications VI: Proceedings of the international symposium on innovative and interdisciplinary applications of advanced technologies (IAT) 2021. Springer International Publishing, pp 281–293

  63. Wu Y, Schuster M, Chen Z, Le QV, Norouzi M, Macherey W, Krikun M, Cao Y, Gao Q, Macherey K, Klingner J (2016) Google’s neural machine translation system: bridging the gap between human and machine translation. arXiv preprint arXiv:1609.08144

  64. Radford A, Narasimhan K, Salimans T, Sutskever I (2018) Improving language understanding by generative pre-training

  65. Akbar NA, Darmayanti I, Fati SM, Muneer A (2021) Deep learning of a pre-trained language model’s joke classifier using GPT-2. J Hunan Univ Nat Sci 48(8)

  66. Floridi L, Chiriatti M (2020) GPT-3: its nature, scope, limits, and consequences. Mind Mach 30:681–694

    Article  Google Scholar 

Download references

Funding

The authors have no relevant financial or non-financial interests to disclose.

Author information

Authors and Affiliations

Authors

Contributions

BHARATHI MOHAN G and PRASANNA KUMAR R - conceptualized review theme, guided in technical writing of manuscript and reviewed the full manuscript. VISHAL KRISHH P, KEERTHINATHAN A, MEKA KAVYA UMA MEGHANA, SHEBA SULTHANA and LAVANYA G- wrote the main manuscript. Srinath Doss- reviewed the full manuscript

Corresponding author

Correspondence to G. Bharathi Mohan.

Ethics declarations

Competing interests

The authors declare no competing interests.

Conflict of interest

The authors have no competing interests to declare that are relevant to the content of this article.

Ethical approval and consent to participate

“Not applicable.”

Consent for publication

“Not applicable.”

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Bharathi Mohan, G., Prasanna Kumar, R., Vishal Krishh, P. et al. An analysis of large language models: their impact and potential applications. Knowl Inf Syst (2024). https://doi.org/10.1007/s10115-024-02120-8

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10115-024-02120-8

Keywords

Navigation