Foundation and large language models: fundamentals, challenges, opportunities, and social impacts

Myers, Devon; Mohawesh, Rami; Chellaboina, Venkata Ishwarya; Sathvik, Anantha Lakshmi; Venkatesh, Praveen; Ho, Yi-Hui; Henshaw, Hanna; Alhawawreh, Muna; Berdik, David; Jararweh, Yaser

doi:10.1007/s10586-023-04203-7

Foundation and large language models: fundamentals, challenges, opportunities, and social impacts

Published: 27 November 2023

Volume 27, pages 1–26, (2024)
Cite this article

Cluster Computing Aims and scope Submit manuscript

Devon Myers¹,
Rami Mohawesh²,
Venkata Ishwarya Chellaboina¹,
Anantha Lakshmi Sathvik¹,
Praveen Venkatesh¹,
Yi-Hui Ho¹,
Hanna Henshaw¹,
Muna Alhawawreh³,
David Berdik¹ &
…
Yaser Jararweh¹

1951 Accesses
2 Citations
Explore all metrics

Abstract

Foundation and Large Language Models (FLLMs) are models that are trained using a massive amount of data with the intent to perform a variety of downstream tasks. FLLMs are very promising drivers for different domains, such as Natural Language Processing (NLP) and other AI-related applications. These models emerged as a result of the AI paradigm shift, involving the use of pre-trained language models (PLMs) and extensive data to train transformer models. FLLMs have also demonstrated impressive proficiency in addressing a wide range of NLP applications, including language generation, summarization, comprehension, complex reasoning, and question answering, among others. In recent years, there has been unprecedented interest in FLLMs-related research, driven by contributions from both academic institutions and industry players. Notably, the development of ChatGPT, a highly capable AI chatbot built around FLLMs concepts, has garnered considerable interest from various segments of society. The technological advancement of large language models (LLMs) has had a significant influence on the broader artificial intelligence (AI) community, potentially transforming the processes involved in the development and use of AI systems. Our study provides a comprehensive survey of existing resources related to the development of FLLMs and addresses current concerns, challenges and social impacts. Moreover, we emphasize on the current research gaps and potential future directions in this emerging and promising field.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Large-Language-Models (LLM)-Based AI Chatbots: Architecture, In-Depth Analysis and Their Performance Evaluation

Unraveling the mysteries of AI chatbots

Article Open access 13 March 2024

Comparative Analysis for Open-Source Large Language Models

Data Availability

All datasets are open-source, and the sources are cited.

References

Abas, A.R., El-Henawy, I., Mohamed, H., Abdellatif, A.: Deep learning model for fine-grained aspect-based opinion mining. IEEE Access 8, 128845–128855 (2020)
Article Google Scholar
Abdullah, M., Madain, A., Jararweh, Y.: Chatgpt: Fundamentals, applications and social impacts. In: 2022 Ninth International Conference on Social Networks Analysis, Management and Security (SNAMS), pp. 1–8. IEEE, (2022)
Abebe, R., Barocas, S., Kleinberg, J., Levy, K., Raghavan, M., Robinson, D.G.: Roles for computing in social change. In: Proceedings of the 2020 COnference on Fairness, Accountability, and Transparency, (2020)
Abid, A., Farooqi, M., Zou, J.: Persistent anti-muslim bias in large language models. arXiv preprint arXiv:2101.05783 (2021)
Akhila, N. et al.: Comparative study of bert models and roberta in transformer based question answering. In: 2023 3rd International Conference on Intelligent Technologies (CONIT), pp. 1–5. IEEE, (2023)
Al-Hawawreh, M., Aljuhani, A., Jararweh, Y.: Chatgpt for cybersecurity: practical applications, challenges, and future directions. Clust. Comput. pp. 1–16 (2023)
Alan Ramponi, B.P.: Neural unsupervised domain adaptation in nlp–a survey, (2020)
Alkhurayyif, Y., Rahaman Wahab Sait, A.: Developing an open domain arabic question answering system using a deep learning technique. In: IEEE Access (2023)
An, T., Song, J., Liu, W.: Incorporating pre-trained model into neural machine translation. In: 2021 4th International Conference on Artificial Intelligence and Big Data (ICAIBD), pp. 212–216 (2021)
Antoun, W., Baly, F., Hajj, H.: Ara bert: Transformer-based model for arabic language understanding. arXiv preprint arXiv: 2003.00104 (2021)
Araujo, A.F., Gôlo, M.P.S., Marcacini, R.M.: Opinion mining for app reviews: an analysis of textual representation and predictive models. Autom. Softw. Eng. 29, 1–30 (2022)
Article Google Scholar
Arumae, K., Liu, F.: Guiding extractive summarization with question-answering rewards. CoRR, abs/1904.02321 (2019)
Baldini, I., Wei, D., Ramamurthy, K.N., Yurochkin, M., Singh, M.: Your fairness may vary: Pretrained language model fairness in toxic text classification. arXiv preprint arXiv:2108.01250 (2021)
Bani-Almarjeh, M., Kurdy, M.-B.: Arabic abstractive text summarization using rnn-based and transformer-based architectures. Inf. Process. Manag. 60(2), 103227 (2023)
Article Google Scholar
Bartlett, Robert: Morse, Adair, Stanton, Richard. Wallace. Discrimination in the FinTech Era, National Bureau of Economic Research, Nancy (2019)
Google Scholar
Bataa, E., Wu, J.: An investigation of transfer learning-based sentiment analysis in Japanese (2019)
Benjamin, Ruha: Assessing risk, automating racism. Science 366, 421–422 (2019)
Article ADS CAS PubMed Google Scholar
Bhattacharjee, S., Haque, R., de Buy Wenniger, G.M., Way, A.: Investigating query expansion and coreference resolution in question answering on bert. In Elisabeth Métais, Farid Meziane, Helmut Horacek, and Philipp Cimiano, editors, Natural Language Processing and Information Systems, pp. 47–59, Cham (2020). Springer International Publishing
Bi, B., Li, C., Wu, C., Yan, M., Wang, W., Huang, S., Huang, F. Si, Luo: P.: Pre-training an autoencoding &autoregressive language model for context-conditioned generation. arXiv preprint arXiv:2004.07159 (2020)
Bommasani, R., Hudson, D.A, Adeli, E., Altman, R., Arora, S., von Arx, S., Bernstein, M.S., Bohg, J., Bosselut, A., Brunskill, E., et al.: On the opportunities and risks of foundation models. arXiv preprint arXiv:2108.07258 (2021)
Borji, A.: A categorical archive of chatgpt failures. arXiv preprint arXiv:2302.03494 (2023)
Bowman, S.R., Angeli, G., Potts, C., Manning, C.D.: A large annotated corpus for learning natural language inference. arXiv preprint arXiv:1508.05326 (2015)
Buck, C., Bulian, J., Ciaramita, M., Gajewski, W., Gesmundo, A., Houlsby, N., Wang, W.: Ask the right questions: active question reformulation with reinforcement learning. arXiv preprint arXiv: 1705.07830, (2018)
Büyüköz, B., Hürriyetoglu, Ö.: Arzucan: Analyzing elmo and distilbert on socio-political news classification. Proceedings of AESPEN 2020, 9–18 (2020)
Google Scholar
Caliskan, Aylin, Bryson, Joanna J., Narayanan, Arvind: Semantics derived automatically from language corpora contain human-like biases. Science 356, 183–186 (2017)
Article ADS CAS PubMed Google Scholar
Canete, J., Chaperon, G., Fuentes, R., Ho, J.-H., Kang, H., Pérez, J.: Spanish pre-trained bert model and evaluation data. PML4DC at ICLR 2020 (2020)
Carlini, N., Terzis, A.: Poisoning and backdooring contrastive learning. arXiv preprint arXiv:2106.09667 (2022)
Chang, W.-C., Yu, H.-F., Zhong, K., Yang, Y., Dhillon, I.S.: Taming pretrained transformers for extreme multi-label text classification. In: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 3163–3171, New York, NY, USA, (2020). Association for Computing Machinery
Chen, G., Ma, S., Chen, Y., Dong, L., Zhang, D., Pan, J., Wang, W.W.: Zero-shot cross-lingual transfer of neural machine translation with multilingual pretrained encoders, Furu (2021)
Chen, K., Meng, Y., Sun, X., Guo, S., Zhang, T., Li, J., Fan, C: Badpre: Task-agnostic backdoor attacks to pre-trained NLP foundation models. arXiv (2021)
Chen, Q., Sun, H., Liu, H., Jiang, Y., Ran, T., Jin, X., Xiao, X., Lin, Z., Niu, Z., Chen, H.: A comprehensive benchmark study on biomedical text generation and mining with chatgpt. bioRxiv, pp. 2023–04 (2023)
Cheuk, Tina: Can AI be racist? Color-evasiveness in the application of machine learning to science assessments. Sci. Educ. 105(5), 825–836 (2021)
Google Scholar
Chronopoulou, A., Stojanovski, D., Fraser, A.: Improving the lexical ability of pretrained language models for unsupervised neural machine translation. arXiv preprint arXiv:2103.10531 (2021)
Clark, K., Luong, M.T., Le, Q.V., Manning, C.D.: Electra: pre-training text encoders as discriminators rather than generators. arXiv preprint arXiv:2003.10555 (2020)
Clinchant, Stéphane.: Jung, Kweon Woo. Nikoulina. On the use of bert for neural machine translation, Vassilina (2019)
Google Scholar
Creel, K., Hellman, D.: The algorithmic leviathan: arbitrariness, fairness, and opportunity in algorithmic decision making systems. In: Proceeding of the 2021 ACM Conference on Fairness, Accountability, and Transparency (2021)
Dabre, R., Chu, C., Kunchukuttan, A.: A survey of multilingual neural machine translation. ACM Comput. Surv. 53(5), 1–38 (2020)
Article Google Scholar
Dafoe, A.: AI governance: a research agenda. Governance of AI program, the Future of Humanity Institute, the University of Oxford, Oxford (2018)
Dai, J., Chen, C., Li, Y.: A backdoor attack against LSTM-based text classification systems. IEEE Access 7, 138872–138878 (2019)
Article Google Scholar
Dang, E., Hu, Z., Li, T.: Enhancing collaborative filtering recommender with prompt-based sentiment analysis. arXiv preprint arXiv:2207.12883, (2022)
de Vries, W., Nissim, M.: As good as new. how to successfully recycle english GPT-2 to make models for other languages. In: Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021. Association for Computational Linguistics (2021)
Majd Saad Al Deen, M., Pielka, M., Hees, J., Soulef Abdou, B., Sifa, R.:Improving natural language inference in arabic using transformer models and linguistically informed pre-training. arXiv preprint arXiv:2307.14666 (2023)
Delobelle, P., Winters, T., Berendt, B.: Robbert: a dutch roberta-based language model. arXiv preprint arXiv:2001.0628 (2020)
Deng, X., Bashlovkina, V., Han, F., Baumgartner, S., Bendersky, M.: What do llms know about financial markets? a case study on reddit market sentiment analysis. In: Companion Proceedings of the ACM Web Conference 2023, pp. 107–110 (2023)
Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding (2018)
Ding, Z., Qi, Y., Lin, D.: Albert-based sentiment analysis of movie review. In: 2021 4th International Conference on Advanced Electronic Materials, Computers and Software Engineering (AEMCSE), pp. 1243–1246 (2021)
Dinh, T.A., Niehues, J.: Perturbation-based qe: An explainable, unsupervised word-level quality estimation method for blackbox machine translation. arXiv preprint arXiv:2305.07457 (2023)
Djandji, M., Baly, F., Antoun, W., Hajj, H.: Multi-task learning using ara bert for offensive language detection. Proceedings of the 4th Workshop on Open-Source Arabic Corpora and Processing Tools, pp. 97–101, (2020)
DoCarmo, T., Rea, S., Conaway, E., Emery, J., Raval, N.: The law in computation: What machine learning, artificial intelligence, and big data mean for law and society scholarship. Law & Policy 43(2), 170–199 (2021)
Article Google Scholar
Dong, L., Mallinson, J., Reddy, S., Lapata, M.: Learning to paraphrase for question answering. arXiv:1708.06022 (2017)
Du, Y., Bosselut, A., Manning, C.D.: Synthetic disinformation attacks on automated fact verification systems. arXiv preprint arXiv:2202.09381 (2022)
Duarte, J.M., Berton, L.: A review of semi-supervised learning for text classification. Artificial Intelligence Review, pp. 1–69 (2023)
Duong, D., Solomon, B.D: Analysis of large-language model versus human performance for genetics questions. medRxiv, pp. 2023–01 (2023)
Edunov, S., Baevski, A., Auli, M.: Pre-trained language model representations for language generation. arXiv preprint arXiv:1903.09722 (2019)
Eisenstein, J., Andor, D., Bohnet, B., Collins, M., Mimno, D.: Honest students from untrusted teachers: Learning an interpretable question-answering pipeline from a pretrained language model. arXiv preprint arXiv:2210.02498, (2022)
Emil, Z., Robbertz, A., Valente, R., Winsor, C:. Towards a more inclusive world: Enhanced augmentative and alternative communication for people with disabilities using ai and nlp. Worcester Polytechnic Institute, (2020)
Erciyes, Necdet Eren, Görür, Abdül Kadir: Deep learning methods with pre-trained word embeddings and pre-trained transformers for extreme multi-label text classification. In: 2021 6th International Conference on Computer Science and Engineering (UBMK), pp. 50–55, (2021)
Faraj, D., Abdullah, M.: Sarcasm det at sarcasm detection task 2021 in arabic using ara bert pretrained model. In: Proceedings of the Sixth Arabic Natural Language Processing Workshop, pp. 345–350 (2021)
Fernandes, P., Deutsch, D., Finkelstein, M., Riley, P., Martins, A.F., Neubig, G., Garg, A., Clark, J.H., Freitag, M., Firat, O.: The devil is in the errors: Leveraging large language models for fine-grained machine translation evaluation. arXiv preprint arXiv:2308.07286, (2023)
Floridi, L., Chiriatti, M.: Gpt-3: Its nature, scope, limits, and consequences. Minds and Machines (2020)
Floridi, Luciano, Chiriatti, M.: Gpt-3: Its nature, scope, limits, and consequences. Minds Mach. 30, 681–694 (2020)
Fuadi, M., Wibawa, A.D., Sumpeno, S.: idt5: Indonesian version of multilingual t5 transformer. arXiv preprint arXiv:2302.00856 (2023)
Fukumoto, D., Kashiwa, Y., Hirao, T., Fujiwara, K., Iida, H.: An empirical investigation on the performance of domain adaptation for t5 code completion. In: 2023 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER), pp. 693–697. IEEE (2023)
Gao, Y., Gia Doan, B., Zhang, Z., Ma, S, Zhang, J., Fu, A., Nepal, S., Kim, H.: Backdoor attacks and countermeasures on deep learning: a comprehensive review. arXiv preprint arXiv:2007.10760 (2020)
Geetha, M.P., Karthika Renuka, D.: Improving the performance of aspect based sentiment analysis using fine-tuned bert base uncased model. Int. J. Intell, Netw (2021)
Book Google Scholar
Ghourabi, A.: A bert-based system for multi-topic labeling of arabic content. In: 2021 12th International Conference on Information and Communication Systems (ICICS), pp. 486–489 (2021)
Giorgi, John M., Wang, Xindi, Sahar, Nicola, Young Shin, Won, Bader, Gary D., Wang, Bo: End-to-end named entity recognition and relation extraction using pre-trained language models. arXiv preprint arXiv:1912.13415, (2019)
Giovannotti, P.: Evaluating machine translation quality with conformal predictive distributions. arXiv preprint arXiv:2306.01549 (2023)
Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., Bengio, Y.: Generative adversarial nets. In: Proceedings of the 27th International Conference on Neural Information Processing Systems—Volume 2, NIPS’14, pp. 2672–2680, Cambridge, MA, USA, (2014). MIT Press
Gore, Ross Joseph, Diallo, Saikou, Padilla, Jose: You are what you tweet: connecting the geographic variation in america’s obesity rate to twitter content. PloS ONE 10(9), e0133505 (2015)
Article PubMed PubMed Central Google Scholar
Gruetzemacher, Ross, Whittlestone, J.: The transformative potential of artificial intelligence. Futures 135, 102884 (2022)
Guo, B., Wang, H., Ding, Yasan, Wu, Wei, Hao, Shaoyang, Sun, Yueqi, Yu, Zhiwen: Conditional text generation for harmonious human-machine interaction. ACM Trans. Intell. Syst. Technol., 12(2), (apr 2021)
Guo, Junliang, Zhang, Zhirui, Xu, Linli, Chen, Boxing, Chen, Enhong: Adaptive adapters: An efficient way to incorporate bert into neural machine translation. IEEE/ACM Transactions on Audio, Speech, and Language Processing, pp. 1740–1751, (2021)
Gupta, A., Lanteigne, C., Kingsley, S.: SECure: a social and environmental certificate for AI systems. arXiv preprint arXiv:2006.06217 (2020)
Guven, Z.A.: The effect of bert, electra and albert language models on sentiment analysis for turkish product reviews. In: 2021 6th International Conference on Computer Science and Engineering (UBMK), pp. 629–632 (2021)
Han, J.M., Babuschkin, I., Edwards, H., Neelakantan, A., Xu, T., Polu, S., Ray, A., Shyam, P., Ramesh, A., Radford, A.: Sutskever. Unsupervised neural machine translation with generative language models only, Ilya (2021)
Google Scholar
Han, Xu.: Zhang, Zhengyan, Ding, Ning, Gu, Yuxian, Liu, Xiao, Huo, Yuqi, Qiu, Jiezhong, Yao, Yuan, Zhang, Ao, Zhang, Liang, Han, Wentao, Huang, Minlie, Jin, Qin, Lan, Yanyan, Liu, Yang, Zhiyuan Liu, Zhiwu Lu, Qiu, Xipeng, Song, Ruihua, Tang, Jie, Wen, Ji-Rong, Yuan, Jinhui, Xin Zhao, Win, Zhu, Jun: Pre-trained model: Past, present, and future. Elsevier, Amsterdam (2021)
He, Y., Zhu, Z., Zhang, Y., Chen, Q., Caverlee, J.: Infusing disease knowledge into BERT for health question answering, medical inference and disease name recognition (2020). arXiv preprint arXiv:2010.03746
Hegde, C., Patil, S.: Unsupervised paraphrase generation using pre-trained language models. arXiv preprint arXiv:2006.05477 (2020)
Henderson, Peter, Sinha, Koustuv, Angelard-Gontier, Nicolas, Rosemary Ke, Nan, Fried, Genevieve, Lowe, Ryan, Pineau, Joelle: Ethical challenges in data-driven dialogue systems. arXiv preprint arXiv:1711.09050, (2017)
Hossain, Md Rajib, Hoque, Mohammed Moshiul, Siddique, Nazmul: Leveraging the meta-embedding for text classification in a resource-constrained language. Engineering Applications of Artificial Intelligence, 124:106586, (2023)
Hovy, D., Prabhumoye, S.: Five sources of bias in natural language processing. Lang. Linguistics Compass 15, 8 (2021), e12432 (2021)
Hutchinson, B., Prabhakaran, V., Denton, E., Webster, K., Zhong, Y., Denuyl, S.: Social biases in nlp models as barriers for persons with disabilities. Association for Computational Linguistics (2020)
Jacob, D., Chang, M.W., Kenton, L., Kristina, T.: BERT: Pre-training of deep bidirectional transformers for language understanding. arXiv.org (2019)
Jacobs, P.S.: Joining statistics with nlp for text categorization . In: Third Conference on Applied Natural Language Processing, (1992)
Jagielski, M., Oprea, A., Biggio, B., Liu, C., Nita-Rotaru, C., Li, B.: Manipulating machine learning: Poisoning attacks and countermeasures for regression learning. arXiv preprint arXiv:1804.00308, (2021)
Jain, Praphula Kumar, Quamer, Waris, Pamula, Rajendra: Consumer sentiment analysis with aspect fusion and gan-bert aided adversarial learning. Expert Syst. 40(4), e13247 (2023)
Article Google Scholar
Jin, W., Mao, H., Li, Z., Jiang, H., Luo, C., Wen, H., Han, H., Lu, H., Wang, Z., Li, R., et al.: Amazon-m2: A multilingual multi-locale shopping session dataset for recommendation and text generation. arXiv preprint arXiv:2307.09688, (2023)
Jing, W., Bailong, Y.: News text classification and recommendation technology based on wide amp; deep-bert model. In: 2021 IEEE International Conference on Information Communication and Software Engineering (ICICSE), pp. 209–216 (2021)
Joyce, K., Smith-Doerr, L., Alegria, S., Bell, S., Cruz, T., Hoffman, S.G., Umoja Noble, S., Shestakofsky, B.: Towards a sociology of artificial intelligence: a call for research on inequalities and structural change. Socius (2021)
Phoebe Judge (Host). Pants on fire, February 14, (2014)
Kadaoui, Karima, Magdy, Samar M., Waheed, Abdul, Khondaker, Md Tawkat Islam, El-Shangiti, Ahmed Oumar, Nagoudi, El Moatez Billah, Abdul-Mageed, Muhammad: Tarjamat: Evaluation of bard and chatgpt on machine translation of ten arabic varieties. arXiv preprint arXiv:2308.03051, (2023)
Karimi, A., Rossi, L.: Prati. Improving bert performance for aspect-based sentiment analysis, Andrea (2020)
Karimi, A., Rossi, L., Prati, A.: Adversarial training for aspect-based sentiment analysis with bert. In: 2020 25th International Conference on Pattern Recognition (ICPR), pp. 8797–8803, (2021)
Khan, Aisha Urooj, Mazaheri, Amir, da Vitoria Lobo, Niels, Shah, Mubarak: Mmft-bert: Multimodal fusion transformer with bert encodings for visual question answering, (2020)
Khan, Wahab, Daud, Ali, Nasir, Jamal A., Amjad, Tehmina: A survey on the state-of-the-art machine learning models in the context of nlp. Kuwait journal of Science, 43(4), (2016)
Kheiri, Kiana, Karimi, Hamid: Sentimentgpt: Exploiting gpt for advanced sentiment analysis and its departure from current machine learning. arXiv preprint arXiv:2307.10234, (2023)
Kiros, Jamie, Chan, William: Inferlite: Simple universal sentence representations from natural language inference data. Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, (2018)
Kolides, A., Nawaz, A., Rathor, A., Beeman, D., Hashmi, M., Fatima, S., Berdik, D., Al-Ayyoub, J.Y.: Artificial intelligence foundation and pre-trained models: fundamentals, applications, opportunities, and social impacts. Simul. Model. Pract. Theory 126, 102754 (2023)
Article Google Scholar
Koto, F., Rahimi, A., Lau, J.H., Baldwin, T.: Indolem and indobert: A benchmark dataset and pre-trained language model for indoesian nlp. arXiv preprint arXiv:2011.00677 (2020)
Kowsari, K., Jafari Meimandi, K., Heidarysafa, M., Mendu, S., Barnes, L., Brown, D.: Text classification algorithms: A survey. Information 10(4), 150 (2019)
Article Google Scholar
Kuang W, Qian B, Li Z, Chen D, Gao D, Pan X, Xie Y, Li Y, Ding B, Zhou J: A comprehensive package for fine-tuning large language models in federated learning. arXiv preprint arXiv:2309.00363, (2023)
Kumar, Shobhan, Chauhan, Arun: A finetuned language model for recommending cqa-qas for enriching textbooks. In Kamal Karlapalem, Hong Cheng, Naren Ramakrishnan, R. K. Agrawal, P. Krishna Reddy, Jaideep Srivastava, and Tanmoy Chakraborty, editors, Advances in Knowledge Discovery and Data Mining, pp. 423–435, Cham, (2021). Springer International Publishing
Kuratov, Y., Arkhipov, M.: Adaption of deep bidirectional multilingual transformers for russian language. arXiv preprint arXiv:1905.07213, (2019)
Kurita, K., Michel, P., Neubig, G,: Weight poisoning attacks on pre-trained models. arXiv preprint arXiv:2004.06660 (2020)
Lahire, T.: Actor loss of soft actor critic explained. arXiv preprint arXiv:2112.15568 (2021)
Lample, Guillaume: Conneau. Cross-lingual language model pretraining, Alexis (2019)
Google Scholar
Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations, (2019)
Lewis, M., Liu, Y., Goyal, N., Ghazvininejad, M., Mohamed, A., Levy, O., Stoyanov, V., Zettlemoyer, L.: Bart: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. arXiv preprint arXiv:1910.13461. (2019)
Li, Junyi, Tang, Tianyi, Zhao, Wayne Xin, Nie, Jian-Yun. Wen, Ji-Rong. A survey of pretrained language models based text generation (2022)
Li, J., Tang, T., Zhao, W.X., Wen, J.-R.: Pretrained language models for text generation: A survey, (2021)
Li, L., Jiang, X.L.: Pretrained language models for document-level neural machine translation, Qun (2019)
Li, L., Song, D., Li, X., Zeng, J., Ma, R., Qiu, X.: Backdoor attacks on pre-trained models by layerwise weight poisoning. arXiv preprint arXiv:2108.13888, (2021)
Li, P., Li, L., Zhang, M., Wu, M., Liu, Q: Universal conditional masked language pre-training for neural machine translation. arXiv preprint arXiv:2203.09210 (2022)
Li, Qian, Peng, Hao, Li, Jianxin, Xia, Congying, Yang, Renyu, Sun, Lichao, Yu, Philip S., He, Lifang: A survey on text classification: From shallow to deep learning, (2020)
Li, S., Liu, H., Dong, T., Zi Hao Zhao, B., Xue, M., Zhu, H., Lu, J.: Hidden backdoors in human-centric language models. arXiv preprint arXiv:2105.00164, (2021)
Li, X., Bing, L., Zhang, W.L.: Exploiting bert for end-to-end aspect-based sentiment analysis, Wai (2019)
Li, X., Fu, X., Xu, G., Yang, Y., Wang, J., Jin, L., Liu, Q., Xiang, T.: Enhancing bert representation with context-aware embedding for aspect-based sentiment analysis. IEEE Access 8, 46868–46876 (2020)
Article Google Scholar
Lim, S., Lee, K., Kang, J.: Drug drug interaction extraction from the literature using a recursive neural network. PLoS ONE (2018)
Lin, Junyang, Men, Rui, Yang, An, Zhou, Chang, Ding, Ming, Zhang, Uichang, Wang, Peng, Wang, Ang, Jiang, Le, Jia, Xianyan, Zhang, Jie, Zhang, Jianwei, Zou, Xu, Li, Zhikang, Deng, Xiaodong, Xue, Jinbao, Zhou, Huiling, Ma, Jianxin, Yu, Jin, Li, Yong, Lin, Wei, Zhou, Jingren, Tang, Jie, Yang, Hongxia: M6: A chinese multimodal pretrainer. arXiv preprint arXiv:2103.00823, (2021)
Liu, Jiachang, Shen, Dinghan, Zhang, Yizhe, Dolan, Bill, Carin, Lawrence, Chen, Weizhu: What makes good in-context examples for gpt-\(3\)? (2021)
Liu, Shansong, Hussain, Atin Sakkeer, Sun, Chenshuo, Shan, Ying: Music understanding llama: Advancing text-to-music generation with question answering and captioning. arXiv preprint arXiv:2308.11276, (2023)
Liu, Wenbin, Wen, Bojian, Gao, Shang, Zheng, Jiesheng, Zheng, Yinlong: A multi-label text classification model based on elmo and attention. MATEC Web Conference, 309, (2020)
Liu, Yinhan, Ott, Myle, Goyal, Naman, Du, Jingfei, Joshi, Mandar, Chen, Danqi, Levy, Omer, Lewis, Mike, Zettlemoyer, Luke, Stoyanov, Veselin: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692, (2019)
Liu, Zheng:. Sociological perspectives on artificial intelligence: A typological reading. Wiley Online Library, (2021)
Lloret, Elena: Llorens, Hector, Moreda, Paloma, Saquete, Estela, Palomar, Manuel: Text summarization contribution to semantic question answering: New approaches for finding answers on the web. International Journal of Intelligent Systems 26(12), 1125–1152 (2011)
Article Google Scholar
Lock, S.: What is ai chatbot phenomenon chatgpt and could it replace humans? ):‘Book What is AI chatbot phenomenon ChatGPT and could it replace humans, (2022)
Ma, Chunlan, ImaniGooghari, Ayyoob, Ye, Haotian, Asgari, Ehsaneddin, Schütze, Hinrich: Taxi1500: A multilingual dataset for text classification in 1500 languages. arXiv preprint arXiv:2305.08487, (2023)
Ma, Shuming, Yang, Jian, Huang, Haoyang, Chi, Zewen, Dong, Li, Zhang, Dongdong, Awadalla, Hany Hassan, Muzio, Alexandre, Eriguchi, Akiko, Singhal, Saksham, Song, Xia, Menezes, Arul, Wei, Furu: Xlm-t: Scaling up multilingual machine translation with pretrained cross-lingual transformer encoders, (2020)
MacCartney, Bill: Natural Language Inference. Stanford University ProQuest Dissertations Publishing, (2009)
Madhyastha, Pranava Swaroop, Bansal, Mohit, Gimpel, Kevin, Livescu, Karen: Mapping unseen words to task-trained embedding spaces. Proceedings of the 1st Workshop on Representation Learning for NLP, pp. 100–110, (2016)
Mager, Manuel, Astudillo, Ramon Fernandez, Naseem, Tahira, Sultan, Md Arafat, Lee, Young-Suk, Florian, Radu, Roukos, Salim: Gpt-too: A language-model-first approach for amr-to-text generation, (2020)
Mai, Florian, Pappas, Nikolaos, Montero, Ivan, Smith, Noah A.: Henderson. Plug and play autoencoders for conditional text generation, James (2020)
Google Scholar
Maldonado, Abran, Pistunovich, Natalie: GPT-3 powers the next generation of apps, (2021)
Manias, George, Mavrogiorgou, Argyro, Kiourtis, Athanasios, Symvoulidis, Chrysostomos, Kyriazis, Dimosthenis: Multilingual text categorization and sentiment analysis: a comparative analysis of the utilization of multilingual approaches for classifying twitter data. Neural Computing and Applications, pp. 1–17, (2023)
Martin, Louis, Muller, Benjamin, Suárez, Pedro Javier Ortiz, Dupont, Yoann, Romary, Laurent, de la Clergie, Éric Villemonte, Seddah, Djamé, Sagot, Benoit: Camembert: a tasty french language model. arXiv preprint arXiv:1911.03894, (2020)
Marulli, Fiammetta: Verde, Laura, Campanile, Lelio: Exploring data and model poisoning attack to deep learning-based NLP systems. Procedica Computer Science 192, 3570–3579 (2021)
Article Google Scholar
Maslennikova, Elizaveta: Elmo word representations for news protection. CLEF (Working Notes, (2019)
Mathew, Leeja, Bindu, V. R.: A review of natural language processing techniques for sentiment analysis using pre-trained models. In: 2020 Fourth International Conference on Computing Methodologies and Communication (ICCMC), pp. 340–345, (2020)
McCarley, J.S.: Chakravarti, Rishav. Sil. Structured pruning of a bert-based question answering model, Avirup (2019)
Google Scholar
Arifuzzaman, M., Rakibul Hasan, Md., Maliha, Maisha: Sentiment analysis with nlp on twitter data. IEEE, (2019)
Meftah, Sara, Tamaazoust, Youssef, Semmar, Nasredine, Essafi, Hassane, Sadat, Faitha: Joint learning of pre-trained and random units for domain adaptation in part-of-speech tagging. arXiv preprint arXiv: 1904.03595, (2019)
Meng, Yuxian, Ren, Xiangyuan, Sun, Zijun, Li, Xiaoya, Yuan, Arianna, Wu, Fei, Li, Jiwei: Large-scale pretraining for neural machine translation with tens of billions of sentence pairs, (2019)
Minaee, Shervin, Kalchbrenner, Nal, Cambria, Erik, Nikzad, Narjes, Chenaghlu, Meysam, Gao, Jianfeng: Deep learning–based text classification: A comprehensive review. ACM Comput. Surv., 54(3), (April 2021)
Mitchell, Lewis, Frank, Morgan R., Harris, Kameron Decker, Dodds, Peter Sheridan, Danforth, Christopher M.: The geography of happiness: Connecting twitter sentiment and expression, demographics, and objective characteristics of place. PloS one 8(5), e64417 (2013)
Article ADS CAS PubMed PubMed Central Google Scholar
Mitkov, Ruslan: The upper Oxford Handbook of Computational Linguistics. Oxford University Press Inc., (2004)
Mohawesh, Rami, Al-Hawawreh, Muna, Maqsood, Sumbal, Alqudah, Omar: Factitious or fact? learning textual representations for fake online review detection. Cluster Computing, pp. 1–16, (2023)
Mohawesh, Rami: Liu, Xiao, Arini, Hilya Mudrika, Wu, Yutao, Yin, Hui: Semantic graph based topic modelling framework for multilingual fake news detection. AI Open 4, 33–41 (2023)
Article Google Scholar
Mohawesh, Rami, Xu, Shuxiang, Springer, Matthew, Al-Hawawreh, Muna, Maqsood, Sumbal: Fake or genuine? contextualised text representation for fake review detection. arXiv preprint arXiv:2112.14343, (2021)
Mohawesh, Rami: Xu, Shuxiang, Springer, Matthew, Jararweh, Yaser, Al-Hawawreh, Muna, Maqsood, Sumbal: An explainable ensemble of multi-view deep learning model for fake review detection. Journal of King Saud University-Computer and Information Sciences 35(8), 101644 (2023)
Article Google Scholar
Mohit, Behrang: Natural Language Processing of Semitic Languages. Springer, Berlin, Heidelberg (2014)
Google Scholar
Mumtarin, Maroa, Samiullah Chowdhury, Md., Wood, Jonathan: Large language models in analyzing crash narratives–a comparative study of chatgpt, bard and gpt-4. arXiv preprint arXiv:2308.13563, (2023)
Nadeau, David: Sekine, Satoshi: A survey of named entity recognition and classification. Lingvisticæ Investigationes 30, 3–26 (2007)
Article Google Scholar
Narang, Sharan, Chowdhery, Aakanksha: Pathways language model (palm): Scaling to 540 billion parameters for breakthrough performance. Google AI Blog, (2022)
Narayan, Shashi, Simoes, Gonçalo, Ma, Ji, Craighead, Hannah, Mcdonald, Ryan: Qurious: Question generation pretraining for text generation, (2020)
Usman Naseem, Matloob Khushi, Vinay Reddy, Sakthivel Rajendran, Imran Razzak, and Jinman Kim. Bio albert: A simple and effective pre-trained language model for biomedical named entity recognition. International Joint Conference on Neural Networks, 2021
Nayak, Pandu: Understanding searches better than ever before, (Oct 2019)
Nguyen, Dat Quoc, Nguyen, Anh Tuan: Phobert: Pre-trained language models for vietnamese. arXiv preprint arXiv:2003.00744, (2020)
Nguyen, Thanh Thi, Wilson, Campbell, Dalins, Janis: Fine-tuning llama 2 large language models for detecting online sexual predatory chats and abusive texts. arXiv preprint arXiv:2308.14683, (2023)
Okur, Halil Ibrahim, Sertbaş, Ahmet: Pretrained neural models for turkish text classification. In: 2021 6th International Conference on Computer Science and Engineering (UBMK), pp. 174–179, (2021)
Orgad, Hadas, Belinkov, Yonatan: Debiasing nlp models without demographic information. arXiv preprint arXiv:2212.10563, (2022)
Padilla, Jose J., Kavak, Hamdi, Lynch, Christopher J., Gore, Ross J., Diallo, Saikou Y.: Temporal and spatiotemporal investigation of tourist attraction visit sentiment on twitter. PloS one 13(6), e0198857 (2018)
Article PubMed PubMed Central Google Scholar
Penha, Gustavo, Hauff, Claudia: What does BERT know about books, movies and music? probing BERT for conversational recommendation. In: Fourteenth ACM Conference on Recommender Systems. ACM, (sep 2020)
Polignano, M., Basile, P., de Gemmis, M., Semeraro, G., Basile, V.: Alberto: Italian bert language understanding model for nlp challenging tasks based on tweets. CEUR Workshop Proceedings, 2481, (2019)
Etoori, Pravallika: Mamidi, Radhika: Chinnakotla. Automatic spelling correction for resource-scarce languages using deep learning. ACL Anthology, Manoj (2018)
Google Scholar
Qi, Ye, Sachan, Devendra Singh, Felix, Matthieu, Padmanabhan, Sarguna Janani, Neubig, Graham: When and why are pre-trained word embeddings useful for neural machine translation?, (2018)
Qiu, Xipeng, Sun, Tianxiang, Yige, Xu., Shao, Yunfan, Dai, Ning, Huang, Xuanjing: Pre-trained models for natural language processing: A survey. Science China Technological Sciences 63(10), 1872–1897 (2020)
Qu, Chen, Yang, Liu, Qiu, Minghui, Bruce Croft, W., Zhang, Yongfeng, Iyyer, Mohit: BERT with history answer embedding for conversational question answering. In: Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, (jul 2019)
Qu, Yuanbin, Liu, Peihan, Song, Wei, Liu, Lizhen, Cheng, Miaomiao: A text generation and prediction system: Pre-training on new corpora using bert and gpt-2. In: 2020 IEEE 10th International Conference on Electronics Information and Emergency Communication (ICEIEC), pp. 323–326, (2020)
Quan, Wei, Zhang, Jinli, Hu, Xiaohua Tony: End-to-end joint opinion role labeling with bert. In: 2019 IEEE International Conference on Big Data (Big Data), pp. 2438–2446, (2019)
Radford, Alec: Narasimhan. Improving language understanding by generative pre-training. OpenAI, Karthik (2018)
Google Scholar
Radford, Alec, Wu, Jeffrey, Child, Rewon, Luan, David, AModei, Dario, Sutskever, Ilya: Learning to paraphrase: An unsupervised approach using multiple-sequence alignment. ACL Anthology, (2019)
Rae, Jack W., Borgeaud, Sebastian, Cai, Trevor, Millican, Katie, Hoffmann, Jordan, Song, Francis, Aslanides, John, Henderson, Sarah, Ring, Roman, Young, Susannah, et al.: Scaling language models: Methods, analysis & insights from training gopher. arXiv preprint arXiv:2112.11446, (2021)
Raffel, Colin: Shazeer, Noam, Roberts, Adam, Lee, Katherine, Narang, Sharan, Matena, Michael, Zhou, Yanqi, Li, Wei, Liu. Exploring the limits of transfer learning with a unified text-to-text transformer, Peter J. (2019)
Rahsepar, Amir Ali, Tavakoli, Neda, Kim, Grace Hyun J., Hassani, Cameron, Abtin, Fereidoun, Bedayat, Arash: How ai responds to common lung cancer questions: Chatgpt vs google bard. Radiology, 307(5):e230922, (2023)
Ramponi, Alan, Plank, Barbara: Neural unsupervised domain adaptation in nlp—a survey. arXiv preprint arXiv:2006.00632, (2020)
Ramraj, S., Arthi, R., Murugan, Solai, Julie, M.S.: Topic categorization of tamil news articles using pretrained word2vec embeddings with convolutional neural network. In: 2020 International Conference on Computational Intelligence for Smart Power System and Sustainable Energy (CISPSSE), pp. 1–4, (2020)
Rehman, Abdul, Abbasi, Rabeeh Ayaz, Khattak, Akmal Saeed, et al.: Classifying text-based conspiracy tweets related to covid-19 using contextualized word embeddings. arXiv preprint arXiv:2303.03706, (2023)
Reimers, Nils, Schiller, Benjamin, Beck, Tilmann, Daxenberger, Johannes, Stab, Christian, Gurevych, Iryna: Classification and clustering of arguments with contextualized word embeddings. arXiv preprint arXiv:1906.09821, (2019)
Rezaeinia, Seyed Mahdi, Rahmani, Rouhollah, Ghodsi, Ali, Veisi, Hadi: Sentiment analysis based on improved pre-trained word embeddings. Expert Systems with Applications 117, 139–147 (2019)
Article Google Scholar
Rosario, Barbara, Hearst, Marti A.: Classifying semantic relations in bioscience texts. Proceedings of the 42nd Annual meeting of the association for computational linguistics, (2004)
Roudsari, Arousha Haghighian, Afshar, Jafar, Lee, Charles Cheolgi, Lee, Wookey: Multi-label patent classification using attention-aware deep learning model. In: 2020 IEEE International Conference on Big Data and Smart Computing (BigComp), pp. 558–559, (2020)
Sarkar, Sagnik, Singh, Pardeep: Combining the knowledge graph and t5 in question answering in nlp. In: Sentiment Analysis and Deep Learning: Proceedings of ICSADL 2022, pp. 405–409. Springer, (2023)
Saunders, Danielle: Domain adaptation and multi-domain adaptation for neural machine translation: A survey. arXiv preprint arXiv:2104.06951, (2021)
Schmid, Helmut: Part-of-speech tagging with neural networks. arXiv preprint arXiv:9410018, (1994)
Sen, Bhaskar, Gopal, Nikhil, Xue, Xinwei: Support-bert: Predicting quality of question-answer pairs in msdn using deep bidirectional transformer, (2020)
Shi, Yucheng, Ma, Hehuan, Zhong, Wenliang, Mai, Gengchen, Li, Xiang, Liu, Tianming, Huang, Junzhou: Chatgraph: Interpretable text classification by converting chatgpt knowledge to graphs. arXiv preprint arXiv:2305.03513, (2023)
Singhal, Karan, Tu, Tao, Gottweis, Juraj, Sayres, Rory, Wulczyn, Ellery, Hou, Le, Clark, Kevin, Pfohl, Stephen, Cole-Lewis, Heather, Neal, Darlene, et al.: Towards expert-level medical question answering with large language models. arXiv preprint arXiv:2305.09617, (2023)
Song, Youwei, Wang, Jiahai, Liang, Zhiwei, Liu, Zhiyue, Jiang, Tao: Utilizing bert intermediate layers for aspect based sentiment analysis and natural language inference, (2020)
Stickland, Asa Cooper, Li, Xian: Ghazvininejad. Recipes for adapting pre-trained monolingual and multilingual models to machine translation, Marjan (2020)
Google Scholar
Strubell, Emma, Ganesh, Ananya, McCallum, Andrew: Energy and policy considerations for deep learning in nlp. arXiv preprint arXiv:1906.02243, (2019)
Sun, Chi: Huang, Luyao. Qiu. Utilizing bert for aspect-based sentiment analysis via constructing auxiliary sentence, Xipeng (2019)
Google Scholar
Chi Sun, Xipeng Qiu, Yige Xu, and Xuanjing Huang. How to fine-tune bert for text classification?, 2019
Sun, Yu, Wang, Shuohuan, Feng, Shikun, Ding, Siyu, Pang, Chao, Shang, Junyuan, Liu, Jiaxiang, Chen, Xuyi, Zhao, Yanbin, Lu, Yuxiang, Liu, Weixin, Wu, Zhihua, Gong, Weibao, Liang, Jianzhong, Shang, Zhizhou, Sun, Peng, Liu, Wei, Ouyang, Xuan, Yu, Dianhai, Tian, Hao, Wu, Hua, Wang, Haifeng: Ernie 3.0: Large-scale knowledge enhanced pre-training for language understanding and generation, (2021)
Suneera, C. M., Prakash, Jay: A bert-based question representation for improved question retrieval in community question answering systems. In Srikanta Patnaik, Xin-She Yang, and Ishwar K. Sethi, editors, Advances in Machine Learning and Computational Intelligence. Springer Singapore, (2021)
Sweeney, Latanya: Discrimintion in online ad delivery. arXiv preprint arXiv:1301.6822, (2013)
Tabinda Kokab, Sayyida, Asghar, Sohail, Naz, Shehneela: Transformer-based deep learning models for the sentiment analysis of social media data. Array, page 100157, (2022)
Tanvir, Hasan, Kittask, Claudia, Eiche, Sandra, Sirts, Kairit: Estbert: a pretrained language-specific bert for estonian. arXiv preprint arXiv:2011.04784, (2021)
Terpin, Antonio: Lanzetti, Nicolas, Yardim, Batuhan, Dorfler, Florian, Ramponi, Giorgia: Trust region policy optimization with optimal transport discrepancies: Duality and algorithm for continuous actions. Advances in Neural Information Processing Systems 35, 19786–19797 (2022)
Google Scholar
Balaji, T.K., Annushree, Bablani, and Sreeja, S.R.: Opinion mining on covid-19 vaccines in india using deep and machine learning approaches. In: 2022 International Conference on Innovative Trends in Information Technology (ICITIIT), pp. 1–6, (2022)
Touvron, Hugo, Lavril, Thibaut, Izacard, Gautier, Martinet, Xavier, Lachaux, Marie-Anne, Lacroix, Timothée, Rozière, Baptiste, Goyal, Naman, Hambro, Eric, Azhar, Faisal, et al.: Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971, (2023)
Ulcar, Matej: Robnik-Sikonja, Marko: Training dataset and dictionary sizes matter in bert models: the case of baltic languages. Analysis of Images, Social Networks and Texts (2021)
Google Scholar
Uthus, David, Ontañón, Santiago, Ainslie, Joshua, Guo, Mandy: mlongt5: A multilingual and efficient text-to-text transformer for longer sequences. arXiv preprint arXiv:2305.11129, (2023)
van Stegeren, Judith, Myundefinedliwiec, Jakub: Fine-tuning gpt-2 on annotated rpg quests for npc dialogue generation. In: The 16th International Conference on the Foundations of Digital Games (FDG) 2021. Association for Computing Machinery, (2021)
Variš, Duš an, Bojar, Ondřej: Unsupervised pretraining for neural machine translation using elastic weight consolidation. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop. Association for Computational Linguistics, (2019)
Veysel Kocaman, David Talby: SparkNLP: Natural language understanding at scale. Elsevier, (2021)
Virtanen, Antti, Kanerva, Jenna, Ilo, Rami, Luoma, Jouni, Luotolahti, Juhani, Salakoski, Tapio, Ginter, Filip, Pyysalo, Sampo: Multilingual is not enough: Bert for finnish. arXiv preprint arXiv:1912.07076, (2019)
Wang, Hai, Yu, Dian, Sun, Kai, Chen, Jianshu, Yu, Dong: Improve pre-trained multilingual models with vocabulary expansion. arXiv preprint arXiv:1909.12440, (2019)
Wang, Shuo, Nepal, Surya, Rudolph, Carsten, Grobler, Marthie, Chen, Shangyu, Chen, Tianle: Backdoor attacks against transfer learning with pre-trained deep learning models. arXiv preprint arXiv:2001.03274, (2020)
Wang, Wenxuan: Jiao, Wenxiang, Hao, Yongchang, Wang, Xing, Shi, Shuming, Tu, Zhaopeng. Lyu. Understanding and improving sequence-to-sequence pretraining for neural machine translation, Michael (2022)
Google Scholar
Wang, Yuhui, He, Hao, Tan, Xiaoyang: Truly proximal policy optimization. In: Uncertainty in Artificial Intelligence, pp. 113–122. PMLR, (2020)
Wei, Xiaokai, Wang, Shen, Zhang, Dejiao, Bhatia, Parminder, Arnold, Andrew: Knowledge enhanced pretrained language models: A compreshensive survey, (2021)
Wiggers, Kyle: (2021)
Wikipedia contributors. Turing test — Wikipedia, the free encyclopedia, (2022). [Online; accessed 26-April-2022]
Wu, Carole-Jean, Raghavendra, Ramya, Gupta, Udit, Acun, Bilge, Ardalani, Newsha, Maeng, Kiwan, Chang, Gloria, Behram, Fiona Aga, Huang, James, Bai, Charles, Gschwind, Michael, Gupta, Anurag, Ott, Myle, Melnikov, Anastasia, Candido, Salvatore, Brooks, David, Chauhan, Geeta, Lee, Benjamin, Lee, Hsien-Hsin S., Akyildiz, Bugra, Balandat, Maximilian, Spisak, Joe, Jain, Ravi, Rabbat, Mike, Hazelwood, Kim: Sustainable ai: Environmental implications, challenges and opportunities. arXiv, (2021)
Xia, Congying, Zhang, Chenwei, Nguyen, Hoang, Zhang, Jiawei, Yu, Philip: Cg-bert: Conditional text generation with bert for generalized few-shot intent detection, (2020)
Xing, Yiran, Shi, Zai, Meng, Zhao, Lakemeyer, Gerhard, Ma, Yunpu, Wattenhofer, Roger: Km-bart: Knowledge enhanced multimodal bart for visual commonsense generation, (2021)
Xu, Haoran, Van Durme, Benjamin, Murray, Kenton: Bert, mbert, or bibert? a study on contextualized embeddings for neural machine translation. ACL Anthology, (2021)
Xu, Hu., Shu, Lei, Yu, Philip S.: Liu. Understanding pre-trained bert for aspect-based sentiment analysis, Bing (2020)
Xue, Linting, COnstant, Noah, Roberts, Adam, Kale, Mihir, Al-Rfou, Rami, Siddhant, Aditya, Barua, Aditya, Raffel, Colin: mt5: A massively pre=trained text-to-text transformer. arXiv preprint arXiv:2010.11934, (2021)
Yang, Wei, Xie, Yuqing, Lin, Aileen, Li, Xingyu, Tan, Luchen, Xiong, Kun, Li, Ming, Lin, Jimmy: End-to-end open-domain question answering with. In: Proceedings of the 2019 Conference of the North. Association for Computational Linguistics, (2019)
Yang, Wei: Xie, Yuqing, Tan, Luchen, Xiong, Kun, Li, Ming. Lin. Data augmentation for bert fine-tuning in open-domain question answering, Jimmy (2019)
Google Scholar
Yang, Zhilin, Dai, Zihang, Yang, Yiming, Carbonell, Jaime G., Salakhutdinov, Ruslan, Le, Quoc V.: Xlnet: Generalized autoregressive pretraining for language understanding. CoRR, abs/1906.08237, (2019)
Yu, Wenhao, Zhu, Chenguang, Li, Zaitang, Hu, Zhiting, Wang, Qingyun, Ji, Heng, Jiang, Meng: A survey of knowledge-enhanced text generation. ACM Comput. Surv., (jan 2022)
Zaib, Munazza, Tran, Dai Hoang, Sagar, Subhash, Mahmood, Adnan, Zhang, Wei E., Sheng, Quan Z.: Bert-coqac: Bert-based conversational question answering in context. In Li Ning, Vincent Chau, and Francis Lau, editors, Parallel Architectures, Algorithms and Programming, pp. 47–57, Singapore, (2021). Springer Singapore
Zajko, M.: Artificial intelligence, algorithms, and social inequality: Sociological contributions to contemporary debates. Sociology Compass, (2022)
Zhang, B., Dafoe, A.: Artificial intelligence: American attitudes and trends. Governance of AI program, the Future of Humanity Institute, the University of Oxford, Oxford, UK (2019)
Zhang, B., Yang, H., Liu, X.-Y.: Instruct-fingpt: Financial sentiment analysis by instruction tuning of general-purpose large language models. arXiv preprint arXiv:2306.12659, (2023)
Zhang, H., Li, X., Bing, L.: Video-llama: An instruction-tuned audio-visual language model for video understanding. arXiv preprint arXiv:2306.02858, (2023)
Zhang, H., Song, H., Li, S., Zhou, Ming, Song. A survey of controllable text generation using transformer-based pre-trained language models, Dawei (2022)
Zhang, J., Zhao, Y., Saleh, M., Liu, P.J.: Pegasus: Pre-training with extracted gap-sentences for abstractive summarization (2019)
Zhang, T., Xu, B., Thung, F., Haryono, S.A., Lo, D., Jiang, L.: Sentiment analysis for software engineering: How far can pre-trained transformer models go? In: 2020 IEEE International Conference on Software Maintenance and Evolution (ICSME), pp. 70–80, (2020)
Zhang, Z., Wu, S., Jiang, D., Chen, G.: BERT-JAM: Maximizing the utilization of BERT for neural machine translation. Neurocomputing 14(460), 84–94 (2021)
Article Google Scholar
Zhu, Jinhua: Xia, Yingce, Wu, Lijun, He, Di, Qin, Tao, Zhou, Wengang, Li, Houqiang. Liu. Incorporating bert into neural machine translation, Tie-Yan (2020)

Download references

Funding

No Funding

Author information

Authors and Affiliations

Duquesne University, Pittsburgh, Pennsylvania, USA
Devon Myers, Venkata Ishwarya Chellaboina, Anantha Lakshmi Sathvik, Praveen Venkatesh, Yi-Hui Ho, Hanna Henshaw, David Berdik & Yaser Jararweh
Al Ain University, Al Ain, Abu Dhabi, United Arab Emirates
Rami Mohawesh
Deakin University, Geelong, Melbourne, Australia
Muna Alhawawreh

Authors

Devon Myers
View author publications
You can also search for this author in PubMed Google Scholar
Rami Mohawesh
View author publications
You can also search for this author in PubMed Google Scholar
Venkata Ishwarya Chellaboina
View author publications
You can also search for this author in PubMed Google Scholar
Anantha Lakshmi Sathvik
View author publications
You can also search for this author in PubMed Google Scholar
Praveen Venkatesh
View author publications
You can also search for this author in PubMed Google Scholar
Yi-Hui Ho
View author publications
You can also search for this author in PubMed Google Scholar
Hanna Henshaw
View author publications
You can also search for this author in PubMed Google Scholar
Muna Alhawawreh
View author publications
You can also search for this author in PubMed Google Scholar
David Berdik
View author publications
You can also search for this author in PubMed Google Scholar
Yaser Jararweh
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

All authors contributed equally to the paper.

Corresponding author

Correspondence to Rami Mohawesh.

Ethics declarations

Conflict of interest

No competing interest.

Ethical approval and consent to participate:

No ethical issue is involved.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Myers, D., Mohawesh, R., Chellaboina, V.I. et al. Foundation and large language models: fundamentals, challenges, opportunities, and social impacts. Cluster Comput 27, 1–26 (2024). https://doi.org/10.1007/s10586-023-04203-7

Download citation

Received: 13 May 2023
Revised: 03 November 2023
Accepted: 04 November 2023
Published: 27 November 2023
Issue Date: February 2024
DOI: https://doi.org/10.1007/s10586-023-04203-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Foundation and large language models: fundamentals, challenges, opportunities, and social impacts

Abstract

Access this article

Similar content being viewed by others

Large-Language-Models (LLM)-Based AI Chatbots: Architecture, In-Depth Analysis and Their Performance Evaluation

Unraveling the mysteries of AI chatbots

Comparative Analysis for Open-Source Large Language Models

Data Availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval and consent to participate:

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Foundation and large language models: fundamentals, challenges, opportunities, and social impacts

Abstract

Access this article

Similar content being viewed by others

Large-Language-Models (LLM)-Based AI Chatbots: Architecture, In-Depth Analysis and Their Performance Evaluation

Unraveling the mysteries of AI chatbots

Comparative Analysis for Open-Source Large Language Models

Data Availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval and consent to participate:

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation