What is ChatGPT and how does it work?

Chat Generative Pre-trained Transformer (ChatGPT) is a very large natural language processing model that uses deep learning algorithms trained on vast amounts of internet data to generate human-like responses to user prompts. It was developed by a company called OpenAI and released in November 2022 [1].

Transformers are an architecture used primarily in the field of natural language processing, that aims to solve sequence-to-sequence tasks (e.g. language translation, which amounts to predicting a sequence of words in a target language from a sequence of words in a source language) while handling long-range dependencies with ease. They provide superior answers while requiring less training time than previous approaches such as recurrent neural networks [2]. In short, ChatGPT examines the words provided as input by the users, and predicts what words will follow while generating a coherent response in a human-like style. It has been fine-tuned for a myriad of language tasks including text generation and completion, translation, sentiment analysis, document summarisation, question and answering, and even generation and explanation of programming code [1].

So far, ChatGPT passed the Wharton MBA examination, law school examinations, United States Medical Licensing Examination (USMLEs), and more [3, 4]. Its role has been explored in different specialties including intensive care medicine, with variable results [5, 6]. How much of ChatGPT’s “textbook” knowledge remains applicable to “real world” patients remains to be demonstrated.

Potential applications of ChatGPT in intensive care medicine

ChatGPT and large language models (LLMs) could potentially impact the clinical practice of intensive care medicine in significant ways (Table 1).

Table 1 Potential applications, limitations and risks of using large language models such as ChatGPT in medicine and the intensive care unit. For each application, we provide examples of how ChatGPT could be queried

Supporting medical decisions

In the intensive care unit (ICU) setting, where there is a high level of uncertainty and where fast and accurate decisions are necessary, ChatGPT can be used as a glorified search engine, quickly providing information on medical conditions, treatments, and drug interactions, offering evidence-based recommendations, but also analysing (actual) patient data and predicting outcomes, all of which could potentially improve patient outcomes [7].

Handling of medical notes

As an artificial intelligence (AI) specialised in language problems, ChatGPT could assist clinicians in a number of tasks related to handling medical data and clinical notes. For example, it can be used to organise clinical notes and write accurate discharge summaries from unstructured input data [8].

Medical education

LLMs can also be used as educational tools, whereby medical students or clinical trainees interact with the AI to ask and/or answer clinical questions. Gunther Eysenbach provided a clear demonstration of such application in diabetes [9].

There is much more work that will need to be done for a biomedical LLM to provide accurate and up-to-date answers to very specific and technical questions, but it has the potential to be a robust educational tool that is equal to or better than learning from textbooks or scientific literature [10].

Enhancing communication

ChatGPT can be used as a communication tool between healthcare providers, patients, and their families, for example to explain or translate technical information. The ability of ChatGPT to summarise lay and specialised information is often remarkable, so it represents an opportunity for technical communication among clinics, as well as plain language for communication with patients and their families [11].

Scientific writing

ChatGPT can assist in the writing process of scientific papers, identify research questions, provide an overview of the current state of the field, and assist with tasks such as formatting and language and content review [12].

Using ChatGPT, free translations can reduce costs and facilitate access to publishing for the authors from non-English speaking countries, hence helping with democratisation and diffusion of knowledge [13].

Limitations and risks of large language models

However, it is important to note that ChatGPT is a machine learning model, and as such, it has limitations, and can be used in “wrong” ways, knowingly or unknowingly (Table 1).

Hallucinations

Bluntly put, ChatGPT (at the time of writing) cannot be (fully) trusted, since its output can be flat-out incorrect. The so-called “hallucination” phenomenon refers to the ability of ChatGPT to confidently produce answers that look believable but may be incorrect or nonsensical. For example, it may invent publications, references or Pubmed IDs, or write case reports about non-existing diseases [14]. It cannot replace the expertise of trained healthcare professionals, and its recommendations should always be considered in the context of the patient's unique circumstances.

Misuses

Given their ability to generate realistic looking data, there are a number of possible misuses of LLMs, including fabricating research data or results to meet funding or publication requirements, generating fake news or misinformation, and plagiarising [7, 10]. This has serious implications for the integrity of the scientific record, given the risk of introducing not only errors but also plagiarised content into publications. This could result in future research or health policy decisions being made on the basis of false information [10].

Perpetuating biases

These models train on existing medical research that could carry biases, causing the AI program to risk perpetuating these same biases [10]. Since most scientific publications are from high-income countries, ChatGPT’s output will be based on evidence from these countries, and its applicability to low- and middle-income settings must be questioned.

Data privacy and security

A significant concern related to privacy pertains to the data shared with ChatGPT through user prompts. As we request ChatGPT to answer queries, we may unintentionally disclose sensitive information and expose it to the public. Since every user prompt is incorporated into ChatGPT's database, they may be utilised to enhance the tool's training and conceivably be incorporated in replies to other users’ prompts [10].

Automation of jobs and wider societal risks

Finally, advanced LLMs are a source of concerns and wider societal risks, which have been expressed by machine learning experts and key opinion leaders, calling for a halt in training these models [15].

While it is unlikely that LLMs will ever replace physicians, if they can pass medical exams, suggest differential diagnoses, investigation and treatment plans, summarise information and communicate effectively with patients and their relatives, what is left for human doctors to do? As stated in the Future of Life Institute letter, “Should we automate away all the jobs, including the fulfilling ones?” [15].

Take-home messages

In the fast-paced and high stress environment of the ICU, LLMs like ChatGPT should be considered as potential valuable tools to support medical and non-medical staff in daily activities, education and research. Importantly, users must be aware of the limitations and potential misuses of these tools, including their capacity to generate confident and completely wrong answers or perpetuate existing biases and inequalities. However, this is a fast evolving field, and (rapid) progress on LLMs performance, safety and uncertainty awareness is to be expected.