A step-by-step researcher's guide to the use of an AI-based transformer in epidemiology: an exploratory analysis of ChatGPT using the STROBE checklist for observational studies

Sanmarchi, Francesco; Bucci, Andrea; Nuzzolese, Andrea Giovanni; Carullo, Gherardo; Toscano, Fabrizio; Nante, Nicola; Golinelli, Davide

doi:10.1007/s10389-023-01936-y

A step-by-step researcher's guide to the use of an AI-based transformer in epidemiology: an exploratory analysis of ChatGPT using the STROBE checklist for observational studies

Original Article
Open access
Published: 26 May 2023

(2023)
Cite this article

Download PDF

You have full access to this open access article

Journal of Public Health Aims and scope Submit manuscript

A step-by-step researcher's guide to the use of an AI-based transformer in epidemiology: an exploratory analysis of ChatGPT using the STROBE checklist for observational studies

Download PDF

Francesco Sanmarchi ORCID: orcid.org/0000-0001-8288-0563¹,
Andrea Bucci²,
Andrea Giovanni Nuzzolese³,
Gherardo Carullo⁴,
Fabrizio Toscano⁵,
Nicola Nante⁶ &
…
Davide Golinelli¹^nAff6

4073 Accesses
Explore all metrics

Abstract

Objective

This study aims at investigating how AI-based transformers can support researchers in designing and conducting an epidemiological study. To accomplish this, we used ChatGPT to reformulate the STROBE recommendations into a list of questions to be answered by the transformer itself. We then qualitatively evaluated the coherence and relevance of the transformer’s outputs.

Study design

Descriptive study.

Methods

We first chose a study to be used as a basis for the simulation. We then used ChatGPT to transform each STROBE checklist’s item into specific prompts. Each answer to the respective prompt was evaluated by independent researchers in terms of coherence and relevance.

Results

The mean scores assigned to each prompt were heterogeneous. On average, for the coherence domain, the overall mean score was 3.6 out of 5.0, and for relevance it was 3.3 out of 5.0. The lowest scores were assigned to items belonging to the Methods section of the checklist.

Conclusions

ChatGPT can be considered as a valuable support for researchers in conducting an epidemiological study, following internationally recognized guidelines and standards. It is crucial for the users to have knowledge on the subject and a critical mindset when evaluating the outputs. The potential benefits of AI in scientific research and publishing are undeniable, but it is crucial to address the risks, and the ethical and legal consequences associated with its use.

AI-based chatbots in customer service and their effects on user compliance

Article Open access 17 March 2020

Systematic review or scoping review? Guidance for authors when choosing between a systematic or scoping review approach

Article Open access 19 November 2018

Doing Reflexive Thematic Analysis

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Introduction

The scientific method is globally recognized as a systematic and logical approach to discovering new knowledge and understanding the natural world and is also the foundation of all scientific inquiry which is critical for the advancement of human knowledge (Gauch 2002). It involves posing a hypothesis, collecting data through observation and experimentation, and analyzing and interpreting the results to reach a conclusion. Hence it helps to ensure that research is objective, replicable, and based on empirical evidence. In the field of epidemiology, the scientific method is essential for conducting rigorous and transparent studies that can accurately address public health issues. As in many other types of research, in a typical epidemiologic study, the general flowchart of the scientific method process (Elm et al. 2008) includes identifying a research question, conducting a literature review, designing the study, collecting and analyzing data, and interpreting the results and implications.

Therefore, given the importance and complexity of this process, it appears essential to define shared rules, standards, and methodologies to guide the conduct of studies and mitigate the risk of misconduct or errors. In other words, it is crucial that epidemiological studies are conducted transparently and rigorously, following acknowledged guidelines. The importance of this has been underlined by many authors (Adami et al. 2011; Alba et al. 2020; Arroyave et al. 2021) and led to the development of the most widely used guidelines globally: the STrengthening the Reporting of OBservational studies in Epidemiology (STROBE) statement and guidelines. The STROBE guidelines provide a framework for the transparent reporting of observational studies, ensuring that the study's sections (introduction, methods, results, etc.) are accurately and thoroughly reported (Elm et al. 2008). Adhering to the STROBE guidelines can increase the reliability and reproducibility of the study, as well as facilitate the critical appraisal and synthesis of the evidence (Elm et al. 2008).

Artificial intelligence (AI) is a rapidly growing field with numerous applications in a variety of fields (Dwivedi et al. 2021). Among these, one where AI has the potential to make a significant impact is clinical research and epidemiology, where it could revolutionize the way these fields operate and significantly improve our understanding and management of public health and healthcare management issues. Both fields rely on the collection and analysis of large amounts of data, making them well-suited for the application of AI technologies (Thiébaut et al. 2018; Topol 2019). One aspect in which AI has yet to show its potential is the overall support for conducting epidemiological studies. Given the complexity of epidemiological studies, and the increasingly present need to speed up and streamline their execution—while preserving scientific quality and integrity—it appears necessary to study applications of AI in supporting the overall conduct of epidemiological studies, i.e., the support for study design, data analysis, and interpretation of results. Both the tools and the studies that test and validate them are scarce in this field, and some potentially revolutionary tools, namely AI-based transformers, have only recently been introduced (Castelvecchi 2022; Graham 2022; Stokel-Walker 2022).

ChatGPT and AI-based transformers

In recent years, techniques based on deep learning are driving the AI revolution. Among them, transformers (Abdel-Aty and Gould 2022; Dai et al. 2021) show prominent results in several applications. A transformer is a type of neural network architecture that was introduced in the paper “Attention is All You Need” by Vaswani and colleagues in 2017 (Vaswani et al. 2017). The transformer architecture is a type of deep learning model that adopts the mechanism of self-attention and is particularly adept at processing sequential data such as machine translation, language modeling, and text classification. The transformer model incorporates attention mechanisms to weigh the significance of various parts of the input and output sequences, enabling it to effectively capture long-range dependencies and generate more coherent text.

Chat-Generative Pre-training Transformer (ChatGPT) is a recently developed AI model which stands for “controlled hierarchical generative transformer” (Cahan and Treutlein 2023). It is a machine learning algorithm capable of generating high-quality scientific text (Brown et al. 2020) that was developed by a team of researchers at OpenAI and has received significant attention in the scientific community due to its impressive performance and potential applications. AI-based transformers, such as ChatGPT, may represent a significant advance in the field of AI and could have the potential to impact the way scientific research is conducted and communicated. However, there are also several potential drawbacks to their use for scientific research and scientific publishing, particularly in public health and epidemiology, and their use to support researchers in conducting epidemiological studies has not yet been studied.

Aim of the study

The aim of this study is to shed light and describe how AI-based transformer models can help a researcher in the various steps of an epidemiological study. To do this, we used the OpenAI ChatGPT to partially simulate an epidemiological study, reformulating the STROBE framework into a list of questions to be answered by the transformer itself. We then qualitatively evaluated the coherence and relevance of the transformer’s outputs and discussed the pros and cons of this innovation in the field of epidemiology.

Methods

In this exploratory study, we assessed how ChatGPT could support an epidemiological study. To do this, we first chose a study to be used as a basis for the simulation and extracted the research question and the specific aim. We then relied on the STROBE checklist as a basis for formulating the questions to be posed to ChatGPT. Specifically, we asked ChatGPT to transform each STROBE checklist item into a question/prompt for itself. We then asked the transformer to answer the questions, to provide support to the researcher in conducting the epidemiological study. Each specific answer was evaluated by three independent senior researchers in terms of coherence and relevance for the purposes of conducting the epidemiological study.

When formulating the various prompts, we used the selected study as a guideline (Subramanian and Kumar 2021). This study by Subramanian and Kumar (2021) analyzed the relationship between the percentage of the population fully vaccinated and new cases of COVID-19 in 68 countries. The study used publicly available COVID-19 data for cross-country analysis. We chose this type of study because of its relatively simple design and straightforward interpretation.

Starting from this article, we extracted the aim of the study, considering this as a purely human activity related to human curiosity to ask questions and to transform them into research questions. The aim of the study served as the basis for contextualizing the use of AI-based transformers and demonstrating their potential usefulness in supporting the conduct of an epidemiological study.

Using STROBE recommendations prospectively

The STROBE guidelines are a set of recommendations for reporting observational studies, including cohort, case–control, and cross-sectional studies. These guidelines provide a framework for transparent and complete reporting of the design, conduct, and results of an observational study, with the aim of improving the quality and transparency of the research and minimizing biases and confounding factors which can affect the validity and reliability of the results.

The STROBE guidelines can also be used prospectively when conducting an epidemiologic study. By following the STROBE guidelines during the design and conduct of the study, researchers can ensure that the study is conducted in a transparent and rigorous manner and that the results of the study are accurately and clearly reported.

The STROBE guidelines provide specific recommendations for the reporting of various aspects of the study, such as the study design, the sampling and recruitment of participants, the measurement of exposures and outcomes, and the analysis and interpretation of the results.

ChatGPT

GPT-3 (Generative Pre-trained Transformer) is a third-generation, autoregressive language model that uses deep learning to produce human-like text developed by OpenAI. It is designed to generate human-like text and engage in conversation with users naturally and intuitively. To train the GPT model, researchers at OpenAI fed it a large dataset of human-generated text and utilized the model to predict the next word in the sequence. The model was then fine-tuned using backpropagation, a technique that involves adjusting the model's weights and biases to minimize the error between the predicted output and the actual output.

After the model was trained, it was able to generate human-like text by predicting the next word in a sequence based on the context provided by the previous words. The model can be fine-tuned for specific tasks, such as translation or question-answering, by training it on a dataset specific to that task.

It is worth noting that ChatGPT, a variant of GPT specifically designed for generating human-like text in a conversational context such as chatbots or virtual assistants, is not the same as GPT. While ChatGPT is trained on a dataset of human conversation and is able to generate text that is appropriate for use in a chat or messaging context, the underlying technology and principles of how ChatGPT works are the same as those of GPT.

One of the key features of ChatGPT is its ability to generate text that is coherent and follows the structure and style of scientific articles. It has been trained on large datasets by using 175 billion parameters on Microsoft’s Azure AI supercomputer and has learned to generate text that is similar to human-written scientific text in terms of grammar, style, and content. This makes it a valuable tool for researchers who need to generate large amounts of scientific text, such as descriptions of experimental methods and results, summaries of research findings, or review articles (Radford et al. 2019).

The use of ChatGPT for supporting the conduction of the study

First, we identified the purpose of the study as follows: [To examine the relationship between the percentage of population fully vaccinated and the incidence of new COVID-19 cases in a given population, and to assess the impact of vaccination on the incidence of COVID-19 cases.]

To maximize the use of ChatGPT, we adopted the STROBE framework, translating its main items/recommendations into prospective questions for the AI transformer. To standardize the use of the transformer as much as possible, we followed the process listed below:

STROBE item > question/prompt for ChatGPT > adaptation to the simulated research questions > final answer

For each STROBE recommendation, we asked the transformer (on December 19–23, 2022) the following question:

“Transform the following recommendation from the STROBE (Strengthening the Reporting of Observational Studies in Epidemiology) guidelines into a question for ChatGPT. Recommendation: [STROBE recommendation]”

This provided us Answer#1. Then, the next step was the specific question:

“[Answer#1]+[The study has the following aim: To examine the relationship between the percentage of population fully vaccinated and the incidence of new COVID-19 cases in a given population, and to assess the impact of vaccination on the incidence of COVID-19 cases]

Expert assessment

We assessed the outputs of the transformer by using human annotators to rate the quality and relevance of the model's responses. This was accomplished through a process called “human evaluation,” in which annotators (NN, FT, AB) were asked to rate the responses produced by the model on a 1-to-5-point Likert scale. The annotators were asked to consider factors such as the coherence of the response and its relevance to the prompt.

Given that there are no standardized questions that are specifically designed to evaluate the quality and relevance of the responses produced by ChatGPT or any other language model in terms of coherence and relevance, we evaluated the outputs by asking the expert annotators to consider the following questions:

1.
Is the response coherent and well-formed, or is it difficult to understand? (where 5 is the maximum level of coherence)
2.
Is the response relevant and informative, or is it lacking in substance? (where 5 is the maximum level of relevance)

For each STROBE recommendation, we formulated a prompt for ChatGPT contextualizing it with the aim of the study by Subramanian and Kumar (2021). Each response to the prompts was then evaluated by domain experts to assess its consistency and relevance.

Results

The STROBE checklist includes a list of 35 recommendations (Table S1). From these recommendations we derived 35 specific questions for ChatGPT. Three domain experts evaluated the answers independently using two coherence and relevance scores (Table 1). The mean scores were heterogeneous. On average, for the coherence scores, the overall mean score was 3.6 out of 5.0, and for relevance it was 3.3 out of 5.0.

Table 1 Final Q&A and respective scores

Full size table

For example, item #9 (“What should be included in the definition of outcomes, exposures, predictors, potential confounders, and effect modifiers in an observational study, and how should diagnostic criteria be provided, if applicable, according to the STROBE guidelines?”), in the Methods section, received a very satisfactory response in terms of both coherence (5.0/5.0) and relevance (5.0/5.0); therefore, the responses given were rated very positively (“According to the STROBE guidelines, in order to clearly define all outcomes, exposures, predictors, potential confounders, and effect modifiers in your observational study, you should provide detailed descriptions of each of these components. Outcomes: In this study, the outcome would be the incidence of new COVID-19 cases in a given population. You should specify how you will measure the incidence of COVID-19 cases (e.g., number of new cases per 100,000 population), and provide any relevant diagnostic criteria for COVID-19 (e.g., symptoms, test results). Exposures: The exposure in this study would be the percentage of the population that is fully vaccinated. You should specify how you will measure the percentage of the population that is fully vaccinated (e.g., number of individuals who have received both doses of the vaccine), and provide any relevant information on the specific vaccine(s) used. Predictors: The predictor in this study would be the percentage of the population that is fully vaccinated. [...]”.

Similarly, the answer to item #10, Methods section, was also rated very positively (“What should be included in the description of sources of data and methods of assessment for each variable of interest in an observational study, and how should comparability of assessment methods be described if there is more than one group, according to the STROBE guidelines?”), and received a satisfactory response in terms of both coherence (4.7/5.0) and relevance (4.7/5.0). Here, too, in the opinion of the evaluators, the transformer response was complete and adequate (“According to the STROBE guidelines, in order to provide a clear and transparent description of the sources of data and methods of assessment for each variable of interest in your observational study, you should include the following information: For each variable of interest (e.g., percentage of population fully vaccinated, incidence of new COVID-19 cases), specify the source of data (e.g., administrative records, survey data). For each variable of interest, describe the method of assessment in detail. For example, if you are using administrative records to measure the percentage of population fully vaccinated, you should describe how the records were collected and how you will extract the relevant information. If you are using survey data to measure the incidence of new COVID-19 cases, you should describe the survey instrument and how you will collect and analyze the data. If there is more than one group in your study (e.g., different populations or time periods), describe the comparability of assessment methods for each variable of interest. For example, if you are comparing the percentage of population fully vaccinated in two different populations, you should describe whether the vaccination programs were similar in the two populations and whether the data sources and methods of assessment were comparable. [...]”.

In contrast, some responses received particularly negative ratings from domain experts. For example, the response to item/recommendation #14, Methods section, received negative comments in terms of both coherence and relevance (2.3/5.0, and 1.7/5.0, respectively). The question was: “What statistical methods should be described, including those used to control for confounding, according to the STROBE guidelines?”. And the answer (“According to the STROBE guidelines, you should describe all statistical methods that you used in your study to examine the relationship between vaccination and COVID-19 incidence, including any methods used to control for confounding factors. This may include methods such as multivariate regression analysis or stratified analysis, depending on the specific design and analysis of your study. It is important to provide a clear and detailed description of your statistical methods in order to allow readers to understand and evaluate the validity and reliability of your findings.”) was not deemed adequate in terms of either coherence or relevance.

Similarly, for item #18 (“What should be included in the report of an observational study in order to describe sensitivity analyses according to the STROBE guidelines?”), the answer given was also not considered adequate by the evaluators, giving mean scores of 2.3/5.0 and 2.0/5.0.

Discussion

In this study, we used an open AI-based transformer, namely ChatGPT (Gordijn and Have 2023; Graham 2023), to reformulate and answer the STROBE recommendations to support the conduction of an epidemiological study. We then assessed the coherence and relevance of the transformer’s outputs. We used an observational study that analyzes publicly available data to investigate the relationship between the percentage of population fully vaccinated and new COVID-19 cases. It is important to note that we started from the premise that while AI can potentially assist in reproducing a study, the research question, the aim of the study, and all the aspects of originality should remain the sole domain of humans. Therefore, we began our simulation by using a research question and aim of the study devised by a human.

After ChatGPT had answered to all the STROBE recommendations, we asked independent experts to qualitatively evaluate the coherence and relevance of the transformer’s outputs. In the last few months, these innovative systems have proven to be fast and intuitive and an important support for researchers. However, the coherence, relevance, and even the correctness of their answers is not always clear, and there are currently no studies, to our knowledge, that critically analyze the use of these tools in epidemiological research (O’Connor and ChatGPT 2023). The choice of the STROBE checklist and the method of turning its recommendations into questions for ChatGPT represents an attempt to standardize and make our analysis as reproducible as possible. In fact, by following the STROBE guidelines, researchers can help ensure that their results are accurately and clearly reported, which can help other researchers and policymakers understand and interpret the results of the study. This is particularly important in the field of epidemiology, as the results of epidemiological studies can have important implications for public health policy and practice.

From the results of our analysis, we found that ChatGPT can be a valuable support for researchers, both experienced and inexperienced, in setting up an epidemiological study, particularly observational, following internationally recognized guidelines and standards. The average score attributed by experts to the responses given by the AI-based transformer was above 3.5 over 5, but with a fair amount of variability among responses to individual outputs. This is consistent with suggestions and preliminary analyses in the limited literature available on these tools (Castelvecchi 2022; Graham 2022; Huh 2023; Stokel-Walker 2022), which indicate that AI-based transformers are beginning to be used but with some skepticism, partly due to their suboptimal ability to respond adequately, in terms of consistency and relevance, to questions posed by domain expert researchers. Therefore, it seems essential that users of these tools have knowledge about the subject matter and a critical mindset when evaluating their outputs. Blindly accepting the answers of these tools may still be too great a risk to the integrity of science and thus human society.

It is also interesting to note that the researchers rated some ChatGPT outputs very positively and others very negatively. Notably, both the responses rated very positively and those rated very negatively were in the methods part. This suggests that, when properly queried, the transformer may provide more appropriate answers in the methodological and data analysis and reporting domains, which were found to have the highest level of consistency and relevance.

Even in the introduction part, this AI-based transformer can be useful in placing a study in context. As literature has shown (O’Connor and ChatGPT 2023), these systems can function as true “scientific writers” and can quickly and adequately contribute to the writing of entire paragraphs. However, our study found that the responses to STROBE recommendations related to the introduction/background and discussion/interpretation paragraphs of scientific articles did not receive high scores, indicating that the tool is not yet fully adequate to support (or potentially replace) researchers in this part of scientific articles, particularly in epidemiological studies. A possible explanation for this drop in performance may be related to the ChatGPT training data. Specifically, the AI-based transformer was not specifically trained on scientific articles. This issue may be partially addressed by specific training or fine-tuning (Brown et al. 2020; Raffel et al. 2020).

Our analysis showed the ability of AI-based transformers to generate answers and human-like text, which could potentially be used to conduct epidemiological studies or write research articles. This could potentially save time and resources for researchers, as these activities can be time-consuming and labor-intensive.

However, there are also several potential drawbacks to the use of AI-based transformers for scientific research (King and ChatGPT 2023; O’Connor and ChatGPT 2023). One concern is the risk of bias in the data that are used to train these systems. If the data used to train the AI system are biased, the system may produce biased results or make biased recommendations. This could have serious consequences, particularly in the field of public health and epidemiology, where decisions about interventions and policies may be based on the results of research.

Another concern that is often reported in the critical analysis of these systems (Else 2023; Graham 2023), is the potential for AI-based transformers to replace human researchers, potentially leading to job loss and the devaluation of human expertise (Graham 2022). However, our assessment suggests a possible shift in the “study-experiment-analysis-results-writing-publication” paradigm. It seems more likely that, with the support of these systems, we will move towards a model where the researcher can focus more on the study and experimental phase than on methodological problems and questions.

Another risk associated with the use of AI in scientific research is the potential for fraudulent manipulation of large-scale scientific publications. An example of this is related to the scientific publisher IOP, which retracted as many as 850 articles in 2022 after a researcher at the University of Cambridge in the United Kingdom discovered that many of them contained nonsensical phrases produced by artificial intelligence programs trying to avoid anti-plagiarism software. These phrases were repeated a staggering number of times in different articles from different groups. When IOP began investigating, the editor found other similarities that suggested the articles came from a so-called scientific paper mill, a company that produces and sells pseudo-articles for a fee.

The use of AI-based solutions capable of generating text in scientific research also brings with it many legal implications. However, it is beyond the scope of this article to address in detail each and every possible legal issue that may arise when using AI tools to generate text to be published in scientific studies. For our purposes, we can highlight that one of the main concerns is the issue of plagiarism and copyright infringement. AI-generated text may be similar or identical to existing copyrighted material, which could lead to legal issues for researchers and their institutions. It is important for researchers to be aware of these potential issues and take steps to ensure that their work does not violate any copyright laws. Using anti-plagiarism software to evaluate the generated text can be a helpful tool in reducing the risk of publishing non-original text.

We look forward to the development and refinement of these systems, which will be helpful in preserving scientific integrity and ethics and will be of great assistance to publishers and other stakeholders. These software programs are designed to detect similarities between a given text and existing material. They can identify potential instances of plagiarism or copyright infringement, allowing researchers to make necessary changes before publishing their work. It is important to note, however, that anti-plagiarism software may not catch all instances of plagiarism. Therefore, it is important for researchers to also use their own judgment and to carefully review the AI-generated text for any potential instance of plagiarism or copyright infringement.

Another legal implication in using AI-based solutions for text generation in scientific research is the accuracy of the information. As mentioned earlier, AI-generated text may contain errors or inaccuracies that can have serious consequences for the scientific community. For example, if an AI-generated article is published in a reputable journal and it contains inaccurate information, it could lead to other researchers basing their work on flawed data. Therefore, it is crucial for researchers to thoroughly check that the final product is sound and reliable.

In short, it is important to stress that even if AI tools are used to generate text, the responsibility for the research still lies with the humans to whom the work is attributed. This means that researchers and their institutions are responsible for ensuring that their work is legally compliant and that it does not infringe on any copyright laws or contain any inaccuracies. This highlights the need for careful consideration and adherence to ethical and legal guidelines when utilizing AI in scientific research.

Limits of the study

It is important to note that, being an ML algorithm, ChatGPT's answers may differ if re-run after some time. This represents a limitation in the reproducibility of the results of our study.

Furthermore, there is no single way to formulate questions to the transformer; different questions, even on the same topic, produce different answers, which may be more or less consistent and relevant to the objective of the analysis, representing a limitation of the study. However, in the context of the present study, we tried to standardize this process as much as possible by starting with STROBE recommendations and directly asking the AI-based transformer to adapt them into prompts for itself. This may have at least partially mitigated the risk of variability.

Conclusions

In this exploratory study, we evaluated the coherence and relevance of an open AI-based transformer’s answers regarding questions extracted and reformulated from the STROBE recommendations for observational studies.

From our assessment, the transformer can be considered as a valuable support for researchers, both experienced and inexperienced, in setting up an epidemiological study, following internationally recognized guidelines and standards. The average score attributed by experts to the responses given by the AI-based transformer was high, but with a fair amount of variability among responses to individual outputs.

Responses related to the methods, data analysis, and reporting domains and recommendations were found to have the highest level of coherence and relevance, emphasizing that these innovative tools may be useful primarily for methodological and statistical support, rather than for building the conceptual frameworks of the study and the narrative structure of the paper.

Therefore, it seems essential that users of these tools have knowledge about the subject matter and a critical mindset when evaluating their outputs. Blindly accepting the answers of these tools may pose a serious threat to the ethics and integrity of scientific research, including epidemiology. Accordingly, we advocate the necessity for taking a proactive stance towards this evolving context, in order to ensure careful governance of this inevitable process. Maintaining rigor and reliability of scientific research implies establishing a shared, and as much as possible, standardized method for conducting studies, analyzing data, and reporting its results. The potential benefits of AI in scientific research are undeniable, but it is crucial that we address the risks, and the ethical and legal consequences associated with its use. This requires a collaborative effort between researchers, policymakers, and other stakeholders to define guidelines and best practices—and monitor their fulfillment—for the ethically and legally correct use of AI in scientific research.

Data availability

The datasets generated and/or analyzed during the current study are available from the corresponding author on reasonable request.

References

Abdel-Aty H, Gould IR (2022) Large-Scale Distributed Training of Transformers for Chemical Fingerprinting. J Chem Inf Model 62(20):4852–4862. https://doi.org/10.1021/acs.jcim.2c00715
Article CAS PubMed PubMed Central Google Scholar
Adami HO, Berry SCL, Breckenridge CB, Smith LL, Swenberg JA, Trichopoulos D, Weiss NS, Pastoor TP (2011) Toxicology and Epidemiology: Improving the Science with a Framework for Combining Toxicological and Epidemiological Evidence to Establish Causal Inference. Toxicol Sci 122(2):223–234. https://doi.org/10.1093/toxsci/kfr113
Article CAS PubMed PubMed Central Google Scholar
Alba S, Verdonck K, Lenglet A, Rumisha SF, Wienia M, Teunissen I, Straetemans M, Mendoza W, Jeannetot D, Weibel D, Mayanja-Kizza H, Juvekar S (2020) Bridging research integrity and global health epidemiology (BRIDGE) statement: Guidelines for good epidemiological practice. BMJ Glob Health 5(10):e003236. https://doi.org/10.1136/bmjgh-2020-003236
Article PubMed PubMed Central Google Scholar
Arroyave WD, Mehta SS, Guha N, Schwingl P, Taylor KW, Glenn B, Radke EG, Vilahur N, Carreón T, Nachman RM, Lunn RM (2021) Challenges and recommendations on the conduct of systematic reviews of observational epidemiologic studies in environmental and occupational health. J Exposure Sci Environ Epidemiol 31(1):21–30. https://doi.org/10.1038/s41370-020-0228-0
Article Google Scholar
Brown T, Mann B, Ryder N et al (2020) Language Models are Few-Shot Learners. Adv Neural Inf Proces Syst 33:1877–1901 https://papers.nips.cc/paper/2020/hash/1457c0d6bfcb4967418bfb8ac142f64a-Abstract.html
Google Scholar
Cahan P, Treutlein B (2023) A conversation with ChatGPT on the role of computational systems biology in stem cell research. Stem Cell Reports 18(1):1–2. https://doi.org/10.1016/j.stemcr.2022.12.009
Article CAS PubMed PubMed Central Google Scholar
Castelvecchi D (2022) Are ChatGPT and AlphaCode going to replace programmers? Nature. https://doi.org/10.1038/d41586-022-04383-z
Dai Y, Gao Y, Liu F (2021) TransMed: Transformers Advance Multi-Modal Medical Image Classification. Diagnostics 11(8):8. https://doi.org/10.3390/diagnostics11081384
Article Google Scholar
Dwivedi YK, Hughes L, Ismagilova E et al (2021) Artificial Intelligence (AI): Multidisciplinary perspectives on emerging challenges, opportunities, and agenda for research, practice and policy. Int J Inf Manag 57:101994. https://doi.org/10.1016/j.ijinfomgt.2019.08.002
Article Google Scholar
Else H (2023) Abstracts written by ChatGPT fool scientists. Nature 613(7944):423–423. https://doi.org/10.1038/d41586-023-00056-7
Article CAS PubMed Google Scholar
Gauch HG Jr (2002) Scientific Method in Practice. Cambridge University Press. https://doi.org/10.1017/CBO9780511815034
Gordijn B, Have H (2023) ChatGPT: Evolution or revolution? Med Health Care Philos. https://doi.org/10.1007/s11019-023-10136-0
Graham F (2022) Daily briefing: Will ChatGPT kill the essay assignment? Nature. https://doi.org/10.1038/d41586-022-04437-2
Graham F (2023) Daily briefing: ChatGPT listed as author on research papers. Nature. https://doi.org/10.1038/d41586-023-00188-w
Huh S (2023) Are ChatGPT’s knowledge and interpretation ability comparable to those of medical students in Korea for taking a parasitology examination?: A descriptive study. J Educ Eval Health Professions 20:1. https://doi.org/10.3352/jeehp.2023.20.01
Article Google Scholar
King MR, chatGPT. (2023) A Conversation on Artificial Intelligence, Chatbots, and Plagiarism in Higher Education. Cell Mol Bioeng 16(1):1–2. https://doi.org/10.1007/s12195-022-00754-8
Article PubMed Google Scholar
O’Connor S, ChatGPT (2023) Open artificial intelligence platforms in nursing education: Tools for academic progress or abuse? Nurse Educ Pract 66:103537. https://doi.org/10.1016/j.nepr.2022.103537
Article PubMed Google Scholar
Radford A, Wu J, Child R, Luan D, Amodei D, Sutskever I (2019) Language Models are Unsupervised Multitask Learners. https://www.semanticscholar.org/paper/Language-Models-are-Unsupervised-Multitask-Learners-Radford-Wu/9405cc0d6169988371b2755e573cc28650d14dfe
Raffel C, Shazeer N, Roberts A, Lee K, Narang S, Matena M, Zhou Y, Li W, Liu PJ (2020) Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer. J Mach Learn Res 21(140):1–67
Google Scholar
Stokel-Walker C (2022) AI bot ChatGPT writes smart essays—Should professors worry? Nature. https://doi.org/10.1038/d41586-022-04397-7
Subramanian SV, Kumar A (2021) Increases in COVID-19 are unrelated to levels of vaccination across 68 countries and 2947 counties in the United States. Eur J Epidemiol 36(12):1237–1240. https://doi.org/10.1007/s10654-021-00808-7
Article CAS PubMed PubMed Central Google Scholar
Thiébaut R, Thiessard F, Section Editors for the IMIA Yearbook Section on Public Health and Epidemiology Informatics (2018) Artificial Intelligence in Public Health and Epidemiology. Yearbook Med Inform 27(01):207–210. https://doi.org/10.1055/s-0038-1667082
Article Google Scholar
Topol EJ (2019) High-performance medicine: The convergence of human and artificial intelligence. Nat Med 25(1):1. https://doi.org/10.1038/s41591-018-0300-7
Article CAS Google Scholar
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is All you Need. Adv Neural Inf Proces Syst 30 https://papers.nips.cc/paper/2017/hash/3f5ee243547dee91fbd053c1c4a845aa-Abstract.html
von Elm E, Altman DG, Egger M, Pocock SJ, Gøtzsche PC, Vandenbroucke JP (2008) The Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) statement: Guidelines for reporting observational studies. J Clin Epidemiol 61(4):344–349. https://doi.org/10.1016/j.jclinepi.2007.11.008
Article Google Scholar

Download references

Acknowledgments

This article was human-made. No AI was involved in its conceptualization, writing, or reporting. AI was involved where declared in the methods section.

Funding

Open access funding provided by Alma Mater Studiorum - Università di Bologna within the CRUI-CARE Agreement. This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.

Author information

Davide Golinelli
Present address: Department of Molecular and Developmental Medicine, University of Siena, Siena, Italy

Authors and Affiliations

Department of Biomedical and Neuromotor Sciences, Alma Mater Studiorum – University of Bologna, Via San Giacomo 12, 40126, Bologna, Italy
Francesco Sanmarchi & Davide Golinelli
Department of Economics and Law, University of Macerata, Macerata, Italy
Andrea Bucci
STLab, Institute for Cognitive Sciences and Technologies (ISTC)-CNR, Roma, Italy
Andrea Giovanni Nuzzolese
Department of Italian and Supranational Public Law, University of Milan, Milan, Italy
Gherardo Carullo
Montefiore Medical Center, Bronx, NY, USA
Fabrizio Toscano
Department of Molecular and Developmental Medicine, University of Siena, Siena, Italy
Nicola Nante

Authors

Francesco Sanmarchi
View author publications
You can also search for this author in PubMed Google Scholar
Andrea Bucci
View author publications
You can also search for this author in PubMed Google Scholar
Andrea Giovanni Nuzzolese
View author publications
You can also search for this author in PubMed Google Scholar
Gherardo Carullo
View author publications
You can also search for this author in PubMed Google Scholar
Fabrizio Toscano
View author publications
You can also search for this author in PubMed Google Scholar
Nicola Nante
View author publications
You can also search for this author in PubMed Google Scholar
Davide Golinelli
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Conceptualization, F.S., A.B, and D.G.; methodology, A.G.N., D.G., and F.S; investigation, D.G., A.G.N., A.B., F.T., and F.S.; resources, N.N. and D.G..; data curation, A.G.N., A.B., and F.S; writing—original draft preparation, D.G., A.G.N., F.S., A.B., G.C, and F.T.; writing—review and editing, N.N., D.G., F.T., and G.C.; supervision, N.N. and D.G. All authors have read and agreed to the published version of the manuscript.

Corresponding author

Correspondence to Francesco Sanmarchi.

Ethics declarations

Ethics approval

Not needed.

Consent to participate

Not needed.

Consent for publication

Not needed.

Competing interests

The authors report no conflict of interest.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Table 2

Table 2 From STROBE items/recommendation to ChatGPT questions

Full size table

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Sanmarchi, F., Bucci, A., Nuzzolese, A.G. et al. A step-by-step researcher's guide to the use of an AI-based transformer in epidemiology: an exploratory analysis of ChatGPT using the STROBE checklist for observational studies. J Public Health (Berl.) (2023). https://doi.org/10.1007/s10389-023-01936-y

Download citation

Received: 16 February 2023
Accepted: 03 May 2023
Published: 26 May 2023
DOI: https://doi.org/10.1007/s10389-023-01936-y

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

A step-by-step researcher's guide to the use of an AI-based transformer in epidemiology: an exploratory analysis of ChatGPT using the STROBE checklist for observational studies

Abstract

Objective

Study design

Methods

Results

Conclusions

Similar content being viewed by others

AI-based chatbots in customer service and their effects on user compliance

Systematic review or scoping review? Guidance for authors when choosing between a systematic or scoping review approach

Doing Reflexive Thematic Analysis

Introduction

ChatGPT and AI-based transformers

Aim of the study

Methods

Using STROBE recommendations prospectively

ChatGPT

The use of ChatGPT for supporting the conduction of the study

Expert assessment

Results

Discussion

Limits of the study

Conclusions

Data availability

References

Acknowledgments

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Ethics approval

Consent to participate

Consent for publication

Competing interests

Additional information

Publisher’s note

Appendix

Appendix

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation