FormalPara Key Points for Decision Makers

A combination of desk research and interviews identified a heterogenous set of signs, symptoms, and impacts of COVID-19, as well as impacts associated with the pandemic overall.

An item bank has been developed to measure signs and symptoms, their associated severity, and disease-related and pandemic-related impacts.

The item bank contains 55 short-term and long-term signs and symptom items, 26 items assessing disease-related impacts, and seven items evaluating pandemic-related impacts that can be individually selected based on research needs.

1 Introduction

On 1st July, 2021, about 18 months after a “pneumonia of unknown cause” was first reported to the World Health Organization (WHO), there were 6055 studies on severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2 [COVID-19]) registered on clinicaltrials.gov. Over half (n = 3394; 56%) were interventional studies of potential treatments/vaccines for COVID-19 and nearly 1000 of these (n = 996) were sponsored by the biopharmaceutical industry. Most of these studies had primary endpoints related to viral load/titers/antibodies; adverse effects of, or reactions to, treatment; or changes in “clinical status.”

One of the most widely used clinical status scales was the WHO Ordinal Scale for Clinical Improvement [1]. This categorizes people from “death” (score of 8) to “uninfected” (score of 0; no clinical/virological evidence of infection). There are three interim levels; hospitalized with severe disease (score of 5–7 depending on the amount of ventilation/organ support required), hospitalized with mild disease (score of 3–4 depending on the need for oxygen therapy), and ambulatory (score of 1–2; not hospitalized). Ambulatory patients are scored as a 1 if they have no limitations of activities of daily living (ADL) and as a 2 if they have limitations of ADL. However, COVID-19 can be a highly symptomatic condition, which significantly affects quality of life beyond ADL limitations [2]. Symptoms often continue long term even when the clinical infection has resolved. As such, these clinical status categories may not always reflect what is meaningful and relevant to the patients’ experiences of COVID-19 nor be an appropriate indicator of the health status of people with COVID-19 infections (current or resolved). Only a minority of industry-sponsored studies in clinicaltrials.gov (July, 2021) had endpoints related to symptoms (n = 243; 24%) and even fewer measured functioning (n = 82), quality of life/perceived health status (n = 31), or ADL as a stand-alone endpoint (n = 23). Given the central importance of symptoms and functional limitations for patients with COVID-19, the field is in need of a comprehensive and compelling measure of the patient’s experience.

Symptoms and impacts of COVID-19 and perceived health status and quality of life are most appropriately evaluated by asking patients to provide information about their experiences. Patient-reported outcome (PRO) questionnaires offer a standardized approach to generate such data. Well-defined and ‘fit for purpose’ PRO questionnaires can provide reliable, valid, and interpretable information that can aid an epidemiological understanding of COVID-19 and can be used to assess the direct impact of COVID-19 vaccines and/or treatments.

Numerous PRO questionnaires have been developed to measure symptoms and/or impacts of COVID-19 [3,4,5,6], but their rapid implementation in fast-recruiting clinical trials has meant that many of these lack content validity, i.e., qualitative evidence that the questionnaire comprehensively measures relevant and important concepts for the intended population and use. Content validity is an essential pre-requisite for the generation of reliable, valid, and interpretable information from a PRO item bank or questionnaire [7]. Further, current PRO questionnaires may not include a comprehensive listing of COVID-19 symptoms. For example, the WHO Clinical Progression Scale, proposed by the WHO Working Group on the Clinical Characterization and Management of COVID-19 infection for use in future COVID-19 clinical trials [8], distinguishes between “symptomatic” and “asymptomatic” patients through assessment of only three symptoms; fever, vital signs, and cough. Jin et al. [9] propose the additional measurement of fatigue, shortness of breath, diarrhea, and body pain as prevalent symptoms. Work has been undertaken to develop a larger questionnaire to measure prevalent symptoms during COVID-19 screening [10], including fever, cough, headache, myalgia, and loss of smell, and the US Food and Drug Administration (FDA) has encouraged routine measurement of less prevalent but patient-relevant symptoms such as sore throat, runny nose, nausea/vomiting, and loss of taste [11]. However, an even broader range of symptoms present heterogeneously among people with COVID-19 in the short and long term, including rhinorrhea, anorexia, and hemoptysis; and a range of central nervous system, cardiovascular/thrombotic, and skin-related symptoms [12,13,14,15,16]. This full range of signs and symptoms are not captured with existing PRO questionnaires. Research has also been undertaken to measure functional limitations associated with COVID-19 (e.g., Klok et al. [6]), but not in the same scale as the symptoms. The FDA has recommended symptom resolution as an endpoint in COVID-19 trials [11], making it essential to have a complete understanding of the symptoms associated with infection and the impacts of these symptoms on daily life.

Given that no PRO questionnaire has sought to measure the heterogenous nature of symptoms experienced by people with COVID-19 and their impacts on people’s feelings, functioning, and quality of life, a comprehensive approach to capturing the patient’s experience is warranted. This will be best addressed with a comprehensive PRO “item bank” of COVID-19 symptoms and the impacts of a COVID-19 infection and the pandemic on peoples’ daily lives. An item bank is a collection of questions that allows different components to be utilized in a non-uniform manner, offering researchers an opportunity to be both consistent and tailored in their approach to measurement [17]. An item bank is more appealing than a static questionnaire as researchers can generate multiple customized versions of questionnaires that are fit for purpose to specific contexts of use; including the epidemiological assessment of new variants/mutations and the clinical evaluation of treatment for COVID-19 and vaccines against COVID-19, all of which may need different PRO measurement strategies. Questions can then be selected from the item bank to meet the specific objectives of the study.

This paper describes the development and content validity testing of PRO item banks containing questions with daily and weekly recall periods related to the prevalence/incidence, severity and impact of COVID-19 signs/symptoms among people with infection, and the impact of the pandemic on people with and without infection. Development and testing of the content validity of the item bank comprised three steps; a review of the medical literature and a review of patient reports on social media, and interviews with people who had lived with symptomatic COVID-19 infections.

2 Methods

2.1 Literature Review and Social Media Listening

A targeted literature review was conducted at the start of the pandemic to identify publications describing the patient experience of COVID-19 in terms of signs, symptoms, and impacts as well as the impact of the pandemic in general. Searches were conducted in PubMed and Google Scholar. Social media listening was also conducted through Brandwatch, a social listening tool that extracts from sources that include Twitter, Reddit, and Facebook, to incorporate additional signs and symptoms reported by people posting about personal experience (‘patients’) or the experience of loved ones (‘caregivers’). The search assessed data in the United States (US) from December 2019 to June 2020. The data included posted comments from patients and caregivers indicating that them or their family members had experienced (patients) or observed (caregivers) COVID-19. Search terms included terms related to COVID-19 (such as “COVID,” “Coronavirus,” “Corona,” “SARS-CoV-2”), treatment (such as “Hospitalized,” “Redemsivir”), symptoms (such as “Shortness of breath,” “Cough”), and diagnosis (such as “Antigen test,” “PCR test”). A full list of search terms is shown in the Appendix in the Electronic Supplementary Material. Data were extracted and analyzed using a combination of NLP-enabled content analysis, data tagging, and manual analysis (https://www.brandwatch.com/use-cases/market-research/).

2.2 PRO Item Bank Creation

A preliminary conceptual model of COVID-19 was initially developed from the literature review and social media research to organize the various signs, symptoms, and impacts identified. This was the basis for the initial PRO item bank, which contained items assessing the prevalence and severity of 43 COVID-19 signs and symptoms. These assessed various symptomatic experiences such as lower respiratory, upper respiratory, ophthalmic, dermatologic, gastrointestinal, neurological, pain, and overall experience. Items were drafted with both daily and weekly recalls for testing in interviews for relevance and understandability. A further 32 items were drafted with a weekly recall period to assess impacts due to COVID-19 symptoms and an additional seven items to assess the general impact of the pandemic on peoples’ lives.

2.3 Patient Interviews

Between June 2020 and October 2020, hybrid concept elicitation (CE) and cognitive debriefing (CDI) interviews were conducted, by multiple trained interviewers, with 20 adult participants living in the US who had previously tested positive for COVID-19 and experienced signs/symptoms. A variety of disease experiences were included and patients could be hospitalized or treated in an outpatient setting. Interviews were conducted over the telephone by a trained moderator and lasted approximately 60 minutes. Participants provided written informed consent and interviews were conducted in accordance with the 1975 Declaration of Helsinki and the regulations of the US FDA. The interview study and all materials received approval from Advarra Institutional Review Board (IRB) in May 2020.

Participants were recruited from the US via two separate recruiting services, Dynata, an online market research firm, and the Survivor Corps Facebook COVID-19-specific support group. Screening criteria was set such that participants were aged 18 years and older, received a positive COVID-19 test, and experienced symptoms (all self-reported).

The CE portion of the interviews was conducted first, in which participants were asked to describe their experiences of COVID; including the symptoms and signs of infection that they experienced, the way that the infection impacted their daily lives, and the lives of their families. When interviewing participants who experience long-term symptoms (also known as “long COVID”; n = 13) the participants were asked to describe their experience in the acute phase of their infection followed by a description of their continued experience. Interviews were recorded and relevant data were compiled in an Excel grid for analysis. Recordings were referenced following the interviews to confirm patient responses. During the CDI portion of the interviews, participants were presented with the draft instructions, items, and response scales and asked about their relevance and clarity. Before the interview, patients were provided a link to a screen sharing platform where they were able to see the interviewer’s screen and respond and provide feedback to the different items. The patients reviewed screenshots from the item bank in ePRO format using WebEx. Participants were also asked about the comprehensiveness of the overall item bank.

Interviews were conducted in two waves (n = 10 per wave) allowing the item bank to be adjusted (items added or removed; wording changed for clarity and patient-friendliness) for wave 2 based on the CE and CDI findings in wave 1. Patients were invited to read each question and confirm its relevancy and explain what the question meant to them in their own words. A revised item bank was then tested in the second wave.

2.4 Conceptual Model Finalization and PRO Item Bank Finalization

The preliminary conceptual model was refined and finalized following participant interviews. This final conceptual model formed the basis for the finalization of the PRO item bank.

3 Results

3.1 Literature Review and Social Media Listening

This research study was conducted early in the pandemic; therefore, the literature review and social media listening were limited by the information available at that time. The literature review identified 30 articles. These articles, mainly providing a clinical perspective on COVID-19, described 25 different signs and symptoms. Fever (estimated prevalence range 72–98%), dry cough (46–82%), fatigue (11–75%), and a reduced sense of smell (30–67%) were reported in the majority of papers. The list of references included in the literature review can be found in the Appendix in the Electronic Supplementary Material.

Keywords associated with COVID-19 and related terms identified 200 million mentions on social media. Chest tightness, sweating/perspiration, dry skin/chapped lips, brittle/dry/frizzy hair, dizziness, and vertigo were identified. The most commonly reported signs and symptoms are shown in Table 1.

Table 1 Reported signs and symptoms of coronavirus disease 2019

3.2 Interviews

Twenty adults participated in the hybrid CE/CDI interviews. Participant ages ranged from 26 to 68 years and the mean age was 50.6 years. Specific demographic data are included in Table 2. Two subgroups of participants can be defined based on their description of their symptom experience over time. Seven participants reported testing positive for COVID-19 and then experiencing a variety of signs and symptoms of differing severities. At the time of the interview, these participants no longer experienced any signs or symptoms. However, 13 participants, self-described as “long haulers,” reported a different experience in which they tested positive for COVID-19 and experienced acute signs and symptoms. However, at the time of the interview several months later, these participants were still experiencing a number of symptoms from onset, as well as several new symptoms that were not present at the time of diagnosis. These symptoms ranged in variety and severity and prolonged the impacts associated with the disease.

Table 2 Patient baseline demographics

3.2.1 Wave 1 (n = 10)

In the CE portion of the interviews, participants listed 48 different signs/symptoms (see Tables 1, 3), ten of which (increased skin sensitivity, vision changes, changes in memory/processing, hair loss, sinus pain, reflux, frequent urination, neck pain, post-nasal drip, and excessive thirst) were not identified in the literature or social media. These were added to the item bank at the end of wave 1. No new impacts were identified. Further changes were made to the item bank based on the CE findings; specifically, vertigo was distinguished from dizziness, and sweating/perspiration was expanded to include night sweats. Vomiting, coughing blood, loss of speech or movement, bluish color to lips or face, rash, and discoloration of toes or fingers were included in the preliminary item bank but were not reported by participants. They were retained for wave 2.

Table 3 Saturation grid showing when each sign/symptom was first identified; wave 1 (patients 1–10) or wave 2 (patients 11–20)

In the CDI portion of the interviews, participants confirmed the relevance of most items. Several changes were made to the item wording to enhance clarity by including more patient-friendly language. Specifically, seven items were updated for increased clarity, and one item was added to establish a baseline number of days that the respondent usually works in a week. Six participants had difficulty differentiating the items asking about the impact of the pandemic in general from the items asking about impacts due to symptoms. The instructions for these items were updated for additional clarity.

3.2.2 Wave 2 (n = 10)

In the CE portion of the interviews, four additional symptoms (beyond the literature review, social media listening, and wave 1 interviews) were identified; insomnia, whole body swelling/fluid retention/bloating, dry eyes, and rapid heartbeat (see Tables 1, 3). However, none was identified by more than three participants (30% of wave 2 sample; 15% of total sample). Weakness was described as an overall lack of energy as well as muscle weakness, making everyday physical tasks difficult to perform. Fatigue was described as difficulty getting out of bed and excessive tiredness to the point that participants found themselves falling asleep while trying to complete a task. No new impacts were identified.

In the CDI portion of the interviews, all ten participants confirmed the relevance and clarity of the instructions, COVID-19 symptoms, impacts, and pandemic-related impact items, and they were able to find a response to describe their experience for each item. Two participants initially reported difficulty differentiating between disease-related and pandemic-related impacts but were able to distinguish and respond appropriately following consideration of the items/instructions. Some symptoms were combined or language was revised to make the list of symptoms more patient friendly, easier to navigate, and to avoid duplication.

3.3 Conceptual Model Development

The preliminary conceptual model of COVID-19, developed from the literature review and social media research, included 43 COVID-19 signs and symptoms. Following a review of data from all 20 interviews, the list of signs and symptoms was refined to include more patient-friendly language, combine overlapping concepts, and exclude signs/symptoms that had few mentions (less than three) or were not ideal to measure with a PRO tool. The final list included 55 different signs/symptoms that the participants experienced and which they attributed to COVID-19 (see Table 1). Of these 55 symptoms, 22 were either reported across all three streams of research (literature review, social media listening, interviews) and/or were reported by at least half of interview participants. These can be categorized as lower respiratory, upper respiratory, dermatologic, gastrointestinal, neurological, pain, and constitutional symptoms. These are represented in the final conceptual model of COVID-19 along with the 40 medical, cognitive, emotional, social, work, and more general impacts of COVID-19 (Fig. 1).

Fig. 1
figure 1

Final conceptual model. COVID-19 Coronavirus disease 2019, PTSD post-traumatic stress disorder, SARS-Cov-2 severe acute respiratory syndrome coronavirus 2

3.4 PRO Item Bank Finalization

Patient-reported outcome items were finalized from the interview data and the conceptual model of COVID-19. Items are organized into seven groups as shown in Fig. 2. Group A and Group B capture information on the incidence and severity of 55 signs/symptoms. All signs/symptoms except fever are measured on a binary yes/no scale to indicate incidence and a 4-point verbal response scale to indicate severity (very mild, mild, moderate, severe). The 4-point verbal response scale was debriefed with patients to assess the clarity and distinctness of the response options. Patients confirmed that this scale was appropriate for describing their signs and symptoms and were able to distinguish between individual response scales. Fever is reported to one decimal point in Celsius or Fahrenheit. A free text “other” sign/symptom item is also included for respondents to indicate any additional signs/symptoms that they may have experienced. Two item banks were developed for Group A and Group B; one with a 24-hour recall period for each item, the IQVIA COVID-19 Daily Diary (ICDD©) Item Bank, and one with a 7-day recall period, the IQVIA COVID-19 Weekly Diary (ICWD©) Item Bank.

Fig. 2
figure 2

Item blocks in the IQVIA COVID-19 Daily Diary (ICDD©) Item Bank and the IQVIA COVID-19 Weekly Diary (ICWD©) Item Bank

Groups C, D, and F capture specific emotional, physical, and functional impacts of COVID-19 infection. Six items assess loneliness and detachment, six items assess impacts on physical functioning, and three items assess worry associated with COVID-19. These items include a 5-point scale (not at all, a little, somewhat, quite a bit, and a lot). Group E assesses the impact of symptoms on work and productivity. These 11 items assess impacts on the number of days a respondent could work or study, and if they sought medical care and time spent in the hospital. These items include yes/no response options and the number of days in the previous week. Group G measures general impacts of the pandemic and includes a 5-point difficulty scale (not at all, a little, somewhat, quite a bit, and a lot). The impacts related to the pandemic can be administered to people with and without COVID-19 infection/signs or symptoms. All impacts (Groups C–F) have a weekly recall and are therefore included only in the ICWD Item Bank©. Two additional introductory items are included in the ICWD Item Bank© to assess if the respondent has felt sick in the past week and when they started feeling sick (not shown in Fig. 2).

4 Discussion

As the world is rapidly evolving and adapting to the challenges of the COVID-19 pandemic, the healthcare industry is making efforts to better understand COVID-19 and develop treatments and vaccines in response. Four Core Outcome Sets (COSs) have been proposed for use in trials of treatments for COVID-19 [9, 10, 18, 19], all of which recommend measuring patient-reported symptoms as part of their outcome assessments. These recommendations are broadly consistent with FDA guidance on conducting clinical trials for the treatment or prevention of COVID-19 [11]. However, no PRO item bank or fit-for-purpose questionnaires have been developed and validated with patients with COVID-19 to comprehensively measure symptoms before, during, and post-infection. Neither has a PRO item bank nor questionnaire sought to comprehensively understand the unique impacts of both COVID-19 infections and the global restrictions resulting from the COVID-19 pandemic on peoples’ feelings, functioning, and quality of life. This paper describes the development and preliminary content validation of a novel PRO item bank, with daily (ICDD©) and weekly (ICWD©) recall versions for signs/symptoms, and weekly recall for impacts (ICWD©) to meet this need.

The item bank (ICDD©, ICWD©) is unique in that it was developed to be used in vaccine, treatment, or epidemiological studies and constructed in line with industry best practice [20, 21] and in accordance with FDA Guidance for Industry on the development of PRO measures [7]. Specifically, the content for the item bank reflects the conceptual model, which was developed from CE research with people with COVID-19 to understand symptoms and impacts that resolve after a few weeks and those that persist for several weeks or months. Additionally, the item bank builds upon the content of existing scales [3,4,5,6] by including additional attributes identified directly by patients. The CDI research undertaken ensures that the item bank measures the most relevant concepts for understanding experiences of COVID-19 in a manner that is understandable to respondents and interpretable to researchers. As such, the ICDD© and ICWD© can be considered preliminarily content valid.

It is not intended that all items of the item bank are administered in any single research study; selection will be a function of study design, population, and outcomes/endpoints of interest [22]. For treatment trials, a small number of items from the ICDD©/ICWD© may be selected to explore how treatment may improve key signs/symptoms of infection and reduce the burden of disease. For instance, loss of sense of taste or smell along with cough and fever may be symptoms of interest and can be assessed over an extended period of time at different frequencies of assessment. This is particularly important in trials of patients with less severe disease, where improvement may be indexed by reductions in symptom severity. Asking patients about their symptoms may allow for the identification of bothersome symptoms, which do not meet the criteria for hospitalization but can be equally disabling and anxiety provoking and persist long term. For vaccine trials with members of the general population, a larger number of items may be selected to examine the likely onset of infection by monitoring a broad set of signs/symptoms. While treatment and vaccine trials are critical for reducing the mortality and burden associated with COVID-19, epidemiological studies are necessary to further understand the implications of the disease in specific populations, the evolutionary nature of the disease, as well as possible interventions in which disease severity and transmission may be reduced. The impact items from the item bank offer additional insight into how both an infection and the pandemic can impact peoples’ lives.

Researchers using the item bank would be advised to conduct further qualitative research in order to further explore the content validity of the item bank as new variants emerge and long-term impacts are discovered. The qualitative research described in this article was conducted in 2019–2020, and although we feel confident about the applicability of the item bank to the Delta and Omicron variants (we have monitored the Centers for Disease Control and Prevention [23] to confirm that no updates are needed at this time and the item bank includes an “Other” option in the list of symptoms), additional clinical insight may be beneficial, as well as insight from a broader range of patients with additional mutations/presentations prior to using the item bank. Quantitative data are thereafter needed to explore the prevalence of signs/symptoms and the relationship between these and the impacts and to establish scoring and psychometric properties for the item banks. Item Response Theory (IRT)-based calibration for symptom items, impacts of a COVID-19 infection items, and the impacts of the pandemic on peoples’ daily lives items can be considered to allow researchers to compare scores among groups of patients with different experiences and allow further refinement of the item bank. The ICDD© and ICWD© can be further developed into short-form instruments that will provide composite and psychometrically supported scores for comparative research studies.

4.1 Limitations and Next Steps in PRO Item Bank Development

While the ICDD© and ICWD© have preliminary content validity, and provide a way to measure a variety of relevant PROs in different research studies going forward, there are several limitations that should be noted. The social media review was conducted in the US and the interviews were conducted based on a US sample only. The item bank has the potential to further evolve when utilized by international participants. Using a simple analysis across the two waves of ten patients, saturation of concepts was not observed. That is to say that new symptoms were being heard in the second half of the patients interviewed in the study (see Table 3). However, the two waves were defined largely for the CDI portion of the interviews. If patients were chronologically ordered into four groups of five to define saturation (as is common in CE research [24]), then saturation was established by the end of wave 3 as no new symptoms were identified beyond interview 13 (see Table 3). In addition, some minor updates for patient-friendly language were made during wave 2 of the interviews. As such, the final measure has not been debriefed in full; however, these changes were minor and in line with patient language used during the interviews and should have a minimal effect on patient understanding.

Further research using the item bank will provide a better understanding of the differences of the patient experience among different patient populations. In the current study, all patients interviewed were aged less than 70 years, and thorough demographic data and patient history were not collected. Therefore, the data cannot be considered generalizable. Additional iterations of the item bank may be justified should a broader age group be interviewed in future qualitative research, and should an internationally and culturally representative or globally generalizable sample be included in such studies. Further, there is a possibility that the qualitative interviews conducted in the current study are prone to recall bias as some participants described their experiences of an infection that occurred several months prior. However, most participants described the experience as very easy to remember because of the novel and serious nature of the infection. Helpful insights may be discovered by interviewing patients during the acute phase to supplement the findings of the current research, although it may not be feasible for those who are functionally ill with COVID-19.

5 Conclusions

The ICDD© and ICWD© item banks have preliminary content validity and can be considered for forthcoming treatment and vaccine trials and epidemiological studies to examine the signs, symptoms, and impacts of COVID-19 and the pandemic in general on people’s lives. However, further qualitative and quantitative research is required to fully understand their utility in such studies. The item banks are available for researchers to review and use. Additional information about the item banks can be found via this link: https://www.iqvia.com/library/fact-sheets/iqvia-covid19-daily-diary-and-weekly-diary-item-banks.