COVID-19 caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has become a big threat to public health and the economy worldwide [1]. Long COVID, defined by the persistence of symptoms beyond 3 months of SARS-CoV-2 infection, is expected to substantially alter millions of lives with a broad spectrum of manifestations since survivors of COVID-19 now exceed hundreds of millions globally [2, 3]. Although socioeconomic characteristics, lifestyle factors, poor metabolic health, and pre-existing chronic conditions were suggested to be risk factors for COVID-19, the whole impactors are still elusive [4,5,6]. While vaccines, effective treatments, and mitigation measures have been launched to deal with COVID-19, understanding the risk factors of COVID-19 would be of great importance in building up people’s health and preventing acute infectious disease in the future [7].

Nutrition is widely known to be an essential determinant of human health [8]. It provides a source of energy and substrates for immune system activity [9]. A balanced diet could promote a healthy gut microbiota, which plays a role in training and regulating the immune system [9]. Previous studies have shown that diet is associated with infectious diseases [10,11,12]. Recent evidence has shown that higher dietary intakes of fruit and vegetables and healthy plant-based foods are related to decreased risk of COVID-19 infection [13, 14], which shed a light on the relationships between nutrition and the risk of COVID-19.

The NOVA system classifies all foods, beverages, and food products into four groups based on the extent and purpose of the industrial processing they undergo, taking into consideration the physical, biological, and chemical methods used in their manufacture, including the use of additives [15]. Ultra-processed foods (UPFs), one of the four groups that make up NOVA, are industrial formulations of processed food substances (oils, fats, sugars, starch, protein isolates) that are submitted to hydrolysis, hydrogenation, or other chemical modifications by adding flavourings, colourings, emulsifiers, and other cosmetic additives [15] UPFs are often high in energy, added sugars, saturated fats, trans fats, and salt, and low in dietary fibre, protein, vitamins, and minerals [15]. Beyond nutritional composition, UPFs are also a major dietary source of contaminants, neo-formed compounds, which may impact several pathways, such as inflammation, and alter the gut microbiota composition [16]. Monotonous diets rich in UPFs may lead to vitamin and mineral deficiencies, impairing the immune system and increasing susceptibility to SARS-CoV-2[17]. And dietary imbalance caused by the consumption of UPFs might increase the inflammatory index of diet, which has been associated with respiratory infection of the airways [18,19,20]. Studies have shown that UPFs are associated with an increased risk of cardiovascular disease, inflammatory bowel disease, cancer, and all-cause mortality among general and high-risk populations [21,22,23,24,25,26]. However, to our best knowledge, there is limited evidence of the associations between UPF consumption and the risk of COVID-19. In this study, using data from the UK Biobank, a prospective cohort, we explored the association between UPF consumption and the risk of COVID-19 infection.


This study followed the Strengthening the reporting of observational studies in epidemiology (STROBE) reporting guideline for cohort studies.

Study design and participants

The UK Biobank is a prospective cohort with about half a million participants aged 40–69 years from one of the 22 assessment centres across England, Scotland, and Wales. The objective of this cohort is to explore the determinants of health. Details of this cohort were described elsewhere [25]. Briefly, the baseline information, including socio-demographic characteristics, lifestyle and environmental factors, physical measures, family history, and health and medical history, were collected from 2006 to 2010. Participants were followed-up to obtain their health outcomes with their informed consent. The ethical approvals of UK Biobank were obtained from the North West Multi-Centre Research Ethics Committee, as a Research Tissue Bank (RTB) approval. This study was approved under the UK Biobank application number 45676. Participants were excluded if: 1) withdrew consent or were lost to follow-up; 2) less than 2 online 24-h dietary recall completed; 3) missing COVID-19 testing results and COVID-19 death records; or 4) extreme mean energy intake (< 800 kcal/day or > 4200 kcal/day for males; and < 600 kcal/day or > 3500 kcal/day for females).

Dietary assessment

The Oxford WebQ dietary questionnaire on the quantities of up to 206 types of foods and 32 types of drinks consumed over the previous day was used to assess the detailed dietary intake over the previous 24-h in each UK biobank assessment centre [27]. The online 24-h dietary recall questionnaire was conducted five times, with the first assessment from April 2009 to September 2010, and then repeated in February 2011 to April 2011, June 2011 to September 2011, October 2011 to December 2011, and April 2012 to June 2012. The quantity of each food item consumed was obtained by multiplying the consumed portions times the corresponding portion size [28]. The averaged mean dietary intakes from all available dietary records of each participant were considered as baseline usual dietary intakes.

All food and drink items in the online 24-h dietary recall questionnaire were categorized into four groups (unprocessed or minimally processed foods, processed culinary ingredients, processed foods, and UPFs) according to their degree of processing by the NOVA system of food classification [29, 30]. (1) Unprocessed or minimally processed foods are unprocessed foods or foods altered by processes such as removal of inedible or unwanted parts, drying, crushing, et al. No process including salt, sugar, oils or fats, or other food substances are used. (2) Processed culinary ingredients, are obtained directly from unprocessed or minimally processed foods or nature, like oils and fats, sugar and salt, created by industrial processes such as pressing, centrifuging, refining, extracting, or mining. (3) Processed foods are industrial products made by adding salt, sugar, or other substance found in foods in the first two categories, using preservation methods such as canning and bottling, and non-alcoholic fermentation (in the case of bread and cheese). (4) UPFs are formulations of ingredients, most of the exclusive industrial use (e.g. protein isolates, hydrogenated oils, and modified starches), that result from a series of industrial processes. UPFs are the focus of this study. A detailed description of the NOVA classification was published elsewhere [15]. In addition, examples of foods in each category were listed in Supplemental Table 2.

The total weight (g/day) of UPF or all foods was calculated as the sum of the weight of each corresponding food item and the average weight of each food item per participant across all dietary recalls was obtained. The proportion of UPF intake in the weight of all food items was calculated for each participant. Such an approach is more advantageous in capturing the adverse effect of UPF than the energy ratio since some food components do not provide energy [24].


Data of COVID-19 testing results of UK Biobank for England, Scotland, and Wales are provided by Public Health England (PHE), Public Health Scotland (PHS), and Secure anonymized information linkage (SAIL), respectively, since 16 March 2020 [31]. Individuals with the death diagnosis of “U07.1” and “U07.2” according to International Classification of Diseases versions 10 (ICD-10) were also deemed as COVID-19 positive as a supplement to test records. Individuals who tested positive for COVID-19 or died because of COVID-19 were identified as COVID-19 infection cases. All outcome data were assessed on 11 December 2021. The date of the last update of COVID-19 tests results and death data are listed in Supplemental Table 1.


Socio-demographic, lifestyle characteristics, and health conditions were collected at baseline. Covariates were adjusted in the multivariable model as potential confounding factors, including gender, age (year of birth), ethnicity (white, mixed, Asian or Asian British, Black or Black British, Chinese, and other ethnic groups), Townsend deprivation index at recruitment (continuous), educational level (college or university degree, A levels/AS levels or equivalent, O levels/GCSEs or equivalent, CSEs or equivalent, NVQ or HND or HNC or equivalent, and other professional qualifications), body mass index (BMI, kg/m2, continuous), physical activity (summed MET minutes per week for all activity, continuous), smoking status (never, previous, and current), alcohol intake frequency (three times a week or more, at least once a month, never or special occasions only), comorbidity status, energy intake (kcal/d, continuous), and healthy diet score. Smoking status was derived from a touchscreen questionnaire. Current smokers were defined as smoking on most or all days or only occasionally according to their response to the question, “Do you smoke tobacco now?”. Non-smokers were defined as those who have never smoked according to their response to the question above and those who just tried once or twice according to their answer to “In the past, how often have you smoked tobacco?”. The resource of disease history was from the datasets of the National Health Service (NHS) Information Centre (England and Wales) and the NHS Central Register Scotland (Scotland). The comorbidities in this study were defined with the ICD-10 code, including diabetes (E11-E14), chronic kidney disease (N03-N05, N07, N11-N15, N18-N19), chronic obstructive pulmonary disease (J41-J44), asthma (J45-J46), coronary heart disease (I20–I25), hypertension (I10-I15), atrial fibrillation (I48), stroke (G45-G46, I61, I63), dementia (F00-F03, G30) and cancer (C00-C97). And diseases that occurred before COVID-19 infection (before the latest update date for population tested COVID-19 negative) were considered comorbidities. The healthy diet score was adopted from previously published studies by estimating adherence to the main items of the Mediterranean diet [32, 33]. It was a 0 to 5 scale calculated by scoring five items each as one point, including vegetable intake above or equal to the median (four tablespoons each day), fruits intake above or equal to the median (3 pieces/day), fish intake above or equal to the median (once a week), unprocessed red meat intake less than the median (once a week), processed meat intake less than the median (once a week) [32, 33]. A higher score indicates a much healthier diet. The resource for healthy diet score assessment was from the food frequency questionnaire with the referral period of one year in the initial assessment visit at assessment centres from 2006 to 2010. The directed acyclic graph illustrating the rationale for the selection of confounders was provided in Supplemental Fig. 1.

Fig. 1
figure 1

Flowchart for the study sample

Statistical analysis

Baseline characteristics of participants included in this study were described as means (standard deviation), median (Interquartile Range (IQR)), or percentages according to the gender-specific quartiles of UPF weight ratio. The analysis of variance (ANOVA), Kruskal–Wallis test, or χ2 tests was used to examine the differences in baseline factors according to quartiles of UPF consumption.

Logistic regression was used to detect the relationship between quartiles of UPF weight ratio and risk of COVID-19 with an estimated odds ratio (OR) and 95% confidential interval (CI). The potential confounding factors listed above were adjusted for in these models. Model 1 was adjusted for gender, age (year of birth), ethnicity, and Townsend deprivation index at recruitment and education levels. Besides covariates in model 1, model 2 was adjusted for BMI, physical activity, smoke status, alcohol intake frequency, and comorbidity status (defined as having at least one of the following: diabetes, chronic kidney disease, chronic obstructive pulmonary disease, asthma, coronary heart disease, hypertension, atrial fibrillation, stroke, dementia, or cancer. Model 3 was further adjusted for total energy intake. Model 4 was further adjusted for a healthy diet score. P for trend was calculated to investigate linear trends of quarters of UPF weight ratio by considering the quarters as an ordinal variable (1, 2, 3, 4). The restricted cubic spline (RCS) model was used to evaluate the non-linear relationship between UPF weight ratio and risk of COVID-19, adjusted for potential confounding factors. The RCS model with 3 knots (10th, 50th, and 90th) was used by comparing the value of the Akaike information criterion of models (with 3 to 7 knots, respectively). Tests for nonlinearity used the likelihood ratio test to compare the model that comprised the linear term with the model that comprised both the linear and the cubic spline terms.

Sensitivity analysis was conducted under five scenarios where basic assumptions were changed. Firstly, the multiple imputation method was used to fill the missing values of covariates (max missing rate: 14%. Details of missing proportions of covariates were listed in Supplemental Table 3), while only complete cases were included in each model in the main analyses. Secondly, the energy contribution of UPF (% of total energy intake) was used as the proxy of individual UPF exposure instead of relative weight proportion. The energy values (kcals) of each food item are derived from McCance and Widdowson’s The Composition of Foods Integrated Dataset 2021 [34]. Thirdly, the absolute weight of UPF was used as the proxy for individual UPF exposure. Fourth, we defined the 5th and 95th centiles as the plausible energy-intake limits. Last, the study population included participants who completed all five dietary recalls only.

In subgroup analysis, we explored the interaction between quarters of UPF weight ratio and different strata factors, including (age: < 65, or ≥ 65 years old; educational level: college or university, or other lower qualifications; BMI: < 25, or ≥ 25 kg/m2; comorbidity status: without comorbidities, or with at least one disease). P for interaction was calculated. And the OR and 95% CI were estimated in each stratum.

Further, mediation analysis was used to explore the mediation effect of BMI on the relationship between UPF weight ratio and the risk of COVID-19. In this analysis, BMI and UPF weight ratio were used as continuous variables. There were two models used. First, a multivariate linear regression model estimating the effect of UPF weight ratio (expose) on BMI (mediator) after adjusting confounders. Then, a multivariate logistic regression model estimates the effect of UPF weight ratio (expose) on COVID-19 incidence (outcome) after adjusting BMI (mediator) and confounders. Average causal mediation effect (ACME) was used to estimate the effect of UPF weight ratio on COVID-19 incidence that could be explained by BMI level. And average direct effect (ADE) represented the effect of UPF weight ratio on COVID-19 incidence independent of BMI. The mediation proportion was calculated describing the proportion of the association that goes through BMI. The nonparametric bootstrap method with 500 repeats was used for confidence intervals of parameters.

All analyses were conducted using R software, V.4.1.2 (R Foundation). All P values for the tests were two-sided, and P values < 0.05 were deemed statistically significant.


In total, 41,012 (18,101 males and 22,911 females) UK Biobank participants were included in the final analysis (Fig. 1). The mean age of participants in the year 2020 (the year of the COVID-19 pandemic) was 56.47 (SD 7.98) years. The average UPF intake weight was 1721.50 g (IQR: 822.50, 1621.54). The weight contribution and energy contribution of UPF among all participants were 26.58% (IQR: 19.98, 34.19) and 52.31% (IQR: 43.62, 60.74). Overall, 6358 (15.5%) participants were diagnosed as COVID-19 positive or dead of COVID-19. Baseline characteristics of participants according to gender-specific quarters of UPF weight ratio were listed in Table 1. Participants in the fourth quarter (high consumption proportion of UPF) were more likely than participants in the lowest quartile to be younger, white, less educated, non-smokers, and to have lower socioeconomic deprivation index, less physical activity, lower alcohol intake, lower healthy diet score, higher BMI, higher energy intake. They were more likely to have a higher intake of carbohydrates, fat (saturated fat and polyunsaturated fat), and a lower intake of protein. And people in the highest quarter were more likely to have a clinical history of hypertension, diabetes, chronic kidney disease, or lung disease. Baseline characteristics of COVID-19 positive and negative participants were listed in Supplemental Table 4.

Table 1 Baseline characteristics of study participants according to quarters of the proportion of ultra-processed food consumption

The relative contribution of each food group to the total weight of UPF was shown in Fig. 2. The food groups with greater contribution were drinks (26.86%), dairy products (20.25%), ultra-processed breads and pastries (19.42%), ready-to-eat meals (12.75%), and ultra-processed fruits and vegetables (10.43%). The cumulative contribution of these five groups of UPF is 89.71%.

Fig. 2
figure 2

Relative contribution of each food group to ultra-processed food consumption in the diet. UP, ultra-processed

Compared to individuals in the lowest quarter of UPF weight ratio, participants in the 2nd, 3rd, and highest quarter were associated with increased odds of COVID-19 incidence, with the ORs of 1.08 (95% CI: 1.00–1.18), 1.33 (95% CI: 1.23–1.44), and 1.57 (95% CI: 1.46–1.70), respectively. The association between UPFs and the risk of COVID-19 was consistent but attenuated, after adjusting for potential confounders. The ORs for 2nd, 3rd, and highest quarters were 1.03 (95% CI: 0.94–1.13), 1.24 (95% CI: 1.13–1.36), and 1.22 (95% CI: 1.12–1.34), respectively. (Table 2).

Table 2 Main result and sensitivity analysis of the proportion of ultra-processed food consumption and risk of COVID-19

According to the restricted cubic spline analysis, there was a non-linear association between the proportion of UPF weight and the risk of COVID-19. It is suggested in Fig. 3 that the risk of COVID-19 increased rapidly and then was flat after around 30% of the predicted proportion of UPF consumption.

Fig. 3
figure 3

Spline plot for linearity assumption of association between the proportion of ultra-processed food in the diet and the risk of COVID-19

The main results were robust in sensitivity analyses (Table 2). However, the effect sizes decreased when the energy contribution of UPF (% of total energy intake) or absolute UPF weight (g/day) was used as exposure. Individuals in the highest quarter of UPF energy contribution and the highest quarter of UPF consumption weight were related to increased odds of COVID-19 incidence with the ORs of 1.09 (95% CI: 1.00–1.19) and 1.16 (95% CI: 1.06–1.27), compared with the lowest quarter. When only participants who attended all 5 times of dietary recall questionnaires were included, the proportion of UPF weight was positively associated with increased risk of COVID-19, although not significantly.

The subgroup analysis showed that there was no interaction effect between the quarters UPF weight ratio and age group, educational level, BMI, and comorbidity status. The association between UPF weight ratio (quarters) and the risk of COVID-19 did not change in most subgroups, but the relationship was not significant among individuals with BMI lower than 25 kg/m2 (Fig. 4).

Fig. 4
figure 4

Subgroup analysis of the proportion of ultra-processed food consumption and risk of COVID-19. Adjustment factors were the same as model 4 listed in Table 2

In a mediation analysis, the results showed that BMI mediated 13.2% (95% CI 8.0% to 23.5%; P < 0.001) of the effect of UPF weight ratio on the risk of COVID-19. The ADE of UPF weight ratio on COVID-19 susceptibility was also significant with an OR of 1.007 (95% CI: 1.004–1.013) per 10% increase in UPF weight contribution. Details are listed in Supplemental Table 5.


Using data from UK Biobank, we found that UPF consumption was associated with an increased risk of COVID-19 infection. This association was similar in different subgroups defined by age, educational level, and comorbidity status. In addition, we observed that such association was partly (13.2%) mediated by BMI. However, there was still a direct effect of UPF weight ratio on the risk of COVID-19.

To our knowledge, no study explored the relationship between UPF intake and the risk of COVID-19. Previous studies have shown that nutrition was associated with infectious disease [9]. Most studies had investigated the favourable effect of nutritional factors on the risk of COVID-19. Evidence derived from the NutriNet-Santé cohort showed that higher dietary intakes of fruit and vegetables, vitamin C, folate, vitamin K and fibres were associated with a lower risk of SARS-CoV-2 infection [14]. Merino, J. et al. reported that healthy plant-based foods were related to a lower risk of COVID-19 in the UK Biobank [13]. Some studies showed that a higher intake of vitamins C, B9, K, and dietary fibre were associated with to lower risk of COVID-19 [14]. Consistent with these studies, our study revealed that UPF, which usually represented a lower diet quality, was related to an increased risk of COVID-19.

UPFs, which are frequently present in Western diets, are closely related to the functioning of the immune system [35]. Some potential mechanisms for the association between UPF intake and the risk of COVID-19 are as follow. First, UPFs with excess simple sugars and saturated fats would exert pro-inflammatory effects [36], which could affect the production of immune cells and, directly, the functions of these cells [36]. Second, high saturated fat and low fibres in UPFs could lead to chronic activation of the innate immune system and an inhibition of the adaptive immune system [37]. This is especially relevant to COVID-19 patients given the high rate of infection among lung alveolar epithelial cells and the involvement of lung tissue inflammation and alveolar damage in COVID-19 pathology [38]. Third, UPF consumption may increase exposure to chemicals used in packaging and production and many non-natural ingredients and additives such as flavours, colours, emulsifiers, and other cosmetic additives, which could lead to adverse health outcomes [39, 40]. Fourth, a higher proportion of UPF consumption denoted a lower proportion of fresh vegetables, fruits, and essential micronutrients, which plays key roles in supporting the human immune system and reducing the risk of infections [41, 42].

In this study, we further discovered that UPFs were still associated with a higher risk of COVID-19 after adjusting for commonly used healthy diet scores. This shed insight on the adverse effect of food processing not merely the effect of nutrient quality on COVID-19 susceptibility. UPFs have drastically deconstructed food matrices that cause modified kinetics of release within the digestive tract and altered bio-accessibility and bioavailability [16]. A previous study showed that based on a data set of 98 ready-to-eat foods, the degree of food processing would correlate with the satiety index and glycaemic response [43]. This shed light on other mechanical pathways of the impact of UPFs on the immune system and infectious risk.

Additionally, our current study indicates that BMI accounted for 13.2% of the association between UPF consumption and COVID-19. Previous studies have shown that overweight or obesity increases the risk of infections from pathogens, such as influenza and coronavirus [44, 45], which might partly support our findings. The underlying mechanism of the effect of obesity on susceptibility might be that obesity could impair the activity of helper T lymphocytes, cytotoxic T lymphocytes, B lymphocytes, and natural killer cells, and reduce antibody and interferon-γ production [46, 47]. On the other hand, evidence has suggested that consumption of UPFs could cause obesity [48, 49]. A randomized cross-over control trial suggested that people consumed more calories when exposed to the ultra-processed diet as compared to the unprocessed diet, despite presented daily intakes of calories, sugar, fat, fibre, and macronutrients being matched [48]. Observational studies suggested that higher consumption of UPF was associated with a gain in BMI and higher risks of overweight and obesity [50]. Our study added the role of BMI in the association between UPF and COVID-19 and showed that BMI was an important mediator.

A recent study showed that during COVID-19 lockdown, the consumption of UPF had highly increased, which might hurt immunity, and people would be more susceptive to COVID-19 [51]. Under this circumstance, a proposal of a healthy diet and lower intake of UPF would be of great importance.

Our study contributes to ongoing research efforts to better understand the potential risk factors of COVID-19 infection. And this is the first study evaluating the association between UPF intake and the risk of COVID-19 adjusting for lifestyle, socio-demographic factors, and social physical measurements. We identified that UPF consumption was associated with an increased risk of COVID-19. Specifically, we found that BMI was a partial mediator in this relationship. These findings reinforce the value of a healthy diet and reveal a potential adverse effect of UPFs on infectious diseases.

We acknowledge several limitations. First, we could not confirm the causal relationship between UPF and the risk of COVID-19 since our research was an observational study. Second, the included population was not a random sample of all participants in UK Biobank since only a limited sample had COVID- 19 tests. Therefore, the generalizability of our findings needs to be confirmed in additional studies. Third, social desirability bias might induce underestimation of UPF consumption, which might dilute the studied associations. Nevertheless, online administration of the dietary questionnaire is expected to minimize any reporting bias due to social desirability. And study proved that this way of dietary assessment is acceptable to the public and a feasible strategy for large population-based studies [25]. Fourth, selection bias due to the exclusion of participants with missing values in covariates may have influenced the results. However, the results were robust in sensitivity analyses. Fifth, misclassifications in the NOVA categories cannot be ruled out. UK Biobank study collected limited information about food processing procedures. Thus, insufficient information might lead to misclassification.


Higher UPF consumption in the diet was associated with a significantly increased risk of COVID-19 in this large prospective cohort. This association could be partly mediated by the effect of UPF consumption on BMI. Our findings suggest that public health interventions to improve nutrition and poor metabolic health may be important for reducing the burden of the COVID-19 pandemic. Further evidence on the underlying mechanism is needed.