This study is embedded in the large ongoing population-based cohort study entitled the Rotterdam Study (RS). The RS commenced in 1989 and was designed to study elderly diseases as a response to the increasing proportion of elderly people in the population. A detailed description of the design and rationale has been described previously . For the purpose of this study, we used two subsequent visits from the second cohort (RSII-3 and RSII-4). All individuals in this cohort were included based on their ZIP-code (being the suburb Ommoord in Rotterdam) and their age (at first inclusion 55 years or older).
We excluded participants that did not undergo ultrasound and participants with missing or unreliable food frequency questionnaires (FFQ), i.e. caloric consumption below 500 or above 7500 kilocalories (kcal) per day. We excluded unreliable transient elastography (TE), i.e. an interquartile range (IQR)/median liver stiffness measurement (LSM) > 0.30 in measurements with a median of ≥ 7.1 kilopascals (kPa) or above  and failure of TE (i.e. if no LSM was measured after at least ten attempts). Lastly, we excluded participants with well-known risk factors for steatosis, such as the presence of viral hepatitis (as measured by HbsAg and anti-HCV), the presence of alcohol misuse (measured using FFQs and defined as ≥ 2 units of alcohol per day in women and ≥ 3 units in men), or the use of pharmacy-registered drugs that are known to cause steatosis (i.e. tamoxifen, methotrexate, systemic corticosteroids, and amiodarone).
The RS has been approved by the institutional review board (Medical Ethics Committee) of the Erasmus MC University Medical Centre Rotterdam and by the review board of The Netherlands Ministry of Health, Welfare and Sports. Written informed consent was obtained from all participants.
Food frequency questionnaires
Participants were asked to fill in a semi-quantitative 389-item FFQ, specifically developed for Dutch adults, during both research visits (RSII-3 and RSII-4) [24, 25]. This FFQ includes detailed questions on consumption over the last month and deals with type of food item, portion size, preparation method and frequency of consumption . Questions within the FFQ are for example: “Did you eat eggs last month? If yes, how were they prepared (boiled or baked)? How often did you eat eggs per month (once or 2–3 times) or per week (once, 2–3, 4–5, 6–7 times)? How many eggs did you eat on an average day then?”. The 389 food items were grouped into 28 empirical food groups (supplementary Table 1), based on previous publications [20, 21, 27], and adapted based on the food item and group quantities (i.e. merged similar groups with a very low median intake).
A priori dietary patterns
We chose to study the Mediterranean Diet Score (MDS), the Dutch Dietary Guidelines (DDG), and the World Health Organization (WHO) recommendations as a priori dietary patterns.
The MDS, first described by Trichopoulou et al. , originally has 9 components, of which 7 components are regarded beneficial (i.e. vegetables, legumes, fruits, nuts, whole grains, fish and mono-unsaturated fatty acids (MUFA)-to-saturated fatty acids ratio) and 2 components are regarded hazardous (i.e. red meat intake and excessive alcohol use). For the purpose of this study, we adapted the MDS by excluding the alcohol component from our calculation, as the MDS cut-off for hazardous alcohol use is 50 g per day, which is a very high cut-off in the context of a hepatic health outcome . Moreover, alcohol consumption was included in the multivariable model as potential confounder. Hence, our adapted MDS has 8 components. All components were given a score of 0 (unhealthy) or 1 (healthy) based on sex specific median cut-offs, and summed up.
The DDG is a predefined index that was developed in 2015 and describes a general advice to follow a balanced and healthy dietary pattern . The DDG is scored on the following points, consumption of (1) vegetables (≥ 200 g/day), (2) fruit (≥ 200 g/day), (3) whole-grain products (≥ 90 g/day), (4) legumes (≥ 135 g/week), (5) unsalted nuts (≥ 15 g/day), (6) fish (≥ 100 g/week), (7) dairy (≥ 350 g/day), (8) tea (≥ 150 mL/day), (9) whole grains ≥ 50% of total grains, (10) unsaturated fats and oils ≥ 50% of total fats, (11) red and processed meat < 300 g/week, (12) sugar-containing beverages (≤ 150 mL/day), (13) alcohol (≤ 10 g/day), and (14) salt (≤ 6 g/day).
We calculated the WHO-score based on the recent revised guidelines of the WHO (October 2018) . This score is composed of 6 components which are scored as 0 if unhealthy and 1 if healthy. The components are scored as healthy if they satisfied the following criteria: (1) vegetables and fruit intake of ≥ 400 g per day, (2) sugar intake (added and sugar sweetened beverages) of < 10 g per day, (3) energy percentage from fat intake < 30%, (4) energy percentage from saturated fat intake < 10%, (5) energy percentage from trans fatty acid intake is < 1%, and (6) salt intake of < 5 g per day.
A posteriori dietary patterns
A priori dietary patterns signify patterns described/identified in previous studies or specific habits of certain populations. They therefore do not necessarily ‘fit’ every population. For example, the Mediterranean Diet is natural for the Greek population in which this diet has been developed, whereas other populations such as the Dutch or Asian have different eating habits. We believe it is therefore of interest to also use a posteriori dietary patterns. These are population-specific dietary patterns and were derived using factor analysis on the 28 food groups at baseline with Varimax rotation and minimum residual estimation, using the function “fa” from the R package psych . We included 5 dietary patterns based on the bend in the scree plot (supplementary Fig. 1). The factor loadings for each food group reflect the relationship between the food group and the respective factor (i.e. dietary pattern). Subsequently we calculated adherence scores, separately for both visits, by multiplying the factors (determined for the food groups at baseline) with the observed values of the food groups at baseline (RSII-3) and follow-up (RSII-4), respectively. Each score at baseline was scaled to have a mean zero and a standard deviation (SD) of one. The same scaling parameters were used for the corresponding score at follow-up, to optimize comparability.
For the purpose of this study, all participants underwent an abdominal ultrasound (Hitachi HI VISION 900) and TE (FibroScan®, EchoSens, Paris, France). Both examinations were performed at the same visit by an experienced nurse technician. The diagnosis of steatosis was dichotomized as yes or no, because of the poor sensitivity for the grading of steatosis but the good performance for diagnosing moderate/severe steatosis . Steatosis was defined as hyperechogenic liver parenchyma as compared to the kidney parenchyma . The practical performance of the transient elastography has been described in detail previously . In short, both M and XL probe were available for the liver stiffness measurements (LSM) and used dependent on the subcutaneous fat layer as instructed by the manufacturer. Reliability criteria are described in the paragraph above (“Study Cohort”). Additionally, participants with an intra-cardiac device were excluded from the analyses. LSM were given as kilopascals (kPa). We used the previously proposed cut-off value of 8 kPa to proxy the presence of fibrosis in participants with steatosis, from this point forward referred to as non-alcoholic steatofibrosis (NASF) . As the main focus of this present study is NAFLD, participants with an LSM of 8 kPa or higher without steatosis were excluded.
All blood samples were drawn after overnight fasting. Automatic enzyme procedures (Roche Diagnostics GmbH, Mannheim, DE) were used to determine lipid profile, glucose, alanine aminotransferase, aspartate aminotransferase and gamma-glutamyltransferase. Insulin and viral hepatitis B or C were determined using an automatic immunoassay (Roche Diagnostics GmbH, Mannheim, DE). Detailed information on drug use was obtained via automated pharmacy linkage (with which 98% of the participants were registered). A 3 h home interview was carried out by trained research nurses and included questions on physical activity, smoking behaviour, education level, medical history, and demographics. In the research centre, anthropometrics were measured (i.e. weight, height, and waist and hip circumference), as well as blood pressure (median value after two measures in an upright position). The metabolic syndrome was defined using the harmonizing consensus criteria from Alberti et al. and contained 5 components on abdominal obesity, lipid profile, blood pressure, and fasting plasma glucose . The comorbidities diabetes mellitus and hypertension were established on the basis of drug use for the respective comorbidity or findings at physical examination, as described in detail previously .
Participant’s characteristics at baseline and follow-up are summarized using the median and first and third quartile, median and range (for dietary variables), or percentages.
To examine the association between the dietary patterns, micronutrient and macronutrient composition, we calculated and tested Spearman correlation coefficients between the raw values of (subtypes of) macronutrients as well as adjusted for total energy intake and the components of the a posteriori dietary patterns, and the adherence scores at baseline. Differences between energy-adjusted and energy & dietary pattern-adjusted correlation coefficients may be explained by overlapping characteristics of the different a posteriori patterns, which could outweigh each other’s effects.
In addition, as a supplementary analysis, we assessed the cross-sectional association between the different energy-adjusted food groups and NAFLD at baseline using univariable logistic regression. The energy-adjustment was carried out using the residual method .
To investigate the association of diet with NAFLD over time in the presence of missing values in the covariates we used Bayesian logistic mixed models, as implemented in the R package JointAI . In this approach missing values in covariates are imputed simultaneously with the estimation of the regression coefficients of interest, and the added uncertainty in the coefficients due to the missing values is automatically taken into account [41, 42]. This imputation was done using the covariates of our most extensive set (i.e. Model 2, given below). Please find more information on the missing data and imputed values in the separate supplementary data file. The choice of a mixed model allowed us to include data from all patients that fulfilled the above mentioned inclusion criteria, even when no follow-up measurement was available. A random intercept was included in the mixed model to take into account correlation between repeated measurements within the same subject. Separate models were fitted for each of the a priori dietary patterns and the five a posteriori patterns. As the five a posteriori dietary patterns explain approximately 20% of the variation in dietary intake in the population, they were analysed together in one model.
Two sets of covariates were created. The first set (“Model 1”) contains baseline age (in years), physical activity (in metabolic equivalent task h/week) and education level (low/intermediate/high), and in addition, sex, energy intake (in kilocalories per day), alcohol consumption (in units per day), and follow-up time (in years). The second set (“Model 2”) additionally contains covariates that reflect potential confounding, colliding, or mediating factors, i.e. baseline type 2 diabetes mellitus, baseline hypertension, and BMI. To allow the effect of diet to change over time and to allow effect modification by BMI interaction terms between the respective dietary pattern variable(s) and follow-up time (in Model 1 and 2) and BMI (only in Model 2) were included.
To obtain results for Model 1, ten sets of imputed values were extracted from each of the analyses of Model 2, then Model 1 was fitted on each dataset. Output from the ten repeated analyses per model was combined to calculate overall results. Since none of the interaction terms mentioned above had relevant contribution to any of the models, and the presence of interaction terms in a model complicates the interpretation of the regression coefficients substantially, we re-fitted Model 1 and Model 2 without the interaction terms (using imputed values from the original models) and present only the results of these simplified models.
We also investigated the role of BMI as a mediating factor between diet and NAFLD: we performed additional analyses with BMI (continuous) as outcome measure. For this, Bayesian linear mixed models were used and incomplete covariates were again simultaneously imputed. The models contained the confounders from Model 2, an interaction term between the dietary patterns and follow-up time, and a random intercept.
We also examined adherence to dietary patterns in relation to NAFLD severity. Due to the low number of NASF patients, we were not able to perform mixed effects logistic regression models on this outcome. In order to gain insight into the association between dietary patterns and NASF, we therefore plotted the (a posteriori and a priori) dietary pattern adherence scores across participants with NASF, participants with ‘simple’ steatosis, and participants without steatosis.
We used non-informative priors for our Bayesian analyses. Results from the Bayesian analyses are presented as posterior means and 95% credible intervals (CI). All analyses were performed using R version 3.5.2 and the packages JointAI (version 0.5.1) and psych (version 1.8.12). More detailed information of the statistical analyses can be found in the supplementary methods.