Sodium and Health Outcomes: Ascertaining Valid Estimates in Research Studies

Purpose of Review The dietary reference intake (DRI) for sodium has been highly debated with persuasive and elegant arguments made for both population sodium reduction and for maintenance of the status quo. After the 2015 Dietary Guidelines Advisory Committee (DGAC) report was published, controversy ensued, and by Congressional mandate, the sodium DRIs were updated in 2019. The 2019 DRIs defined adequate intake (AI) levels by age–sex groups that are largely consistent with the DRIs for sodium that were published in 2005. Given the overall similarities between the 2005 and 2019 DRIs, one may wonder how the recently published research on sodium and health outcomes was considered in determining the DRIs, particularly, the recent studies from very large observational cohort studies. We aim to address this concern and outline the major threats to ascertaining valid estimates of the relationship between dietary sodium and health outcomes in observational cohort studies. We use tools from modern epidemiology to demonstrate how unexpected and inconsistent findings in these relationships may emerge. We use directed acyclic graphs to illustrate specific examples in which biases may occur. Recent Findings We identified the following key threats to internal validity: poorly defined target intervention, poorly measured sodium exposure, unmeasured or residual confounding, reverse causality, and selection bias. Researchers should consider these threats to internal validity while developing research questions and throughout the research process. Summary For the DRIs to inform real-world interventions relating to sodium reduction, it is recommended that more specific research questions be asked that can clearly define potential interventions of interest.


Introduction
Dietary reference intakes (DRIs) have been a cornerstone of United States (US) nutrition policy since 1943 [1]. They impact federally funded nutrition programs and, the recommended population level of sodium can elicit polarizing responses from scientists [2,3], industry representatives [4,5], and journalists [6,7]. Sodium is a nutrient that has been highly debated with persuasive and elegant arguments made for both population sodium reduction [8,9] and for maintenance of the status quo [10,11].
After the 2015 Dietary Guidelines Advisory Committee (DGAC) report was published [ In the 2019 DRIs, the tolerable upper level (UL) was no longer used for sodium. This is a point of difference from the DRIs published in 2005 [14] and is based on a revision in methodology such that ULs are driven by toxicological responses [15]. In 2019, instead of ULs for sodium, the chronic disease risk reduction (CDRR) DRIs were set [13 •• ]. The CDRR DRIs are a new feature of sodium DRIs. CDRR is the intake level at which reduction in intake is expected to reduce chronic disease risk within an apparently healthy population. Given the overall similarities between the 2005 and 2019 DRIs, one may wonder how recently published research on sodium and health outcomes was considered in determining the DRIs, particularly the recent studies from very large observational cohort studies [16][17][18][19][20][21]. And, it may even raise questions as to whether the recommendations from the 2019 committee were simply replicating previous knowledge, or whether it was driven by a lack of certainty in the newly published results.
To answer these questions, it is important to consider that epidemiologic research on sodium and health has sought answers to causal questions such as "will decreased sodium intake reduce risk of cardiovascular disease (CVD)? If so, by how much and for what populations?" Often, these answers can contribute to establishing nutrition guidelines and associated policies, which will subsequently improve population health. When DRIs are being determined, a consensus panel of scientists systematically reviews the literature to evaluate certainty in the presented results and weigh the individual studies based on the potential for bias [22, 23, 24 • , 25 • ]. Additionally, the DRI committee considers the magnitude and direction of the potential bias and discusses the likelihood that the studies' conclusions would meaningfully change in the absence of bias. They consider findings from all study designs and must often grapple with the paucity of randomized clinical trials and some inconsistency across the observational epidemiologic studies. Moreover, in epidemiology, there has been a shift towards the use of very large datasets to understand exposure-disease relationships, and the use of larger sample sizes is sometimes misinterpreted as confidence in the results obtained. Although a benefit of using very large datasets is improved precision in effect estimation, this does not indicate that these data will yield valid estimates of the relationship between exposure and outcome [26].
Herein, we aim to outline major threats to ascertaining valid estimates of the relationship between dietary sodium and health outcomes in observational cohort studies. We use tools from modern epidemiology to demonstrate how unexpected and inconsistent findings in these relationships may emerge.

Current Challenges in Estimating Relationships Between Dietary Sodium and Health Outcomes
In observational studies, the main analyses estimate statistical associations between dietary sodium intake and a specific health outcome. Inferring causation from these statistical associations is a difficult task and requires strict assumptions [27]. Nutritional epidemiology, and particularly the study of specific micronutrients, has been criticized as being plagued with methodological issues limiting this inference [28,29]. In the case of sodium, these doubts contribute to the debates about the recommended intake level for this nutrient and a lack of confidence, by some, in the guidelines [30].
In Table 1, we discuss assumptions in inferring causation from observational data that are the key to the investigation of the effects of sodium intake on health. We illustrate a few of these assumptions (Figs. 1a-c) using causal directed acyclic graphs (DAGs) adapted from figures presented in a textbook by Hernán and Robins (2020) [27]. In brief, a DAG is a graphical representation of the causal effects between variables. They are constructed from a set of edges (arrows) and nodes (variables) based on a priori assumptions about the causal relations among the exposure, outcome, and covariates. An arrow between two variables implies a direct causal effect. Two variables (i.e., X and Y) may be statistically associated if (1) X directly or indirectly causes Y, (2) X and Y share a common cause (i.e., confounding variable), or (3) a descendent of X and Y (i.e., collider) has been conditioned on [27,[31][32][33].
Increasingly, DAGs are being used to help depict different causal structures, thus forcing researchers to be explicit about the research question and the underlying assumptions about how the variables of interest are related. This approach facilitates communication within the research community giving us a framework around which we can align. Additionally, DAGs facilitate appropriate selection of covariates for regression analyses and help elucidate potential sources of bias. Table 1 Key assumptions in inferring causation from observational data in the context of sodium intake and CVD

Exposure is well-defined
A key assumption for estimating causal effects from observational data is that the exposure is well defined and that the observed and counterfactual outcomes are clear [34]. More so, we assume that the effect of an exposure, such as dietary sodium intake, on a specific outcome will be the same regardless of how the exposure is modified [34]. For example, an individual may need to reduce their total dietary sodium intake by 30% to meet DRI levels. There are several ways to reduce total dietary sodium intake (replacement of high-sodium foods, caloric reduction, elimination of table salt, etc.). It is straightforward to challenge the assumption that each of these approaches (i.e., hypothetical interventions) will have consistent causal effects on the outcome of interest. Dietary sodium intake is a complex exposure because it is, almost always, consumed among other nutrients.

Exposure is measured without error
Issues relating to measurement of dietary sodium intake are widely acknowledged [28,29,35]. In brief, there are two primary modes of collection: measurements in urine specimens and dietary surveys. Dietary surveys include food frequency questionnaires (FFQs), food diaries, and 24-h recall. These assessments are inexpensive and with repeated measurements could capture an individual's usual sodium intake; however, they are vulnerable to recall and reporting biases. Additionally, processing data from these surveys requires researchers to estimate sodium levels from food composition. Accuracy in these conversions is limited due to heterogeneity in sodium content across commonly reported foods. Moreover, sodium intake is likely underreported when condiments, table salt, and sodium used when cooking is not included in survey items (Fig. 1a, top). An advantage of urinary collection methods is that they are objective measurements of sodium recovered in urine. Though, to reduce participant burden and study costs, large studies sometimes rely on a single collection of urine via an overnight or spot urine collection, which may vary by time of day and time since consumption of sodium. Bias could be introduced if estimating equations are used to deduce 24 h excretion from a spot urine collection. In addition, due to day-to-day variation in sodium intake, one measurement may not accurately represent usual intake. This random error affects the measurements of all participants under study (Fig. 1a, top). An additional concern is a circumstance in which measurement error in sodium is differential by study outcome. This will occur if the amount of sodium excreted in urine varies by variables associated with CVD risk, such as medication use or kidney function when no longer in steady state. As depicted in the bottom half of Fig. 1a, a third variable, kidney function, is creating a "back door path" (i.e., biasing pathway) due to its association with sodium measurement error and CVD (Fig. 1a, bottom).

Confounding variables are measured and adjusted for
Another key assumption for causal inference from observational studies is that there are no unmeasured variables that affect risk of the outcome and are also differentially distributed by exposure status. For example, when we compare disease risk among participants that have the greatest reported dietary sodium intake vs. those with the least, we assume that this is a valid comparison or that these groups are exchangeable (i.e., conditional exchangeability). This assumption of conditional exchangeability is challenged when there are large differences in unmeasured (or measured) confounding variables across exposure categories. Major differences in the distribution of sodium intake by key demographic variables may be difficult to fully address with traditional methods of adjustment, resulting in residual confounding. Dietary sodium intake is correlated with overall dietary pattern and total caloric intake. Thus, it requires careful work to isolate the specific effect of sodium on a disease outcome.

No reverse causality
Absence of reverse causality implies that the exposure affects the outcome and not vice versa. For example, this is often assumed in a prospective cohort study because the exposure is measured at baseline when disease is absent, and participants are followed for incident disease. However, in Fig. 1b, we illustrate two scenarios in which reverse causation bias may occur. In the first example, participants with pre-existing risk factors or subclinical disease at baseline may use medications that affect valid assessment of the exposure (Fig. 1b, top). In the second example, individuals with subclinical disease are at increased risk to develop the study outcome and may have already reduced their sodium intake via change in diet. As such, a study of these participants would underestimate sodium intake in the disease group and consequently underestimate the true effect of sodium on a health outcome (Fig. 1b, bottom).

Absence of selection bias
The way in which individuals are selected into study analyses can contribute to biased and unexpected results. Studies of sodium and CVD that preferentially recruit sicker patients (i.e., T2D and CKD) either by design or because of recruitment procedures can show the counterintuitive result that increased sodium is protective for CVD. For example, increased sodium intake is associated with greater risk of chronic kidney disease (CKD) which is, in turn, associated with increased risk of CVD. Additionally, CKD and CVD share many risk factors. Thus, conducting an analysis within a stratum of participants with CKD is, by design, conditioning on a common cause of exposure and outcome (i.e., collider stratification bias). Within this stratum, the study data will underestimate sodium intake and overestimate CVD incidence compared to the target population. This could result in unexpected null associations, or inverse associations between sodium and CVD (Fig. 1c bottom). Missing data due to loss to follow-up (LTFU) may also result in selection bias if the LTFU is associated with variables under study. For example, if the sickest participants are those that are most likely to be LTFU, then the analytic sample would be deplete of those with the highest exposure and most disease, resulting in a bias towards the null (Fig. 1c, top).

Findings are generalizable to target population
Lastly, effect estimates may differ across sub-populations. Several studies regarding sodium consumption and health include only high-risk patients (i.e., end-stage renal disease and type 2 diabetes mellitus). Although it is necessary to learn about the health effects of sodium in these clinical populations, the findings may not be generalizable to the US population. Thus, guidelines directed towards the US population usually do not prioritize these findings.

Discussion and Conclusions
We aimed to outline major threats to ascertaining valid estimates of the relationship between dietary sodium and health outcomes in observational cohort studies. We use directed acyclic graphs to illustrate specific examples in which biases may occur. These are tools that can be used throughout the research process to inform which variables should be measured in research studies, what variables should be adjusted for in our multivariable analyses, and how the procedures used to select participants into studies affect internal validity of study results. They can also be used alongside bias Researchers should consider these threats to internal validity while developing research questions and throughout the research process. A well-defined question with a clearly articulated target intervention can be more easily translated to nutritional policy. Other threats to validity can be eased during the study design process. For example, using multiple modes of sodium measurement such that findings can be contrasted within the same study sample will inform the extent to which measurement error is biasing results. Bias due to confounding and reverse causality can be eased by measurement of auxiliary variables. Developing a DAG in collaboration with subject area experts can be used to identify which variables need to be measured and then subsequently adjusted for in statistical analyses.
We also highlight the importance of clearly defining a target population for which the study results should generalize to. Threats to external validity, too, have implications for nutritional policy makers. Studies of sodium and disease in clinical and high-risk populations are beneficial in understanding physiologic mechanisms at play as well as targeted interventions for these groups. These studies should not be prioritized, however, in informing national dietary guidelineswhich are focused on establishing recommendations for health promotion and disease prevention across the US. It is imperative that observational research informing national guidelines includes representation of all population subgroups and that the study population is representative of the general US population.

Conclusions
Despite strong opinions about the usefulness of nutritional epidemiology [29,36,37], and the labeling of this field as flawed [29], it may be more productive and informative to think through how the limitations of the methods employed in these studies affect their conclusions. This can guide us to understand the implications of published analyses, regardless of the size of the dataset, and help inform well designed studies that can be used to set sodium policies.

Compliance with Ethical Standards
Conflict of Interest The authors declare that they have no conflict of interest.

Fig. 1 (continued)
Human and Animal Rights and Informed Consent This article does not contain any studies with human or animal subjects performed by any of the authors.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.