Drinking is different! Examining the role of locus of control for alcohol consumption

Locus of control (LOC) measures how much an individual believes in the causal relationship between her own actions and her life’s outcomes. While earlier literature has shown that an increasing internal LOC is associated with increased health-conscious behavior in domains such as smoking, exercise or diets, we find that drinking seems to be different. Using very informative German panel data, we extend and generalize previous findings and find a significant positive association between having an internal LOC and the probability of occasional and regular drinking for men and women. An increase in an individual’s LOC by one standard deviation increases the probability of occasional or regular drinking on average by 3.4% for men and 6.9% for women. Using a decomposition method, we show that roughly a quarter of this association can be explained by differences in the social activities between internal and external individuals.


Introduction
The personality trait locus of control (LOC) can be characterized as the "generalized attitude, belief, or expectancy regarding the nature of the causal relationship between one's own behavior and its consequences" (Rotter 1966) and describes whether individuals believe in the effects of their own actions on their life's future outcomes. While an individual with an internal LOC believes that she is in control of the consequences of her own actions, an external individual attributes her life's outcomes to luck, chance, fate or other external forces. LOC has already been shown to have an important effect on behavior and decision-making in areas such as human capital investment (Coleman and DeLeire 2003;Caliendo et al. 2020), job search effort (Caliendo et al. 2015;McGee and McGee 2016), labor force participation (Heckman et al. 2006;Hennecke 2020), savings (Cobb-Clark et al. 2016), labor market mobility (Caliendo et al. 2019) as well as investment behavior (Salamanca et al. 2016;Pinger et al. 2018), and it has also been found to have an insuring effect against negative effects of adverse economic and health-related life events on well-being and mental health (Buddelmeyer and Powdthavee 2016;Schurer 2017;Cuesta and Budría 2015).
Additionally, Cobb-Clark et al. (2014) have shown that LOC is positively linked to health-conscious behavior, such as abstaining from smoking, healthy diets and regular exercise, but also excessive alcohol consumption, i.e., binge drinking, raising the question whether "Drinking is Different?". The existing medical and social science literature already provides considerable evidence for that. One notable distinction between alcohol consumption and other forms of unhealthy behavior is the important differentiation between different levels of consumption. While hazardous and excessive drinking are consistently found to be associated with negative consequences on mental and physical health (see, e.g., Chatterji et al. 2004;Marcus and Siedler 2015;Cotti et al. 2014;Grønbaek 2009;Corrao et al. 2004) as well as social and economic outcomes (see, e.g., Francesconi and James 2019; Jones and Richmond 2006;Macdonald and Shields 2004;Mangiavacchi and Piccoli 2018), studies often link moderate and responsible drinking with beneficial medical outcomes (Sayed and French 2016;Grønbaek 2009;Ronksley et al. 2011) as well as higher earnings and employment probabilities (Ziebarth and Grabka 2009;Peters and Stringham 2006;Ours 2004) and better social networks and social integration (Leifman et al. 1995;Buonanno and Vanin 2013). According to data from the World Health Organization 71.7% of US-Americans and 79.4% of Germans over the age of 15 consumed alcohol in 2016, while 36.4% (Germany) and 43.1% (USA) of these drinkers engaged in a heavy drinking episode in the past 30 days (World Health Organization 2018). 1 Based on these numbers and the existing findings about distinct differences in the consequences of light and heavy drinking, an empirical investigation of intrinsic drivers of different levels of alcohol consumption is of high importance in order to enable policymakers to challenge the unwanted costs resulting from it. The economic burden of excessive alcohol consumption in the USA has been estimated at $249 billion in 2010 (Sacks et al. 2015).
We contribute to the literature by investigating the role of LOC for alcohol consumption. The existing literature on the association between LOC and alcohol consumption has mainly focused on excessive drinking and is rather inconclusive due to heterogeneous and selective samples and different ways of defining their drinking variable. For example, Steptoe and Wardler (2001) found that the perception of high, health-related external control is associated with a higher probability of frequent alcohol consumption in a sample of European university students, while Mendolia and Walker (2014) analyzed the effect of LOC and self-esteem on health-related behavior for a group of adolescents aged 15-16 years and find a weak, positive link of having an external perception of control with the frequency of getting drunk when drinking, but no significant association with regular drinking. Lassi et al. (2019) find an association between external LOC and hazardous drinking for samples of teenagers in the UK, and Chiteji (2010) analyzes the effect of self-efficacy (which is strongly linked to LOC) on drinking and exercising, and finds a negative association with drinking using the 1972-sample of male household heads in the Panel Study of Income Dynamics (PSID). Using data from the Household, Income and Labour Dynamics in Australia (HILDA) survey, Cobb-Clark et al. (2014) rely on a self-efficacy scale as a proxy for LOC and identify significantly positive effects of LOC on binge drinking.
We extend and generalize previous analyses in several respects: First, we consider different levels of drinking and thus paint a more comprehensive picture of the relationship between LOC and alcohol consumption. Second, we use a representative adult sample for which we can credibly use LOC as an exogenous covariate. Third, we use an extensive list of control variables such as socio-economic information, health status and other personality traits and, via this, substantially reduce the risk of omitted variable bias. Lastly, we use a general, not domain-specific LOC measurement which is closely linked to the original measure developed by Rotter (1966).
Our estimations are based on extensive information available in the Socio-Economic Panel (SOEP 2017), a large representative household panel from Germany. As opposed to the analysis of Cobb-Clark et al. (2014), our outcome variables are defined in a way which considers more common levels of alcohol consumption and focuses more on frequency rather than amounts of drinking. We thus look into whether individuals drink either occasional or regularly, which does include habitual drinking but is different from excessive drinking such as binge drinking. We find that an internal LOC is associated with a higher probability of reporting occasional or regular drinking even if we control for socio-economic information, health status and other personality and preference measures. Men with a medium or high internal LOC are, on average, about 3.1%-3.7% more likely to be occasional or regular drinkers compared with men with a low internal LOC. Women with a high internal LOC are 6.5% more likely to be moderate drinkers compared with women in the lowest LOC category. As a theoretical explanation for this relationship, we propose that the future risks of alcohol consumption might be underestimated if individuals believe in their own ability to cope with or prevent the negative consequences of unhealthy behavior.
Based on these findings, our paper makes another significant contribution to the literature by discussing social activity as a potential indirect mechanism behind the association. We propose that LOC is especially likely to be highly predictive of individual investment in social networks and thus drinking opportunities, given that attending social gatherings is often inextricably linked with alcohol consumption. We estimate the extent of this indirect effect by including self-reported social activities as mediators into our model and decomposing the estimated coefficients. We find that roughly a quarter of the positive association between LOC and drinking can be explained by different levels of social activity for men and women. These findings are highly policy relevant as increased drinking due to an increase in social interactions is likely to be linked to more moderate levels of alcohol consumption with distinctly different economic and medical consequences as opposed to increased drinking behavior due to a mis-estimation of risks.

Data and empirical approach
Building upon the existing literature, we estimate the relationship between an internal LOC and self-reported alcohol consumption. The estimations are conducted using the extensive information available from the Socio-Economic Panel (SOEP 2017), a large representative longitudinal household panel from Germany (see Goebel et al. 2019, for more information). The SOEP includes detailed socio-economic information and surveys individuals' LOC as well as their health behavior-including alcohol consumption-on a regular basis. While this is also true for other international surveys such as HILDA or NLSY79, the SOEP is the only data source that also enables us to observe important endogenous variables-such as risk and time preferences-as well as social interactions of individuals on a regular basis. This enables us to paint a more detailed picture about potential channels behind the estimated relationship. 2 We restrict our sample to all observations for individuals between the age of 20 to 70 years for the 2006, 2008 and 2010 waves, within which we observe the self-assessed and reported amount of alcohol consumption. The sample is further reduced by item nonresponse in the LOC and other explanatory variables. Table A1 gives an overview over the sample restriction steps and observation loss due to item non-response. 3 The final estimation sample comprises 33,765 observations for 14,841 individuals. Of these, 7496 individuals are observed three times, while 3413 and 3932 are observed once and twice, respectively. The later estimations will always be reported separately for men (48% of the sample) and women (52%) to take care of important, gender-specific heterogeneity, as is common in personality and health literature (see, e.g., Cobb-Clark et al. 2014). Table A2 provides an overview of the main summary statistics for the sample.

Locus of control
For our sample, LOC is measured in 2005 and 2010, in which years SOEP respondents were asked how closely a series of ten statements (items) characterized their Significance stars refer to the significance level of a t-test for mean equivalence between men and woman: * p < 0.1, ** p < 0.05, *** p < 0.01. Items marked with a (-) are reversed prior to factor analysis. a Items 4 and 9 are not included in the analysis views about the extent to which they influence what happens in life. Responses were measured on a seven-point Likert scale ranging from 1 ('disagree completely') to 7 ('agree completely'). A list of the set of items used, as well as the means of the observed responses in the full sample and separated by gender, can be found in Table 1. As a first step in constructing our LOC variable, we conduct an exploratory factor analysis in which we investigate the way in which these items load onto latent factors. The factor analysis reveals that items 1 and 6 have a negative loading and items 2, 3, 5, 7, 8 and 10 have a positive loading onto a first factor. The factor's eigenvalue is 1.84. A second factor has an eigenvalue of only 0.54 and can be neglected. Item 4 does not clearly load onto the first factor and item 9 has an unintuitive attribution, such that we exclude both in line with the earlier literature.
Subsequently, we use a two-step process to create a continuous, unidimensional LOC factor variable, consistent with previous literature (see, e.g., Piatek and Pinger 2016). Based on the exploratory factor analysis, we first reverse the scores for the external items (items 2, 3, 5, 7, 8 and 10) such that all eight items are increasing in internality. Secondly, we use confirmatory factor analysis to extract a single factor for each year separately. 4 This has the advantage of avoiding equal weighting of all items and instead relies on the data to determine how each item is weighted in the overall index. As per Piatek and Pinger (2016), simply averaging the items risks measurement error and attenuation bias. The resulting factor is therefore increasing in internal LOC, and its distribution is shown in Fig. 1.
Additionally, Fig. 1 reports the kernel densities of the LOC factor separately for men and women. It can be seen from both the distribution in Fig. 1 as well as most of the items in Table 1 that men are more internal than women. We account for these gender differences in our empirical analysis by using fully separated estimation models and standardizing the continuous LOC factor as well as generating dichotomous indicators separately within both sub-samples.
Due to the lack in overlap of observations waves of LOC and alcohol consumption, the information on LOC is imputed forward into the years in which we observe alcohol consumption, i.e., the LOC from 2005 is used as the explanatory variable for alcohol consumption in 2006 and 2008 and LOC from 2010 is used for consumption in 2010. Exogeneity of LOC The stability of LOC during adulthood, and thus the exogeneity of the trait in models of decision making, have been heavily discussed in the psychological and economic literature. Although psychological literature exists, which finds variation in LOC during adulthood (Nowicki et al. 2018;Specht et al. 2013), the economic literature widely concluded that, after controlling for age, observed changes are not large enough to be economically relevant (Cobb-Clark and Schurer 2013) or largely unsystematic, driven by situation-specific and temporary measurement inaccuracy in reporting (Preuss and Hennecke 2018). Preuss and Hennecke (2018) find that the reported LOC does change after an exogenous labor market shock but that this observed change is solely driven by the labor force status during the second interview and no permanent changes are observable if individuals are reemployed. Endogeneity issues caused by omitted variable bias or non-random variation in the reporting of LOC are thus less likely if a large set of characteristics of, e.g., labor force or family status are included into the models. Given that we observe LOC only once for 34% of the sample and keeping in mind that variation for the remaining part is likely to arise from temporary variation in reporting, we will not be able to use a fixed-effects framework later on. Nevertheless, besides including a very rich set of control variables into the model, we additionally make sure to reduce the risk of endogeneity to a minimum in two steps: first, we tackle potential reverse causality. Potential endogeneity concerns caused by reverse causality are likely to apply to excessive consumption only and such consumption occurs seldomly in our sample. Nevertheless, in order to minimize the risks in this respect, we ensure that the LOC factor is never measured after the period in which we measure alcohol consumption by the method of forward imputation described above. Second, this would not solve the issue of non-random short-term differences in reporting as described by Preuss and Hennecke (2018), if it has behavioral implications. Thus, we additionally test the robustness of our results to an alternative specification of the LOC indicator in Sect. 3.3. Instead of a forward imputation, we average LOC over all available observation periods and thus generate a LOC measure which is less likely to be affected by temporary volatility in reporting in certain periods. The average LOC is assumed to draw a more consistent picture of the long-term latent trait as it wipes out all within-variation in LOC.

Alcohol consumption
In 2006, 2008 and 2010, individuals were asked to rate their consumption of four different types of alcoholic beverages (beer, wine, spirits, and mixed drinks) on a scale from 1 (regularly) to 4 (never). Based on a combination of all those answers and guided by the work of Ziebarth and Grabka (2009), we generate an ordinal measure of alcohol consumption. 5 The variable categorizes individuals into the following four groups: (1) Abstainers No consumption of all four types, (2) Rare Drinkers Seldom drinking of at least one type, no occasional drinking, (3) Occasional Drinkers Occasional drinking of at least one type, no regular drinking, (4) Regular Drinkers Regular drinking of at least one type. Table 2 provides an overview of the shares of alcohol consumption in the sample. In the full sample, 12% can be characterized as abstainers (no alcohol consumption at all) and 60% of the individuals are counted as being occasional (42%) or regular (18%) drinkers. 6 In line with expectations, the share of drinkers is distinctly lower for women (40% occasional drinkers and 9% regular drinkers) than for men (44% occasional drinkers and 27% regular drinkers). 15% of all women are abstainers compared with 9% of men. 5 The main drawback of this measurement is the rather vague and subjective character, as no concrete information about the exact quantity of alcohol consumption is collected. We conduct a sensitivity check to test our measures and results in this respect and show the robustness of our results against the use of a more objective measure of alcohol consumption (see Sect. 3.3 for more detail). 6 Of the regular drinkers, observed in the data, only 5% report that they drink either spirits or mixed drinks regularly, which corresponds to less than 1% of all individuals in the data and does not allow us to draw any empirical conclusions on excessive drinking from the data at hand. The presented results do not change if these "excessive drinkers" are dropped from the sample. Individuals are grouped into internals and externals based on whether their LOC is lower/equal (external) or higher (internal) than the median. Significance stars refer to the significance level of a t-test for mean equivalence between externals and internals: * p < 0.1, ** p < 0.05, *** p < 0.01 Additionally, Table 2 summarizes the results of a first descriptive analysis of the relationship between LOC and alcohol consumption. The results of the t-tests for mean equality indicate that for both men and women, the share of individuals who indicate that they are occasional or regular drinkers is significantly higher in the group of internal individuals (individuals with a LOC larger than the sample median). The share of occasional drinkers in the internal men category is 46%, while the share in external men is 41%. Internal men are also more likely to be regular drinkers (29% as opposed to 26%) and less likely to be abstainers (6% as opposed to 11%). All differences hold similarly for women.

Estimation strategy
Based on the available data, the obvious modeling choice would be to estimate an ordered response model. However, this model is based on the proportional odds assumption, which can be easily tested with a Brant (1990) test. The statistics of the Brant test for parallel regressions indicate a strong violation of the proportional odds assumption in the full model for men and women in our case, such that we refrain from using an ordered response model. 7 Instead, we estimate four separate binary choice models based on the four drinking indicators D j with j = {1, 2, 3, 4} summarized in Table 3. As our main indicator-D 1it -we estimate the average marginal effects of an individual's LOC on her probability of being an occasional or regular drinker as opposed to be an abstainer or rare drinker in Sect. 3.1. The choice of this indicator as our main explanatory variable is based on the assumption that rare drinkers are very similar to abstainers in their decision making while the same holds true for occasional and regular drinkers. This assumption is discussed and empirically tested in Sect. 3.2. In a further step, we take a closer look at potential differences along the extensive and intensive margin by also estimating the relationship between LOC and the probability of drinking any alcohol (D 2it ), i.e., the extensive margins, as well as the probability of being an occasional or regular drinker as opposed to being a rare drinker (D 3it ) and a regular drinker as opposed to being an occasional drinker (D 4it ), which serve as two different cutoffs at the intensive margin. Nevertheless, it should be noted that the estimated relationships using the latter two outcome variables are at risk of sample selection bias. Therefore, the estimated results should be interpreted with care and only serve as ancillary evidence, additional to the main indicator. All estimations are conducted using the following estimation equation: where D jit is one of the four indicators for alcohol consumption j = {1, 2, 3, 4} of individual i at time t and loc it is the (imputed) LOC of individual i in t. Each model pools observations from the 2006, 2008, and 2010 waves and contains a list of interview characteristics I it , such as the interview mode, interview year, interview month as well as the day of the week. Additionally, we include an extensive list of exogenous control variables, such as demographic characteristics D it (age, nationality, region of residence, number of children in the household, an indicator for expecting parents as well as young children (aged 0-1 and 1-7) in the household, family status, and religious affiliation) as well as standardized personality and preferences measures averaged over all available time periods P i (i.e., the Big Five personality traits, general and healthrelated risk aversion as well as patience and impulsiveness as a proxy for individual time preferences). In order to further investigate the extent of the indirect link between LOC and alcohol consumption, we additionally include a list of potentially endogenous variables into the model. The inclusion of these variables likely moderates the effect of locus of control. These variables include educational controls E it (school degree, vocational degree and university degree), current individual labor market controls LM it (net household income, gross labor income, labor force status and occupational autonomy) and the individual health status and behavior H it (indicator for officially assessed, severe disability or working incapability, subjective health, body mass index and mental and physical health scores 8 as well as the frequency of smoking, healthy diets and exercise). See Table A2 for the full list of control variables.
Equation (1) is estimated using binary logit models, and average marginal effects are reported in the following. LOC is always standardized and categorized within the sub-samples such that, e.g., having a high LOC corresponds to a high LOC compared to all other individuals within the selected sub-sample (e.g., only drinkers in the analysis at the intensive margin) and within the same gender. Standard errors are clustered at the personal level to account for serial correlation in the error terms, which occurs due to the panel nature of the data. Table 4 summarizes the average marginal effects based on the logit model using the binary indicator of occasional or regular drinking (D 1 ) as the dependent variable. Column (1) shows the descriptive raw difference. Consistent with the descriptive differences in Table 2, the estimates show a positive raw gap in the probability of occasional and regular drinking between internals and externals. The more internal that an individual is, the higher the probability of reporting at least occasional drinking. We can see that this descriptive raw difference between internals and externals becomes smaller but remains significantly positive when we include additional sets of exogenous control variables in column (2). An increase in an individual's LOC by one standard deviation increases the probability of occasional or regular drinking on average by 2.4 percentage points for men and by 3.4 percentage points for women, holding all other variables constant. This corresponds to a relative effect of 3.4% for men and 6.9% for women based on the sample means of 71% and 49%, respectively (see Table 2). In order to get a more conservative estimate of the association between LOC and alcohol consumption which considers the role of possible mediators, we include a long list of potentially endogenous control variables in columns (3)-(5). The inclusion of educational controls as well as health behavior has a particularly strong effect on the estimated relationship between LOC and drinking for women, explaining close to twothirds of the estimated association in column (2). For men all three groups of variables (education, labor market and health) have a similarly large effect on the estimated relationship and explain roughly half of it. Column (5) contains the results for the full model, which we will use as our main specification and to which we will refer in the following. 9 An increase in an individual's LOC by one standard deviation on average increases the probability of occasional or regular drinking by 1.1 percentage points for men and by 0.9 percentage points for women, holding all other variables constant. This corresponds to a relative effect of 1.5% for men and 1.8% for women. While we observe a substantial decrease in the estimated effect on D 1 from column (2) to column (5), it is important to note that our data contain a very rich set of control   (6)-(10), LOC is included as binary indicators for the terciles of LOC (with LOC < LOCP33 being the reference group). Standard errors (in parentheses) are clustered on the individual level. * p < 0.1, ** p < 0.05, *** p < 0.01

Main indicator
variables. The evolution of the estimated effect can be interpreted as evidence that the relationship between LOC and occasional/regular drinking is robust and likely not purely driven by observable mediators. We will further analyze the sensitivity of our results with respect to omitted variables following Oster (2019) in Sect. 3.3. Although these effects appear to be rather small, concurrent with the low overall explained variability of alcohol consumption in the model (see pseudo R 2 in Table 4), they hold considerable economic relevance. The magnitude of the effect is of a similar size to the marginal effects of knowingly important preference measures such as the willingness to take risks and patience (as a proxy for time preferences), which can be found in Table S.2. To identify potential nonlinearities, we consider indicators for being in a different tercile of the LOC distribution as explanatory variables in columns (6)-(10). The results show a similar picture to that of the continuous LOC measure. For men, having a medium LOC ((LOC P33 , LOC P66 ]) on average increases the probability of occasional or regular consumption by 2.6 percentage points (3.7%) compared to having a low LOC ([LOC min , LOC P33 ]). Interestingly, having a high LOC ((LOC P66 , LOC max ]) increases men's probability of occasional or regular consumption in a similar magnitude by 2.2 percentage points (3.1%). Thus, the association appears to be nonlinear for men, with the highest probability of drinking being found for men with a medium LOC. The overall picture is slightly different for women: Only having a high LOC increases a woman's probability of occasional or regular consumption by 3.2 percentage points (6.5%), while the effect of a medium LOC is not significant when compared to having a low LOC.

Supplementary indicators: extensive and intensive margin
As the results from the Brant test already indicated, the effect of LOC is likely to differ between different intensities of alcohol consumption. To further investigate this, we devote some further attention to how the effect might differ at the extensive and intensive margin. Table 5 summarizes the estimated average marginal effects of LOC using the three supplementary binary indicators D j={2,3,4} as dependent variables. 10 In the first step, we re-estimate the model using an indicator for any drinking (extensive margin) in order to identify whether the estimation results for the main indicator have been driven by differences between abstainers and rare drinkers. In line with the main results, the effects in columns (1) and (2) are significantly positive, indicating a reduced probability of being an abstainer for internal men and women. The effect magnitudes are smaller than for the main outcome variable, indicating some important associations at the intensive margin, but are still of considerable magnitude. 11 In the second step, we then analyze potential associations at the intensive margin in columns (3)-(6) of Table 5. If we abstract from potential sample selection bias caused by the restriction of the sample to non-abstainers, the results reveal that for men(women) having an medium(high) LOC increases the probability of occasional and regular drinking in the sub-sample of individuals who drink at least sometimes. The patterns are comparable to the ones estimated for the main drinking indicator in Sect. 3.1, but effect magnitudes are smaller and the effects of the continuous measure lose significance. In the last step, in columns (5) and (6), we do not find a significant association between LOC and differences between regular and occasional drinking for men as well as women. Based on these findings, we can assume that much of the association between LOC and drinking is driven by differences at the extensive margin as well as in parts at the cutoff between rare drinkers and occasional drinkers.  (1), (3) and (5) use the continuous LOC factor as the explanatory variable and in columns (2), (4) and (6), LOC is included as binary indicators for the terciles of LOC (with LOC < LOC P33 being the reference group). All models include the full set of control variables in line with columns (5) and (10) of Table 4. Standard errors (in parentheses) are clustered on the individual level. * p < 0.1, ** p < 0.05, *** p < 0.01

Robustness checks
We test the robustness of our results with respect to both the definition of our main explanatory variable and our outcome variable as well as with respect to potentially omitted variables. Explanatory variable In a first step, we check the robustness of our estimated effects with respect to the construction and imputation of the LOC measure. Thus, we construct two alternative LOC measures and re-estimate our main model for these alternative explanatory variables. The results can be found in Table A3. First, we find that our estimated effects are relatively robust against the use of a simple average over all eight LOC items (panel A1), which assumes equal weights for each item on the latent factor. Only the coefficient of the medium LOC for men seems to be sensitive to this change in the definition of the LOC and thus might be more strongly associated with items with higher loadings (e.g., items 5 and 10). Secondly, we check whether our estimated effects are sensitive to the use of an averaged LOC imputed over all available observations, which wipes out all within-variation in LOC for those individuals whom we observe more than once. This adjustment is expected to reduce measurement inaccuracies in the situational measurement of LOC. The estimated effects presented in panel A2 are also robust and effect magnitudes are stronger than for the imputed version of LOC indicating potential attenuation bias due to noise in the measurement of LOC.
Outcome variable-objective amounts The estimated results might also be biased by the subjective nature of the alcohol consumption variable used. As our main dependent variable is based on the self-assessed amount of consumption, it not only depends on the actual consumption level but also the individual's perception of the terms 'regular' and 'occasional'. If individuals perceive amounts differently based on their LOC, this would bias our results. We can test the reliability of our measure and the sensitivity of our results with respect to the subjectivity of the reported amounts using measures of concrete frequencies and amounts available in the SOEP 2016 wave. In 2016, individuals do not self-assess their consumption but report more distinct amounts and frequencies. An overview of the descriptive statistics for these variables can be found in Table A4.
The new dependent variables are generated based on the reported frequency of consumption and the reported consumption amount per consumption day. LOC is imputed from the 2015 wave. 12 The results of this sensitivity check are reported in panel (B) of Table A3. First, the binary indicator for drinking is one if the individual reports drinking at two or more days per month ("moderate or high frequency"). This behavior is assumed to correspond most closely to "occasional or regular consumption" as per the baseline. The sensitivity check (panel B1) indicates that the results from the baseline are relatively robust with respect to the type of reporting. Although effects lose significance due to the extreme reduction in sample size, the effect sizes remain stable except for the effect of a high LOC for women. 13 When we look at high consumption amount-defined as three or more drinks per day-(panel B2) for women we can see that LOC has no effect on the amount of drinks consumed per episode. However, for men, a medium as well as high LOC has a significant positive effect on consumption amounts, too. Omitted variable bias Despite our extraordinary rich set of controls-which include detailed socio-economic characteristics, health status and health behavior in other domains as well as a list of other personality traits and preference measures-we 12 The measurement of LOC in 2015 as well as the construction of the factor is equivalent to those in 2005 and 2010, which is described in detail in Sect. 2. 13 If we concentrate on the full sample of all individuals and thus a higher sample size, effects are stable in significance level and size for both medium and high LOC. The results are available in Table S.4. cannot rule out that omitted variables bias our results. In order to address this issue, we (i) add a number of stressful events as additional control variables and (ii) use a bounding analysis.
First, in addition to our extensive list of control variables, we conduct a robustness check in Table A3 (panel C) in which we additionally add a list of potentially stressful events (job loss, marriage, residential moves, separation, death of a spouse and birth of a child), which might be at risk of affecting both alcohol consumption and LOC, as control variables. Reassuringly, controlling for these events does not affect the estimated effects of LOC.
Secondly, Oster (2019) provides a method of calculating consistent estimates of bias-adjusted treatment effects given assumptions about (i) the relative degree of selection on observed and unobserved variables (δ), and (ii) the R-squared from a hypothetical regression of the outcome on the treatment and both observed and unobserved controls (R max ). δ = 1 implies that observed and unobserved factors are equally important in explaining the outcome, while δ > 1 (δ < 1) implies a larger (smaller) impact of unobserved than observed factors. Given the assumed bounds for δ and R max , researchers can then calculate an identified set for the treatment effect of interest. If this set excludes zero, the results from the controlled regressions can be considered robust to omitted variable bias.
Consequently, we focus on our main result-the estimated effect of LOC on our main indicator D 1 (occasional/regular drinking vs none/rare)-and we re-estimate the results reported in Table 4 using OLS. Table A5 presents the results for the LOC terciles. 14 Comparing Columns (1) and (2) in Table A5 reveals that for men the estimated effect of a medium (high) LOC on D 1 decreases from 0.067 (0.081) in a model with only interview controls to 0.028 (0.026) in our full specification which includes all sets of control variables. For women, the estimated effect decreases from 0.067 (0.115) to 0.013 (0.030). Guided by the rule of thumb provided in Oster (2019), the maximum R 2 is set to 1.3 times the R 2 in the fully-controlled model. Column (3) contains the identified set of coefficients at δ = 1, i.e., a situation in which there are unobserved variables that have similarly explanatory power as our large set of explanatory variables. Subsequently, the identified set is [0.014; 0.028] ([0.001; 0.026]) for men and would still be positive even if we consider the full set of control variables including the potentially endogenous mediators. In fact, the identified set of coefficients only includes zero ifδ exceeds 1.88 (1.05). The identified set for women is [−0.007; 0.013] ([−0.007; 0.030]) if the reduced baseline effect is compared to the controlled effects which include the potentially endogenous health-related variables and thus includes 0. In this caseδ is 0.79 (0.82). This is driven by the strong effect of the health-related control variables for women. As has already been discussed above, these sets of variables are at risk of introducing endogeneity to the model. We thus re-estimate the selection test for a case in which we exclude them from the fully controlled model. If the health controls are excluded from the fully controlled model in the second panel, the identified set is [0.013; 0.028] ([0.022; 0.051]) and would only include zero ifδ exceeds 1.81 (1.64). Overall, the robustness analysis is re-assuring and shows that the results are quite robust to potentially omitted variables. 15 Sample restriction In a set of robustness checks, we analyze the sensitivity of our estimation results with respect to the sample restriction steps as described in Sect. 2. First,in panel (D) of Table A3, we further restrict the age-range of our sample to working-age individuals (i.e., 25-64 years) as LOC is assumed to be more stable in this age period. Estimation results are robust against this sample restriction. Secondly, the robustness checks presented in panel (E) of Table A3 analyze the role of non-random item non-response in the very early sample restriction steps. We re-estimate the raw effect of LOC (without controls) for the unrestricted sample, which also includes individuals who drop out of our main estimation sample to missing information on any of the control variables. Estimation results are also robust with respect to this sample restriction step.

Discussion of results
The results from our empirical analysis stand in contrast to the existing findings on the effect of LOC on health-related behavior in other domains such as smoking, exercise and healthy diet in the previous literature. Health investment models such as in Grossman (1972Grossman ( , 2000 might, thus, not be applicable to the relationship between LOC and alcohol consumption. This doubt is prompted by the missing subjective link between current alcohol consumption and future health consequences. Bennett et al. (1998) state that alcohol consumption might be associated with higher levels of uncertainty about future outcomes as individuals do not see alcohol consumption in reasonable amounts as affecting their health too strongly. Although individual considerations about health investments are likely still at play, they might be on average dominated by other mechanisms in the analyzed population. Due to this uncertainty, we can assume that individual perceptions are highly important in those situations. Individuals must build their own expectations about the probabilities with which their behavior is associated with certain outcomes. In the present case, individuals estimate the likelihood with which their alcohol consumption entails negative future consequences for their health. For example, Sloan et al. (2013) find that heavy drinkers in the USA on average tend to overestimate their ability to handle alcohol while Lundborg and Lindgren (2002) find that young people in Sweden on average tend to overestimate the risks associated with drinking. 16 In line with the definition of LOC, it is obvious to expect that an internal LOC entails lower levels of these risk perceptions. Multiple studies have already found that LOC has an important effect on individual perceptions about personal risk, e.g., with respect to, e.g., myocardial infarction and cancer (see, e.g., Stürmer et al. 2006;Källmén 2000;Sjoberg 2000). In line with this literature, Cobb-Clark et al. (2014) argue that an increased perception of control might 15 In line with the nonlinearity of effects discussed above, the estimates for the continuous measure (see Table S.5) are found to be more sensitive to the selection test. In this case,δ is only > 1 if health controls are excluded from the fully controlled model. 16 As is shown in Ziebarth (2018) and Lundborg and Lindgren (2002), the accuracy of these estimations is affected by the available information such as the education about alcohol. be correlated with a stronger belief about the ability to cope with and prevent the consequences of drinking. An increased perception of individual control might reduce the perceived importance of risk for life's outcomes. The future risks of alcohol consumption might be underestimated if the individual control is overestimated (Slovic 1992). 17 An increased alcohol consumption due to mis-estimated risk probabilities is a likely important explanation for the observed association above.
Potential additional explanations for an observed positive correlation between LOC and alcohol consumption include the role of being able to afford alcohol consumption, the relationship between LOC and alcohol consumption with behavior in other health domains (see e.g. Nguyen 2019), and the correlation between LOC, individual risk preferences and self-control problems as well as present-biased decision-making. However, all these possible explanations have been ruled out largely through the inclusion of earnings, household income, behavior in other health domains, willingness to take risks and patience and impulsiveness as proxies of individual time preferences in the main estimation model. Nevertheless, there is one potential factor that remains and that we will explore in the next subsection.

The (mediating) role of social activities
Based on the existing psychological literature on peer effects of alcohol consumption in adolescence (Lundborg 2006;Buonanno and Vanin 2013), a likely remaining mechanism of the association between LOC and drinking might be the link via differences in the importance of peer and networking effects. Alcohol consumption is associated with important positive effects on social networks. Drinking is common at social events and abstinence has been shown to be linked to strong negative penalties with respect to social integration (see, e.g., Leifman et al. 1995). For example, Peters and Stringham (2006) and Ziebarth and Grabka (2009) discuss the association between alcohol consumption and social networks as likely channels for their identified positive effect of alcohol consumption on earnings. As they notice, alcohol consumption remains a social norm in modern Western societies, which inevitably links drinking and the attendance of social events. Thus, moderate drinking produces social capital and can be labeled as a productive activity. In line with the argument about LOC and investment in future outcomes-which has been raised, for example, in Coleman and DeLeire (2003) and Caliendo et al. (2015)-internals are expected to invest more in social capital than externals, as they expect higher future returns from it such as a network of social support or professional contacts. This can easily be achieved by attending social gatherings and thus drinking. Hence, by default they might be more likely to drink alcohol in moderation. As opposed to excessive and uncontrolled alcohol consumption, drinking behavior that can be explained by this mechanism might be connected with less severe negative or even positive economic and medical consequences, which is why it is important to separate it from other potential explanations.  (1) and (3) use the continuous LOC factor as the explanatory variable and in columns (2) and (4), LOC is included as binary indicators for the terciles of LOC (with LOC < LOC P33 being the reference group). All models include the full set of control variables in line with columns (5) and (10) of Table 4. Standard errors (in parentheses) are clustered on the individual level. * p < 0.1, ** p < 0.05, *** p < 0.01 In order to check this hypothesis, we analyze whether LOC can be associated with higher levels of social activity using information on spare time activities available in the SOEP. We measure social activity with a set of ordinal variables which are based on the self-reported frequency of three social activities, namely "going out eating and drinking," "attending social gatherings" and "visiting friends and neighbors." 18 For the first stage analysis of the relationship between LOC and these social activities, the activities are summarized into a continuous variable, which counts the number of activities which are conducted at least once per week. Table 6 gives the estimated effects of this analysis, estimated using a linear estimation model. As expected, LOC is associated with a higher likelihood of regular participation in these social activities.
Based on this, we investigate whether internals are simply more likely to be exposed to alcohol, as they are socially more active and outgoing, by considering social activities as a mediator in our model. For this purpose, we decompose the estimated relationship between LOC and drinking into a direct and an indirect effect via the full ordinal versions of all social activities analyzed above using the method proposed in Karlson and Holm (2011) and Breen et al. (2013) (KHB method). The KHB method allows for the comparison of estimated coefficients between two nested nonlinear probabilities models by accounting for the fact that coefficients and error variances in these models are not separately identified and coefficients, thus, cannot be directly compared  (1) and (4) show the reduced model which represents the specification as presented in Eq.
(1) with the only difference being that residuals of the social activity variables are included as right-hand side variables. Columns (2) and (6)  between the reduced and the full model. 19 It does so by augmenting the reduced model with the residuals from a regression of the mediator variable (i.e., social activity) on the key explanatory variable (i.e., LOC) and thus allows for a separation of the difference due to mediation and difference due to a rescaling with different error variances. The results of this decomposition are reported in Table 7. The decomposition is based on the coefficients of the nonlinear estimation model and average partial effects are computed for both models and reported in square brackets. The results indicate that for both men and women, a significant share of the effect can be contributed to differences in social activities between internal and external individuals. They explain about 23.8% (30.8%) of the association between a medium (high) LOC and drinking for men and 23.2% of association between a high LOC and drinking for women. The overall effect of a high LOC drops from 2.2 (3.4) to 1.5 (2.6) percentage points for men (women). The remaining associations are only statistically significant for men with a medium LOC and women with a high LOC and we can, thus, conclude that a very large part of the estimated associations can be explained by differences in social activity and social networking between Internals and Externals.

Conclusions
Most studies in the pre-existing economic and psychological literature show that internal individuals live a healthier life. They are more likely to invest in their future health outcomes by following a healthy diet, exercising regularly, and abstaining from smoking. Although we would initially expect this to translate into drinking less or abstaining from alcohol, drinking seems to be different. We find a significant positive link between LOC and alcohol consumption. Men with a medium or high internal LOC are on average about 3.1%-3.7% more likely to be at least occasional drinkers compared to men with a low LOC. Based on this observed nonlinearity in the link between LOC and drinking, we can thus assume that among men an external LOC is linked to less drinking rather than an internal LOC being linked to more drinking. Women with a high internal LOC are even 6.5% more likely on average to be occasional or regular drinkers than women with a low internal LOC with the link being much more linear. These findings are robust to controlling for an extensive list of explanatory variables, the variation in the LOC construct, the definition of the outcome variable and they also largely pass a test for potentially omitted variables based on Oster (2019).
We argue that this finding is likely driven by the fact that the link between drinking and future outcomes is subject to uncertainty more than behavior in other health domains and that especially moderate drinking is largely associated with positive medical, economic and social outcomes. The commonly observed association between LOC and health investments is, hence, less of importance for levels of responsible alcohol consumption. As opposed to this, we suggest that internal individuals more strongly believe in or overestimate their ability to cope with and prevent the negative consequences of drinking. Thus, they might underestimate the risk associated with drinking. In a same way as external individuals overestimate the potential future risks of drinking and thus underestimate their ability to responsibly deal with it. In addition, we show that large parts of the positive relationship can be explained by differences in social behavior and investments into social networks. Internal individuals invest in social networks more strongly by being socially more active. While attending social events, meeting friends, and going out, they are more exposed to alcohol and have more opportunities to drink. A decomposition analysis-in which measures for social activities are included into the model as mediators-indicate that much of the association can be explained by this indirect effect via different levels of social interaction for both men and women.
It is important to note that the two mechanisms are expected to have very distinct economic and medical consequences. Whereas drinking as an investment decision might improve occupational and economic success while being related to rather moderate amounts of alcohol consumption, an underestimation of risks is potentially associated with regular drinking and the economic costs involved, e.g., through the strain that it places on individual health care expenditures and labor market perspectives. However, as excessive alcohol consumption and addiction are relatively rare, we are unable to identify a sufficient number of individuals involved in this kind of behavior to make statements about the link between LOC and extreme forms of drinking behavior. Further disentangling the association with respect to the underlying channels is not possible with the data at hand. This might be an important path for future research.
Our paper adds interesting new aspect to the literature on behavioral implications of LOC as well as determinants of moderate levels of alcohol consumption and again supports the finding that drinking is different. It became clear that more strongly than other forms of unhealthy behavior, alcohol consumption involves multiple opposing behavioral considerations and particular degrees of uncertainty. The underlying mechanisms-social investments and/or mis-estimation of risks-have many layers and stress the individual complexity behind drinking decisions. Knowing about these specific intrinsic drivers of drinking can, e.g., crucially contribute to the efficacy of interventions with the goal of reducing habitual and dangerous alcohol consumption in the population while not adversely affecting or even promoting light and moderate drinking.    (1) and (3) use the continuous LOC factor as the explanatory variable and in columns (2) and (4), LOC is included as binary indicators for the terciles of LOC (with LOC < LOC P33 being the reference group). All model, except the raw model presented in panel (E), include the full set of control variables in line with columns (5) and (10) of Table 4. Standard errors (in parentheses) are clustered on the individual level. * p < 0.1, ** p < 0.05, *** p < 0.01 Source SOEP, 2016 wave, version 33, https://doi.org/10.5684/soep.v33, own calculations Individuals are grouped into Internals and Externals based on whether their LOC is lower/equal or higher than the median. Significance stars refer to the significance level of a t-test for mean equivalence between externals and internals: * p < 0.1, ** p < 0.05, *** p < 0.01  (1) and (2) display the estimation results of a linear estimation with the binary indicator for occasional or regular drinking as dependent variable and the binary indicators for the terciles of LOC (with LOC < LOC P33 being the reference group) as explanatory variables. In the baseline models in column (1), only interview characteristics and time fixed effects are included as control variables, while the controlled models in column (2) include the full set of control variables. The full set of control variables is varied between the two panels for each gender with the upper panel including health controls and the lower panel not including health controls in the controlled setting. Standard errors (in parentheses) are clustered on the individual level. * p < 0.1, ** p < 0.05, *** p < 0.01. R max is set to 1.3 * R and reported in the bottom row of each panel. Column (3) reports the identified set, which is bounded below byβ and above by β * at R max and δ = 1. Column (4) shows the value ofδ that would produce β = 0 given the values of R max