Gambling with lottery stocks?

In this article, we assess whether German private investors gamble in the stock market. Other studies that have analyzed private investors’ preferences with regard to lottery-like characteristics have used retail or discount brokerage data. They have shown that stock trading has common entertainment features with traditional gambling. In particular, clients of discount brokers may invest for speculative purposes and thus have disproportional preferences for lottery-like characteristics. In consequence, assessing preferences by solely using a subset of investors—associated brokerage clients—may lead to substantially biased results. We assess this issue by using SHS-base data from Deutsche Bundesbank which captures the aggregate holdings of the German private sector. In line with the research, we find that German private investors overinvest in stocks with lottery-like features. Yet, when assessing the economic significance of the aggregate overinvestment, the effect is negligible. Further, we do not find consistent evidence of skewness that positively affects the aggregate holdings of the private sector. As studies have identified preferences for skewness as a driving force for retail investors’ stock purchases, our results challenge the preconceived notion of which characteristics actually induce (disproportional) private sector investments.


Introduction
There are various motives for individual investors to participate in the stock market such as for speculative purposes (Oehler 1995;Shefrin and Statman 2000). Kumar (2009) demonstrates that individual investors in the USA are drawn to stocks which simultaneously have the following three lottery-like characteristics: high idiosyncratic volatility, high idiosyncratic skewness, and low price. Furthermore, Kumar (2009) shows that individual investors suffer from (over) investing in those so-called lottery-like stocks because these stocks tend to significantly underperform their counterparts.
Stocks with lottery-like characteristics have been the subject of a variety of academic studies (Bali et al. 2011;Kumar 2009). We add to this body of work in several respects. So far, the majority of studies that assess lottery-like stocks focus on the US and correspondingly US private investors' preferences. By using the German stock market and aggregate portfolio data on private investors, we bring diversity to a widely discussed-yet still mostly US-focusedresearch area.
Studies have relied on discount brokerage data to examine private investors' holdings regarding lottery-like stocks (Han and Kumar 2013;Kumar 2009;Meng and Pantzalis 2018). Dorn and Sengmueller (2009) provide evidence that investors who use discount brokers (partly) consider trading as entertainment and thus engage in it excessively. Dorn et al. (2015) provide further evidence that the clients of discount brokers substitute between participating in lottery games and financial market gambling. Hence, investors who access the stock market via discount brokers may use stocks to gamble (Barberis and Huang 2008;Statman 2002). In contrast, investors holding stocks via the deposit account of their house bank (i.e., retail broker) may follow a buyand-hold approach that involves stocks for which lotterylike characteristics (Bali et al. 2011;Kumar 2009) are less common. Since discount brokerage data, as applied in other studies, only capture a fraction of all investor holdings, the results may be biased. We describe the holding preferences of individual investors for lottery-like stocks with data from the German central bank's (Deutsche Bundesbank) Securities Holdings Statistics (SHS). This database (SHS-base) captures the aggregate holdings of the entire German private sector, that is, our results are not subject to preselection distortions. Thus far, studies that have covered lotterylike gambling in the stock market mostly use one distinct approach of characterizing these stocks. We apply Kumar's (2009) lottery-like stock definition that comprises idiosyncratic volatility, idiosyncratic skewness, and low price as well as its extension by Bali et al. (2011) that provides a more comprehensive overview in which only extreme past daily returns are considered.
Moreover, regarding lottery-like characteristics, most studies focus on investors' preferences for domestic stocks; however, respective preferences for foreign stocks may be substantially different. Since we assess German private investors' preferences for German (domestic) as well as US (foreign) stocks, a comparison between domestic and foreign preferences for holdings is possible.
In contrast to Kumar (2009), we do not discover significant pricing differentials with regard to lottery-like stocks. This is not overly surprising as (sophisticated) investors acquire knowledge about mispricing through academic publications; as rational investors trade against mispricing, the effect decays or disappears (McLean and Pontiff 2016). However, evidence exists that lottery-like stocks as defined by Bali et al. (2011) continue to underperform their counterparts. As these stocks have high levels of idiosyncratic risk, repressed arbitrage may cause a more persistent mispricing as indicated in the literature (McLean and Pontiff 2016;Pontiff 2006Pontiff , 1996Treynor and Black 1973).
Analyzing aggregate holdings of the private sector, we find evidence that German private investors significantly overinvest in stocks with lottery-like characteristics as defined by Kumar (2009). Further, German private investors only overinvest in German lottery-like stocks as defined by Bali et al. (2011). We reconcile private investors' preferences for a subgroup of domestic (i.e., German) stocks with seemingly different preferences for a similar subgroup in a foreign (i.e., US) equity market by pointing to the interrelation of familiarity and risk perception (Heath and Tversky 1991) as well as ambiguity aversion (Ahn et al. 2014;Baltzer et al. 2015;Bossaerts et al. 2010;Boyle et al. 2012;Fox and Tversky 1995).
Using the dataset with German and US stocks, the results from our regression analysis show that German private investors have preferences for low-priced stocks as well as for stocks with high (idiosyncratic) volatility. Furthermore, we find evidence that private investors gravitate toward stocks with high maximum daily returns.
In conflict with other studies, we do not find consistent evidence that (idiosyncratic) skewness drives private sector investments (Brunnermeier and Oehmke 2013;Kane 1982;Kumar et al. 2019;Mitton and Vorkink 2007). As private investors are subject to limited capabilities in regards to perceiving and processing information (Kahneman 1973), they may struggle to identify higher distribution moments like skewness. But they may more easily identify features like price or maximum daily returns and thus these are reflected in their aggregate holdings.
The remainder of this article is structured as follows: in "Literature review" section, we provide a short review on the related literature. In "Data and methodology" section, we describe our data and methodological approach. Subsequently, in "Results and discussion" section we present and discuss the results. "Conclusion" section is the conclusion.

Literature review
Standard neoclassical finance theory (Markowitz 1952;Sharpe 1964) fails to explain why investors engage in excessive trading which deteriorates their performance Odean 2001, 2000;Odean 1999) or why investors participate in negative-sum games like purchasing lotteries (Ariyabuddhiphongs 2011;Davis et al. 1992). In this context, Statman (2002) discusses several behavioral effects which may explain why private investors display such seemingly irrational patterns.
Within the available investment universe, Kumar (2009) argues that stocks that simultaneously have a low share price, high (idiosyncratic) volatility, and high (idiosyncratic) skewness resemble lottery tickets and thus are especially appealing to private investors.
Other studies have addressed the importance of skewness for asset pricing (Arditti 1971(Arditti , 1967Barone-Adesi 1985;Kraus and Litzenberger 1976;Sears and Wei 1985). Prior to Kumar's (2009) publication, Barberis and Huang (2008) postulated that investors who behaved according to Tversky and Kahneman's (1992) cumulative prospect theory were inclined to overweight low probability events and thus had a corresponding preference for positively skewed stocks. Furthermore, investors' preferences for skewness are addressed by Kane (1982), Brunnermeier et al. (2007), Mitton and Vorkink (2007), and Kumar et al. (2019). Directly addressing lotteries, Garrett and Sobel (1999) provide evidence that the skewness of prize distributions may explain why riskaverse individuals accept unfair gambles. Dorn and Huberman (2010) describe the preferences of private investors for volatile stocks. Evidence for retail investors' preferences for low-priced stocks is provided by Kumar and Lee (2006).
Introducing a more viable definition, Bali et al. (2011) characterize lottery-like stocks in terms of extreme daily returns. Similar to Kumar (2009), Bali et al. (2011 show a statistically significant underperformance of lottery-like stocks in comparison to their counterparts. The studies on private investors have widely discussed biases such as overconfidence Odean 2001, 2000;Odean 1999) and attention (Barber and Odean 2008;Da et al. 2011) and their corresponding effects on performance. 1 If private investors preferred lottery-like payoffs and thus overinvested in lottery-like stocks, their overall portfolio performance would suffer. First, by (substantially) overweighting a subgroup of assets, investors deviate from the market portfolio-the optimal investment choice in neoclassical finance (Sharpe 1964)-and thus forgo diversification benefits. Second, overinvestment in stocks that significantly underperform their counterparts deteriorates performance (Bali et al. 2011;Kumar 2009). Fong (2013) argues that individuals are drawn to lotterylike stocks because they seek risk and are prone to sentiment. While risk-averse investors generally avoid lottery-like stocks, risk seekers are strongly attracted to this category when their sentiment is positive. When sentiment wanes, the preference reverses. Investigating the characteristics of stocks with a high proportion of retail trading, Han and Kumar (2013) find strong lottery-like features. Furthermore, Han and Kumar (2013) indicate that collectively speculative retail trading has an effect on stock prices.
Regarding institutional investors, Kumar (2009) shows a collective underinvestment in lottery-like stocks. Agarwal et al. (2019) show that certain institutional investors, that is, actively managed US equity funds, might be prone to investing in lottery-like stocks, the reasons being their catering to investor preferences as well as shifting risk. Hsu et al. (2016) examine whether lottery-like characteristics affect institutional participation in share allocation around seasoned equity offerings (SEO) as well as the issuing firms' post-issue long-run performance: firms with lottery-like characteristics have lower pre-SEO levels of institutional ownership. However, regarding these particular firms, SEOs result in a sharp increase in institutional ownership. Moreover, lottery-like characteristics are negatively associated with long-run performance after the SEO's issue.

Stock market data
As basis for the empirical analysis, we built a dataset containing German and US equities. Only stocks with data for at least 7 months were considered.
We use the CDAX, which is a broad German stock index that comprises all prime and general standard equities, as a proxy for the German stock market. To create a dataset (relatively) free of survivorship bias, we obtained data on monthly index compositions from Thomson Reuters Datastream (Datastream) for the period from July 2000 to August 2020. 2 Subsequently, we consolidated all International Security Identification Numbers (ISINs) and removed any duplicates. The consolidation led to 1059 different ISINs for the period from January 1990 to August 2020 that corresponded to individual companies that were in the CDAX between July 2000 and August 2020. Daily and monthly returns were calculated with the Datastream's total return index-time series data were queried for the period from January 1990 to August 2020. For calculating monthly returns, we, respectively, applied the daily total return index of the first and the last day of each considered stock. Monthly values for share price and market capitalization were obtained by applying the means to the corresponding daily values.
We merged this dataset with the market, size, and bookto-market factors (Fama and French 1993) as well as the momentum factor (Carhart 1997) and obtained the daily and monthly factors from the Kenneth French Data Library (KFDL); as factor data are geographically based and thus provided for different regions, we applied the corresponding factors for Europe.
For US stocks, we used daily as well as monthly data from the Center for Research in Security Prices (CRSP). We included all stocks in the CRSP universe that were listed on one of the three major US exchanges: NYSE, AMEX, or NASDAQ. As CRSP data have the advantage of being free of survivorship bias, no further adjustments were necessary. Factors for North America are, in turn, obtained from the KFDL.
The market, size, and book-to-market factors (Fama and French 1993) are available from July 1990; data for the momentum factor (Carhart 1997) starts in November 1990.

Portfolio sorts
In this subsection we describe the construction of different portfolios which are subsequently used to assess private sector holding preferences. As CRSP data were provided in USD, we converted all CDAX data into USD to eliminate currency effects.
As in Kumar (2009), each month we form three distinct portfolios based on idiosyncratic volatility, idiosyncratic skewness, and price: Lottery , NonLottery , and Others . In order to compute monthly idiosyncratic volatility, we follow Kumar (2009) and use the standard deviation in the residuals from applying the four-factor model (Carhart 1997) to the time-series of the respective daily stock returns. Thus, we run the regression on the daily stock returns of the previous 6 months (i.e., months t − 6 to t − 1 ). As in the case with idiosyncratic volatility, for the computation of monthly idiosyncratic skewness, we follow Kumar (2009) and apply the Harvey and Siddique (2000) method. In this context, idiosyncratic skewness is measured as the third moment of the residuals obtained by regressing daily stock returns on a two-factor model of the market excess return and the square of the market excess return. As before, we obtain the residuals by running the regression on the daily stock returns of the previous 6 months. The Lottery portfolio contains stocks in the lowest kth price percentile (measured as the average price in the previous month), highest kth idiosyncratic volatility percentile, and in the highest kth idiosyncratic skewness percentile. As in Kumar (2009), for the major part of the analysis, we have chosen k = 50 where stocks are above the median idiosyncratic volatility, above median idiosyncratic skewness, and below median price. We identify these stocks as lottery-like. In contrast to the Lottery portfolio, the NonLottery portfolio is composed of stocks that are assigned to the highest kth stock price percentile, the lowest kth idiosyncratic volatility percentile, and the lowest kth idiosyncratic skewness percentile, that is, stocks featuring below median idiosyncratic volatility, below median idiosyncratic skewness, and above median price. The portfolio labeled Others comprises all stocks that are neither in the Lottery nor in the NonLottery portfolio.
Furthermore, we use another definition of lottery-like stocks: We follow Bali et al. (2011) who define stocks with extreme past daily returns as lottery-like. Stocks are sorted based on the constituent maximum daily return over the previous month. Stocks in the highest kth percentile, that is, stocks with the highest daily return over the previous month, are categorized as lottery-like. Similarly, stocks in the lowest kth percentile are classified as nonlottery-like. The corresponding portfolios are labeled Max and NonMax . As a variation, decile portfolios are formed based on the average of the five highest daily returns in the previous month. Accordingly, stocks in the highest and lowest kth percentiles are categorized as lottery-like ( Max5 ) and nonlottery-like ( NonMax5 ). In accordance with Bali et al. (2011), we set k = 10.
In order to analyze the preferences of the German private sector regarding lottery-like characteristics on a broader level, we construct several more portfolios. In this context, we sort portfolios on Kumar's (2009) constituent characteristics of lottery-like stocks. The resulting portfolios are as follows: low/high price ( LPrice/HPrice ), high/low total volatility ( HTVol∕LTVol ), high/low idiosyncratic volatility ( HIVol∕LIVol ), high/low total skewness ( HTSkew∕LTSkew ), and high/low idiosyncratic skewness ( HISkew∕LISkew ). Stocks in the highest/lowest kth percentile of each sorting criterion are assigned to the corresponding portfolio. When sorting portfolios on one criterion, we set k = 10.
Furthermore, portfolios are simultaneously sorted by using various combinations of the (constituent) characteristics of lottery-like stocks. Hence, we construct additional portfolios based on low/high price and high/low total volatility ( LPrice&HTVol∕HPrice&LTVol ), low/high price and high/ low idiosyncratic volatility ( LPrice&HIVol∕HPrice&LIVol ), low/high price and high/low total skewness ( LPrice&HTSkew∕HPrice&LTSkew ), low/high price and high/ low idiosyncratic volatility ( LPrice&HISkew∕HPrice&LISkew ), high/low total volatility and high/low total skewness ( HTVol&HISkew∕LTVol&LTSkew ), and high/low idiosyncratic volatility and high/low idiosyncratic skewness ( HIVol&HISkew∕LIVol&LISkew ). Stocks in the highest or lowest kth percentile are assigned to the corresponding portfolio. When sorting portfolios on two criteria, we chose k = 25.
Given this methodology, there are overlaps among several of the constructed portfolios in which stocks may be assigned to various portfolios at the same time. Summary statistics for all portfolios are displayed in Table 1.

Performance analysis
We conduct a performance analysis on all the portfolios. In this context, we compute the mean monthly raw returns by averaging the value-weighted monthly returns for each portfolio. Additionally, the performance is measured via risk-adjusted returns, which are calculated as the regression intercept ( ) from Carhart's (1997) four-factor model: R i,t denotes the value-weighted return of portfolio i , RF ,t is the risk-free return, and RMRF t represents the return of the market portfolio net of the risk-free return for month t . SMB and HML reflect the size and book-to-market factors as described by Fama and French (1993). WML is a factor that captures momentum as identified by Jegadeesh and Titman (1993). As described, stock market data were obtained from January 1990. Since the Carhart (1997) factors were used to compose some of the portfolio sorting criteria, factor data availability marks the inception of the respective conducted analyses (see "Stock market data" section).
The results for the portfolios sorted according to Kumar (2009) and Bali et al. (2011) are displayed in Table 2. The results for the remaining portfolios described in the previous subsection are displayed in Tables 5 and 6 of Appendix.
(1) Considering all the constructed portfolios, statistically significant mispricing is rare. In contrast to Kumar (2009), we do not find consistent evidence that the Lottery portfolio statistically and significantly underperforms. The Lottery portfolio of the US market shows an alpha of − .39% per month, however, with weak statistical significance at the 5% level (see Table 2 Panel B (1)). For German lottery-like stocks, we do not find evidence of any underperformance (see Table 2 Panel A, (1) to (3)). Pricing differentials are insignificant in both markets. Regarding the Max and Max5 portfolios, the underperformance found by Bali et al. (2011) still prevails in both the US and the German stock markets. 3 Further evidence of mispricing for stocks with extreme maximum daily returns is provided by Annaert et al. (2013). Yet, in the US market the economic magnitude of the effect, as well as its statistical significance, is weaker as reported by Bali et al. (2011). For the decay in mispricing-instead of disappearing-after its publication, McLean and Pontiff (2016) point to frictions hindering arbitrage from completely In this table, we report the summary statistics for the different portfolios described in "Portfolio sorts" section. Furthermore, all portfolios are depicted and described in Table 8 (Panel B) of Appendix. Price refers to the mean monthly stock price; MCap depicts the mean monthly market capitalization. TVol t−1 ∕TSkew t−1 and TVol t−1 t−6 ∕TSkew t−1 t−6 depict the mean monthly values for total volatility/skewness, respectively, measured by using the daily returns of the previous month ( t − 1 ) and the previous 6 months ( t − 6 to t − 1 ). IVol t−1 ∕IVol t−1 t−6 is the mean monthly idiosyncratic volatility computed as the standard deviation in the residuals obtained by fitting a four-factor model (Carhart 1997) to the respective time-series of daily stock returns that cover the previous month and the previous 6 months. ISkew t−1 ∕ISkew t−1 t−6 is the mean monthly idiosyncratic skewness that is measured as the third moment of the residuals obtained by regressing the daily stock returns for the previous month and the previous 6 months, on a two factor-model, where the two factors are the market excess return and the squared of the market excess return (Harvey and Siddique 2000). The analysis is conducted for the German (Panel A) as well as the US (Panel B) stock market. The CDAX is a proxy for the German stock market. Regarding the USA, the analysis contains all common shares in the CRSP universe which are listed on the NYSE, AMEX, or NAS-DAQ Table 2 Value-weighted portfolio returns In this table, we report the key figures for performance, including performance differentials, for the value-weighted portfolios described in "Portfolio sorts" section. Columns (1) and (2), respectively, display the results for the Lottery and NonLottery portfolios sorted according to Kumar (2009). Columns (4)/ (7) and (5) portfolios that are sorted according to Bali et al. (2011). Columns (3), (6), and (9), respectively, display the results for portfolios that contain stocks which, regarding the corresponding sorting approach, have not been assigned to any of the previous categories. Furthermore, all portfolios are depicted and described in Table  The symbols ***, **, and * denote statistical significance at the 1%, 5%, and 10% levels, respectively; t-statistics are displayed in parentheses. All variables are depicted and described in

Max5
(1) (1)- (2) (1)- (3) (4) (4)- (5) (4)-(6) (8)   eliminating the effect. 4 Max and Max5 stocks have very high levels of idiosyncratic risk represented by idiosyncratic volatility (see Table 1). As reported in the research, idiosyncratic risk restrains the amount investors are willing to invest in mispriced assets, thereby inhibiting arbitrage (McLean and Pontiff 2016;Pontiff 2006Pontiff , 1996Treynor and Black 1973). Thus, regarding the Max and Max5 portfolios, mispricing may be fairly persistent (Annaert et al. 2013). Considering all other portfolios sorted, we report evidence of a statistically significant underperformance of high-risk stocks, that is, stocks with simultaneously high (idiosyncratic) volatility and high (idiosyncratic) skewness, in both markets. In this context, significant performance differentials may be attributed to the widely known low-volatility anomaly, which can be traced back to Black (1972) and Haugen and Heins (1975). Contradicting the Capital Asset Pricing Model (Sharpe 1964), the low-volatility anomaly states that low-risk assets, irrespective of the applied risk measure, have superior returns. 5

Securities holdings statistics data
Data on private sector holdings come from the SHS-base which is a reasonable indicator for the distribution of listed securities among German households. The SHS-base is a collection of obligatory reports filed by all financial institutions domiciled in Germany to Deutsche Bundesbank (Bade et al. 2017). The available data contain quarterly observations from the fourth quarter of 2005 to the fourth quarter 2012; each observation is from the end of the last month of the quarter. Starting in January 2013, the SHS-base changed to monthly observations. Monthly data are obtained up until June 2017. Accordingly, they reflect end-of-month security holdings. Subsequently, SHS-base data points are labeled security-month observations.
The reports comprise data on all debt securities, shares, and mutual funds stored at the reporting institutions that correspond to German households. Security holdings are reported by their ISIN. For each security-month observation, Deutsche Bundesbank provides the aggregated market value of the shares owned by German households that comprises the aggregated number of shares multiplied by the corresponding end-of-month market price in EUR. In contrast to discount and retail brokerage data which mirror portfolios of a corresponding client base, SHS-base aggregated market values are based on the shares owned by the entirety of German households. Hence, the SHS-base dataset, as applied in this analysis, gives information about the actual (unbiased) distribution of German private sector funds across the considered securities (Oehler and Wanger 2020).
In order to assess the holdings of the German private sector with regard to the previously described portfolios, we merge the SHS-base with the applied proxies for the German (CDAX) and the US (CRSP) stock markets. The CRSP data do not have security ISINs. Hence, SHS-base data cannot be directly merged with the CRSP dataset. Applying ticker symbols as common identifiers, we access Datastream to obtain ISINs for the corresponding CRSP securities. Matching CRSP securities with ISINs proves to be rather difficult. When merging the CRSP dataset-supplemented by all accessible ISINs-by using SHS-base aggregated market values for the private sector, only about half of all securitymonth observations can be matched. The poor matching results are explained by the difficulties in acquiring ISINs for CRSP securities as well as the particular composition of the CRSP database. Regarding the latter, CRSP has a variety of securities that correspond to relatively unknown US companies that are unlikely to be a pertinent part of German private sector holdings. 6 Henceforth, we address this issue by using the S&P1500 as an alternative proxy for the US stock market which leads to vastly superior matching results. S&P1500 data are in turn obtained from Datastream. 7

Unexpected portfolio weights
In this subsection, we assess if the German private sector, as mirrored by SHS-base data, disproportionally invests in any of the previously described portfolios. In this context, we construct the unexpected portfolio weights ( EW h p,t ) which are composed as follows: where w h p,t is the relative weight of portfolio p held by the private sector in month t in relation to all corresponding private sector holdings; accordingly, w m p,t is the relative market For evidence on the low-volatility/low-risk anomaly see Ang et al. (2006Ang et al. ( , 2009, Baker and Haugen (2012), Bali et al. (2017), Blitz et al. (2013), Blitz and van Vliet (2007), Haugen and Baker (1991), Jagannathan and Ma (2003), and Leote de Carvalho et al. (2012). 6 Complying with Deutsche Bundesbank's data privacy protection, we are only able to use aggregated private sector data when a corresponding security is stored by at least three distinct reporting institutions. 7 Data on monthly S&P1500 compositions as well as daily time series data for individual stocks come from Datastream for the period covered by SHS-base data. The S&P1500 dataset is constructed by using the approach applied for CDAX securities (see "Stock market data" section).
weight of portfolio p in month t . The unexpected portfolio weights are, respectively, composed for the proxies for the German (CDAX) and the US (S&P1500) stock markets. The relative private sector weight is constructed as the funds assigned to the respective portfolio that are divided by all funds assigned to German and US stocks for which SHSbase data are available. Accordingly, the relative market weight is constructed as the market value of the respective portfolio that is divided by the total market value of all German and US stocks; stock-month observations which cannot be matched to SHS-base data are not included when constructing the relative portfolio market weights with the available SHS-base data. The results are displayed in Table 3.

Regression analysis
Furthermore, we use a regression analysis to assess the preferences of the private sector for lottery-like characteristics. Following Goetzmann and Kumar (2008) and Kumar (2009), we apply the unexpected weight allocated to each stock as the dependent variable. The measure is constructed as follows: where w h i,t is the relative weight of stock i held by the private sector in month t in relation to all corresponding private sector holdings; w m i,t depicts the relative market weight of stock i in month t. 8 The baseline model for the regression analysis is as follows: All dependent variables refer to stock-month observations. Vol∕Skew depicts (idiosyncratic) volatility/skewness that is measured using the daily returns of the previous month and previous 6 months, and Price is the stock price during the previous month, DDomestic is a dummy variable which equals one if the corresponding stock is listed in the CDAX, lnMCap is the natural logarithm of the corresponding firm's market capitalization during the previous month, SSkew is the systematic skewness that is measured by using the daily returns of the previous month and previous 6 months, RMax is the maximum daily return attained in the previous month, and R is the monthly return over the previous month. Furthermore, we report the results for the following regression model: where DVol∕DSkew is a dummy variable that equals one if the corresponding stock's (idiosyncratic) volatility/skewness measured by using the daily returns of the previous month and previous 6 months is within the highest kth percentile of its domestic market; DPrice depicts a dummy variable which equals one if the corresponding stock's price during the previous month is within the lowest kth percentile. DVolSkew , DPriceVol , and DPriceSkew depict dummy variables that equal one if the corresponding stock is simultaneously within the highest kth percentile with regard to the volatility and the skewness measures, or the lowest kth percentile with regard to the price and the highest kth percentile, respectively, with regard to the volatility or skewness measure. DPriceVolSkew is a dummy variable which equals one if the corresponding stock is simultaneously in the lowest kth price percentile, the highest kth (idiosyncratic) volatility percentile, and the highest kth (idiosyncratic) skewness percentile. DRMax depicts a dummy variable equal to one if the stock is within the highest kth percentile with regard to the maximum daily return of the previous month.
All variables are displayed and summarized in Table 8 of Appendix. The results are reported in Table 4.

Weighting
Our results presented in Table 3 show that German private investors overinvest in stocks with lottery-like characteristics. The results are in line with the research that has reported that private investors have a strong preference for stocks with lottery-like features (Bali et al. 2017;Doran et al. 2012;Han and Kumar 2013;Kumar 2009;Kumar and Lee 2006). The German private sector overweights both domestic and foreign lottery-like stocks as defined by Kumar (2009). The exposure to domestic lottery-like stocks is 107% higher (see In line with the approach described in "Unexpected portfolio weights" section, the unexpected stock weights are computed with regard to the proxies for the German (CDAX) and the US (S&P1500) stock markets. The relative private sector weight is constructed as the aggregated funds assigned to the respective stock i that are divided by all funds assigned to German and US stocks for which SHS-base data are available. Accordingly, the relative market weight is constructed as the market capitalization of stock i that is divided by the total market capitalization of all German and US stocks with available SHSbase data. column (7)) and the exposure to US lottery-like stocks is 25% higher (see column (14)) than justified by the stocks' market capitalization. However, the households only overinvest in the domestic Max and Max5 portfolios as defined by Bali et al. (2011). The German private investors marginally overweight the domestic NonLottery portfolio but seem to underinvest in the foreign NonLottery portfolio. Furthermore, they underweight the domestic NonMax and NonMax5 portfolios. Stocks with relatively low maximum daily returns, that is, stocks without large (positive) outliers, are unlikely to capture (extra) attention from private investors (Barber and Odean 2008;Odean 1999). Thus, the underinvestment in stocks with low maximum daily returns may be driven by this lack of attention. As argued by Dorn and Sengmueller (2009), private investors to some extent consider trading as entertainment. Therefore, stocks assigned to the NonMax and NonMax5 portfolios may be unpopular choices as they do not trigger investors' excitement. However, in contrast to their domestic equivalents, foreign NonMax and NonMax5 stocks appear to be overweighted by private investors. For the German Max and Max5 portfolio, the mean of the relative market weight ( w m p,t ) exceeds the mean of the relative household portfolio weight ( w h p,t ), yet the mean of the excess weight ( EW h p,t ) indicates an average overinvestment (see Table 3, Panel A). This can be attributed to two positive outliers in the excess market weight, yet the robustness of this pattern appears to be weak.
Differences with regard to relative weights assigned to a domestic portfolio and its foreign counterpart may be driven by the interrelation of familiarity and risk perception (Heath and Tversky 1991). Studies have well-documented that investors are subject to ambiguity aversion (Ahn et al. 2014;Baltzer et al. 2015;Bossaerts et al. 2010;Boyle et al. 2012;Fox and Tversky 1995). In this context, due to their geographic remoteness, distant stocks correspond to a greater sense of unfamiliarity and thus investors perceive them as being riskier (Baltzer et al. 2015;Goetzmann and Kumar 2008;Huberman 2001). 9 In this context, when investing aboard, investors may be drawn to stocks which have low levels of idiosyncratic risk.
Furthermore, when taking into account Shefrin and Statman's (2000) Behavioral Portfolio Theory, investors who favor certain high-risk stocks and their low-risk counterparts do not pose a contradiction; as investors segregate their portfolios into mental accounts that correspond to different aspirations, assets at both ends of the risk spectrum may appear as suitable investment choices (Oehler et al. 2018a;Oehler andHorn 2021, 2019).
Furthermore, our results with regard to US stocks may be partly driven by the market proxy. As described in "Portfolio sorts" section, lottery-like stocks are defined in relative terms (Bali et al. 2011;Kumar 2009). While capturing a large portion of its market capitalization, our proxy for the US market-the S&P1500-only includes a fraction of available US equities. We acknowledge that with regard to the classification of US lottery-like stocks the applied benchmark may potentially lead to distortions.
Regarding disproportional investments, private investors substantially overinvest in low-priced stocks as well as in stocks with high levels of (idiosyncratic) volatility. In contrast, they underweight stocks with a high level of idiosyncratic skewness. The results are displayed in Table 7 (Panel A) of Appendix.
Further, private investors overweight the portfolio that contains low-priced stocks which simultaneously have high levels of (idiosyncratic) volatility. Moreover, private investors overinvest in the portfolio that contains low-priced stocks which simultaneously have high levels of (idiosyncratic) skewness. They also overweight the domestic portfolio that contains high (idiosyncratic) volatility and high (idiosyncratic) skewness stocks; regarding its foreign counterpart, there is no evidence of a significant disproportional investment. The results are displayed in Table 7 (Panel B) of Appendix.

Regression analysis
The results of the regression analyses are displayed in Table 4. In line with Kumar (2009), we find evidence that private investors prefer low-priced stocks and stocks with high (idiosyncratic) volatility.
The regression model depicted in Eq. (5) yields significantly positive coefficients for PriceVolSkew and DRMax that, respectively, reflect lottery-like characteristics according to Kumar (2009) and Bali et al. (2011). They are evidence that private investors show preferences for the established definitions of lottery-like stocks.
Surprisingly, we do not find consistent evidence that (idiosyncratic) skewness drives overinvestment in the private sector; the results are very consistent across the applied regression models. This is in contrast to the theoretical and empirical literature which highlights the importance of skewness with regard to investors' preferences (Brunnermeier and Oehmke 2013;Kane 1982;Kraus and Litzenberger 1976;Kumar et al. 2019;Mitton and Vorkink 2007). Table 3 Weighting The table presents the characteristics of the relative household portfolio weight ( w h p,t ), relative market weight ( w m p,t ), and the resulting unexpected or excess weight ( EW h p,t ) for six portfolios with lottery-and nonlottery-like features (Bali et al. 2017;Kumar 2009); portfolio composition is described in "Portfolio sorts" section. Furthermore, all portfolios are depicted and described in  (1) and (8), (2) and (9), and (3) and (10) we, respectively, report the Mean , Median , and Standard Deviation ( SD ) that correspond to the relative household portfolio weights. Columns (4) and (11), (5) and (12), and (6) and (13), respectively, display the Mean , Median , and SD that correspond to the relative market weights of each portfolio. Columns (7) and (14) display the Mean of the unexpected portfolio weight (see "Unexpected portfolio weights" section); we conduct one-sample t-tests in order to determine whether the underlying means are significantly different from zero. The symbols ***, **, and * denote statistical significance at the 1%, 5%, and 10% levels, respectively; t-statistics are displayed in parentheses. Columns (1) to (7) cover the holdings of the German private sector with regard to German stocks represented by the CDAX; columns (8) to (14) (3) (3)    In the table, Panel A displays the panel regression estimates for the regression model of equation (4) as in "Regression analysis" section. In columns (1), (2), (3), (4), (9), (10), (11), and (12), the regression has been estimated using total volatility and total skewness measures; columns (5), (6), (7), (8), (13), (14), (15), and (16) depict regression specifications with idiosyncratic volatility and idiosyncratic skewness. Columns (1), (2), (5), (6), (9), (10), (13) and (14) depict panel regression estimates with time fixed effects and (accounting for potential serial and cross-correlations) stock-month clustered standard errors (Petersen 2009). Columns (3), (4), (7), (8), (11), (12), (15), and (16) show the Fama and MacBeth (1973) cross-sectional regression estimates with Newey and West (1987) adjusted t-statistics. Following Kumar (2009), all variables are winsorized at their 0.5 and 99.5 percentile levels to ensure that extreme values are not affecting the results. Likewise, as in Kumar (2009), all variables (dependent and independent) are standardized ( Mean set to zero and SD set to one); thus, the coefficient estimates may be directly compared within and across the depicted regression specifications. The symbols ***, **, and * denote statistical significance at the 1%, 5%, and 10% levels, respectively; t-statistics are displayed in parentheses. All variables are depicted and described in  Fama and MacBeth (1973) cross-sectional regression estimates with Newey and West (1987) adjusted t-statistics; the regression model is depicted in Eq. (5) in "Regression analysis" section. In columns (1) and (2) we use dummy variables that correspond to total volatility and total skewness; columns (2) and (3)  . The symbols ***, **, and * denote statistical significance at the 1%, 5%, and 10% levels, respectively; t-statistics are displayed in parentheses. All variables are depicted and described in Table 8 (Panel D) of Appendix.
Panel C presents the Fama and MacBeth (1973) cross-sectional regression estimates with Newey and West (1987) adjusted t-statistics; the regression model is depicted in Eq. (5) in "Regression analysis" section. In columns (1) and (2), we use dummy variables that correspond to total volatility and total skewness; columns (2) and (3)  . The symbols ***, **, and * denote statistical significance at the 1%, 5%, and 10% levels, respectively; t-statistics are displayed in parentheses. All variables are depicted and described in There are several factors which may drive the obtained results. 10 As individuals inherent limited capabilities to perceive and process information (Kahneman 1973), the assumption that private investors are sufficiently able to assess a stock's corresponding (idiosyncratic) skewness appears to be rather pretentious. Even when financial literacy among investors is generally high, identifying and evaluating skewness may impose a challenge. In line with this argument, van Rooij et al. (2011) find that financial literacy is predominantly limited to basic knowledge. 11 Share price and (idiosyncratic) volatility are features that may be identified much more easily by private investors. Accordingly, regarding the price and the idiosyncratic volatility feature, the conducted regression analysis yields unambiguous results. Furthermore, investors' expected skewness may not exactly match the applied skewness measures which are based on past daily returns. 12 Drerup et al. (2022) assess heterogeneity in skewness expectations, providing evidence that individuals disagree on the magnitude of skewness as well as on its sign. 13 In this study, we extrapolate past return skewness ( Skew t−1 ∕Skew t−1 t−6 ) into the future (Barberis et al. 2016;Kumar 2009). While being a reasonable proxy, this approach may not directly capture private investor skewness expectations. Moreover, as skewness may not be persistent over time (Adcock and Shutes 2005;DeFusco et al. 1996;Harvey and Siddique 1999;Singleton and Wingender 1986), investors may exhibit preferences for skewness when choosing stocks, but (at the aggregate level) do not rebalance their portfolios when stock and/or portfolio characteristics change (Calvet et al. 2009). The latter behavior might even be beneficial for households since excessive trading and rebalancing might considerably hamper their investment performance (Anderson 2005; Barber and Odean 2000;Bauer et al. 2007;Horn and Oehler 2020). Finally, the observation period which coincides with the emergence of innovations in financial markets that are popular among private investors may have an impact on the reported results. These innovations include Contracts for Difference (CFDs) as well as various forms of Social Trading. CFDs are leveraged financial instruments which enjoy popularity among private investors. As they allow investors to take highly levered positions in financial instruments without taking actual physical positions, their nature is highly speculative (Brown et al. 2010;Corbet and Twomey 2014;Lee and Choy 2014;Twomey and Corbet 2014). Social Trading is a social network-based innovation where private investors may delegate their investment decision to other private investors (Horn et al. 2020;Oehler et al. 2016). A first attempt to study gambling behavior in the context of social trading is made by Schneider and Oehler (2021). Popular Social Trading platforms like eToro (www. etoro. com) and ZuluTrade (www. zulut rade. com) additionally offer CFD trading. Given these new possibilities, private investors may no longer rely on stocks in order to include skewness into their overall portfolios.

Economic significance
As in other studies, we find that private investors on an aggregate level overinvest in stocks with lottery-like features. The statistical significance of this disproportional investment is high. Yet, due to the minor overall size of the Lottery , Max , and Max5 portfolios, the effect is not as severe. From October 2005 until June 2017, the mean market value of the German Lottery portfolio is 6.1 billion EUR or 7.8 billion USD. The German Lottery portfolio, on average, accounts for 0.5% of the total market capitalization of the CDAX. Thus, on an aggregate level, investors should assign 0.5% of their funds designated for domestic equities to the Lottery portfolio. Yet, the average weight assigned to the domestic Lottery portfolio is 1.0%. As on the aggregate level German private investors have 145.2 billion EUR in domestic stocks, the expected aggregate investment in the German Lottery portfolio is 726 million EUR. As the actual funds assigned to the Lottery portfolio are about twice as high, the average aggregate overinvestment is 726 million EUR. Considering the entirety of German private investors, the corresponding aggregate overinvestment of 726 million EUR does not seem to be particularly relevant.
Considering our results, one could make the argument that German private investors hold substantial parts of their public equity investments in foreign lottery-like stocks which are listed in a country other than the USA. However, considering the previously discussed home bias phenomenon and the associated overall overinvestment in domestic assets (Cooper and Kaplanis 1994;French and Poterba 1991;Tesar and Werner 1995), this does not seem to be likely.
Thus, while in relative terms the aggregate overinvestment in stocks with lottery-like features may appear to be 10 Several studies challenge the prevailing narrative that investors strictly exhibit preferences for positive skewness. Yang and Nguyen (2019) provide empirical evidence that Japanese investors show preference for positively skewed assets, but do not dislike assets which are negatively skewed. Taking a theoretical approach, Brockett and Garven (1998) provide examples where a decision maker prefers the less skewed option when faced with the choice between two prospects with equal means and equal variances but different levels of skewness. That is, differences in higher moments can offset skewness preferences. Brünner et al. (2009) provide experimental evidence that skewness has an impact at the individual level, yet its direction is found to substantially differ across subjects. 11 Financial Literacy in Germany is addressed by Oehler and Werner (2008) and Oehler et al. (2018b). An International comparative study on financial literacy is provided by OECD/INFE (2016). 12 In order to forecast skewness, Boyer et al. (2010) use lagged skewness as well as additional predictive variables. 13 While being interpersonally stable, stock market expectations vary substantially in between individuals (Dominitz and Manski 2011). large, when considering the absolute invested funds, the effect appears relatively minor.
When assessing lottery-like stocks as defined by Kumar (2009), we do not discover significant pricing differentials. This is not overly surprising as sophisticated investors acquire knowledge about mispricing through academic publications; as rational investors trade against mispricing, the effect decays or disappears (McLean and Pontiff 2016). In contrast, there is evidence that lottery-like stocks as defined by Bali et al. (2011) still tend to underperform their counterparts. As these stocks have high levels of idiosyncratic risk, repressed arbitrage and thus a more persistent mispricing is in line with these studies (McLean and Pontiff 2016;Pontiff 2006Pontiff , 1996Treynor and Black 1973).
Taking into account aggregate private sector holdings (SHS-base), we find evidence that German private investors overinvest in stocks with lottery-like characteristics as defined by Kumar (2009). Further, German private investors only overinvest in domestic lottery-like stocks as defined by Bali et al. (2011). We attribute the preferences for a subgroup of domestic stocks and the seemingly differing preferences for a similar subgroup in a foreign equity market to the interrelation of familiarity and risk perception (Heath and Tversky 1991) and ambiguity aversion (Ahn et al. 2014;Baltzer et al. 2015;Bossaerts et al. 2010;Boyle et al. 2012;Fox and Tversky 1995).
We conduct a regression analysis and find evidence that private investors prefer low-priced stocks and those with high (idiosyncratic) volatility. Furthermore, our results show that private investors gravitate to stocks with high maximum daily returns. As opposed to the literature, we do not find evidence that (idiosyncratic) skewness drives the (over) investments of the private sector (Brunnermeier and Oehmke 2013;Kane 1982;Kumar et al. 2019;Mitton and Vorkink 2007). Taking into account limited capabilities to perceive and process information (Kahneman 1973), we argue that private investors may struggle to identify higher distribution moments like skewness. Features like price, (idiosyncratic) skewness, or maximum daily returns may be identified more easily and thus are reflected in the aggregate holdings of the private sector. Moreover, private investors may be subject to heterogeneous skewness expectations which are not captured by the applied proxies and/or are reluctant to rebalance their portfolios when skewness characteristics change. In addition, given the rise of financial innovations like CFDs and Social Trading which enjoy great popularity, private investors may no longer rely on stocks to include skewness into their overall portfolios.
Finally, while in relative terms the aggregate overinvestment in stocks with lottery-like features may appear to be large, it has a relatively minor effect with regard to the absolute invested funds. Nonetheless, German private investors may still engage in excessive gambling in the financial market.
Columns (11) and (12) depict the results for the portfolios that are jointly sorted on idiosyncratic volatility and idiosyncratic skewness ( HIVol&HISkew and LIVol&LISkew ). Furthermore, all portfolios are depicted and described in Table D (Panel B) of Appendix. The analysis contains the value-weighted mean monthly portfolio return ( MeanRet ), the respective standard deviation ( SD ), and the regression intercept alpha ( ) from a four-factor model (Carhart 1997) as performance measures. With regard to the regression, the factor exposures to the market ( RMRF ), size ( SMB ), value ( HML ), and momentum ( WML ) factor are reported. The analysis is conducted for the German (Panel A) as well as the US (Panel B) stock markets. The CDAX is used as a proxy for the German stock market. Regarding the US, the analysis contains all common shares in the CRSP universe which are listed on the NYSE, AMEX, or NASDAQ. The symbols ***, **, and * denote statistical significance at the 1%, 5%, and 10% levels, respectively; t-statistics are displayed in parentheses. All variables are depicted and described in Table D (Panel A) of Appendix

Mean
(1) (3) The table presents the characteristics regarding the relative household portfolio weight ( w h p,t ), relative market weight ( w m p,t ), and the resulting unexpected or excess weight ( EW h p,t ) for the portfolios described in "Portfolio sorts" section. Furthermore, all portfolios are depicted and described in Table 8 (Panel B) of Appendix. Panel A presents the results for the portfolios, respectively, sorted on one of the three constituent lottery-like features identified by Kumar (2009). Panel B presents the results for the portfolios, respectively, sorted on two of Kumar's (2009) three constituent lottery-like features. In Columns (1) and (8), (2) and (9), and (3) and (10) we, respectively, report the Mean , Median , and SD that correspond to the relative household portfolio weights.
Columns (4) and (11), (5) and (12), and (6) and (13), respectively, display the Mean , Median , and SD that correspond to the relative market weights of each portfolio. Columns (7) and (14) display the Mean of the unexpected portfolio weight (see "Unexpected portfolio weights" section); we conduct one-sample t-tests in order to determine whether underlying means are significantly different from zero. The symbols ***, **, and * denote statistical significance at the 1%, 5%, and 10% levels, respectively; t-statistics are displayed in parentheses. Columns (1) to (7) cover German private sector holdings with regard to German stocks represented by the CDAX; columns (8) to (14) correspond to US stocks represented by the S&P1500.
Data on aggregate private sector holdings come from the Deutsche Bundesbank's SHS-base (see "Securities holdings statistics data" section)

Conflict of interest
The authors declare that they have no conflict of interest.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in Unexpected portfolio weight following Kumar (2009); disproportional investment of German private sector-mirrored by SHS-base data-in portfolio p measured in month t Panel D: Regression variables w m i,t Relative weight of stock i held by the German private sector in month t (Kumar 2009) based on SHS-base data. Measured as the sum of funds assigned to stock i that is divided by the sum of funds assigned to the respective benchmark (CDAX and S&P1500) Relative market weight of stock i in month t (Kumar 2009). Measured as the market capitalization of stock i that is divided by the total market capitalization of the respective benchmark (CDAX and S&P1500) Unexpected stock weight following Kumar (2009); disproportional investment of German private sectormirrored by SHS-base data-in stock i measured in month t TVol Total volatility is the standard deviation in the daily returns measured over the previous month ( TVol t−1 )/ previous 6 months ( TVol t−1 t−6 ) IVol Idiosyncratic volatility is the standard deviation in the residual obtained by fitting Carhart's (1997) fourfactor model to the daily returns of the previous month ( IVol t−1 )/previous 6 months ( IVol t−1 t−6 ) TSkew Total skewness is scaled by the third moment of daily returns measured over the previous month ( TSkew t−1 )/previous 6 months ( TSkew t−1 Idiosyncratic skewness is scaled by the third moment of the residual obtained by following Harvey and Siddique (2000): fitting a two-factor model-RMRF and RMRF 2 -to daily returns of the previous month ( ISkew t−1 )/previous 6 months ( ISkew t−1 t−6 ) SSkew Systematic skewness/co-skewness; coefficient of the RMRF 2 -variable is obtained by fitting a two-factor model (Harvey and Siddique 2000)-RMRF and RMRF 2 -to daily returns of the previous month ( SSkew t−1 )/previous 6 months ( SSkew t−1 t−6 ).

Price
Price of stock i measured over the previous month lnMCap Natural logarithm of the market capitalization that corresponds to stock i and is measured over the previous month R Monthly return of stock i over the previous month RMax Maximum daily return of stock i measured during the previous month DDomestic Dummy variable that equals one if stock i is of German origin (listed in CDAX) DVol Dummy variable that equals one if stock i is within the highest kth total/idiosyncratic volatility percentile in the previous month DSkew Dummy variable that equals one if stock i is within the highest kth total/idiosyncratic skewness percentile in the previous month DPrice Dummy variable that equals one if stock i is within the lowest kth price percentile in the previous month DPriceVol Dummy variable that equals one if stock i is simultaneously within the lowest kth price percentile and the highest total/idiosyncratic volatility percentile in the previous month DPriceSkew Dummy variable that equals one if stock i is simultaneously within the lowest kth price percentile and the highest total/idiosyncratic skewness percentile in the previous month DVolSkew Dummy variable that equals one if stock i is simultaneously within the highest kth total/idiosyncratic volatility percentile and the highest total/idiosyncratic skewness percentile in the previous month DPriceVolSkew Dummy variable that equals one if stock i is simultaneously within the lowest kth price percentile, the highest kth total/idiosyncratic volatility percentile, and the highest total/idiosyncratic skewness percentile in the previous month DRMax Dummy variable that equals one if stock i , based on the maximum daily return of the previous month, is within the highest kth percentile The stock market data are acquired from Thomson Reuters Datastream and CRSP. The Fama and French (1993) factors and the momentum factor (Jagannathan and Ma 2003) come from the KFLD. The SHS-base data come from the Deutsche Bundesbank and are to mirror German private sector holdings the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.