Missing the wealthy in the HFCS: micro problems with macro implications

Wealth aggregates implied by the Household Finance and Consumption Survey (HFCS) usually yield much lower amounts than macroeconomic statistics reported in the National Accounts. An important source of this gap may be the under-representation of the wealthiest households in the HFCS. This article therefore combines a semi-parametric Pareto model estimated from top survey data and observations from rich lists with a non-parametric stratification approach to quantify the impact of the missing wealthy households on component-specific micro-macro gaps. We find that unadjusted micro data substantially underestimates wealth inequality. The largest effects are documented for equity. For other components, the missing wealthy explain less than ten percentage points of the micro-macro gap. We find that differences in oversampling strategies limit the cross-country comparability of unadjusted survey-implied wealth distributions and that our top tail adjustment leads to measures that are internationally better comparable.


Introduction
The prime goal of Distributional National Accounts (DINA) is to add distributional structure to the macroeconomic statistics reported in the national accounts aimed to reveal the distribution of national income and wealth across different sections of society (Fesseau and Mattonetti 2013; Alvaredo et al. 2020). Piketty et al. (2017) name this gap in national accounts as the most important limitation that needs to be overcome to rigorously Sofie R. Waltl sofie.waltl@liser.lu; sofie.waltl@wu.ac.at 1 measure inequality and thus call for establishing DINA. Due to better data availability there is already substantial progress with regard to income, whereas there are important gaps concerning wealth-related indicators. Such indicators, however, are needed to accommodate the renewed interest in wealth distributions -and in particular in wealth inequality -among the greater public as well as academics, politicians, and policy-makers.
We tackle this gap by discussing how to align the Household Finance and Consumption Survey (HFCS) with national accounts and by proposing a method for correcting a major drawback of the HFCS -the under-representation of wealthy households without losing the benefits of micro data. To this end, we make use of the 2nd wave of the HFCS carried out in a harmonized fashion in 20 European countries and published in 2016 (see HFCN 2016a). Though our core case study here isolates Austria and Germany, the methodology developed is transferable to all other participating countries and all future survey waves, thus providing a powerful toolkit. We refer to the broader applicability throughout the article whenever insightful.
Although the HFCS provides ample information on different wealth components next to detailed information on the survey participants, accurately aggregating results to macroeconomic indicators requires the participation of members of all parts of the wealth distribution -including the wealthiest members of society -to ensure precise measurement. Surveys have the distinct advantage of a long list of variables (including a comprehensive list of socio-economic characteristics of the household) compared to the (scarcely available) administrative data. Indeed, in this regard surveys appear to be even better suited than administrative data alone, which usually only focus on a single wealth component (like capital gains taxes or register data) or only provide information at a certain moment in the life cycle (like estate or transaction taxes). Above all, limited prevalence of wealth-related taxes 1 hampers data collection, while data protection and privacy concerns limit the use of existing administrative data even by public institutions. Neither in Austria nor in Germany is comprehensive administrative wealth micro-level data available for research purposes, and, therefore, wealth surveys constitute a distinct source of information.
Macro and micro data are not -and cannot always be -completely aligned and definitions as well as concepts differ in certain instances. 2 We therefore focus here on a set of highly comparable components of wealth to demonstrate the steps needed to move from pure survey data to top tail-adjusted figures and to link the result with national accounts. All of these well-comparable items are part of the financial accounts (FA), which thus form the core of our analysis. Waltl (2021) develops supplemental methodology towards a comprehensive integration building on the first steps shown here.
However, voluntary surveys suffer from different kinds of shortcomings including a welldocumented lack of observations from the wealthiest section of society. These "missing wealthy households" 3 contribute to the micro-macro gap through two channels (see also 1 Wöhlbier (2015) reports that "[m]ost net-wealth taxes were removed or scaled down by [EU] Member States between 1995 and 2007." 2 See also Kavonius and Törmälehto (2010), Kavonius and Honkkila (2013), Henriques and Hsu (2014), Andreasch and Lindner (2016), Baranyai-Csirmaz et al. (2017), Chakraborty et al. (2019), EG-LMM (2020). 3 Other sources of this gap include conceptual discrepancies (differences in the definition of instruments, different valuation methods), population discrepancies (different scopes of the survey and financial accounts), reporting errors (intentional under-or over-reporting and related behavioural effects, reporting errors due to a lack of knowledge; see D' Aurizio et al. 2006), rounding errors (rounding of amounts by respondents and additional rounding for the purpose of anonymization), and sampling errors (errors in the sample design). Although financial accounts are supposed to be exhaustive, they are not a perfect benchmark due to, for instance, balancing across accounts, difficulties in the valuation of unquoted assets and potential coverage Avery and Elliehausen 1986;Avery et al. 1988): unit-non-response 4 as well as a lack of precision due to too few observations at the very top. Therefore, many wealth survey conductors rely on strategically drawing more observations from the population of wealthy households. While this oversampling yields an increase in precision it does not eliminate a likely unit-non-response bias.
Due to the high concentration of wealth at the very top (see Davies and Shorrocks 2000), a high degree of precision in this part of the distribution would be necessary. Yet, most surveys, and particularly wealth-related surveys, usually fail to capture these households appropriately as they are both harder to contact and engage, and an over-proportional number of such households would be required to obtain more precise estimates (Bover et al. 2014;Woodburn 1999, 1999a;Kennickell 2019;Schröder et al. 2020). For Austria and Germany alike, a separate quantification of precision and unit-non-response bias is problematic as no external household level data on wealth can be used for such an analysis. However, "[i]n a wealth survey, the sensitivity of the subject and the time cost of being interviewed, for people with complex assets, should be enough to raise a priori concerns" (Kennickell 2008, page 405).
Acknowledging these shortcomings, we propose a strategy for combining HFCS survey data with observations from national rich lists to adjust for an underestimation of wealth held by the wealthiest members of society. This strategy tackles both unit non-response and a limited degree of precision simultaneously. Concretely, we rely on a Pareto adjustment extending the work by Vermeulen (2016Vermeulen ( , 2018. 5 Such an adjustment, however, comes at the cost of loosing the insightful micro-structure inherent in surveys. We therefore go a step further and propose a subsequent step to restore this micro-structure by developing both an analytical and a simulation approach. Both approaches are formally equivalent, but have complementary advantages: the analytical approach is fast and easy-to-use whereas the simulation approach additionally produces empirical measures of variability. In a final step, we demonstrate how such top-tail adjusted survey data needs to be treated when aiming for a macro-micro link with national accounts. This link provides a basis for quantifying the reliability of several survey instruments and provides the basis for DINA. Austria and Germany both lack additional administrative micro data on wealth to strategically oversample the wealthy, and correct and cross-check self-reported wealth components without introducing specific linking variables. While Austria refrains from oversampling at all, Germany performs oversampling based on geography, i.e., more households are drawn from areas with higher average incomes yet no household-specific information on wealth problems of wealth held by residents abroad (see Zucman 2013). See EG-LMM (2020) for more details regarding population and conceptual discrepancies. 4 Unit-non-response bias refers to the bias introduced when certain households systematically do not participate or participate less often in a survey. It has to be noted, though, that -when the non-participation is random -low response rates do not necessarily lead to strong bias and vice versa. 5 Vermeulen (2016Vermeulen ( , 2018 makes use of the Forbes list to adjust total wealth measured by several surveys and demonstrates his approach for the US, the UK and several Euro area countries. Bach et al. (2019) build on this work and replace the short Forbes lists by more comprehensive national rich lists for several Euro area countries exclusively using the HFCS. Eckerstorfer et al. (2016) propose a method to determine appropriate cut-off points when adjusting the top tail and apply their approach to Austrian HFCS data and a national rich list. or income is used. We thus expect larger effects for these countries. Above that, "[b]oth the composition and the distribution of wealth in Germany and in Austria exhibit considerable similarities" (Fessler et al. 2016, page 29) justifying the comparison of results.
Our findings show that when relying on the HFCS only, one concludes that the richest 1% hold roughly 25% of total wealth in Austria and 24% in Germany. Replacing the top tail by a rich list-calibrated Pareto distribution increases this share to roughly 43% in Austria and 36% in Germany. Previous studies find similar results: Vermeulen (2018) finds shares of 32%-34% for Germany and 31%-32% for Austria. Bach et al. (2019) find 33% for Germany and Eckerstorfer et al. (2016) 38% for Austria. We find stronger effects for Austria than for Germany (in terms of wealth shares and changes in the effective oversampling rate) indicating that completely lacking an oversampling strategy for the wealthy as in the case of Austria introduces systematic bias for HFCS results hampering cross-country comparability.
Adjusting for the missing wealthy can lead to structural changes affecting the relation between different components of wealth. This stems from the different portfolio structure of the very rich. We find that equity has an increased importance in the portfolios of the very rich and our adjustments thus have the largest impact on this instrument. In Austria, aggregates for equity increase by roughly 125% and in Germany by roughly 38%. While the survey usually leads to under-coverage for most instruments, aggregates for equity calculated from the HFCS tend to exceed the financial accounts counterparts (see EG-LMM 2020). This rather surprising result adds an empirical argument to the conceptual conclusions of the EG-LMM that equity as such is not readily comparable between the two data sources. HFCS aggregates increase even more after applying our methodology. This is another hint that some of the underlying instruments (in particular the value of selfemployed businesses on the HFCS side, and unlisted shares and other equity on the financial accounts side) are not readily comparable and differences in valuation concepts are an issue.
Our analysis further shows that for other instruments -namely deposits, bonds and liabilities (loans) -changes are much less pronounced as these instruments seem to be less important for the very rich. For these instruments, coverage ratios 6 increase by roughly 0.5 to 20 percentage points in Austria and by 2 to 9 percentage points in Germany still leaving a significant gap. This shows that the missing wealthy explain only a rather small fraction of the macro-micro gap for financial instruments other than equity. The sources of the remaining gap thus need further exploration.
While it is not new that Austria and Germany have similar degrees of net worth inequality (Fessler et al. 2016, page 29f.), we also find strikingly strong similarities in (the high degree of) inequality on a more disaggregated level. For instance, average financial wealth among the poorest 20% of households equals -11,506 EUR in Austria 127 EUR in Germany) as compared to 325,920 EUR (186,789 EUR) in the top quintile.
The remainder of this article is organized as follows: First, Section 2 elaborates on the distribution of net worth and describes how a Pareto model may be used to adjust the distribution at the top and use this information when linking survey data to the financial accounts. Methodological approaches are derived in Section 3. Data is presented in Sections 4 and 5 reports empirical results. Finally, Section 6 concludes. The supplementary materials add technical details, performs comprehensive sensitivity analyses, and provides further supporting results.

The top of the wealth distribution
In an ideal world, a survey would represent the entire population, i.e., also the most affluent households. Unfortunately, surveys -and particularly wealth-related surveys -usually fail to capture these households as they are less willing to participate in a wealth survey and are harder to contact. For this reason, surveys may not be the ideal source for measuring a complete wealth distribution and administrative data, combined with suitable techniques, appear generally more appropriate whenever they exist. Indeed, various techniques are applied to do so, for instance, the capitalization method inferring wealth holdings from income flows or the estate multiplier method (see Kopczuk, 2015;Saez and Zucman, 2016, for further details about these methods and comparisons to other techniques). However, in the absence of such data, as well as when aiming for detailed breakdowns by components of wealth or even other categories like income or household types, surveys appear to be a prime information source to do so (see also Waltl 2021).
Our goal is to focus on improving the top tail of survey data. In this context, many countries participating in the HFCS already perform some kind of oversampling of wealthy households, i.e., strategically contacting proportionally more wealthy households. Oversampling leads to more observations in a particular part of the distribution than implied by the original sample frame, which makes it necessary to down-weight each observation in this part to guarantee correct population totals.
In principle, a survey with a sophisticated sample design yields an unbiased estimate of the aggregate -also without any kind of strategically oversampling of wealthy households. Oversampling the wealthy, however, may help to decrease the variance of the aggregate and hence increases precision. In the case of a highly skewed distribution or, equivalently speaking, a high degree of inequality, the upper tail largely contributes to the aggregate and thus oversampling this important part of the distribution may be very effective to get more reliable results for a single survey wave. At the same time, oversampling the wealthy without increasing the total number of observations negatively affects other parts of the distribution due to larger weights per observation outside the right tail.
It is important to note though, that oversampling does not remove any kind of bias resulting from unit-non-response. Differences in the measurement of the right tail may lead to a limitation in the comparability of tail-sensitive estimates across survey waves and across different wealth surveys (Kennickell 2019).
Countries with comprehensive data on wealth or income (e.g., register data maintained for the sake of collecting wealth-related taxes or income tax files, that are used to project wealth based on investment returns, see Kennickell 1999b) can use this extra information to strategically contact larger numbers of wealthy households. However, in Austria and Germany no such data is used (mainly) due to data protection concerns and oversampling is hence largely restricted. Austria refrains from oversampling whereas Germany relies on a regional oversampling strategy (i.e., more observations are drawn from wealthy municipalities or regions within cities; see HFCN 2016a; Schmidt and Eisele 2013, and footnote 10).
Ex-post adjustments relying on parametric models are a suitable approach to address both issues related to the missing wealthy: precision and bias. There is evidence that the top of the wealth distribution can be more accurately approximated by a Pareto distribution, aka the "power law" (see Pareto 1895). 7 Hence, relying on survey observations only for the less-wealthy and a Pareto model for the wealthy, i.e., a semi-parametric approach, seems to be well-suited to tackle this problem. This idea is not new and has been widely used before (see e.g., Cowell, 2011a;Bach et al. 2019;Eckerstorfer et al. 2016;Vermeulen 2016;Vermeulen 2018).
In general, a Pareto model may be estimated from survey data only. As the very top of the tail is not appropriately represented by a survey, the estimated distribution is likely to generate too flat a tail. We therefore follow the strategy of several articles that add observations from rich lists to the estimation procedure to increase confidence in the estimated tail (see Bach et al. 2014;Eckerstorfer et al. 2016;Vermeulen 2016Vermeulen , 2018).

The Pareto distribution
The standard Pareto distribution is a two-parameter distribution with cumulative distribution function (CDF) and density function f Y (y) = ϑy ϑ 0 y ϑ+1 , y ≥ y 0 , where y 0 > 0 denotes the scale (or threshold) parameter and ϑ > 0 the shape parameter: Decreasing ϑ yields to a reduction of probability mass at the threshold y 0 and at the same time a prolongation of the tail.
Several estimators for the shape parameter are suggested in the literature. When aiming for an estimator based on survey data, a method that accounts for survey weights is needed. Vermeulen (2018) suggests a pseudo maximum likelihood and a weighted regression estimator.  extend robust estimators for survey data. 8 In this article, we rely on two robust estimators and Vermeulen's regression method. Additionally, we derive a robust version of the regression method based on quantile regression that is expected to be less sensitive toward extreme observations. Details are provided in Supplementary Material A.
On the contrary, there is no universally agreed estimation procedure to determine the threshold y 0 . In general, the threshold should be large enough to guarantee that observations follow a Pareto law. The threshold, however, must also be small enough so that there are enough observations above the threshold that can be used to estimate the Pareto shape parameter. For reasons of comparability, we use the same threshold for Austria and Germany, namely one million Euro. Supplementary Material B shows that this is a suitable choice for both countries and performs robustness checks accordingly.

Portfolio structure at the very top
A correlation between portfolio structures and the relative position in the wealth distribution is well known. While portfolios of households belonging to the lower deciles of the wealth distribution consist mainly of deposits, the shares of more risky financial assets (mutual funds, bonds, publicly traded shares) increase when moving up the wealth distribution (HFCN 2016b). The probability of owning a private business is significantly higher for households belonging to the top 20% of the net wealth distribution (Arrondel et al. 2016). Supplementary Material C provides detailed distributional information on portfolio structures calculated from second wave HFCS data for several euro area countries.
We even find large variation in portfolio structures within the tail: Portfolios of the "wealthy" still differ from portfolios of the "extremely wealthy." To show that, we split up the top tail into four strata Q1, Q2, Q3, and Q4 as demonstrated in Fig. 1. The number of observations per stratum and average observation weights are given in Table 1. 910 Figure 2 shows the average portfolio structure for each tail quartile: For the top 25% of the tail population, i.e., Q4, equity is much more important than for the rest of the tail population. In contrast, real estate assets are more important for the lower 50% (Q1 and Q2) of the tail population. The share of total assets held as deposits, bonds, and mutual funds decreases when moving to the very top of the distribution. Figure 2 also shows the ratio of liabilities over net worth. This ratio also decreases with net worth.
The portfolio allocation for individual observations and for the entire tail can be found in Acknowledging this variation, we distinguish between "types of households" for Q1 to Q4 and summarize the results in Fig. 3. A household is classified to be of type "equity" when its largest portfolio position is equity and so forth. The share of "real estate households" decreases when moving up the distribution while the share of "equity households" increases. Portfolios consisting mainly of assets other than real estate and equity are generally more frequent in Q1 to Q3 than in Q4.
Although we observe clear tendencies in changes in portfolio structures, which we exploit in our analysis, the survey (and in particular surveys without oversampling) most probably still lack representative portfolios at the very top and the accuracy of our results may thus be limited. 11 For instance, Kennickell (2008) finds that in the 2004 US Survey of Consumer Finances out of roughly 400 interviewed households that had direct holdings 9 Due to very few tail observations in Austria, we slightly increased Q4 by including the top 26% instead of the top 25% and thus shrunk Q3 accordingly. This has the effect that two additional observations enter Q4 that stabilize participation rates in bonds since, according to the HFCS, the top ten observations in terms of wealth hold zero assets in the form of bonds. This adaptation has virtually no effect on other instruments. 10 Table 1 reveals two other important facts worth noting: First, the overall HFCS sample size for Germany is strikingly small, implying large average weights per observation. (Note that Germany is roughly ten times the size of Austria in terms of population -see also Table 3.) Second, the geographic oversampling strategy applied by Germany leads to substantial lower weights in the tail than for the rest of the population. In contrast, the average weight of tail observations in Austria is larger indicating that Austria was indeed facing problems contacting wealthy households. 11 As Austria completely refrains from oversampling the very rich, we particularly expect such shortcomings for this country. Additionally, Austria is much smaller than Germany in terms of population size (see also  Table 3) and there are thus in general much fewer tail observations. Therefore, the results for Austria in Fig. 3 are less smooth and accuracy -especially for Q4 -would most likely increase with the introduction of oversampling.

Fig. 1 Illustration of the top tail of the net worth distribution
Notes: The figure depicts the top of the net worth distribution (for this illustration German HFCS data has been used). Below the Pareto threshold, the empirical distribution is plotted. Above the threshold, the parametric model, i.e., the theoretical Pareto distribution, takes over. Q1 to Q4 represent the quartiles of the tail distribution. Source: HFCS of government or commercial bonds, approximately 90% of these cases entered the survey through oversampling. Indeed, we find that bond holdings are surprisingly low among households in the top tail in Austria.
Under the assumption that the portfolio structures of the wealthiest of the wealthy do not differ strongly across European countries, we thus demonstrate in Supplementary Material B how "borrowed portfolios" from other HFCS countries can be incorporated into our approach to increase reliability. We find that there are only small changes to general results and thus do not borrow portfolio structures in the remaining analysis. Notes: The thresholds for the strata are determined by quartiles of the adjusted tail distribution. Source: HFCS

Fig. 2 Change in portfolio structure in the tail
Notes: The figure shows bonds, deposits, equity, mutual funds, and real estate assets as share of total assets and the ratio of liabilities (loans) over net worth (see Table 5 for definitions). Shares are calculated separately for each net worth tail quartile (quartile thresholds are calculated from the semi-parametric Pareto model), e.g., Q4 refers to the wealthiest 25% in the tail. For the figures instrument-specific aggregates are calculated for each quartile and divided by total assets (total net worth) in the respective quartile. Source: HFCS

Methodological approaches
We propose a Pareto top-tail correction on HFCS data combined with national rich lists and two methodological approaches to break-down the resulting adjusted total wealth held by the top by instruments: an analytical and a simulation approach. The analytical approach

Fig. 3 Types of households in the tail
Notes: The figure shows the share of types of households in the tail split up by quartile. A household is of type "equity" when its largest position in the portfolio is equity and so forth. For Austria, results are less smooth due to fewer tail observations (see Table 3 for totals). Source: HFCS derives closed-form formulae enabling fast and straight-forward calculations. The simulation approach approximates the same formulae numerically and complements the result with empirical standard errors yielding a measure of confidence. The second step is crucial as the Pareto adjustment comes at the cost of losing microstructure: the dependency among instruments (as each instrument-specific aggregate is calculated independently of all other instruments) as well as the valuable variability resulting from using individual rather than average portfolio shares are lost. For instance, calculating the share of total assets (or liabilities) held by the top 1%, or distributional measures such as Gini coefficients, are not possible.
There are three fundamental assumptions for this approach: (i), the validity of a Pareto model describing the shape of the top wealth distribution as well as the applied approaches to estimate model parameters; (ii), the quality of rich lists and the comparability of concepts across the HFCS and rich lists; and (iii), the appropriateness to extrapolate observed portfolio structures throughout the (stratified) top tail.
We perform ample sensitivity analyses in Supplementary Material B to validate these assumptions and further technical choices.
As a final step, we derive formulas facilitating the computation of distributional national accounts figures from such top-tail adjusted survey data. We limit the analysis here to a set of components of wealth that are conceptually readily comparable between the HFCS and National Accounts following an assessment by the ECB Expert Group on Linking Macro and Micro Data for the Household Sector (see also EG-LMM 2020).

An analytical approach
In the case of ϑ > 1 (which holds true for all our estimations), the Pareto distribution has a finite first moment, i.e., the mean exists and is given by In this case, the total tail wealth can be estimated by multiplying the number of households belonging to the tail N tail by the mean, i.e., In a survey setting, N tail equals the sum of weights of all households with a net worth greater than y 0 . As input, the analytical method needs a Pareto model for the top tail. The parameter of this model are ideally derived by combining survey and rich list observations as described in Section 2. We split the Pareto tail into four strata as shown in Fig. 1. Four strata seem appropriate for our case study. However, the method can easily be adjusted if a finer split-up is needed. Keep in mind the practical adjustment for Austria as mentioned in footnote 10.
We calculate average wealth for each stratum Q. Therefore, we calculate expected net worth Y conditional on net worth realizing in stratum Q. Strata are defined via quartiles, which are denoted by .75, 1}}. Stratum-specific expected wealth is given by The derivation of this formula is given in Supplementary Material D.
Total stratum-specific net worth is thus given by Note that in case of a split-up by quartiles p 2 − p 1 = 0.25. As a next step, we calculate stratum-specific portfolio shares. Let w k denote the weight of household k and s k j assets held by household k in the form of instrument j over net worth. Instrument-and stratum-specific tail aggregates are then given by Total instrument-specific aggregates are obtained by summing up stratum-specific tail aggregates and adding them to the non-tail aggregate A j (N T ), which is calculated respecting survey weights:

A simulation approach
The fact that the top tail of an empirical distribution is replaced by a parametric model can also be exploited for simulation. A simulation creates micro data files that are adjusted for the too flat tail. These micro files allow one to analyze distributional patterns at a disaggregated level. The simulation, which here combines Monte Carlo simulation and bootstrapping, perfectly mirrors the analytical approach meaning that empirical confidence intervals constructed as bootstrap quantiles supplement the analytical results. The proposed algorithm aims to recover observed portfolio structures and thus also enables analyses in this regard.
We combine the semi-parametric model for net worth with a wealth-stratified bootstrap algorithm to conserve the correlation between portfolio structure and net worth. The bootstrap algorithm is a fully non-parametric 12 procedure.
For each household in the tail, total net worth is simulated from the Pareto model. The household is then allocated to its respective net worth stratum and a suitable portfolio is drawn.
Again, we stratify tail observations into four strata (Q1 to Q4) depending on their position in the net worth distribution. Note that in principle the algorithm works for any stratification as long as there are sufficient observations per stratum. The choice of four strata here is in accordance with the analytical method. A household with simulated net worth of roughly one million Euro is allocated to the lowest quartile and thus a portfolio is drawn from the Q1 stratum. In contrast, households with large simulated wealth are assigned a portfolio structure from Q4. 12 The correlation between portfolio structures and net worth does not seem to be linear but rather follows distinct functional patterns. Hence, relying on a parametric model such as a Beta regression (the Beta distribution is ideally suited to model shares as it is bounded between 0 and 1.) might (and in fact does) heavily depend on the exact model specification. Semi-parametric approaches with data driven functional forms (such as Generalized Additive Models) would be an obvious candidate for overcoming this issue. However, the observed tail is very sparsely populated and it was (at least in our analysis) impossible to estimate stable and trustworthy functional forms. Supplementary Material A describes how to calculate quartiles from a Pareto distribution (step 2) and how to sample from a Pareto distribution (step 4).
In step 5, we draw a random portfolio allocation. In this context, this means that we draw a full list of shares of assets/net worth but no amounts. Specifically, this list includes the ratio of liabilities over net worth, as well as the share of total assets invested in bonds, deposits, equity, and real estate. The portfolio is then constructed by, first, calculating the amount of liabilities by multiplying its drawn ratio with simulated net worth and, second, using the Euro amount of liabilities to calculate total assets. Third, we multiply total assets with the drawn share of bonds, deposits, equity, and real estate to receive the respective Euro amounts.
Thus, although we replicate observed patterns in terms of portfolio structure, simulated amounts are (on average) larger as net worth is drawn from a Pareto model, which generally increases total tail wealth.

Incorporating top-tail adjusted survey data into financial accounts
Top tail-adjusted survey data ultimately leads to an adjusted distributional structure, which should be respected when compiling distributional national accounts (DINA) based on surveys.
We focus on the most comparable instruments in the financial accounts (see EG-LMM 2020) to circumvent conceptual differences across these two data sources, which would require a finer harmonization procedure.
The aim of DINA is to compliment national accounts aggregates by distributional indicators. In general, there are many different indicators that would add useful information.
Here, we focus purely on indicators based on the distribution of net worth as we can directly make use of our methodology for this exercise. Other indicators may include a split up by income, household structure, or other characteristics as shown by Waltl (2021) extending the algorithms developed in this article.
We compute an indicator splitting up financial accounts aggregates by n net worth groups. For instance, n = 5 when compiling a spilt-up by net worth quintiles. Such indicators show how much of the total aggregate of a certain instrument is held by, for instance, the poorest 20% or wealthiest 20% of the population.
Let y j denote the FA aggregate for instrument j and x ij the HFCS aggregate of instrument j . Thereby, x ij is the HFCS aggregate for instrument j within the net worth group i. Without adjusting for the missing wealthy in the HFCS, distributional indicators for instrument j and net worth group i are given by Note that each x ij is scaled up by the inverse instrument-specific coverage ratio. The scaling guarantees that values across all net worth groups add up to the financial accounts total, i.e., Taking into account the adjustments for the missing wealthy, the HFCS aggregates x ·j change whereas the FA aggregates y j remain constant. To simplify notation, assume that with the adjustment, only the highest net worth group is affected, i.e., group n. Indeed, where broad indicators are concerned, such as a split-up by quartiles or quintiles, only group n is affected. Let x * nj denote the adjusted HFCS aggregate for instrument j and net worth group n. The adjusted HFCS total is then given by Note that in general x * ·j > x ·j , as our methodology leads to larger amounts held by the wealthy.
Distributional indicators are defined analogously via Again, the values across all net worth groups add up to the financial accounts total: Our adjustment has the following effect: For the unadjusted indicators d ij the gap between HFCS and FA aggregates is "filled up" by equally scaling up the respective numbers by the inverse coverage ratio. This means that the "missing" portions are agnostically distributed proportionally across wealth groups. Our methodology exclusively quantifies the contribution of the missing wealthy to the total gap. Thus, this information is used to allocate this portion of the FA aggregate directly to the top quintile.
To fill the remaining gap, aggregates are then once more scaled up proportionally. This last step does likely not reflect the "true" full distribution of errors within either survey or FA data, and more adjustments of this kind for both types of data sources would be required to picture reliable full wealth distributions in-line with macroeconomic statistics. Also, we have solely discussed one survey error and left out other types of errors, including incomplete information regarding portfolio structures at the very top implied by this group being not well-represented in the survey. Importantly, errors in the FA also are not discussed here.
In theory, if one knew to which net worth quintiles each proportion of the remaining gap truly belonged, no scaling would be necessary at all. Also, a scaling of FA totals may be necessary yet falls outside the scope of this analysis. Further research is needed in this respect.

Data
The analysis makes use of three different data sources: The first source are data for Austria and Germany collected in the second wave of the HFCS. Second, we use aggregates from financial accounts (FA) as benchmark measures. Third, we use national rich lists that yield information on net worth of the wealthiest individuals/families. Table 2 reports the codes used in the HFCS and financial accounts as defined in the European System of National and Regional Accounts (ESA2010).

HFCS data
The HFCS collects household-level data on households' finances and consumption. The second wave, which was released in December 2016, was conducted in 18 euro area countries as well as the non-euro area countries Hungary and Poland in a harmonized way while still leaving room for a country-specific implementation. 13 While some European countries have a long tradition of conducting wealth related household surveys, a harmonized survey for such a large number of countries is a recent development and results of the first wave were only released in 2013. In Austria and Germany, the HFCS (labelled Panel on Household Finances (PHF) in Germany) is the first comprehensive household survey on wealth, which is why there is as yet very limited knowledge about the wealth distribution in these countries.
The variables in the HFCS follow common standards and definitions. Therefore, the instruments used in the analysis (assets and liabilities) are in principle comparable between Austria and Germany. In both countries, the survey has been mainly conducted via Computer Assisted Personal Interviews.
The HFCS uses probability sampling and the sample size is representative both at the country and at the euro area level. For Austria, the gross sample size was 6,308 which led to 2,997 observations (response rate of 49.8%) representing 3.9 million households while in Germany the gross sample size was 16,221 yielding 4,461 observations (response rate of 19%) 14 representing 39.7 million households. Table 3 provides HFCS summary statistics. Specifically, details are provided for the wealthiest households here defined as all households with a net worth of at least one million Euro.
The HFCS relies on multiple imputation to fill in gaps due to item-non-response and to erroneous and implausible entries. 15 Imputations are not free of doubt which is why five implicates are provided. For our analysis, for each imputed value we use the average across all five implicates.
Our methodology critically depends on an exact estimate of the number of households owning at least one million Euro. We estimate this number from the HFCS by summing up the weights of the wealthiest households. Results are also provided in Table 3. According to the World Wealth Report (Capgemini 2016) there were 114,000 (121,000) high net worth individuals (HNWIs) 16 in 2014 (2015) in Austria and 1,141,000 (1,199,000) in Germany. Although the definitions of HNWIs and millionaires in the HFCS are not perfectly aligned, the very similar results increase our confidence in HFCS numbers.

Financial accounts data
Aggregates compiled under the ESA2010 framework generally provide a detailed and consistent description of an economy and aim to be exhaustive. While data generally is intended 13 See HFCN (2016a) for a general documentation and https://www.hfcs.at/en/ (Austria) and http://www.bundesbank.de/Navigation/EN/Bundesbank/Research/Panel on household finances/ panel on household finances.html (Germany) for a more detailed country-specific documentation. 14 Response rate considering households that are interviewed for the first time (Germany has a panel component). The response rate including panel households is 29% (see HFCN 2016a). 15 When a respondent generally agrees to participate in the survey but declines to answer some particular questions the missing values are classified as item-non-responses. 16 HNWIs are defined as individuals having at least one million USD of investable assets, excluding primary residence, collectibles, consumables, and consumer durables. to be exhaustive, for example the value for real estates for the household sector is not yet fully included in practice. Within the European Union, data is well comparable as Member States are legally obliged to follow common accounting rules. For our comparisons, we use financial accounts data for the household sector, which form an integral part of the ESA2010 accounting framework.
Aggregates from financial accounts are reported in Table 4. The period of measurement used in financial accounts are either quarters or years. As fieldwork periods of the HFCS are usually longer than a quarter but shorter than a year, we calculate weighted averages of quarterly financial accounts data using all quarters that overlap with the respective fieldwork period: If, say, the fieldwork period was three months, of which one month falls into quarter A and two months fall into quarter B, then quarter A will be weighted with 1/3 and quarter B with 2/3.

Comparability of HFCS and FA instruments
Although the definitions of variables are similar in both the HFCS and FA, they are not fully comparable. The ECB Expert Group on Linking Macro and Micro Data for the Household Sector (EG-LMM) developed a catalogue assessing the conceptual comparability for each individual instrument (see EG-LMM 2020). Their findings are summarized in Table 5.   Table 2 for definitions Whereas liabilities, bonds, deposits, and mutual funds are in principle highly comparable, the case is more complicated for equity. 17 While listed shares as part of equity are highly comparable between the HFCS and ESA2010, other parts of equity are not. In particular, there is a valuation discrepancy for unlisted shares and other equity/(non) self-employment businesses: The HFCS relies on self-evaluation of the market value elicited via the question "What is the net value of your/your household's share of the business? That is, what could you sell it for, taking into account all (remaining) assets associated with the business and deducting the (remaining) liabilities?" In financial accounts, "the valuation for unlisted shares and in particular holdings of other equity is less accurate [than for items with quoted market prices] as their valuation requires assumptions and modelling" (EG-LMM, 2020, page 10).
We look only at the aggregate of equity since the EG-LMM assessed the comparability of the more detailed break down, i.e., at the level of unlisted shares and other equity, as low.
In general, we refrain from directly comparing net worth between the HFCS and FA. Chakraborty et al. (2019) and EG-LMM (2020) analyse different wealth concepts and conclude that at this early stage a comparison at the level of net wealth suffers from many conceptual difficulties. However, we do provide some results for financial wealth, which we define here as the sum of bonds, deposits, equity and mutual funds minus liabilities.
Similarly, we do not link real estate assets yet. This comparison will be possible once data coverage in the national accounts is improved. 18 Another important issue concerning the comparability between HFCS and FA data emerges from the distinction between producer households (part of the household sector) and quasi-corporations and corporations (part of the non-financial corporations sector) in the system of national accounts. This distinction affects the composition of the household 17 Equity as defined here differs from the HFCS definition of "business wealth," which includes properties used for business purposes. These assets are added to non-business real estate here and show up as part of "real estate." In Austria and Germany, the variable DA1140 also includes land and buildings that are part of small farms (excluding the part used as main residence). In this regard the distinction between business real estate and other business wealth is far from perfect. This explains why HFCS data yields larger total equity than its closest yet still imperfect match in the FA as shown in Table 8. 18 The ESA Transmission Programme requires the transmission of annual data on land only by end-2017. The EG-LMM has not yet made a final decision regarding the conceptual comparability, but expects it to be at least of medium comparability. In general, national accounts separate real estate into AN.111 Dwellings, AN.112 Other buildings/structures and AN.211 Land. sector balance sheet. Generally speaking, if an unincorporated enterprise is considered to be a separate unit, it is recorded in the corporate sector and shares held by households are recorded as other equity on a net basis. In contrast, unincorporated enterprises classified as producer households are part of the household sector, and their assets and liabilities are spread across all instruments. In the HFCS, a concept comparable to producer households does not exist. Assets and liabilities of self-employed businesses are recorded as net values. From this aspect alone, we would thus expect over-coverage for equity and under-coverage for remaining instruments (for a more in depth discussion of this issue and further details, see (for a more in depth discussion of this issue and further details, see Chakraborty et al. 2019, EG-LMM 2020).

Rich lists
In many countries, journalists create lists of the wealthiest individuals and families probably the most famous one of which is the Forbes World's Billionaires List (e.g., Forbes 2015). Each year, Forbes lists the name and net worth of individuals and families with at least one billion US-dollars every year. It uses a number of different sources and rather opaque methodologies to estimate the wealth of the financial elite around the globe. 19 Though such rich lists are in general neither expected to be complete nor completely accurate, they constitute the only public data source reporting an estimate of the individual wealth of the most affluent members of society.
Next to the Forbes list targeting all billionaires in the world, there are several national rich lists. For Austria, the weekly business magazine Trend (Trend 2015) publishes a list of the wealthiest 100 Austrians every year. For Germany, the Manager Magazine (Manager Magazin 2014) publishes the list of the wealthiest 500 Germans.
In this article, we rely on national rich lists. National lists are more comprehensive as they do not only list billionaires. For instance in the case of Austria, there are only roughly ten billionaires on the Forbes list per year, whereas the Trend list reports the 100 wealthiest individuals/families. Additionally, the Trend and Manager Magazine lists are estimated in Euros, which -compared to the Forbes list using USD -does not introduce an additional source of ambiguity due to changes in exchange rates (e.g., the number of Austrian names on the Forbes list likely dropped from eleven in 2014 to eight in 2015 due to a drop in the EUR-USD exchange rate).
Next to various measurement issues there is another major problem when using rich lists to adjust the tail of the wealth distribution: The rich lists of Forbes, Trend, and Manager Magazine fail to distinguish between households and family clans, whereas both the HFCS as well as financial accounts use households as unit of recording. This is why we do not directly rely on exact amounts reported in the rich lists but rather only use these observations to increase the quality of our Pareto estimates. We also perform some robustness checks in this regard as reported in Supplementary Material B.

Trend list:
The magazine lists the names, net worth and major sources of wealth of the richest individuals or families in Austria. Additionally, they classify the "type" of assets: assets held in the form of a private foundation ("Stiftungsvermögen"), listed and unlisted shares ("Betriebsvermögen"), and inherited assets. This classification shows that in Austria the wealthy hold large parts of their assets in the form of private foundations. 20 We perform robustness checks (see Supplementary Material B) on how estimates change when leaving out rich list observations that hold their assets exclusively as foundations.
For individuals with net worth of less than 500 million EUR, Trend does not report the exact amount but only a range. Altogether, there are 60 observations with exact amounts, 19 observations with a net worth between 300 and 500 million, and 21 observations with net worth ranging between 100 and 300 million EUR. To avoid cluster effects in the estimation of the Pareto tail, we assume that the net worth of range observations is uniformly distributed within the respective range and assign them an accordingly drawn random number (the result is shown in Fig. 4).
Like Forbes, Trend uses a large number of different sources to compile its rich list. The list is published in June. We use the data from the 2015 list, as this best matches the fieldwork period in Austria. For listed shares, they use valuations corresponding to the beginning of May 2015.

Manager Magazine list:
The Manager Magazine publishes the list of the wealthiest 500 individuals and families in Germany together with their major source of wealth (e.g., real estate assets or the name of a company). Like the Forbes and Trend list, a wide range of sources and methodologies are used when compiling the list. In contrast to the Trend list, "exact" amounts are reported rather than ranges and therefore no adjustment has to be made in this respect. The list does not provide information about the type of assets.
As the field-work period for Germany squarely falls in the year 2014, we make use of the 2014 list. For listed shares, values used are from September 2014.

The adjusted wealth distribution
Pareto estimation results depend on the choice of method for estimating the shape parameter. As mentioned, in this article we rely on four different estimation techniques described in Supplementary Material A. We choose the estimation procedure according to graphical inspection. 21 20 We see many private foundations ("Privatstiftungen") in Austria among the rich list as foundations serve as a way to secure the wealth of the family (e.g., the founder of a company can prevent fragmentation of business assets) and a private foundation can benefit from tax-preferred treatments (e.g., to reduce inheritance taxes as well as taxes on capital income). The majority of private foundations in Austria follow private purposes and not charitable or public purposes (Schneider et al. 2010). In the national accounts, claims by households in these private foundations are recorded in the household sector (see Andreasch et al. 2015). The national questionnaire of the Austrian HFCS has a dedicated question asking for the value of private foundations, but in practice this question was hardly ever answered. Therefore, adding information from the rich list in Austria is particular important and complements the HFCS rather well. The analysis by Andreasch et al. (2015) on equity stakes held by private foundations furthermore indicates that wealth in private foundations is rather concentrated even inside the group of private foundations showing similar patterns as the rich list and a Pareto-like behaviour. 21 From theory, we know that the logged random variable to be modelled (here: net worth) and the log of one minus the CDF have a linear relationship. Thus, we look for a straight line that best fits our observations. This is shown in Fig. 5.
Both, the wPDC and wISE estimators are designed to be robust against outliers. As all our rich list observations are "outliers" to some extend, these procedures take these observations into account to a lesser degree.  Table 6 reports the respective numbers. For both countries, the shape parameter is chosen according to the regression method as it seems to be best suited to capture the richest households appropriately. 22 We draw M = 100 samples each of size N tail from the Pareto(y 0 ,θ) distribution. We analyze the net worth distribution when replacing the observed tail by the estimated one. Table 7 reports average results across all M = 100 replicates as well as coefficients of variation (CoV). For comparison, the table also reports results when exclusively relying on survey data.
For both countries, applying the semi-parametric Pareto model leads to higher degrees of wealth inequality compared to calculations relying on survey data only. Figure 7 plots the shift of the distribution: for Austria and Germany, mainly the top 1-2% are affected.
In Austria, the top 1% is estimated to own roughly 43% of total wealth whereas this number is estimated to be 25% when relying on the survey only. In the case of Germany, this kind of inequality increases from 24% to 36%. Together all millionaires in Austria own The regression and robust regression method are successful in combining the information from the HFCS as well as the rich lists and lead to almost indistinguishable results. Vermeulen (2018) compares several other methods and also concludes that the regression method including observations from the Forbes list is best suited. 22 However, we do not find a consistent pattern across methods for the two countries: In Austria, the wISE and wPDC estimators imply the lowest tail wealth whereas for Germany these methods yield the largest tail wealth. Results are quite sensitive towards this choice. In particular for Austria, the gap between the lowest observation on the rich list and the vast majority of observed wealth holdings in the HFCS is substantial (partly as a consequence of lacking an oversampling strategy), and judging which method is the most appropriate is challenging. A more flexible method than what is used here and in the current literature, and which does not impose a constant shape parameter for the entire tail, may be more appropriate. We leave this for future research. 24 To plot the Lorenz curve, among the 100 simulated Pareto samples the one selected is that which best fits the theoretical Pareto distribution according to a Kolmogorov-Smirnov test. Gini coefficients reported here are the average coefficient across all 100 replicates.

Fig. 5 Estimated Pareto distributions
Notes: The figures plot the (empirical) complementary cumulative distribution function (which is defined as 1-CDF) on a log-log scale. Survey observations as well as observations from rich lists are included. The graphs are used to decide which estimation method is most appropriate for each country. Sources: HFCS, Trend, Manager Magazine roughly 54% of total wealth (compared to 37% when relying on the survey only) and in Germany 48% (compared to 39% when relying on the survey only). Likewise, household-level Gini coefficients increase from 72% to 80% in Austria and from 76% to 80% in Germany. Figure 6 shows household-level Lorenz curves before and after adjusting for the missing wealthy. The Lorenz curve is read as follows: The poorest x% of the households hold y% of total wealth (Fig. 7).
These findings are in-line with prior analyses based on the first wave of the HFCS (fieldwork conducted in 2010 and 2011), which consistently find that adjusting for the missing wealthy pushes up measured total net worth and in particular the net worth at the very top. For Austria, Eckerstorfer et al. (2016) measure an increase in aggregate wealth of 28 percentage points. The share of the richest 1% rises by 15 percentage points to 38%. Vermeulen (2016) finds that the share held by the top 1% increases by 6-7 percentage points in Germany and by 8-11 percentage points in Austria when adding observations from the Forbes list.
Rich list observations can also be used as a tool to assess the success of a survey in capturing the full wealth distribution. According to the rich lists, there are 31 billionaires in Notes: The implied tail wealth is calculated using formula (1) and is reported in mEUR. Estimation is performed for the combined survey and rich list data Notes: Amounts are in mEUR. Shares (e.g., wealth of the top 1%) refer to the relative amount of (un-)adjusted survey totals of households but not individuals. Likewise, the Gini coefficient is calculated at the household level. CoV refers to the coefficient of variation, i.e., the standard deviation divided by the mean. Tail refers to households with a net worth of at least one million EUR. Analytical refers to the theoretical mean using the estimated parameters and formula (1). Simulation results are based on M = 100 draws from the respective Pareto distribution. "pp" refers to percentage points. Source: HFCS Austria (according to the Forbes list, in 2014 there were 11 and in 2015 only eight USDbillionaires) and 142 in Germany. In the survey, however, not a single billionaire has been interviewed indicating that the HFCS really fails to capture this -in terms of aggregate wealth -very important segment of the population. When drawing from the Pareto distribution, we create on average eight billionaires in Austria and 121 in Germany, which matches the numbers on the rich lists reasonably well. Furthermore, we simulate households to assess the quality of the HFCS sampling strategy with regard to the top tail. In fact, most HFCS countries try to oversample the wealthy as it is known that their influence on total wealth is substantial while at the same time these households are less likely to participate in surveys. The success of an oversampling strategy can be measured by the effective oversampling rate (see also HFCN 2016a), which is defined for the top x% as is the share of sample households among the wealthiest x%.
If, e.g., the share of very wealthy (top 1%) households in the net sample is exactly 1%, then the effective oversampling rate of the top 1% would be zero. In general, oversampling is thus successful when the effective oversampling rate is greater than zero.
Austria is one of the very few countries not oversampling at all, resulting in a negative effective oversampling rate of -33% for the top 1%, which in turn means that there are Fig. 7 The top tail of the wealth distribution Notes: The figure plots the top tail of the Austrian and German net worth distribution once estimated using the HFCS data only and once using the semi-parametric Pareto model, i.e., the combination of observed survey data and a theoretical Pareto tail estimated from survey and rich list observations. Source: HFCS (substantially!) fewer observations in the top tail than implied by the sample design. Thus, every single observation at the top is up-weighted thereby increasing their impact. This likely has a negative effect on the precision of aggregate estimates. Germany uses regional indicators for their oversampling strategy yet no information on individual households. This results in an effective oversampling rate of 131% for the top 1%.
Augmenting the survey with simulated observations sheds even more light on the sampling success. The Pareto adjustment changes the distribution at the very top and pushes up the 99th percentile in the case of Austria. For Germany, the 99th percentile does not change -only the total wealth of the top 1% as well as higher percentiles. Because of this shift in Austria, the effective oversampling rate drops from -13% to -50% after the adjustment and acknowledging the extra information provided in rich lists (see Table 7). This indicates that measuring the distribution of wealth in Austria would be much more reliable if an oversampling strategy were to be introduced.
As we focus here on households with a net worth of at least one million Euro, we also report the effective oversampling rate for the group of millionaires: 25 Similarly to the results for the top 1%, we find an effective oversampling rate of 171% for Germany and -14% for Austria.

Effects on instrument level
While the effect on net worth and the overall wealth distribution can be somewhat expected when Pareto estimates are used, the effect on instrument-specific coverage ratios and instrument-specific distributions is less clear. In general, we find that coverage ratios indeed increase when substituting the top tail by a Pareto model thus confirming the hypothesis of under-representation of the most affluent households in the survey. Table 8 summarizes the main results and Fig. 8 graphically shows the changes in coverage ratios for highly comparable instruments. We report point estimates calculated via the analytical method. Any additional distributional information as well as confidence intervals are obtained from the simulation method.
Overall, coverage is higher for Germany than for Austria. All ratios increase due to our adjustment. The impact on instrument-specific coverage ratios obviously correlates with the proportion of an asset type held by the wealthiest and the exact distribution within the tail. The impact on instruments that are less important for the wealthiest of the wealthy (deposits and bonds as well as liabilities) is lower, while the impact on assets which are largely held by the wealthy (funds and equity) is more pronounced. 26 In general, we find very similar patterns for Austria and Germany, although our methodology leads to larger changes for Austria. This is likely due to the differences in the German and Austrian HFCS in terms of oversampling: Our methodology includes an ex-post adjustment for missing wealthy households. As Germany has implemented an oversampling strategy, outcomes are more likely to better reflect the upper part of the distribution than in Austria (see Fessler et al. 2016, page 29f for a comparison between the Austrian and German HFCS). Waltl (2021) shows that measured differences between wealth concentrations in Austria and Germany can indeed be linked to a differential sampling success in these two countries.
Changes are most pronounced for equity. Wealthier households hold much larger shares in equity than the rest of the population and this share increases even further in the top quarter of the tail. The EG-LMM has already documented coverage ratios exceeding 100% when relying on HFCS data only. Most likely, the reason is that there are substantial conceptual and methodological differences between national accounts and the HFCS (see the discussion in Section 4). The Pareto adjustment of the wealth distribution pushes up these numbers even further. Thus, better aligning the HFCS and financial accounts definition of equity is needed when aiming for distributional indicators. It is important to note that also confidence intervals are largest for equity. Thus, not surprisingly, the sensitivity analysis in Supplementary Material B finds that modelling choices too have the largest impact for equity.
Next, we compute figures for financial wealth defined here as the sum of bonds, deposits, equity and mutual funds minus liabilities. Given that parts of equity are recorded as net positions in the HFCS but spread across all instruments in the financial accounts (gross recording), we would expect high coverage ratios (or even over-coverage) for this level of aggregation as the net value in the HFCS could also include real assets. Adjusting for the missing wealthy indeed leads to formally measured over-coverage of financial wealth mainly driven by the large increase of equity. For equity the differences in net and gross recording are not the only conceptual problem as, for instance, differences in valuation also play a crucial role (EG-LMM 2020).
One particular advantage of the simulation approach is that it allows us to calculate empirical confidence intervals. These intervals take into account insecurities related to the Pareto sampling as well as the portfolio bootstrapping. Figure 9a depicts the share of instrument-specific aggregates held by the tail population, i.e., jointly by all millionaires. Shares consistently increase as compared to shares calculated from HFCS data only. Changes are least pronounced for liabilities, bonds and deposits. We find that millionaires in Austria and Germany hold roughly 90% of total equity.
Similarly, Fig. 9b shows the share of instrument-specific aggregates held by the top 1% of the population. Again, shares strongly increase when applying our methodology. Fig. 9 Break down of the wealth of the wealthiest Notes: The figures show the shares (in %) of total assets and total liabilities held by the tail population, i.e., by all millionaires (Fig. 9a) and the top 1% of the wealth distribution (Fig. 9b). Simulation results also include 95%-confidence intervals In comparison to the break down for millionaires (see Fig. 9a), changes are even more pronounced for equity and real estate for the top 1% as these asset classes are most important for the richest of the rich. While millionaires hold roughly 90% of total equity, the top 1% still hold roughly between 70% and 80% revealing an extremely high degree of wealth inequality for this particular asset class.

Truncated distributional national accounts
So far, we have discussed the impact of the missing wealthy within the HFCS. As a next step, we make use of external data from the financial accounts to bench-mark the quantitative changes of our proposed methodology and demonstrate how such an exercise may lead to (truncated) DINA. We therefore match FA and survey data as outlined in Table 5. Figure 10 shows the truncated DINA results graphically. Tables E.14 and E.15 in the appendix report full numerical results. As we only adjust for the wealthy, it is the top quintile that receives much larger weights. At the same time, all other quintiles "lose" (as less Euros  Table E.14). Shaded bars indicate shares without adjustments, i.e., d ij /y j , whereas full bars indicate adjusted shares, i.e., d * ij /y j are distributed to them due to scaling). Note that this is a mechanical effect that holds true whenever adjusting for the missing wealthy leads to larger instrument-specific aggregates in the top net worth group. The proof of this statement is given in Supplementary Material D.
In general, we find very high degrees of instrument-specific inequality which is most dramatic for equity. Inequality is measured to be even larger when applying our corrections. For instance, the share of total mutual funds held by the top quintile is adjusted upwards from 74.2% to 83.4% in Austria and from 77.4% to 81.1% in Germany. Thus, ignoring the missing wealthy leads to a misleading picture of the distribution of wealth within societies.
Instrument-specific quintile shares are very similar between Germany and Austria, i.e., we find similar degrees of inequality for both countries even at instrument-level. Differences are larger before adjusting for the missing wealthy than after adjustment. The difference, measured by the sum of squared differences, decreases from This finding suggests again, that measured differences between Austria and Germany in terms of wealth inequality may, at least to a certain extent, be driven by the different treatment of the wealthiest households in the survey: oversampling versus non-oversampling. Additionally, it also shows that our methodology appears to be able to overcome parts of this shortcoming.
Furthermore, we compile distributional figures for total financial wealth split up by the same net worth quintiles as before, i.e., we sum up financial assets for each net worth quintile (bonds, deposits, mutual funds and equity) and deduct quintile-specific liabilities. A complete set of adjusted DINA figures is provided in Table 9.
The split-up is technically fully consistent with financial accounts aggregates: Summing over instrument-specific split-ups leads to financial accounts aggregates (as a direct consequence of scaling) and likewise summing over the financial wealth split-up yields total financial wealth as reported by financial accounts.  Figure 11 shows the result. In general, the top quintile holds the lion's share of total financial wealth whereas the bottom quintile has negative financial wealth in Austria and Germany. In both countries the 3rd quintile has less financial wealth compared to the 2nd quintile (in Germany the 3rd quintile even has negative financial wealth). This is due to the fact that this part of the population to a large extent replaces financial wealth with nonfinancial wealth, i.e., they own more real estate (non-financial assets) that are often financed via mortgages (financial liabilities).  Our adjustment further increases financial wealth for the top quintile while reducing the financial wealth of all other quintiles. For Germany, we find that the top quintile holds 1,480 billion EUR in financial wealth whereas the 4th quintile holds 108 billion EUR only. In Austria, the top quintile holds 250 billion EUR compared to roughly 6 billion EUR held by the 4th quintile. Figure 12 shows average financial wealth per net worth quintile. In theory, each quintile consists of the same number of households (namely 20% of total households), however, due to survey weights the number of households per quintile varies to some extend: For Austria the number ranges between 771,819 and 772,994, and in Germany between 7,920,061 and 7,951,902. The average financial wealth per household varies strongly across the distribution: In Austria (Germany), a household belonging to the lowest quintile has on average -11,506 EUR (-25,127 EUR) compared to 325,920 EUR (186,789 EUR) for a household belonging to the top quintile.

Conclusions
Core macroeconomic aggregates are compiled in the system of national accounts of which financial accounts form an integral part. Financial accounts are expected to be exhaustive and thus provide an important source of information on the entire economy. Yet, financial accounts aggregates per se do not provide any details about the distribution of assets and liabilities within the population. This gap is filled by making use of micro data collected from administrative sources or surveys. This article demonstrates how to use the Household Finance and Consumption Survey (HFCS) for such an endeavour.
In an ideal world, without measurement errors, ambiguities in valuation, identical definitions of variables in the survey and financial accounts, a system of national accounts without disturbing balancing effects and a perfect survey sample, identical aggregate results would emerge from either data source, i.e., a coverage ratio of 100%. In the real world, however, coverage ratios are far from perfect.
To compile distributional national accounts (DINA) that provide distributional information consistent with aggregates, we need to understand where this gap comes from. This article adds to the literature in this respect as it aims to quantify the impact of the missing rich on the (instrument-specific) gap between HFCS and financial accounts aggregates, and demonstrates how such a quantification can be used in the compilation of DINA.
We make use of the second wave of the HFCS and analyze the impact of the missing rich in the HFCS on conceptually highly comparable instruments (liabilities, bonds, deposits, and mutual funds).
We perform a case study for Austria and Germany as these countries do not have access to administrative data to strategically oversample the wealthiest households in the HFCS nor to perform plausibility checks and corrections on self-reported survey data. We would therefore expect the largest effects for these countries. Whereas Germany uses geographical data for oversampling, Austria entirely refrains from oversampling. We find that the Pareto adjustment is much larger for Austria than for Germany indicating that lacking an oversampling strategy negatively affects the reliability of the Austrian HFCS.
Previous findings in the literature suggest that the HFCS underestimates net worth at the top of the distribution. We therefore adjust the distribution by replacing the top tail by a Pareto model. The Pareto model is estimated based on a combined sample of observations from the HFCS and national rich lists.
Additionally, we use portfolio structures observed in the HFCS to break down the Paretoadjusted wealth at the top tail by instruments. For this purpose, we propose an analytical as well as a simulation approach based on stratified bootstrapping. Whereas the analytical approach is fast and easy-to-implement, the simulation provides additional valuable information on variation and distributional patterns within the tail. Results for aggregates are identical as the simulation method converges to the analytical approach.
We find that coverage ratios consistently increase when explicitly adjusting survey data to better reflect the top tail of the net worth distribution. Changes vary by instrument and are most pronounced for the conceptually not readily comparable item, namely equity. Although the "missing wealthy" notion explains up to 20% (but usually less than 10%) of the gap between HFCS and financial accounts aggregates, there still remains a substantial gap whose source needs further exploration. Our findings for equity point towards the urgent need to better align the HFCS definition of equity with the globally agreed definition in the financial accounts. At the same time, the appropriateness of the valuation concept for equity in the financial accounts (which is often based on book values) should be reconsidered.
We conduct extensive robustness checks which find that the most crucial assumption in our analysis is the choice of Pareto estimation method. The exact treatment of the rich list is also important. Less influential is the choice of the Pareto threshold and the inclusion of further portfolios to better represent portfolio structures of the wealthiest of the wealthy households. However, as we only rely on portfolio structures collected in the HFCS, results could in fact be even more extreme in the sense that equity could be even larger. We also find that refraining from stratification yields unreliable results and this emphasizes the importance of a more sophisticated approach than just taking a simple average, as proposed in this article.
We compile distributional account figures and show how our adjustments can be used to enhance them. We detect large instrument-specific inequality and find that the top quintile holds the lion's share of financial wealth whereas the bottom quintile has negative financial wealth. In Austria (Germany), a household belonging to the lowest quintile has on average −11, 342 EUR (−25, 127 EUR) compared to 323,397 EUR (186,789 EUR) for a household belonging to the top quintile. In general, patterns are very similar in Austria and Germany.
As a by-product, we gain new estimates on the distribution of net worth. In-line with prior studies, we find that the HFCS substantially underestimates wealth inequality in Austria and Germany. Our analysis suggests that wealth inequality measured by a household-level Gini coefficient is higher than suggested by pure survey data: the Gini coefficient increases from 72% in Austria and from 76% in Germany to roughly 80% in both countries when adjusting for the missing wealthy.
In terms of instrument-level inequality, we find that the wealthiest 1% of the population owns 70-80% of total equity and roughly 30% of total real estate assets, but less than 10% of total liabilities. The top wealth quintile holds roughly 98% of total equity in Austria and 97% in Germany, whereas the bottom three wealth quintiles together, i.e., the lowest 60% hold practically nothing in both countries. We also find very high degrees of instrument-specific inequality for mutual funds and bonds.
Funding Open access funding provided by Vienna University of Economics and Business (WU).

Data Availability
We use data from the Household Finance and Consumption Survey (HFCS), which can be accessed by researchers via a formal request to the ECB. Additionally, we use publicly available National Accounts data and observations from rich lists published by the journals Trend and Manager Magazine. The results published, and the related observations and analyses may not correspond to results or analyses of the data producers.

Declarations
Disclaimer We have carried out large parts of this work during our employment at and Sofie Waltl's subsequent consultancy work for the European Central Bank. This article should, however, not be quoted to represent the views of the ECB, the Deutsche Bundesbank nor any other member of the Eurosystem. The views expressed are those of the authors.

Conflict of Interests
There are no further potential conflicts of interest to declare.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.