1 Introduction

In many developed economies, income inequality has increased sharply during the recent decades (see, for example, OECD 2015). This increase is often connected with several intensive and intertwined processes, including the technological and skill bias change, globalisation, financial deepening, and so on (see ibid.). This paper considers the contribution of financial deepening to income inequality in developed economies, stressing the importance of its interaction with the difference between the (lending) interest rate (r) and the gross domestic product (GDP) growth rate (g).

Most theoretical models, at least of a closed economy, predict that the removal of financial constraints and increasing amounts of borrowing lead unconditionally to the reduction in income inequality (see, for example, Banerjee and Newman 1993; Galor and Zeira 1993) or at least for more developed countries due to the inverted U-shaped relationship in terms of development level (Greenwood and Jovanovic 1990). The previous econometric evidence, especially using earlier data periods, also corresponded rather well with these predictions (see, for example, Levine 2005; Clarke et al. 2006; Beck et al. 2007; Kim and Lin 2011). Recently, contrary empirical findings have started to accumulate evidence that a larger financial deepening may actually have increased inequality instead of reducing it (see, for example, Claessens and Perotti 2007; Kunieda et al. 2014; Denk and Cournede 2015; Haan and Sturm 2016; Jauch and Watzka 2016; Battisti et al. 2018; Brei et al. 2018), which is especially striking in developed economies where the downwards sloping part is expected according to the earlier predictions.

Concentrating on bank credit that is presumably less inequality-inducing than market-based financing (see Brei et al. 2018), this paper explores whether the ambiguity of these findings is related to the influence of the difference between the real interest rate and the GDP growth rate, which we show to be predicted by Kunieda et al. (2014) theoretical model due to the presence of certain non-linearity (an interaction between financial deepening and ‘\(r-g\)’).Footnote 1 Taken separately, this difference (\(r-g\)) plays a central role in the framework of (wealth) inequality as advanced by Piketty (2014) (see also Piketty and Zucman 2015). It is also the decisive factor in assessing whether an economy is dynamically efficient or inefficient (Abel et al. 1989). Relying on Galor and Zeira (1993), Battisti et al. (2018) also show empirically that the level of (world) interest rates itself might be an important state-determining factor for inequality. However, to our knowledge, the importance of \(r-g\) for income inequality due to its interaction with the degree of financial deepening has not been previously explored.Footnote 2

In terms of the Kunieda et al. (2014) model of open economies, the \(g-r\) term represents a synthetic index of realised opportunity for agents of an economy to join investors earning productivity-specific income that are larger than that of lenders who gain the same income (interests) irrespective of their potential productivity differences.Footnote 3 On the one hand, we show that the growth rate of an economy g defined by the model is linked to the participation intensity (as investors). Thus, higher growth rate signals better escape from being an insufficiently productive and thus staying as a pure lender who obtains a lower level of consumption. On the other hand, r itself is a threshold that must be passed and the higher are interest rates, the smaller mass of agents can be sufficiently productive to repay the borrowed capital with interests. Thus, \(r-g\) can be thought as jointly (synthetically) summarising the initial (objective) constraint (r) together with the actual realisation (as measured by g), just in the contrary direction of \(g-r\).

Intuitively, the difference \(r-g\) and, maybe even more importantly, its sign signal if an economy fails/succeeds to generate sufficient income growth to cover the obligations to the financial sector. In the latter case, the benefits from financing are spreading to the whole economy instead of concentrating mostly within the finance sector, that is, benefiting mostly the owners and workers of it. At the same time, greater net returns could make capital owners more capable and/or willing to share the surplus with workers in comparison with the situation where interest rates are squeezing their profits. Thus, the influence of the financial deepening on income inequality might be conditional on the sign and the size of \(r-g\).

A brief look at data already points to a potential role of this difference. Comparison of dynamic patterns of income inequality, bank credit, and \(r-g\) suggests (see Fig. 1 in Appendix A) that the median \(r-g\) and inequality levels share similar regime change in countries entering the Organisation for Economic Co-operation and Development (OECD) by around 1978–1980. At the same time, the median bank credit levels were quite steadily increasing during the period under discussion. This is likely to be one of the reasons for the varying results that may be found in the previously described empirical literature. Whenever one employed the early data period, an increasing level of bank credit could have been pointing to the reduction in inequality levels. Meanwhile, the later periods (or whenever their weight became dominant) would associate the increasing credit levels with the observed upwards shift in inequality.

This paper contributes to the empirical literature on income inequality along three main lines. First, it elaborates and extends the implications of the Kunieda et al. (2014) model of financially open economies for the link between financial deepening, income inequality (both the overall, Gini-based, and as measured by top income shares) and its interaction with \(r-g\). Second, it estimates the effect of financial development, \(r-g\), and their interaction on income inequality in several panels of relatively homogeneous developed economies. Third, it proposes a joint estimation procedure for the top income shares which exploits the fact that a higher income range is well characterised by the Pareto distribution.

Relying on the Kunieda et al. (2014) model, the implications for the top income shares are also derived, defining the conditions for the top income inequality to increase with the relaxation of the financial constraint. Consequently, both the overall inequality as captured by the Gini index and the top income inequality as measured by the top income shares received by the 1%, 5%, and 10% largest income earners will be taken under consideration. This paper is concentrated on these three shares, leaving out the even smaller ones, because a smaller share makes the precision of the corresponding income estimate likely to be less accurate.

Following the main prediction by Kunieda et al. (2014) that, in financially open economies, financial deepening leads to increasing inequality as measured by the Gini index, this paper further tests the significance of non-linearity emerging due to the interaction between the financial development and other components, which was not considered by these authors in their empirical application. Using a few approximations, it is revealed to be linked to the \(r-g\) impact on inequality through an interaction term with the financial deepening.

Then, an empirical analysis of the sign of the impact of ’\(r-g\)’ on income inequality is performed. Since income inequality is one of the sources of changing wealth inequality, our results are relevant for the Piketty theory. In the specification predicted by the model, the sign of the impact of \(r-g\) alone (without taking the bank credit into account) is negative (inequality decreasing), which is consistent with the view advanced in Krusell and Smith (2015), or Acemoglu and Robinson (2015) and which is in contrast with the Piketty prediction. However, the interaction term of \(r-g\) with the private bank credit share in GDP has the inequality-increasing effect whenever \(r>g\), and therefore, it is potentially consistent with the Piketty prediction provided that a sufficiently large bank credit penetration coexists with the previously defined condition.

It is important to stress that the analysis is performed for a relatively homogeneous set of countries. The consideration of many countries at various levels of economic development might be tailored (and very useful) for the identification of the factors that are crucial for development, that is, the factors explaining the differences of inequality at lower and higher income levels. However, this might hide the drivers of inequality in developed economies alone because they might be dominated or insufficiently strong to be observable in a mixed sample of countries.

Consequently, this paper concentrates only on the developed (open) economies. Namely, the two panels are under consideration of countries entering the OECD and the European Union (EU). The latter set of countries is more homogeneous in general, but the interest in the EU member states emerges mainly because of their highly bank-biased financing systems (Langfield and Pagano 2016) and also because of their higher mutual integration and similarity relative to other countries. Consequently, it is expected that similar principles apply in countries with a more uniform impact of financial deepening in terms of bank credit (Benczúr et al. 2019). Furthermore, consideration of a group of similar countries relaxes the need to control for many variables that would otherwise be importantly shaping the differences in development. Hence, a smaller set of other control variables is expected to be sufficient as compared with the case where a diversity of countries is under consideration.

Finally, from the methodological point of view, the contribution of this paper is the proposed simultaneous estimation of the impact of financial deepening measures on the top income shares that exploits the fact that a higher income range is well characterised by the Pareto distribution (see, for example, Atkinson et al. 2011). This simultaneous estimation is introduced to solve the problem of the small number of observations that is caused by that the number of countries with the data on top income shares is much scarcer.

The rest of the paper is structured as follows: Sect. 2 draws some predictions from the Kunieda et al. (2014) theoretical model of open economies and derives its implications for the top income shares. Section 3 describes the data and defines the econometric framework that we have employed. Section 4 presents the empirical results. Finally, Sect. 5 concludes this paper. Some additional details are delegated to Appendix.

2 Some theoretical implications

In the sequel, the implications from an open economy version of the model proposed by Kunieda et al. (2014) are employed to study the financial deepening effect on inequality. The model is used to motivate the specifications that will be estimated later. First, the summary of the model is presented in Sect. 2.1. Then, the implications for the Gini index are discussed in Sect. 2.2. Finally, the results for the top income shares are obtained in Sect. 2.3 (the respective derivations are presented in Appendix C).

2.1 A summary of the model

In a framework with (overlapping) generations of selfish individualsFootnote 4 living for the two periods and deriving their utility from individual consumption in the second period, Kunieda et al. (2014) introduce an endogenous growth model (due to learning by doing) with agents that are heterogeneous in their productivity at creating individual capital. In the first period the agents work and earn a homogeneous wage that is determined by the aggregate technology, whereas in the second period their earnings (which are spent on consumption) depend on individual productivity because persons endogenously choose at the end of the first period to become lenders (earning interest rates) or capital builders (earning from creation of capital) depending on their private productivity, which is distributed uniformly over [0, 1] and constant over time.

It should be pointed out that the (working) population mass (\(L_t\)) living in the first period is homogeneous, and therefore, the heterogeneity stems from the second period generation. Consequently, all of the results on inequality will concentrate only on it. Furthermore, in such a model inequality of consumption is directly indicative of the (final) income inequality. Nevertheless, following Kunieda et al. (2014), the discussion is centred around the consumption patterns.

At an individual level, an agent derives utility from the second period consumption (\(c_{t+1}\)) and, at the end of the first period, chooses to either lend in the second period the previously earned wage (\(w_t\)) or to invest it in the creation of individual capital (\(k_t\)), possibly with additional borrowing or lending (\(b_t<0\) or \(b_t>0\), respectively), depending on his/her random (uniformly distributed on [0, 1]) productivity (\(\phi \)), which is known to him/her but unobserved by others.

Given the credit constraint \( b_t \le \frac{\mu }{1+\mu }, \ \mu \ge 0\),Footnote 5 an agent solves the following problem:

$$\begin{aligned}&\underset{b_t,k_t}{\max }\ c_{t+1}, \text {subject to: } \\&c_{t+1} \le \phi q_{t+1} k_t + R_{t+1} b_t,\\&- \frac{\mu }{1-\mu }w_{t} \le b_t \le w_t,\\&k_t \ge 0,\ k_t+b_t\le w_t,\\&\text {given } q_{t+1}, R_{t+1}, w_t >0, \end{aligned}$$

where \(q_{t+1}\) and \(R_{t+1}\) are the next period’s real price of capital and the gross real interest rate paid/received for the borrowed/lent means, respectively. Different productivities define the choice of an agent to become a lender or a borrower of capital in this economy. Namely, when the individual productivity is sufficiently high to render the rate of return of real capital investment higher than the interest rate on borrowing/lending, an agent becomes a real capital investor and, therefore, also a borrower.

Defining a ratio \(\phi _{t}:=R_{t+1}/q_{t+1}\), the aggregate capital level that fully depreciates over a period is consequently given by

$$\begin{aligned} Z_{t+1} = k_tL_t\int _{\phi _t}^{1} \phi \mathrm {d}\phi = \frac{w_t(1-\phi _t^2)}{2(1-\mu )}L_t, \end{aligned}$$

because, for individuals with productivity \(\phi > \phi _{t}\), the optimal choice is to invest by choosing \(k_t=\frac{w_t}{1-\mu }\) and (borrowing) \(b_t=-\frac{\mu w_t}{1-\mu }\), whereas for less productive agents with \(\phi \le \phi _{t}\), it is optimal to lend the means (\(b_t=w_t\)) without any capital creation (\(k_t=0\)). Consequently, the consumption of those lending and borrowing–investing is given by

$$\begin{aligned} c_{t+1}=\phi _{t}\alpha A^\frac{1}{\alpha }w_{t} \end{aligned}$$
(1)

and

$$\begin{aligned} c_{t+1}(\phi )=\frac{\phi -\mu \phi _{t}}{1-\mu }\alpha A^\frac{1}{\alpha }w_{t}, \end{aligned}$$
(2)

respectively.

At the aggregate level, the first-order conditions under the perfect competition and the technological constraint of production

$$\begin{aligned} Y_t = AZ_t^\alpha H_t^{1-\alpha } \end{aligned}$$

yield further \(q_t=\alpha A^\frac{1}{\alpha }\) and \(w_t=(1-\alpha )A^\frac{1}{\alpha }Z_t/L_t\). Here, the labour force \(L_t\) is enhanced with a learning by doing-implied augmentation parameter (\(y_t=Y_t/L_t\)) that yields the augmented labour force \(H_t=y_tL_t\). Consequently, due to the described learning by doing and technological constraint, production satisfies \(Y_t=A^\frac{1}{\alpha }Z_t\). Assuming a constant labour force (\(L_t=L\)), the growth rate of such an economy is given by

$$\begin{aligned} g_{t+1} = A^\frac{1}{\alpha }\cdot \kappa \frac{1-\phi _t^2}{1-\mu } - 1, \end{aligned}$$
(3)

where \(\kappa = \frac{1-\alpha }{2}\). Hence, in an open economy where \(\phi _t\) is exogenously given by some \(\bar{\phi _t}<1\), which will be defined shortly, an increase in \(\mu \) enhances economic growth as long asFootnote 6\(\mu <1\).

2.2 Implications for the Gini index

In a small open economy version of the characterised economy, the relaxation of the borrowing constraint \(\mu \) produces a larger relative amount of financing and increases the consumption inequality as measured by the Gini index (see Proposition 2 in Kunieda et al. 2014).

Namely, the Gini index is given (see ibid.) by

$$\begin{aligned} G_t = \frac{2{{\bar{\phi }}}_{t-1}^3-3{{\bar{\phi }}}_{t-1}^2+1}{3\left( {{\bar{\phi }}}_{t-1}^2-2\mu {{\bar{\phi }}}_{t-1}+1\right) }, \end{aligned}$$
(4)

where \(\mu \) represents the financial deepening and

$$\begin{aligned} {{\bar{\phi }}}_{t-1} = \frac{{\bar{R}}_t}{\alpha A^\frac{1}{\alpha }} \in (0,1). \end{aligned}$$
(5)

Here \({\bar{R}}_t>0\) stands for the gross (real) borrowing costs that are exogenously given, whereas \(\alpha \in (0,1)\) and \(A\ge 0\) are the aggregate capital-linked parameter and the total factor productivity in the previously defined Cobb–Douglas production function, respectively.

Provided that the ratio is non-negative,Footnote 7 it is indeed clear from Eq. (4) that the increasing financial deepening (\(\mu \)) increases inequality. However, the impact of \(\mu \) on G depends also on the value of \({{\bar{\phi }}}_t\) due to the interaction term in the denominator of Eq. (4), that is, the product term \(\mu {{\bar{\phi }}}_{t-1}\). Motivated by this, in the empirical estimations the additional interaction terms of financial deepening with the real interest rates and growth rates will be used as proxies for the influence connected with the nominator and denominator of Eq. (5).

In particular, up to the first-order effects, we can obtain the following formal guidance. First, let \( r_t : = {\bar{R}}_t - 1\). From Eqs. (3) and (5), it follows that

$$\begin{aligned} {{\bar{\phi }}}_{t-1} = {\eta (\mu )} \frac{1+ r_t}{1+g_t} - {\eta (\mu )}\frac{1+ r_t}{1+g_t}\cdot {{\bar{\phi }}}_{t-1}^2, \end{aligned}$$
(6)

where \(\eta (\mu ) = \frac{\kappa }{\alpha (1-\mu )}\). Since the positive \({{\bar{\phi }}}_t <1\) [see the condition in Eq. (5)], it also follows that the second term on the right side of Eq. (6) with the \({{\bar{\phi }}}_{t-1}^2\) is only of the second order.Footnote 8 Hence, noticing that \(\frac{1+ r_t}{1+g_t}=1+\frac{r_t-g_t}{1+g_t}\) and considering the first-order effects, the following approximation of the interaction term emergesFootnote 9

$$\begin{aligned} \mu {{\bar{\phi }}}_{t-1} \ \approx \eta \mu + \eta \mu \frac{r_t-g_t}{1+g_t}, \ \eta =\frac{1-\alpha }{2\alpha }, \end{aligned}$$
(7)

leading to the appearance of \(r-g\) in the measurement of the impact. It can be further observed that, because the value of \(g_t\) is in most cases close to zero, \(\frac{r_t-g_t}{1+g_t}\) and \(r_t-g_t\) will usually be good proxies for one another.

It is clear that, given the highly stylised model and highly nonlinear relationship in Eq. (4), we cannot expect the functional forms to hold exactly, but we at least expect the signs of the interaction terms to be correct, that is, inequality increasing whenever \(r_t-g_t\) is greater than zero and vice versa. In the sequel, both the linear and the interaction terms appearing on the right side of Eq. (7) will be under consideration in the empirical estimations.Footnote 10

2.3 Implications for the top income shares

This section will give the main results about the top income shares. Since they hold for each fixed period, its index is dropped hereafter for the sake of simplicity of presentation providing the generic case. The proof of Proposition 1 is contained in Appendix C, whereas some shorter derivations are presented in the footnotes.

Let \(p\in [0,1]\) stand for a population shareFootnote 11 and the consumption share of \(1-p\) largest consumers by \(S_p\in [0,1]\). Proposition 1 states the main implication derived for the top income shares from the Kunieda et al. (2014) model, which is summarised in Sect. 2.1, whenever some \(p\in [0,1]\) is under consideration that satisfies the condition

$$\begin{aligned} p \ \in \ ({{\bar{\phi }}},1). \end{aligned}$$
(8)

Proposition 1

In the open economy version of the Kunieda et al. (2014) model, for any p that satisfies condition (8), the share of consumption of the \(1-p\) largest consumers is given by

$$\begin{aligned} S_p&= (1-p)\frac{1+p-2\mu {{\bar{\phi }}}}{{{\bar{\phi }}}^2-2\mu {{\bar{\phi }}}+1}. \end{aligned}$$
(9)

Corollary 1

Given \(\mu <1\), the necessary and sufficient condition for \(\frac{\mathrm {d}S_p}{\mathrm {d}\mu } \ge 0\) isFootnote 12

$$\begin{aligned} p \ \ge \ {{\bar{\phi }}}^2 . \end{aligned}$$
(10)

It is obvious that the condition (10) holds whenever the initial condition required in Eq. (8) is satisfied, because of the restriction given in Eq. (5). In practical terms, when the value of the considered p is larger, it becomes more feasible that the financial development will be inequality increasing, that is, top income shares will raise with the financial deepening.

It is again clear from Eq. (9) that a higher financial penetration (a larger \(\mu \) value) affects inequality through the interaction term \(\mu {{\bar{\phi }}}\). Therefore, similarly to the case of the Gini index, we again can expect that the (interaction) terms defined in Eq. (7) will be of importance.

3 Data and econometric specifications

3.1 Data

In the sequel, we consider the impact of financial deepening on income inequality by separating the later one into the overall inequality—as represented by the Gini index—and the top income inequality—as represented by the top income shares. The Gini index data on market income inequality are taken from the Standardised World Income Inequality Database (SWIID) that provides inequality estimates for a large set of countries for quite long periods.Footnote 13 The top income data are taken from the World Wealth and Income Database (WID).Footnote 14

The financial deepening variable under consideration is the (logarithm of) domestic bank credit to private sector relative to GDP. Two data sources are used for the bank credit: the World Bank Global Financial Development Database (GFDD) and the Bank for International Settlements (BIS) Credit to Non-financial Sector database. The GFDD provides information on a larger set of countries and therefore will be used in the basic estimations, whereas the BIS data being adjusted for structural breaks are available for a smaller set of countries and will be used in robustness checks.

In addition to the real interest rates and GDP (per capita) growth rates, the usual set of further control variables to be used in robustness analysis includes initial income per capita, government consumption expenditure to GDP, trade openness to GDP, human capital, intensity of redistribution, inflation (of consumer prices), and the Chinn and Ito (2006) index of capital account openness.Footnote 15 The WDI database is a source of initial income, government consumption expenditure to GDP, trade openness to GDP, GDP per capita growth rates, and real interest rates of loans. The financial openness indicator is downloaded from the Chinn and Ito indicator website.Footnote 16 Intensity of redistribution is measured by the absolute reduction in Gini index from market to the net Gini using the respective SWIID data. The data on human and real capital stem from the Penn World Table.Footnote 17 Appendix B contains more details of the variables and data sources that we used.

The logarithmic transformation is applied to most of the variables, except for the interest rate, growth rates, capital openness, and inflation. The inverse hyperbolic sine transform of CPI-based inflation is used due to the presence of negative values of inflation in some cases.

Some additional instrumental variables will be employed in the robustness analysis including the legal origin, latitude, the political risk index of the International Country Risk Guide (ICRG), the durability of the stable period without severe political turmoils/changes, the share of foreign banks among total banks, and the real interest rate of US as a (good) proxy of world interest rate (see Battisti et al. 2018).

For the basic empirical estimations that are presented hereafter, all of the available data will be employed. However, the resulting panels are highly unbalanced: for a few countries and variables the series begin as early as 1965, but in most of the cases the data are only available from much later periods. Therefore, the effective number of data varies substantially both with the sets of countries under consideration (OECD and EU) and in the particular set of (control or instrumental) variables. In the tables that will follow both the effective number of observations and the countries under consideration will be always reported.

Given that most of the control variables were insignificant when used with the main specification of interest (containing the interaction term of \(r-g\) and credit) and at the same time reducing further the number of observations and degrees of freedom, the main estimations are presented in Sect. 4.1 only with the main variables of interest in the regression functions. The sensitivity analysis with additional specifications and control variables is discussed in Sect. 4.2 with the related results reported in Online Appendix: Robustness checks.

3.2 Econometric specifications

In this section, we will present the econometric models that were applied for the analysis of the impact of financial deepening on overall inequality measured by the Gini index and the top income shares, respectively. A standard dynamic panel model is employed to model the former one. In the case of the top income shares, there are only data available for a few countries (e.g. only seven from the EU). Therefore, only the OECD case is considered because it comprises the largest (effective) set of thirteen countries. Since the number of countries is still small even in this case, a special framework is developed that overcomes the issue of a small number of observations by jointly estimating the underlying parameters of financial deepening impact on the top income sharesFootnote 18 by relying on certain properties that are implied by the Pareto distribution, which is well known to reasonably characterise the actual income data (see, for example, Saez 2001; Atkinson et al. 2011).

3.2.1 Models of the Gini index

Let \(i\in \{1,2,\dots ,N\}\) and \(t \in \{1,2,\dots ,T\}\) stand for the country and period indices, respectively. Relying on the insights from the previous sections, the following econometric model underlies the evaluation of the impact of financial deepening on overall inequality of income in countries as measured by the Gini index:

$$\begin{aligned} \alpha (L)G_{i,t} = \beta _0 B_{i,t}+\beta _1 h(r_{i,t},g_{i,t})+\beta _2 B_{i,t}h(r_{i,t},g_{i,t})+{\varvec{\theta }}^\prime {\varvec{x}}_{i,t}+\lambda _i+\varepsilon _{i,t},\nonumber \\ \end{aligned}$$
(11)

where

\(\alpha (L)= 1-\alpha _1 L-\dots -\alpha _k L^k\) is a lag polynomial of order \(k\in N\) with real-valued parameters;

\(G_{i,t}\) stands for the (natural) logarithm of the Gini index;

\(B_{i,t}\) denotes the logarithm of private bank credit to GDP;

\(h(r_{i,t},g_{i,t})\) signifies either \(\frac{r_{i,t}-g_{i,t}}{1+g_{i,t}}\) or \(r_{i,t}-g_{i,t}\), where \(r_{i,t}\) represents the real bank lending interest rates and \(g_{i,t}\) denotes the real growth rate;

\({\varvec{x}}_{i,t}\) comprises various other time-varying control variables;

\(\lambda _i\) stands for the country-specific fixed effects;

\({\varvec{\beta }}=(\beta _0,\beta _1,\beta _2)^\prime \) and \({\varvec{\theta }}\) are real-valued vectors of parameters of proper lengths;

\(\varepsilon _{i,t}\) denotes the usual zero-mean error term.

Specification (11) is informative about the direction of the impact of \(r_{i,t}-g_{i,t}\) on inequality. It is clear that its impact will be determined by \((\beta _1 +\beta _2 B_{i,t})\). In connection with the Piketty prediction of \(r-g\) impact on inequality, the expected sign condition is \((\beta _1 +\beta _2 B_{i,t})>0\), whereas in the alternatives that were considered, for instance, by Krusell and Smith (2015) or Acemoglu and Robinson (2015), the opposite would be the case.

Given that model (11) is a dynamic panel model, the usual fixed effects estimator is inconsistent for fixed T due to the presence of incidental parameters (Nickell 1981). Furthermore, inequality is also highly likely to be endogenous with financial deepening, growth rate, and/or interest rates. Therefore, we employ the generalised method of moments (GMM)-based estimatorFootnote 19 originated for the dynamic panels by Arellano and Bond (1991) and extended by Arellano and Bover (1995) and Blundell and Bond (1998). In particular, we apply system GMM conditioning on both lags of differences and levels of the dependent variable. In the robustness analysis in Sect. 4.2, we also discuss the conditioning only on regular (non-GMM) instruments.

We use yearly data for the estimation of the models while avoiding pre-aggregation of initial data (e.g. into periods of 5 or 10 years). Although such an aggregation aims to capture the longer-term impact and the removal of business cycle effects, it is rather questionable because the business cycles can differ across time and countries both in terms of length and phases, whereas it can lead to pre-aggregation biases and result in a substantial reduction in number of observations.Footnote 20 However, the use of yearly data might result in higher lags of serial dependence and we can take this into account by either proper lagging of instruments or selecting the lag order of the polynomial \(\alpha (L)\) based on the significance of the autoregressive terms.Footnote 21

3.2.2 Modelling of top income shares

The Pareto distribution is known to well characterise the top tails of the income or wealth distribution (see, for example, Atkinson et al. 2011; Blanchet et al. 2017), and therefore, the top incomes can be properly characterised with a few parameters that, potentially, are time and country varying. Furthermore, it follows that the respective top income shares depend only on a straightforward function of the shape parameter of the Pareto (Type I) distribution besides the considered share itself (see, for example, Blanchet et al. 2017; Jones 2015; Jones and Kim 2018). Hence, allowing for some random deviation from the Pareto distribution-implied shares, the actual share of top income for any quantile level p is given by

$$\begin{aligned} s_{i,t}^{(p)}= (100/p)^{\eta _{i,t}-1 +\varepsilon _{i,t}^{(p)}} , \end{aligned}$$
(12)

where \(\eta _{i,t}\) is an inverse of the Pareto distribution shape parameter and \(\varepsilon _{i,t}^{(p)}\) stands for the p-specific zero-mean error; thus, for \(\varepsilon _{i,t}^{(p)}=0\) the observed shares are consistent with the Pareto distribution, whereas there is only an approximate correspondence otherwise. The term \(\varepsilon _{i,t}^{(p)}\) is assumed to be independent of the remaining part of the process under consideration with its variance shrinking towards zero as \(p\rightarrow 1\) quickly enough to ensure that \(s_{i,t}^{(p)}\in [0,1]\).

Simple transformation of Eq. (12) yields

$$\begin{aligned} {\tilde{s}}_{i,t}^{(p)} :=&\frac{\ln \left( s_{i,t}^{(p)}\right) }{\ln (100/p)} +1 \nonumber \\ =&\eta _{i,t}+ \varepsilon _{i,t}^{(p)}, \end{aligned}$$
(13)

and therefore, \({\tilde{s}}_{i,t}^{(p)}\) and \(\eta _{i,t}\) share the same conditional expectations, provided that \(\varepsilon _{i,t}^{(p)}\) is independent from the conditioning set for all p under consideration. Given some vector of explanatory variables \({\varvec{z}}_{i,t}\), suppose it is given by

$$\begin{aligned} \mathrm {E}(\eta _{i,t} | {\varvec{z}}_{i,t},\eta _{i,t-1},\eta _{i,t-2},\dots ,\eta _{i,t-k} )={\varvec{z}}_{i,t}^\prime {\varvec{\psi }}+ \alpha _1\eta _{i,t-1} + \alpha _2\eta _{i,t-2}+\dots + \alpha _k\eta _{i,t-k},\nonumber \\ \end{aligned}$$
(14)

which allows also for some autoregressive effects. Recall the notation of the lag polynomial \(\alpha _k(L)=1+\alpha _1L +\alpha _2L^2+\dots +\alpha _kL^k\), for some positive integer k. Then, the following system of equations of top incomes shares of \(1\%\), \(5\%\), and \(10\%\) of richest population will be under estimation

$$\begin{aligned} {\left\{ \begin{array}{ll} \alpha _k(L){\tilde{s}}_{i,t}^{(1)} = &{} {\varvec{z}}_{i,t}^\prime {\varvec{\psi }}+ \xi _{i,t}^{(1)}, \\ \alpha _k(L){\tilde{s}}_{i,t}^{(5)} = &{} {\varvec{z}}_{i,t}^\prime {\varvec{\psi }}+\xi _{i,t}^{(5)},\\ \alpha _k(L){\tilde{s}}_{i,t}^{(10)} = &{} {\varvec{z}}_{i,t}^\prime {\varvec{\psi }}+\xi _{i,t}^{(10)}, \end{array}\right. } \end{aligned}$$
(15)

where \(\xi _{i,t}^{(p)} = \nu _{i,t} + \alpha _k(L)\varepsilon _{i,t}^{(p)}, \ p\in \{1,5,10\}\), and \( \nu _{i,t} = \eta _{i,t} - \mathrm {E}(\eta _{i,t} | {\varvec{z}}_{i,t},\eta _{i,t-1},\eta _{i,t-2},\dots ,\eta _{i,t-k} )\).Footnote 22

It is clear that system (15) is featured by the cross-equation parameter restrictions. To estimate the parameters of such a dynamic system, we employ the GMM.Footnote 23 It should be pointed out that, in the approximate case whenever \(\varepsilon _{i,t}^{(p)}\not =0\), \(\xi _{i,t}^{(p)}\) is a moving average term that by the construction correlates with lags of \({\tilde{s}}_{i,t}^{(p)}\) for \(\alpha _p(L)\not =1\); hence, in the main specification, we exclude lags of \({\tilde{s}}_{i,t}^{(p)}\) from the conditioning set when estimating with the GMM and use only those of \({\varvec{z}}_{i,t}\). In such a case, the consistency of estimation does not even depend on the presence of serial correlation in errors. On the other hand, if shares were generated by the Pareto distribution, then \(\xi _{i,t}^{(p)} = \nu _{i,t}\) even with \(\alpha _k(L) \not = 1\); therefore, errors of the system under consideration would be serially uncorrelated. Consequently, the lags of \({\tilde{s}}_{i,t}^{(p)}\) would also be valid instruments and the over-identifying restrictions would not be rejected, for example, by the Sargan–Hansen test.Footnote 24 The empirical results that will be presented shortly look to be consistent with such a case.

4 Estimation results

In this section, the basic estimations with Gini index and top income shares are characterised in Sect. 4.1. Afterwards, various extensions and robustness checks are performed in Sect. 4.2.

4.1 Main findings

This subsection will first report the estimation results for the Gini index of market and net income. It will then present the findings for the joint estimation of the top income system of equations. In either case, the dependent variable is logarithmically transformed to reduce the heteroscedasticity of the errors. In all of the cases that will be presented, the robust inference relies on standard errors adjusted for clustering by countries while the employed number of instruments is set to be below 70% of the number of countries in order to stay sufficiently below the number of cross sections (see the discussion in Roodman 2009a,b). Further shrinkage of the relative number of instruments almost up to twice is explored in the robustness analysis in Sect. 4.2.Footnote 25 To sufficiently reduce the number of instruments to the intended level, we use the principal components of instruments.

Aside from a few marginal cases, the instrument admissibility in terms of over-identifying restrictions is not rejected at standard significance levels by the Sargan/Hansen test(s) practically for all specifications of main interest, i.e. involving the terms discussed in Sects. 2.2 and 2.3 . However, in a critical mass of cases with the Gini index the hypothesis of the absence of serial dependence of errors at second and third lag orders was rejected significantly.Footnote 26 Hence, the set of instruments here initiates from the fourth lag of the dependent variable.Footnote 27

Table 1 Estimation results with the Gini index of income (line two indicates the type of income)

Table 1 reports the results of estimation for the Gini indexes of market and net income. In columns (1)–(4), the specifications include only the (logarithm of) bank credit without the \(r-g\) and interaction terms. The coefficients have the expected positive sign predicted by Kunieda et al. (2014), but are marginally significant, if at all.Footnote 28

Next, columns (5)–(8) augment the results with the \(\frac{r_t-g_t}{1+g_t}\) term and its interaction with the bank credit variable. These two additional terms are significant and have the expected signs while the bank credit term stays insignificant. Removing it produces columns (9)–(12) where all series are significant at least at the 10% significance level. As expected, the sign of \(\frac{r_t-g_t}{1+g_t}\) is negative while at the same time its interaction with the (logarithm) of bank credit to GDP has a positive coefficient.

The results remain practically unchanged when, instead of \(\frac{r_t-g_t}{1+g_t}\) appearing in Table 1, and motivated by the arguments presented in Sect. 2, the unscaled \(r_t-g_t\) is used for the estimations (see Sect. 4.2). It is also of interest to point out that the \(r-g\) term does not proxy for any of its constituents: the hypothesis of equally sized parameters with the opposite sign cannot be rejected in the corresponding unconstrained regression.

The significance of the two terms implies that the sign of \(r-g\) impact on inequality is conditional and depends on the actual level of financial penetration.Footnote 29 For instance, taking the point estimates from column (9) of Table 1 when the market income inequality in countries from the OECD is considered, the prediction is that the inequality reducing impact of \(\frac{r_t-g_t}{1+g_t}\), conditionally on \(r_t>g_t\), is outweighed by the interaction term whenever the bank credit is greater than, approximately, 43% of GDP. Therefore, it might not be a mere coincidence that the median bank credit in the OECD countries plotted in Fig. 1 provided in Appendix A (see the top right figure) also passes this threshold around 1980, when the inequality starts rising and it also becomes a norm for the real interest rates to be greater than the corresponding real GDP growth rates. The analogous thresholds derived from estimates presented in columns (9)–(12) range from 43 to 55%, but the difference between them is statistically insignificant.

Turning to the top income inequality, Table 2 reports the related estimation results using the (effectively available) sample of OECD countries. Recall that the estimates rely on the GMM estimation of the system (15) using jointly the 1%, 5%, and 10% top income shares.Footnote 30

Table 2 Estimation results for the top income shares

Columns (1) and (2) show that, when taken alone, the effect of the financial deepening on top income shares is statistically significant,Footnote 31 but the coefficient is negative. It changes to positive when the two additional explanatory terms are added in columns (3) and (4). Here all variables have the expected signs and are statistically significant. Columns (5) and (6) additionally present the results when only the last two terms are retained, although this disregards the significance of the former term.

The significance of all three components in columns (3) and (4) implies that both the financial deepening and \(r-g\) influence on inequality are conditional. However, the \(r-g\)-linked threshold at which the bank credit initiates to have an inequality reducing impact is quite high (the growth rate g needs to be larger than the interest rate r by about 4–5 percentage points). Hence, at least historically, the higher levels of bank credit were mostly connected with the increasing inequality as measured by the top income shares.

In terms of the emerging non-linearity, the same pattern of impact appears as was found previously with the Gini index. However, the estimated levels of bank credit to GDP rendering the total impact of ‘\(r-g\)’ inequality increasing (given \(r>g\)) are higher and, as derived from columns (3)–(6), range between 76 and 81.

The findings provided in Tables 1 and 2 are consistent with the expectations stated in Sect. 2; there is a conditional (nonlinear) impact of financial deepening (as well as \(r-g\)) on inequality due to the presence of the interaction term.

4.2 Some further checks

In this subsection, we will check the robustness of the main findings reported in the previous subsection presenting the related tables in Online Appendix D: Robustness checks. This analysis is concentrated on the case of the Gini index because of larger number of observations available for it and because of some restrictions on the empirical implementation of the somewhat nonstandard model considered for the top income shares.

In the sequel, we perform eleven robustness checks which cover the estimations with different variables or data sources, with some additional control variables (among which are standard controls like initial income, openness, human capital, etc., as well as nonlinear terms of financial deepening, the US interest rate as a proxy of the world interest rate and/or other proxies of period effects), with the varying number of the GMM instruments and the usage of other estimators (including the two-step GMM and the estimation with regular (non-GMM) instruments that are supposed to be exogenous), as well as the estimation of the dynamic panel threshold model.

First, Table D1 reveals that the use of the \(r-g\) term instead of \(\frac{r-g}{1+g}\) yields practically the same results, as was already expected in Sect. 2.2. There is no change in the significance of coefficients, and even the point estimates of coefficients are barely affected.

Next, we explore the importance of the varying number of GMM instruments. In the basic estimation, the number of instruments made up, approximately, 70% of the number of cross sections (countries). Further shrinkage of the relative number of instruments to 60%, 50%, and 40% is considered in Table D2. Let us recall that the high correlation between linear terms with their interaction is an important aspect of our analysis, which increases the variance of estimators on its own. The instrumental variable estimation further increases this variance and more so whenever the correlation between instruments and explanatory variables becomes lower. Hence, a too small number of instruments coupled with the presence of multicollinearity might result in huge variance of the estimators and, consequently, apparent insignificance of coefficients. Despite this possibility, our results are quite robust to the decreasing number of observations retaining the correct sign of \(r-g\) and its interaction with credit along all changes, although the significance becomes weaker and the variation in point estimates greater when the number of instruments shrinks to 40%.

Table D3 investigates next the sensitivity of the results to the inclusion of other control variables (i.e. capital openness, initial income per capita, government consumption expenditure to GDP, trade openness to GDP, human capital, intensity of redistribution, and inflation). The main issue with such an exercise is the potentially insufficient number of degrees of freedom for precise estimation of many additional parameters. We proceeded therefore by investigating the significance of individual control variables by their stepwise inclusion. As can be seen, none of the considered control variables becomes significant having even contradictory signs in different specifications or even in the same specification when different (OECD and EU) samples are under consideration. At the same time, the two terms of primary interest always retain their expected signs and even point estimates remain quite similar. These results would suggest that, as expected in introduction, the countries under consideration are indeed quite similar in terms of inequality patterns after the conditioning on the financial deepening and \(r-g\) (as well as time-invariant factors), thus rendering the further controls less informative.

Table D4 controls for the potential presence of two additional non-linearities. First, it includes the square of (lagged) income in relation to the potential importance of the Kuznets curve, although in the sample of quite homogeneous developed economies we are unlikely to detect its presence. Second, it includes the squared term of the financial deepening itself (both log-transformed and non-transformed series are considered). None of the additional terms become significant while the non-linearity due to the interaction remains significant at least at the 10% significance level with the Gini index of market income.Footnote 32

Battisti et al. (2018) reveal that the behaviour of inequality depends on the level of global interest rates. They also show that the dynamics of the US interest rate is very similar to that of the world interest rate. We therefore include in Table D5 the US interest rate as an additional time-varying control variable in order to check if our findings are not driven by unconscious proxying of global interest rates with the \(r-g\) term.Footnote 33 This augmentation has changed the sign and significance of the credit term when considered alone [see columns (1)–(4)], but it is always insignificant and has no visible effect on the specifications with \(r-g\) and the interaction term. Hence, it looks that our result has its separate content.

Table D6 presents some further robustness checks of the inclusion of time-varying components in the main specifications. Namely, the trend and its square as well as the logarithm of the trend and the square of such a transformation are used as some ‘basis functions’ which linear combination can approximate the potentially changing dynamics common to all countries.Footnote 34 As in the previous analysis with the included US interest rate, only the bank credit (alone) is sensitive to such an inclusion changing even its sign, whereas the \(r-g\) and the interaction terms are barely affected as compared with the main findings in Table 1.Footnote 35

In our relatively small sample, the two-step GMM does not necessarily perform better (see the arguments in, for example, Hwang and Sun 2018); therefore, the main results presented in Table 1 relied on the one-step GMM estimator. The results of the two-step GMM are additionally reported in Table D7. Here the signs of the two main terms of interest remain correct; however, the significance is retained only when the market income Gini is under consideration.

The main results reported in Sect. 4.1 use the private bank credit series from the GFDD. Table D8 contains the outcome when the BIS credit series are used instead that are adjusted for structural breaks but are available only for a smaller set of countries. Although in a few cases the statistical significance is lost, the results in general are very similar to those seen in Table 1.

The main estimations provided in Table 1 relied on the (broader and longer) SWIID data set, but it might have some quality issues (see Jenkins 2015). Therefore, Table D9 additionally reports the robustness check obtained using only the high-quality WIID Gini dataFootnote 36 which has shorter series. Unfortunately, the available number of observations shrinks substantially: for gross income, the reduction is about four and six times in the OECD and EU samples, correspondingly, whereas for the disposable income the reduction is by more than 40% and 30%, respectively. Despite the much smaller number of observations, the signs of the estimates remain as shown in Table 1 in all the cases.Footnote 37 Since the estimation with the WIID high-quality data relies mainly on the data only after 1978Footnote 38 whenever \(r>g\) was predominant, the \(r<g\) state cannot be properly captured, and therefore, there is no wonder that the credit series now becomes always positive [see columns (1)–(8)] and also more significant than the \(r-g\) and/or the interaction term.

Next, Table D10 addresses the potential endogeneity issue using the following set of regular instruments. Besides the usual time-invariant characteristics representing the historical tradition (five dummies of British, French, German, Scandinavian, and Socialist legal origin) and the geographical positioning (latitude), we include further the political risk index of the International Country Risk Guide (ICRG), the (logarithm of) durability of the stable period without severe political turmoils/changes, the share of foreign banks among total banks, and, motivated by Battisti et al. (2018), the US real interest rate. In order to avoid potential repercussions, we drop the US from the OECD estimation sample (and it is absent from the EU sample by the definition). Apart from the first two instruments that do not vary over time, each time-varying instrument is additionally lagged (by one year) in order to further ensure the absence of potential endogeneity.

Table D10 presents the estimation results,Footnote 39 revealing that only in a few cases the significance is lost as compared with Table 1 and only for the specification whenever all the three explanatory variables are included. This is likely to occur due to the earlier discussed high correlation between the linear and interaction terms resulting in the increased variance of the estimator. Despite this, the results based on the instrumental variables seem to be reassuring, although the estimated thresholds of the conditional inequality-increasing impact of \(r-g\) now are spread over a wider range (from 36% to 82%).

Up until now we considered econometric models that are linear in terms of parameters, but nonlinear in terms of variables and lead to a nonlinear impact of financial deepening on inequality because of the presence of the significant interactionFootnote 40 between the credit penetration and ‘r-g’-driven terms together with the regime changes of the later series as characterised in Appendix A. Table D11 presents in addition the estimation results of a model nonlinear in parameters. Namely, the dynamic panel threshold model of Seo and Shin (2016) with a threshold in \(\frac{r-g}{1+g}\) is estimatedFootnote 41 using the implementation by Seo et al. (2019).Footnote 42 The requirement of balanced panel that is needed for the implementation by Seo et al. (2019) restricts substantially the effective number of observations in the OECD and EU samples. Therefore, we additionally included the results with a sample of all countries at hand and not only those from EU and/or OECD. Furthermore, fixing \(T=10\) to be able to capture at least some variation in variables over time, we searched over the available consecutive periods for a window with a balanced panel/subsample (from the whole sample of unbalanced panel) that maximises the number of observations. The results with all available countries are reported also for other consecutive periods that are outside the main selected window, but they have (much) smaller number of observations, like in the OECD and EU samples. The findings in larger samples seem to corroborate the main previous conclusion about the importance of \(r-g\) for the increasing impact of bank credit on inequality: the null hypothesis of linearity is significantly rejectedFootnote 43 in favour of the presence of a parameter changing threshold, and the impact of credit penetration on inequality either increases (when positive) or switches the sign from negative to positive whenever the threshold level around zero is surpassed.Footnote 44

5 Concluding remarks

Our analysis of both the Gini indexes of market and net income, and also the top income shares points to that the impact of the financial deepening as measured in terms of private bank credit on income inequality is conditional on the sign and size of the gap between the real (bank lending) interest rates and the real GDP growth rates. Our estimates show that, in the considered sample of open economies, the bank credit expansion under \(r>g\) increases inequality as measured by all indicators, which is consistent with the prediction derived from the Kunieda et al. (2014) model.

This is likely to be one of the reasons for the varying results that may be found in the empirical literature. Whenever one employed the early data period where the negative \(r-g\) values dominated, an increasing level of bank credit could have been pointing to the reduction in inequality levels. Meanwhile, the later periods were featured by the prevailing positive \(r-g\) that would associate the increasing credit levels with the observed upwards shift in inequality.

From the policy perspective, this implies that, in principle, inequality can be reduced by either shrinking the amounts of bank credit under the positive gap or seeking for some structural policies that, even under the same amount of bank credit, would ensure that the real GDP growth rates are above the real lending rates. The latter alternative seems to be very attractive because ‘bad credit’ might soon become inequality reducing under \(r<g\). Therefore, further understanding of the ‘\(r-g\)’ gap drivers is needed.

The presence of the established interaction term of bank credit with the gap also has some implications for the discussion about the sign of the \(r-g\) impact on income inequality. Whenever \(r<g\), the sign of \(r-g\) impact on inequality as measured by both the Gini index and top income shares is negative. Given that \(r>g\), which was the predominant case during the latest few decades, the impact of \(r-g\) on inequality is conditional on credit penetration. For instance, the impact of \(r-g\) on the Gini index of market income, as estimated from the OECD country sample, is negative until bank credit to GDP becomes greater than approximately 40% of GDP. Afterwards, the interaction-driven term dominates and larger \(r-g\) leads to increasing inequality. For the top income shares, this threshold seems to be higher.

The dependence of bank credit impact on income inequality on the \(r-g\) is fascinating because it implies that countries with the same path of credit can reach very different outcomes. For instance, if some Southern European countries faced additional risk premiums that increased this gap, then their inequality would be expected to reach higher levels than in countries with lower-risk premiums, given similar or even lower bank credit penetration. At the same time, this suggests that the financial integration of some Central and Eastern European countries into the EU could have softened the increase in inequality that was associated with the quick increase in bank credit penetration in these countries.

Nevertheless, it remains an open question not only what drives \(r-g\), but also if its different components (risk premium vs. base rate, growth rates of various GDP components, etc.) have the same influence on inequality and whether the interaction has different impact depending on the particular composition of finance.

Fig. 1
figure 1

Dynamics of median yearly values in the OECD member countries (MC) and the distribution shift of ’\(r-g\)’. Note: the availability of data for different countries varies over time