Introduction

Recently, many governments are designing and implementing national strategies for financial education (OECD/INFE, 2013, 2015a, 2015b; OECD, 2013, 2016a, 2016b). This is because these governments are convinced of the advantages that being financially literate entails in citizens’ economics outcomes. Even policymakers from different countries believe if citizens had more financial knowledge they would have endured less from the recent crisis. In other words, “the crisis seems to have been a rather ‘efficient promoter’ of the importance of improving financial literacy” (OECD/INFE, 2009, p.9)

Income inequality is another issue that is recently gaining significant importance. So much so that this issue is the tenth sustainable development goal (UNDP, 2020). In fact, several organizations point towards income inequality as a problem that must be solved (especially within countries) to avoid economic disasters (World Inequality Lab, 2018; Oxfam International, 2017, 2018). In addition, the UNDP (2018) pointed out that “inequality in income contributes the most to overall inequality” (p. 4). In fact, the last Human Development Report is focusing on inequality (UNDP, 2019).

Due to the recent importance of both issues, we wonder if financial knowledge influences income inequality. Answering this question is our aim, especially given the empirical shortage there is about it (see the “Literature Background” section). One of the reasons for such a shortage is probably that surveys usually force individuals to choose a certain range of annual income that suits their situation rather than allowing them to write their exact annual income. This fact makes it difficult to calculate a measure of inequality such as Gini Index. In addition, another disadvantage of the surveys is their lack of longitudinally (they usually have cross-sectional or time series design). Furthermore, assuming the same survey is conducted periodically, there is a risk that the same individuals may not always participate. Or, when they dealt with countries, the number of them changes. In other words, attrition could occur. Another possible problem is that of subjectivity biases (all these issues all commented in more detail in the “Empirical Analysis” section).

Recently, Oliver-Márquez et al. (2021) have created a financial knowledge index (FKI) whose design is longitudinal. This means that this FKI measures the financial knowledge of several countries over several years. Furthermore, this FKI overcomes the problems mentioned in the previous paragraph since it has not been created using surveys, but rather macroeconomic variables from reliable secondary sources (mainly the World Bank, UNESCO, and UNDP). Therefore, this FKI makes it possible to analyze how financial knowledge is related to variables that go beyond the intrinsic characteristics of individuals (gender, age, educational attainment, occupation, etc.) or the composition of their portfolio, which are those traditionally contained in surveys.

Exactly, in this paper, we are using this novel FKI to analyze if financial knowledge influences the income inequality that exists in countries, measured through the Gini Index. Precisely because the FKI is longitudinal, to achieve our goal we estimate panel data models, which are rare in the literature on financial knowledge. Estimating this type of models could be an advantage because these provide “more informative data, more variability, less collinearity among variables, more degrees of freedom and more efficiency” (Gujarati & Porter, 2009) in comparison with cross-sectional (or time series) models, that are those that we could usually estimate using survey data.

Henceforward, we provide a review of previous related papers in Sect. 2. After that, in our empirical analysis (the “Empirical Analysis” section), we address what data and methods we use, and we show and discuss our results. Lastly, we present our conclusions in the “Conclusions” section.

Literature Background

Although the first formal definition of financial literacy was provided by Noctor et al. (1992), it was Mandell (1998) who set the starting point in the analysis of financial knowledge as it is known today in economic research. Subsequently, different contributions regarding its measurement, its determining factors, as well as its effect on people economic outcomes have emerged. In face of the continuous proliferation of increasingly complex and accessible financial products and services, the OECD (2005) was the first international organization to point out the growing importance of financial knowledge.

So much so that many governments and policy makers pointed to financial illiteracy one of the aggravating factors of the devastating economics consequences of the recent crisis (OECD/INFE, 2009). Given this scenario, it is not surprising that the OECD (2014, 2017, 2020) began to assess the financial capabilities of 15-year-old high school students. In addition, in recent years, national strategies for financial education have emerged in both developed developing countries (OECD/INFE, 2013, 2015a, 2015b; OECD, 2013, 2016a, 2016b).

There are many previous works that have analyzed the effect of individuals’ financial knowledge on their socioeconomic status. For example, Lusardi and Mitchell (2005, 2007), O’Rand (2011), Van Rooij et al. (2012), Santos and Abreu (2013), Lusardi et al. (2014), Fisch et al. (2016), and Clark et al. (2017) declared that financial knowledge contributes positively to post-retirement well-being. Meanwhile, Lusardi and De Bassa Scheresberg (2013) showed that greater financial knowledge implies lower cost when borrowing. Likewise, Murendo and Mutsonziwa (2017) postulated that financial knowledge contributes to saving.

In fact, Lusardi (2008), Bucher-Koenen and Ziegelmeyer (2011), van Rooij et al. (2011), and De Bassa Scheresberg (2013) agreed that the most educated people financially profit better economic-financial outcomes. Previously, Martin (2007) noted the positive implications of individuals’ financial knowledge on their retirement planning, homeownership, credit use, saving, and their financial outcomes in general. In other words, financial knowledge is important to avoid financial fragility (Lusardi et al., 2011; Lusardi & De Bassa Scheresberg, 2013; De Bassa Scheresberg et al., 2014a, 2014b; Lusardi & De Bassa Scheresberg, 2017).

In addition, previous literature offers numerous findings about the effect of financial knowledge on the wealth accumulation and distribution both (e.g., Bannier & Schwarz, 2018; Bernheim et al., 2001; Lusardi et al., 2017; van Rooij et al., 2011, 2012). However, the effect of financial knowledge on income inequality is a less explored issue even though the OECD (2008) pointed out that being financially literate “is instrumental strengthening social fairness”. Also, the OECD (2013) suggested that national strategies for financial education could contribute to a lower income inequality.

Nevertheless, here is not much empirical evidence that supports the positive effects of financial knowledge on income inequality. In addition, the existing one does not contribute to create consensus about that. In this way, Campara et al. (2016) showed that financial knowledge has implication for poverty alleviation and, thus, for reducing income inequality. Likewise, Lo Prete (2018) established that financial knowledge is negatively and significantly linked with income inequality. But Kurihara (2013) concluded that financial knowledge does not affect income inequality.

The most curious thing is that both works (Kurihara, 2013; Lo Prete, 2018) use the database provided by the World Competitiveness Center (WCC). Specifically, this database contains two variables with which financial knowledge can be measured. The first is called “education in finance,” while the second is “economic literacy among the population.” The WCC uses these two variables along with many others to measure the competitiveness in the countries. The problem is that both variables are created from surveys aimed to senior business leaders about their perception of the individuals’ financial knowledge. Therefore, these two variables are biased because they are subjective (a perception).

In this paper, we contribute to building consensus around the influence of financial knowledge on income inequality by using the FKI as one of the explanatory variables of the Gini Index. An additional novelty is that, in addition to using the original version of the FKI, we use a squared version of this variable. In this way, we aim to find out whether there is a non-linear relationship between financial knowledge and income inequality. This issue has not yet been analyzed in the related literature. If this non-linear (or U-shaped) relationship exists, we could even determine in what range of values financial knowledge reduces/increases income inequality and if outside this range the relationship is inverse.

In addition to the FKI (and its quadratic version), we use other control variables that allow us to guarantee the robustness and reliability of our results and, consequently, of our conclusions. It is precisely the introduction of additional explanatory variables to the FKI that will also allow us to shed light on some issues not yet sufficiently agreed upon in the literature related to income inequality.

Empirical Analysis

To achieve our aim, we estimate panel data models where the Gini Index acts as the FKI dependent variable, in addition to other explanatory variables. These additional control variables are introduced to provide robustness and reliability to our model, although they also help to shed light on some issues not yet sufficiently explored in the literature related to income inequality. In any case, we also carry out different of pre- and post-estimation analyses that allow us to ensure that our results are valid and reliable, and thus also our conclusions.

Why we use panel data estimators instead of cross-sectional (or time series) estimators? First, the FKI is longitudinal. Therefore, it makes sense to consider using panel data estimators, especially when individual effects are important (as we check in the “Estimations” section). Second, empirical evidence shows how using microeconomics variables (extracted from surveys) could present certain disadvantages such as subjectivity biases or difficulty measuring income inequality because individuals choose a certain range of annual income rather than write their exact annual income (see the “Introduction” and “Literature Background” sections).

Another disadvantage of micro-databases is attrition. That is, if the same survey is carried out periodically (which favors use of longitudinal data), it is probably that individuals interviewed one year decide not to participate next year (or refuse to answer certain questions). Also, panel data estimators tend to be more efficient than cross-sectional (or time series) estimators. Exactly, panel data estimators provide “more informative data, more variability, less collinearity among variables, more degrees of freedom and more efficiency” (Gujarati & Porter, 2009).

In addition, panel data estimators can be statics or dynamics. We use two static estimators (FGLS and PCSE) and a dynamic estimator (Roodman, 2006, 2009). Dynamic estimators emerged to solve the possible existence of endogeneity without having to replace those variables that are suspected of causing such endogeneity (which implies losing information without make sure that this problem is really solved). Roodman’s estimator has some advantages over normally used dynamic estimators (Arellano & Bond, 1991; Arellano & Bover, 1995; Blundell & Bond, 1998; among others). Specifically, Roodman’s estimator, in addition to estimating with instruments in differences, incorporates instruments in levels. This implies a lower loss of information compared to the rest of dynamic estimators.

Furthermore, while most of dynamic estimators use dependent variable lags as an explanatory variable, Roodman’s estimator allows us not to do so. Thus, this estimator is more efficient as well as more both innovative and attractive than usual dynamic estimators if the conditions for its validity are met. What are these conditions? On the one hand, there should be no second-order autocorrelation. On the other hand, there should be no problems of over-identification of the instruments (that is, using more instruments than are necessary, which implies that the estimate is suboptimal).

To check the first condition, Roodman’s estimator runs Arellano-Bond test, whose null hypothesis is “there is no second-order autocorrelation.” To check the second condition, we first must ensure that number of instruments is less than number of groups. Also, Roodman’s estimator runs Hansen test, whose null hypothesis is “identification restrictions are valid.” It is important to note that identification problems could be avoided by using an adequate individuals-periods ratio. To achieve it, grouping periods by biennium, triennium, lustrum, etc. is recommendable (Mileva, 2007; Roodman, 2009). In fact, we group data for our period (2008–2014) by biennium.

Data

FKI is the main variable to solve our aim. This is an index that measures the financial knowledge that exists in different countries over different years. Therefore, it is a longitudinal index (combining cross-sectional and time-series data). Its values range from 0 to 1. If a country has an FKI equal to 0, it means that there is an absolute lack of financial knowledge in this country. The opposite is true when this FKI is equal to 1.

Precisely because it is not constructed from survey data, it allows us to analyze the relationship of financial knowledge with variables that go beyond people’s individual characteristics (such as gender, age, educational attainment, occupation, etc.) or the composition of their portfolio. Thus, it allows a macroeconomic turnaround in the analysis of financial knowledge. Anyway, this FKI enables us to analyze whether there is any relationship (linear and/or non-linear) between financial knowledge and income inequality in the context of several countries for a given period.

Regarding the latter, the sample used by Oliver-Márquez et al. (2021) to construct their FKI addresses 63 countries during 1999–2014. This, together with the characteristics of the models to be estimated (especially Roodman) described in the previous section, leads us not to use each of the years that make up that period. In addition to all this, it should be considered that mixing periods prior to the 2008 crisis with periods after it (in which income inequality underwent abrupt changes) could call into question the quality, robustness, and reliability of the estimation results. For all these reasons, we use the period 2008–2014 per biennium. Table 7 in the appendix shows the values (in descending order) of the FKI for each of the countries in the sample for each of the years we use.

As for the rest of the explanatory variables, some of them are introduced to gain consistency in our estimations, since their results are expected a priori due to the high consensus that exists in previous literature. For example, economic growth, financial development, and tax burden (Rodríguez-Pose & Tselios, 2009; Faustino & Vali, 2013; Kus, 2012; Muinello-Galo and Roca-Sagalés, 2011; Seven & Coskin, 2016). We also use the lagged Gini Index for 16 periods (the maximum we can lag without incurring a lot of missing data) to analyze whether income inequality is a dynamic phenomenon and to clarify some aspects of Kuznets’ and Piketty’s thesis.

The possible redistributive effect of variables such as inflation and over-aging is still under discussion in the related literature (Deaton & Paxson, 1994; Faustino & Vali, 2013; Monnin, 2014; Onafowora & Owoye, 2017; Sarel, 1997), which is why they are included in this paper. The relationship between education and income inequality is more consensual (De Gregorio and Lee, 2002; Rodríguez-Pose & Tselios, 2009, 2011). However, we focus on the difference between the number of years that a country’s law requires its citizens to study and the number of years they do it. The result obtained will allow us to draw conclusions about the governmental decisions that could be taken in this respect where the objective is to gain in economic equity.

Likewise, although Naveed and Wang (2018) evidenced the relationship between belonging to a certain religious group and income inequality, they did not manage to reach conclusive results about the influence that being an atheist has on income inequality, which is why we consider the variable “atheism.” Also, to prevent the effect of financial knowledge on income inequality from being influenced by the different development level of each country, we introduce a human development variable. But, to avoid multicollinearity problems, instead of considering the HDI value for each country, we construct an ordinal qualitative variable according to the specifications shown in Table 1.

For similar reasons, we introduce a variable for institutional quality, since otherwise, it could be doubted whether the relationship between financial knowledge and income inequality is not really influenced by well-functioning institutions where financial knowledge is higher.

Table 1 shows the definition and statistical-descriptive summary of each one of the variables that we use in our analysis.

Estimations

Conclusions we provided in this paper are supported by the results we obtain in our estimations. For this reason, we must ensure that our results are valid and reliable. To do so, we rely on the pre- and post-estimation analyses that we address in this section.

Verify disturbances normality assumption “is not essential if our objective is estimation only” (Gujarati & Porter, 2009). Even so, the Anderson–Darling Test generates a p-value greater than 0.05. So, we could conclude that there is not enough statistical evidence to reject the null hypothesis (disturbances follow a normal distribution) and, therefore, we assume that the disturbances normality assumption is met.

Confirming if there are outliers is a more decisive issue because they could distort the estimated coefficients value. For that, we calculate the Cook’s Distance for each of the observations, which make up our sample. We did not find values greater than 1, which a priori means that there are no outliers (Hamilton, 2013). However, if we follow a stricter criterion (that is, according to the number of observations — n — in the sample; see Hamilton, 2013) we find four points where \({D}_{i}>{^4}/{_n}\). In other words, there are four outliers.Footnote 1

Related literature (Draper & Smith, 1998; Gujarati & Porter, 2009) tells us that outliers should only be rejected when they are the result of human carelessness (e.g., erroneous observations or registration errors). In fact, it does not occur. Even so, we make all our estimations with and without outliers. Thus, we could find out if the latter are truly influential points that could distort the estimation. However, as will be seen later, this fact does not substantially affect our conclusions.

Multicollinearity can be detected through different methods. One of them is Tolerance Index (1-R2), whose value is around 0.20 in all our estimations. So, there are no symptoms of the linear combination of explanatory variables are distorting the coefficients’ estimation, according to related literature (Kleinbaum et al., 1988; Gujarati & Porter, 2009). Tolerance’s inverse is known as Variation Inflation Factor (VIF), which shows if the variance of an estimator inflates because of multicollinearity. Also, according to the same literature, since VIF corresponding to each variable is less than 5 (and, therefore, mean VIF also) we could confirm that there is no multicollinearity that prevents the optimal coefficients’ estimation. Tables 2 and 3 show the VIF values.

More exhaustively, creating a correlations matrix of all involved variables to be able to find out the degree of collinearity between them is a convenient aspect. If the correlation coefficient between two variables is significant and greater than 0.50, then there is collinearity. When it occurs between one of the explanatory variables and the explained one is good news. But if it occurs between two explanatory variables, we must be taken into special consideration in the rest of the analysis due to it is an indication of endogeneity that we must solve (see Hamilton, 2013).

Table 4 shows pairwise correlations values. Those values indicated with an asterisk are those for which there is enough statistical evidence to reject the null hypothesis that each individual correlation equals zero. But the truly important values are those shaded because they are greater than 0.50 and, therefore, could make it difficult to estimate the regression coefficients optimally. It is also important to note that the values in Table 4 have undergone Sidak correction to avoid the multiple-comparison fallacy (Sidak, 1967; Hamilton, 2013).

Table 1 Definition and descriptive-statistical summary of the variables we use in our empirical analysis
Table 2 VIF values
Table 3 VIF values (replacing FKI by FKI-squared)
Table 4 Correlations matrix

Heteroskedasticity is another issue we must address. We could resort to several tests to diagnose it. First, Ramsey tests brings p-values above 0 for all estimations. Therefore, there is insufficient statistical evidence to reject the null hypothesis that the model has no omitted variables, i.e., we could affirm that there are no heteroskedasticity problems due to omission of variables. Second, Breusch-Pagan-Godfrey Tests generates p-values below 0.05. Thus, there is enough statistical evidence to reject the null hypothesis that estimated variance of disturbances depends on the explanatory variables’ values. Broadly, we could affirm that there are no heteroskedasticity problems due to this fact. Third, we use the White Test to verify if disturbances variance is constant (null hypothesis). Since p-values are less than 0.05, there is enough statistical evidence to reject the null hypothesis and, therefore, we could affirm that there is variances’ homogeneity (that is, there is heteroskedasticity). Heteroskedasticity problems are controlled using robust standard errors.

Additionally, when we are working with a panel data, we need to know whether individual effects are important. In this way, we could know if using Pooled Ordinary Least Squares (POLS) estimator is enough or if, on the contrary, we should use other estimators such as Feasible Generalize Least Squares (FGLS) or Panel Corrected Standard Errors (PCSE), among others. To know if individual effects are important, we use Lagrange Multiplier Test for Random Effects, whose null hypothesis is that individual effects are not important. p-values are less than zero in all our estimations, so that there is enough statistical evidence to reject the null hypothesis. So, we must consider that these individual effects are important. Consequently, it is necessary to use estimators such as FGLS or PCSE.Footnote 2

Once we have determined that individual effects are important, we must decide whether they should be treated as random or fixed. For that, we use the Hausman Test, whose null hypothesis is that there are no systematic differences between the coefficients. Since p-values are below 0.05, there is enough statistical evidence to reject the null hypothesis and, therefore, consider that there are systematic differences between the coefficients (that is, these individual effects must be treated as fixed). Precisely, when the chosen estimator is that of fixed effects, it is necessary to check for heteroskedasticity problems. Therefore, we use the modified Wald Test, whose null hypothesis is that the model is homoskedastic. Since p-values are below 0.05, there is enough statistical evidence to reject the null hypothesis (that is, there are heteroskedastic problems).

It is also important to check for autocorrelation problems. For this, we use the Wooldridge Test, whose null hypothesis is there is no first-order autocorrelation. Since p-values are below 0.05, there is enough statistical evidence to reject the null hypothesis. That is, we assume that there are autocorrelation problems. Therefore, there are problems of both heteroskedasticity and autocorrelation. FLGS and PCSE estimators have the advantage that they could solve both problems together. According with Beck and Katz (1995), when the number of individuals (n) exceed the number of periods (t), PCSE is more consistent than FGLS. Despite in our sample n > t, we are using both estimators to underline the models’ robustness.

However, as noted above, FGSL and PCSE are static panel data estimators. These have the disadvantage of not being able to solve the possible endogeneity problems that the correlation matrix (see Table 4) warns us. Also, we have explained above in detail the advantages that dynamic panel data estimators have over traditional methods (that is, using instrumental variables representative of those that could cause endogeneity) in solving possible endogeneity problems. Additionally, among all dynamic estimators, we highlight Roodman’s (2006, 2009). This estimator differs from other dynamic estimators (e.g., Arellano & Bond, 1991; Arellano & Bover, 1995; Blundell & Bond, 1998, among others) in that, in addition to estimating with instruments in differences, it incorporates instruments into levels. This implies less loss of information. Likewise, the Roodman estimator does not insert the lags of the dependent variable as an independent variable. All of this makes this estimator more innovative, more efficient, and more attractive than the others, if the conditions for its validity and robustness are met.

These conditions are essentially two. First, there should be no second order autocorrelation. Second, there should be no problems of over-identification of the instruments (that is, using more instruments than are necessary, which implies that the estimate is suboptimal). Regarding first condition, the Arellano-Bond tests generates p-values > 0.05 in all our estimations (see Tables 5 and 6), which means that there is not enough statistical evidence to reject the null hypothesis (there is no second order autocorrelation). Therefore, we assume that there are no second-order autocorrelation problems.

Table 5 Empirical results
Table 6 Empirical results (replacing FKI by FKI-squared)

About the second condition (no over-identification), it could be checked in at least two ways. First, the number of instruments must be less than the number of groups (in this case, countries). We meet this condition because there are 52 instruments and 63 groups. So, there is no over-identification problems. Second, Hansen tests brings p-values between 0.05 and 0.8 (Roodman, 2009), so that there is not enough statistical evidence to reject the null hypothesis (over-identification restrictions are valid) and then, we conclude that there are no over-identification problems.

Below are the specifications of our estimated models:

$$\begin{aligned}{GINI}_{it}= & \ \alpha + {\beta }_{1}{GINIt16}_{it}+{\beta }_{2}{FKI}_{it}+ {\beta }_{3}{EGR}_{it}+{\beta }_{4}{FD}_{it}+{\beta }_{5}{OAR}_{it}+{\beta }_{6}{UED}_{it}+ {\beta }_{7}{TB}_{it} \\ &+ {\beta }_{8}{INF}_{it} {\gamma }_{1}{ATH}_{it}+ {\gamma }_{2}HDL+ {\gamma }_{3}IQ+ {\eta }_{i}+{\rho }_{t}+{\mu }_{it}\end{aligned}$$

where:

  • GINI: net Gini Index

  • GINIt16: net Gini Index lagged 16 periods

  • FKI: financial knowledge index

  • EGR: economic growth rate

  • FD: private credit by deposit money banks as percentage of GDP

  • OAR: over-aging ratio

  • UED: under-education

  • TB: tax burden as percentage of GDP

  • INF: inflation

  • ATH: atheism

  • HDL: human development level

  • IQ: institutional quality

  • \({\eta }_{i}\): individual effects not observed in each country, but constant over time

  • \({\rho }_{t}\): time effects not observed which are variables though time, but identical between countries

  • \({\mu }_{it}\): error term

Results and Discussion

Here, we show and discuss our empirical results. Tables 5 and 6 provide our results, as well as the parameters values that allow us to affirm that our results (and, therefore, our conclusions) are valid and reliable. All this information is available considering and disregarding outliers.

First, constant term is estimated coefficient registers a positively and significant value. It reveals that income inequality is itself a constantly increasing phenomenon. Net Gini Index lagged 16 periods record a positively and significant regressor’s value which reinforce this fact. It reflects that income inequalities in previous years lead to income inequalities in later years. In turn, this last makes us wonder to what extent the crisis influenced the income inequality dynamics. It seems that it was not tragic enough to break this dynamic, at least not as much as the two World Wars were (Piketty, 2013). This income inequality dynamic clashes with the Kuznets’ (1955) optimistic view. Meanwhile, it fits with the Piketty’s (2013) perspective. Thus, converging forces are necessaries.

Economic growth rate estimated predictor shows a positively and significant value. It allows us to affirm that increases in economic growth rate lead to income inequality, in line with Rodríguez-Pose and Tselios (2009) and Faustino and Vali (2013), among others. Similarly, private credit by deposit money banks also registers a positively and significant regressor’s value. It suggests that a greater financial development affects a greater income inequality. This result is consistent with Rodríguez-Pose and Tselios (2009), Kus (2012), and Seven and Coskin (2016).

These last two results are not particularly surprising. In fact, they both are variables that are introduced in the model as a way of granting greater veracity precisely their results are a prior expected. In the same way, tax burden’s predictor brings a negatively and significant value. It ratifies a clear consensus in the economic discipline: taxes are an effective instrument to correct the income inequality (Muinelo-Gallo & Roca-Sagalés, 2011).

Inflation effect on income inequality has traditionally been more debated. On the one hand, inflation could have a redistributive impact (Monnin, 2014) because it could reduce the creditor over the debtor power. Especially, when low-and-middle-income household are heavily indebted, and inflations occurs almost unexpectedly. On the other hand, increases in prices could be associated with stronger economic growth, which could increase income inequality (Faustino & Vali, 2013). Probably, due to the clash of these two forces, this variable estimated coefficient does not bring a conclusive result, as it happened to Sarel (1997).

Like inflation, the relationship between over-aging and income inequality is a very controversial issue. Thus, Deaton and Paxson (1994) as well as Onafowora and Owoye (2017) consider that aging generates inequality. However, there is empirical evidence about its redistributive effect, especially when the sample is a data panel (Rodríguez-Pose & Tselios, 2009). Our results are along these lines, but they are neither significant nor, therefore, conclusive. Therefore, we cannot confirm that a lower frequency of deaths (or over-aging) could cause redistributive effects, as suggested by Piketty (2013).

Our results for “atheism” are negative and significant when we estimate by FGLS and PCSE, but not when we estimate by GMM-Roodman with outliers. Nevertheless, they are significant we suppress the outliers. However, according to Drapper and Smith (1998), since the presence of these outliers is not due to human error, it would be hasty to base our conclusions on the estimation without outliers when the results are different when considering them. Thus, we cannot really corroborate that the prevalence of atheist population in a country is negatively associated with income inequality and thus complete the work of Naveed and Wang (2018).

Our results obtained for “under-education” are positive and significant. This suggests that in those countries where people are generally educated for more years than the law requires, income inequality could be lower. Thus, people should complete at least the compulsory years of education. This result is according to De Gregorio and Lee (2002), Rodríguez-Pose and Tselios (2009, 2011), and Piketty (2013) and does it for a larger and more current sample.

Regarding the level of human development, our results suggest that the higher the level of human development, the lower the income inequality. These results are consistent with others found in related literature (Ortega et al., 2016; Larionova & Variamova, 2015) and contribute to complement other works (Amate-Fortes et al., 2017). Also, by controlling for it in our estimations, it allows us to know that the relationship between financial knowledge and income inequality is not conditioned by differences in human development between the different countries in the sample.

It is precisely for similar reasons that we control institutional quality. Our results in this regard are negative and significant. Considering that the value of this variable is higher the lower the institutional quality of the country, these results suggest that there is a positive association between institutional quality and income inequality. This result is not so surprising when compared to previous findings. For example, Blancheton and Chhron (2021) point out that in the short run, advances in institutional quality generate income inequality (and our sample is not very long in time). Likewise, Josifidis et al. (2017) point to high economic inequality because of institutions not being able to channel current changes towards a more equitable redistribution of income in a globalized context. Somehow, these results also suggest that the possible redistributive effect that financial literacy could have would not be influenced by the quality of the institutions.

Finally, and answering our main question, the estimated coefficients for the FKI are negative and significant. This suggests that financial knowledge could reduce income inequality. Thus, we contribute to build consensus around Campara et al. (2016) and Lo Prete (2018) and against Kurihara (2013). But the results obtained by FKI-squared allow us to go even further. Specifically, we can affirm that there is a nonlinear relationship between these two variables. In other words, financial knowledge reduces income inequality within a given range of values. Meanwhile, values outside this range would have the opposite effect (increases in income inequality). It is possible to observe this range (0.2–0.6) by plotting the values of the FKI together with those of the Net Gini Index on a scatter plot (Fig. 1), as follows:

Fig. 1
figure 1

Scatter plot: Net Gini Index (NGI) and Financial Knowledge Index (FKI)

Therefore, our results obtained here allow us to point out that increases in financial knowledge have income redistributive implications. However, these redistributive effects occur from a certain level of financial knowledge (when starting from relatively low levels of financial knowledge) and up to a point (relatively high levels of financial knowledge) beyond which the redistributive effects may disappear or even have the opposite effect.

Conclusions

Financial knowledge and income inequality are both two issues that are recently becoming very important worldwide. On the one hand, most governments are designing and implementing national strategies for financial education believing that they will improve their citizens’ quality of life. On the other hand, income inequality is the tenth sustainable development goal, and it is in the crosshairs of important international organizations. Nevertheless, there are hardly any empirical works that prove the effect of financial knowledge on income inequality. Even, some of them come to different conclusions using the same database.

This scarce empirical evidence and the lack of consensus on this issue are probably due to the drawbacks of financial knowledge indexes based on surveys of individuals: firstly, because they do not allow for the analysis of issues (macroeconomic) that go beyond their individual characteristics or the composition of their portfolio (microeconomic), and secondly, because surveys lead to problems such as attrition or subjectivity bias. For this reason, we use a new longitudinal financial index that overcomes these problems related to surveys. With this index, we aim to answer whether financial knowledge influence income inequality in an international context during several years. For that purpose, we use panel data estimators that are rare in the literature on financial knowledge.

An additional novelty is that we do not limit our analysis only to the existence of a linear relationship between financial knowledge and income inequality, but also analyze whether such relationship is nonlinear. Thus, our results allow us to conclude that financial knowledge has income redistributive implications. Specifically, when starting from low levels of financial knowledge, an increase in financial knowledge contributes to reducing income inequality, but above a certain threshold, this effect may disappear or be reversed. We also contribute to the literature on the effect that other variables have on income inequality, such as institutional quality or under-education. By virtue of these, we could recommend policy actions aimed at rethinking the role of institutions as well as fostering the improvement of financial knowledge through the education system where it is insufficient.