Skip to main content
Log in

Gini-PLS Regressions

  • Original Article
  • Published:
Journal of Quantitative Economics Aims and scope Submit manuscript

Abstract

Data contamination and excessive correlations between regressors (multicollinearity) constitute a standard and major problem in econometrics. Two techniques enable solving these problems, in separate ways: the Gini regression for the former, and the PLS (partial least squares) regression for the latter. Gini-PLS regressions are proposed in order to treat extreme values and multicollinearity simultaneously.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Notes

  1. Observations can be eliminated block by block instead of one by one, see Tenenhaus (1998), p. 77.

  2. The rank vector is obtained by replacing the values of y by their ranks (the smallest value of y being ranked 1 and the highest n).

  3. There are two G-correlation coefficients (which are not symmetric) \(\Gamma _{xy}:=\text {cov}(x,R(y))/\text {cov}(x,R(x))\) and \(\Gamma _{yx}:=\text {cov}(y,R(x))/\text {cov}(y,R(y))\) in the same manner as the coginis \(\text {cov}(x,R(y))\) and \(\text {cov}(y,R(x))\). The two G-correlations are equal if the distributions are exchangeable up to a linear transformation. Note that in the remainder, only one cogini is used: \(\text {cog}(x,y)=\text {cov}(x,R(y))\).

  4. If the condition of linearity is relaxed, the parametric and non-parametric Gini techniques are not necessarily equivalent.

  5. Note that the weight vector \(w_j\) may also be derived from the minimization of the Gini index of the residuals, i.e. by the parametric Gini regression whenever the link between y and \(x_j\) is not linear.

  6. The rank vector is homogeneous of degree zero in \(x_j\), \(R(x_j)=R(\lambda x_j)\) for \(\lambda >0\), as well as translation invariant, \(R(x_j)=R(x_j+a_j)\) with \(a_j=(a,a,\ldots ,a)\in \mathbb {R}^{n}\). Hence the standardization of the variables \(x_j\) enables the data to be purged of measurement errors of the following form \(\tilde{x}_j = \lambda x_j + a\), as in PLS1 and Gini2-PLS1.

  7. We set for simplicity that \(\text {cov}(x_j+u,y)-\text {cov}(x_j,y) \approx \frac{\partial \text {cov}(x_j,y)}{\partial x_j}\). On this basis we compute the derivative of the weight \(w_{1j}\) and we deduce the variation of \(VIP_{1j}\). Note that if the weight is negative the converse is obtained: \(\text {if} \ w_{1j}^{PLS1} < 0 \ \text {and if} \ \text {cov}(x_j,u) \lessgtr 0 \ \Longrightarrow \ \tilde{VIP}_{1j} \gtrless VIP_{1j}.\)

  8. Note that if the weight is negative: \(\text {if} \ w_{1j}^{Gini2} < 0 \ \text {and if} \ \text {cov}(x_j,u) \lessgtr 0 \ \Longrightarrow \ \tilde{VIP}_{1j} \lessgtr VIP_{1j}.\)

  9. For the sake of simplicity, we only present the results for the first component \(t_1\). The results are similar for \(t_2\). Note that in all figures, the maximum value in the abscissa is 1, that is, \(1 \times 10^{4}\).

  10. Note that the negative signs of the Gini correlations cannot systematically assess the negative correlation between two variables, see Yitzhaki (2003, p.293).

  11. It is important to note that Gini-PLS regressions do not aim at detecting outliers. They allow for dealing with outliers without withdrawing them from the sample.

References

  • Bastien, P., V. Esposito Vinzi, and M. Tenenhaus. 2005. PLS generalised linear regression. Computational Statistics and Data Analysis 48: 17–46

  • Bry, X., C. Trottier, T. Verron, and F. Mortier. 2013. Supervised component generalized linear regression using a PLS-extension of the Fisher scoring algorithm. Journal of Multivariate Analysis 119: 47–60.

    Article  Google Scholar 

  • Choi, S.W. 2009. The effect of outliers on regression analysis: Regime type and foreign direct investment. Quarterly Journal of Political Science 4: 153–165.

    Article  Google Scholar 

  • Chung, D., and S. Keles. 2010. Sparse partial least squares classification for high dimensional data. Statistical Applications in Genetics and Molecular Biology 91, article 17.

  • Dixon, W.J. 1950. Analysis of extreme values. The Annals of Mathematical Statistics 2 (4): 488–506.

    Article  Google Scholar 

  • Durbin, J. 1954. Errors in variables. Review of the International Statistical Institute 22: 23–32.

    Article  Google Scholar 

  • John, G.H. 1995. Robust decision tree: Removing outliers from databases. In KDD-95 Proceeding, 174–179.

  • Olkin, I., and S. Yitzhaki. 1992. Gini regression analysis. International Statistical Review 602: 185–196.

    Article  Google Scholar 

  • Planchon, V. 2005. Traitement des valeurs aberrantes: concepts actuels et tendances générales. Biotechnologie Agronomie Société et Environnement 91: 185–196.

    Google Scholar 

  • Russolillo, G. 2012. Non-Metric partial least squares. Electronic Journal of Statistics 6: 1641–1669.

    Article  Google Scholar 

  • Tenenhaus, M. 1998. La régression PLS théorie et pratique. Paris: Technip.

    Google Scholar 

  • Schechtman, E., and S. Yitzhaki. 1999. On the proper bounds of the Gini correlation. Economics Letters 63 (2): 133–138.

    Article  Google Scholar 

  • Schechtman, E., and S. Yitzhaki. 2003. A family of correlation coefficients based on extended Gini. Journal of Economic Inequality 1: 129–146.

    Article  Google Scholar 

  • Wold, S., C. Albano, W. J. Dunn III, K. Esbensen, S. Hellberg, E. Johansson, and H. Sjöström. 1983. Pattern recognition: Finding and using regularities in multivariate data. In Proc. UFOST Conf., Food Research and Data Analysis, ed. J. Martens. Applied Science Publications: London.

  • Wold, S., H. Martens, and H. Wold. 1983. The multivariate calibration problem in chemistry solved by the PLS method. In Proc. Conf. Matrix Pencils, eds. A. Rnhe, and B. Kagstroem, 286–293. Berlin: Springer.

  • Yitzhaki, S. 2003. Gini’s mean difference: A superior measure of variability for non-normal distributions. Metron 61 (2): 285–316.

    Google Scholar 

  • Yitzhaki, S., and E. Schechtman. 2004. The Gini instrumental variable, or the ’double instrumental variable’ estimator. Metron 52 (3): 287–313.

    Google Scholar 

  • Yitzhaki, S., and E. Schechtman. 2013. The Gini Methodology: A Primer on a Statistical Methodology. Berlin: Springer.

    Book  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Stéphane Mussard.

Additional information

This paper was presented at the AFSE conference in Marseille, June 2013. The authors would like to thank particularly Michel Simioni for very helpful comments about the simulation part of the paper. The authors also acknowledge Michel Tenenhaus for comments on the first draft of the paper and Ricco Rakotomalala for his advices and for sharing his Tanagra PLS codes.

Appendices

Appendix A: Proof of Proposition 2

1. Properties (o)–(vi) of PLS1: See Tenenhaus (1998).

2. Properties (o)–(vi) of Gini1-PLS1.

\({(o) {t_1} \ \bot \ \cdots \ \bot {t_h}:}\)

The proof is by mathematical induction. We follow Tenenhaus (1998, p. 101) for PLS1, except that in our case, \(\hat{U}_{(0)}:=R(X)\) [the residuals are issued from the rank vectors Eq. (7)].On the one hand, let us show that \({t_1} \ \bot \ {t_2}\):

$$\begin{aligned} {t_1} \ \bot \ {t_2} \ \Longleftrightarrow \ t_1^{\intercal } t_2 = t_1^{\intercal } \underbrace{\hat{U}_{(1)}w_{2}}_{t_2} = 0 , \end{aligned}$$

since \(t_1^{\intercal } \hat{U}_{(1)} = \mathbf {0}\), where \(\mathbf {0}\) is the null (row) vector of size p. Suppose the following claim is true:

$$\begin{aligned} \mathbf{[h] :} {t_1} \ \bot \ {t_2} \ \bot \cdots \ \bot \ {t_h}. \end{aligned}$$

We have to show that [h+1] is true, i.e., \(t_{h+1}\) is orthogonal to all components \(t_1,\ldots ,t_{h}\). The relation [h] implies that \(t_{h}^{\intercal }\hat{U}_{(h)} = \mathbf {0}\), hence

$$\begin{aligned} t_{h}^{\intercal } t_{h+1} = t_{h}^{\intercal }\hat{U}_{(h)} w_{(h+1)} =0. \end{aligned}$$

According to Steps 2-h, the partial regressions imply, for all \(j=1,\ldots ,p\),

$$\begin{aligned} R(x_j) = \hat{\beta }_{1j}t_1 + \hat{u}_{(1)j} = \hat{\beta }_{1j}t_1 + \hat{\beta }_{2j}t_2 + \hat{u}_{(2)j} = \cdots = \sum _{r=1}^{h}\hat{\beta }_{rj}t_r + \hat{u}_{(h-1)j}. \end{aligned}$$
(11)

The relation (11) provides \(\hat{U}_{(h)} = \hat{U}_{(h-1)} - t_h\hat{\beta }_{(h)}^{\intercal }\), where \(t_h\hat{\beta }_{(h)}^{\intercal }\) is the \(n\times p\) matrix containing \(\hat{\beta }_{hj}t_h\) in columns, for all \(j=1,\ldots ,p\). Since [h] implies that \(t_{h-1}^{\intercal }\hat{U}_{(h-1)}=\mathbf {0}\) and that \(t_{h-1}^{\intercal } t_{h} = 0\), we get

$$\begin{aligned} t_{h-1}^{\intercal } t_{h+1}&= t_{h-1}^{\intercal }\hat{U}_{(h)} w_{(h+1)} \\&= t_{h-1}^{\intercal }\left( \hat{U}_{(h-1)} - t_h\hat{\beta }_{(h)}^{\intercal } \right) w_{(h+1)} \\&=\left( t_{h-1}^{\intercal }\hat{U}_{(h-1)} - t_{h-1}^{\intercal }t_h\hat{\beta }_{(h)}^{\intercal }\right) w_{(h+1)} = 0 \ . \end{aligned}$$

Using [h], we find

$$\begin{aligned} t_{h-2}^{\intercal } t_{h+1}&= t_{h-2}^{\intercal }\left( \hat{U}_{(h-1)} - t_h\hat{\beta }_{(h)}^{\intercal } \right) w_{(h+1)} \\&= t_{h-2}^{\intercal }\left( \hat{U}_{(h-2)} - t_{h-1}\hat{\beta }_{(h-1)}^{\intercal } - t_h\hat{\beta }_{(h)}^{\intercal }\right) w_{(h+1)} \\&= \left( t_{h-2}^{\intercal }\hat{U}_{(h-2)} - t_{h-2}^{\intercal }t_{h-1}\hat{\beta }_{(h-1)}^{\intercal } - t_{h-2}^{\intercal }t_h\hat{\beta }_{(h)}^{\intercal } \right) w_{(h+1)} = 0 \ . \end{aligned}$$

Finally, [h] yields

$$\begin{aligned} t_{1}^{\intercal } t_{h+1}&= t_{1}^{\intercal }\left( \hat{U}_{(h-1)} - t_h\hat{\beta }_{(h)}^{\intercal } \right) w_{(h+1)} \\&= \left( t_{1}^{\intercal }\hat{U}_{1} - t_{1}^{\intercal }\sum _{r=2}^{h}t_{r}\hat{\beta }_{(r)}^{\intercal } \right) w_{(h+1)} = 0 \ . \end{aligned}$$

(i) \({ w_\ell ^\intercal \hat{\beta }_{\ell } = 1, \ \forall \ell \in \{2,\ldots ,h\}:}\)

Let \(\hat{\beta }_h\) be the column vector whose elements are \(\hat{\beta }_{hj}\) for all \(j=1,\ldots ,p\). The components \(t_h\) are given by \(w_h^{\intercal }\hat{U}_{(h-1)}^{\intercal }=t^{\intercal }_h\), and so

$$\begin{aligned} w_{(h)}^{\intercal } \hat{\beta }_{h} = w_h^{\intercal }\frac{\hat{U}_{(h-1)}^{\intercal }t_h}{t_h^{\intercal }t_h}. \end{aligned}$$
(12)

For \(h=1\), we have \(w_1^{\intercal }X^{\intercal }=t^{\intercal }_1\), and so \(w_{1}^{\intercal } \hat{\beta }_{1}=w_1^{\intercal }\frac{R(X)^{\intercal } t_1}{t_1^{\intercal }t_1}\ne 1\) if \(R(X)\ne X\). For \(h > 1\), we get \(w_h^{\intercal }\hat{U}^{\intercal }_{(h)}=t^{\intercal }_h\). From expression (11), we have

$$\begin{aligned} \hat{\beta }_{h} = \frac{\hat{U}^{\intercal }_{(h)} t_h}{t_h^{\intercal }t_h}, \end{aligned}$$
(13)

and so \(w_{h}^{\intercal } \hat{\beta }_{h} = w_h^{\intercal }\frac{\hat{U}^{\intercal }_{(h)} t_h}{t_h^{\intercal }t_h} =\frac{t_h^{\intercal } t_h}{t_h^{\intercal }t_h} = 1\).

(ii) \({ w_h^{\intercal } \hat{U}^{\intercal }_{(\ell )} = \mathbf {0}, \ \forall \ell \geqslant h > 1:}\)

Relation (11), \(R(x_j)=\hat{\beta }_{1j}t_1 + \hat{\beta }_{2j}t_2 + \cdots + \hat{u}_{(h-1)j}\), yields

$$\begin{aligned} \hat{U}_{(h-1)} - t_h\hat{\beta }_{h}^{\intercal } = \hat{U}_{(h)}. \end{aligned}$$
(14)

For \(h=\ell =1\), we get \(R(X) \equiv \hat{U}_{(0)}=t_1\hat{\beta }_{1}^{\intercal }+\hat{U}_{(1)}\), and so

$$\begin{aligned} w_1^{\intercal } \hat{U}^{\intercal }_{(1)} = w_1^{\intercal }\left( R(X)^{\intercal }- \hat{\beta }_1 t_1^\intercal \right) = w_1^{\intercal }\left( \hat{\beta }_1 t_1^\intercal +\hat{U}_{(1)}^\intercal - \hat{\beta }_1 t_1^\intercal \right) = w_1^{\intercal }\hat{U}_{(1)}^\intercal \ne \mathbf {0}. \end{aligned}$$

For \(h=\ell >1\), using (i), we deduce from (14) that

$$\begin{aligned} w_h^{\intercal } \hat{U}_h^{\intercal }&= w_h^{\intercal }\left( \hat{U}^{\intercal }_{(h-1)} - \hat{\beta }_h t_h^\intercal \right) \\&= w_h^{\intercal }\hat{U}^{\intercal }_{(h-1)} - w_h^{\intercal }\hat{\beta }_h t_h^\intercal \\&= t_h^\intercal - t_h^\intercal = \mathbf {0}. \end{aligned}$$

For all \(\ell> h > 1\), expressions (i) and (13) yield

$$\begin{aligned} w_h^{\intercal } \hat{U}_{(\ell +1)}^{\intercal }&= w_h^{\intercal }\left( \hat{U}_{(\ell )}- t_{\ell +1} \hat{\beta }_{\ell +1}^\intercal \right) ^{\intercal } \\&= w_h^{\intercal } \hat{U}_{(\ell )}^{\intercal } - w_h^{\intercal } \frac{\hat{U}_{(\ell )}^{\intercal }t_{\ell +1}}{t_{\ell +1}^{\intercal }t_{\ell +1}}t_{\ell +1}^{\intercal } \\&= w_h^{\intercal } \hat{U}_{(\ell )}^{\intercal } - w_h^{\intercal } \hat{U}_{(\ell )}^{\intercal }(t_{\ell +1}t_{\ell +1}^{\intercal })^{-1}t_{\ell +1} t_{\ell +1}^{\intercal } \\&=\mathbf {0}. \end{aligned}$$

(iii) \({w_h^{\intercal } \hat{\beta }_\ell = 0, \ \forall \ell> h > 1:}\)

Due to relations (i) and (13), we get

$$\begin{aligned} w_h^{\intercal } \hat{\beta }_\ell = w_h^{\intercal } \frac{\hat{U}_{(\ell )}^{\intercal } t_\ell }{t_\ell ^ {\intercal } t_\ell }. \end{aligned}$$

For \(\ell> h > 1\), relation (ii) provides the result.

(iv) \({ w_h^{\intercal } w_\ell = 0, \ \forall \ell> h > 1:}\)

Relation (ii) yields \(w_h^{\intercal }\hat{U}_{(\ell )}^{\intercal }= \mathbf {0}\), for all \(\ell> h > 1\). Thus

$$\begin{aligned} w_h^{\intercal } w_\ell = w_h^{\intercal } \frac{\hat{U}_{(\ell -1)} \hat{\varepsilon }_{\ell -1}}{\left( \sum _{j=1}^p \text {cog}^2 \left( \hat{u}_{(\ell -1)j}^{\intercal }, \hat{\varepsilon }_{\ell -1} \right) \right) ^{\frac{1}{2}}} = w_h^{\intercal } \frac{\hat{U}_{(\ell -1)} \hat{\varepsilon }_{\ell -1}}{\left\| \hat{U}_{(\ell -1)} \hat{\varepsilon }_{\ell -1} \right\| } =0. \end{aligned}$$

(v) \({ t_h^{\intercal }\hat{U}_\ell = \mathbf {0}, \ \forall \ell \geqslant h \geqslant 1:}\)

Equations (14) and (11) imply

$$\begin{aligned} t_h^{\intercal } \hat{U}_{(\ell )}&= t_h^{\intercal } \left( \hat{U}_{(\ell -1)} - t_\ell \hat{\beta }_\ell ^{\intercal } \right) \\&= t_h^{\intercal } \left( R(X) - \sum _{\ell =1}^h t_\ell \hat{\beta }_\ell ^{\intercal } \right) \\&= t_h^{\intercal } \hat{U}_{(h)} = \mathbf {0}. \end{aligned}$$

(vi) \({ \hat{U}_{(h)} \ne \hat{U}_{(0)} \prod _{\ell =1}^{h} \left( \mathbb {I} - w_\ell \beta _\ell ^{\intercal }\right) , \ \forall h \geqslant 1:}\)

Let \(h = 1\), since \(\hat{U}_{(0)}\equiv R(X)\), by Eq. (14) we have

$$\begin{aligned} \hat{U}_{(1)}&= R(X) - t_1 \hat{\beta }_1^{\intercal } \\&= R(X) - X w_1 \beta _1^{\intercal } \\&\ne R(X)\left( \mathbb {I} - w_1 \beta _1^{\intercal }\right) , \ \text {if} \ X\ne R(X) \\&\ne R(X) \prod _{\ell =1}^{h+1} \left( \mathbb {I} - w_\ell \beta _\ell ^{\intercal }\right) . \end{aligned}$$

3. Properties (o)–(vi) of Gini2-PLS1.

All properties (o)–(v) are obtained in the same manner as in the Gini1-PLS1 case. As to property (vi), let \(h = 1\), since \(\hat{U}_{(0)}\equiv X\):

$$\begin{aligned} \hat{U}_{(1)}&= R(X) - t_1 \hat{\beta }_1^{\intercal } \\&= X - X w_1 \beta _1^{\intercal } \\&= X\left( \mathbb {I} - w_1 \beta _1^{\intercal }\right) \\&= X \prod _{\ell =1}^{h+1} \left( \mathbb {I} - w_\ell \beta _\ell ^{\intercal }\right) . \end{aligned}$$

Appendix B

See Tables 13, 14, 15 and 16.

Table 13 Descriptive statistics of the database
Table 14 Database of cars data
Table 15 Detecting outliers (before contamination)
Table 16 Predictions with \(t_1\) and \(t_2\) (before contamination)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Mussard, S., Souissi-Benrejab, F. Gini-PLS Regressions. J. Quant. Econ. 17, 477–512 (2019). https://doi.org/10.1007/s40953-018-0132-9

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s40953-018-0132-9

Keywords

JEL Classification

Navigation