Gini-PLS Regressions

Mussard, Stéphane; Souissi-Benrejab, Fattouma

doi:10.1007/s40953-018-0132-9

Gini-PLS Regressions

Original Article
Published: 16 April 2018

Volume 17, pages 477–512, (2019)
Cite this article

Journal of Quantitative Economics Aims and scope Submit manuscript

Stéphane Mussard^1,2,3,4 &
Fattouma Souissi-Benrejab^5,6

126 Accesses
1 Citation
Explore all metrics

Abstract

Data contamination and excessive correlations between regressors (multicollinearity) constitute a standard and major problem in econometrics. Two techniques enable solving these problems, in separate ways: the Gini regression for the former, and the PLS (partial least squares) regression for the latter. Gini-PLS regressions are proposed in order to treat extreme values and multicollinearity simultaneously.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Lorenz Model Selection

Article 08 January 2020

Multicollinearity in regression: an efficiency comparison between Lp-norm and least squares estimators

Article 12 September 2017

Multiple Regression Analysis from Data Science Perspective

Notes

Observations can be eliminated block by block instead of one by one, see Tenenhaus (1998), p. 77.
The rank vector is obtained by replacing the values of y by their ranks (the smallest value of y being ranked 1 and the highest n).
There are two G-correlation coefficients (which are not symmetric) $\Gamma _{xy}:=\text {cov}(x,R(y))/\text {cov}(x,R(x))$ and $\Gamma _{yx}:=\text {cov}(y,R(x))/\text {cov}(y,R(y))$ in the same manner as the coginis $\text {cov}(x,R(y))$ and $\text {cov}(y,R(x))$. The two G-correlations are equal if the distributions are exchangeable up to a linear transformation. Note that in the remainder, only one cogini is used: $\text {cog}(x,y)=\text {cov}(x,R(y))$.
If the condition of linearity is relaxed, the parametric and non-parametric Gini techniques are not necessarily equivalent.
Note that the weight vector $w_j$ may also be derived from the minimization of the Gini index of the residuals, i.e. by the parametric Gini regression whenever the link between y and $x_j$ is not linear.
The rank vector is homogeneous of degree zero in $x_j$, $R(x_j)=R(\lambda x_j)$ for $\lambda >0$, as well as translation invariant, $R(x_j)=R(x_j+a_j)$ with $a_j=(a,a,\ldots ,a)\in \mathbb {R}^{n}$. Hence the standardization of the variables $x_j$ enables the data to be purged of measurement errors of the following form $\tilde{x}_j = \lambda x_j + a$, as in PLS1 and Gini2-PLS1.
We set for simplicity that $\text {cov}(x_j+u,y)-\text {cov}(x_j,y) \approx \frac{\partial \text {cov}(x_j,y)}{\partial x_j}$. On this basis we compute the derivative of the weight $w_{1j}$ and we deduce the variation of $VIP_{1j}$. Note that if the weight is negative the converse is obtained: $\text {if} \ w_{1j}^{PLS1} < 0 \ \text {and if} \ \text {cov}(x_j,u) \lessgtr 0 \ \Longrightarrow \ \tilde{VIP}_{1j} \gtrless VIP_{1j}.$
Note that if the weight is negative: $\text {if} \ w_{1j}^{Gini2} < 0 \ \text {and if} \ \text {cov}(x_j,u) \lessgtr 0 \ \Longrightarrow \ \tilde{VIP}_{1j} \lessgtr VIP_{1j}.$
For the sake of simplicity, we only present the results for the first component $t_1$. The results are similar for $t_2$. Note that in all figures, the maximum value in the abscissa is 1, that is, $1 \times 10^{4}$.
Note that the negative signs of the Gini correlations cannot systematically assess the negative correlation between two variables, see Yitzhaki (2003, p.293).
It is important to note that Gini-PLS regressions do not aim at detecting outliers. They allow for dealing with outliers without withdrawing them from the sample.

References

Bastien, P., V. Esposito Vinzi, and M. Tenenhaus. 2005. PLS generalised linear regression. Computational Statistics and Data Analysis 48: 17–46
Bry, X., C. Trottier, T. Verron, and F. Mortier. 2013. Supervised component generalized linear regression using a PLS-extension of the Fisher scoring algorithm. Journal of Multivariate Analysis 119: 47–60.
Article Google Scholar
Choi, S.W. 2009. The effect of outliers on regression analysis: Regime type and foreign direct investment. Quarterly Journal of Political Science 4: 153–165.
Article Google Scholar
Chung, D., and S. Keles. 2010. Sparse partial least squares classification for high dimensional data. Statistical Applications in Genetics and Molecular Biology 91, article 17.
Dixon, W.J. 1950. Analysis of extreme values. The Annals of Mathematical Statistics 2 (4): 488–506.
Article Google Scholar
Durbin, J. 1954. Errors in variables. Review of the International Statistical Institute 22: 23–32.
Article Google Scholar
John, G.H. 1995. Robust decision tree: Removing outliers from databases. In KDD-95 Proceeding, 174–179.
Olkin, I., and S. Yitzhaki. 1992. Gini regression analysis. International Statistical Review 602: 185–196.
Article Google Scholar
Planchon, V. 2005. Traitement des valeurs aberrantes: concepts actuels et tendances générales. Biotechnologie Agronomie Société et Environnement 91: 185–196.
Google Scholar
Russolillo, G. 2012. Non-Metric partial least squares. Electronic Journal of Statistics 6: 1641–1669.
Article Google Scholar
Tenenhaus, M. 1998. La régression PLS théorie et pratique. Paris: Technip.
Google Scholar
Schechtman, E., and S. Yitzhaki. 1999. On the proper bounds of the Gini correlation. Economics Letters 63 (2): 133–138.
Article Google Scholar
Schechtman, E., and S. Yitzhaki. 2003. A family of correlation coefficients based on extended Gini. Journal of Economic Inequality 1: 129–146.
Article Google Scholar
Wold, S., C. Albano, W. J. Dunn III, K. Esbensen, S. Hellberg, E. Johansson, and H. Sjöström. 1983. Pattern recognition: Finding and using regularities in multivariate data. In Proc. UFOST Conf., Food Research and Data Analysis, ed. J. Martens. Applied Science Publications: London.
Wold, S., H. Martens, and H. Wold. 1983. The multivariate calibration problem in chemistry solved by the PLS method. In Proc. Conf. Matrix Pencils, eds. A. Rnhe, and B. Kagstroem, 286–293. Berlin: Springer.
Yitzhaki, S. 2003. Gini’s mean difference: A superior measure of variability for non-normal distributions. Metron 61 (2): 285–316.
Google Scholar
Yitzhaki, S., and E. Schechtman. 2004. The Gini instrumental variable, or the ’double instrumental variable’ estimator. Metron 52 (3): 287–313.
Google Scholar
Yitzhaki, S., and E. Schechtman. 2013. The Gini Methodology: A Primer on a Statistical Methodology. Berlin: Springer.
Book Google Scholar

Download references

Author information

Authors and Affiliations

Chrome Université de Nîmes, Nîmes, France
Stéphane Mussard
MRE University of Montpellier, Montpellier, France
Stéphane Mussard
GrEdi University of Sherbrooke, Quebec, Canada
Stéphane Mussard
Liser Luxembourg, Esch-sur-Alzette, Luxembourg
Stéphane Mussard
Université Montpellier 1, UMR5474 LAMETA, 34000, Montpellier, France
Fattouma Souissi-Benrejab
Faculté d’Economie, Av. Raymond Dugrand, Site de Richter C.S. 79606, 34960, Montpellier Cedex, France
Fattouma Souissi-Benrejab

Authors

Stéphane Mussard
View author publications
You can also search for this author in PubMed Google Scholar
Fattouma Souissi-Benrejab
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Stéphane Mussard.

Additional information

This paper was presented at the AFSE conference in Marseille, June 2013. The authors would like to thank particularly Michel Simioni for very helpful comments about the simulation part of the paper. The authors also acknowledge Michel Tenenhaus for comments on the first draft of the paper and Ricco Rakotomalala for his advices and for sharing his Tanagra PLS codes.

Appendices

Appendix A: Proof of Proposition 2

1. Properties (o)–(vi) of PLS1: See Tenenhaus (1998).

2. Properties (o)–(vi) of Gini1-PLS1.

${(o) {t_1} \ \bot \ \cdots \ \bot {t_h}:}$

The proof is by mathematical induction. We follow Tenenhaus (1998, p. 101) for PLS1, except that in our case, $\hat{U}_{(0)}:=R(X)$ [the residuals are issued from the rank vectors Eq. (7)].On the one hand, let us show that ${t_1} \ \bot \ {t_2}$:

$$\begin{aligned} {t_1} \ \bot \ {t_2} \ \Longleftrightarrow \ t_1^{\intercal } t_2 = t_1^{\intercal } \underbrace{\hat{U}_{(1)}w_{2}}_{t_2} = 0 , \end{aligned}$$

since $t_1^{\intercal } \hat{U}_{(1)} = \mathbf {0}$, where $\mathbf {0}$ is the null (row) vector of size p. Suppose the following claim is true:

$$\begin{aligned} \mathbf{[h] :} {t_1} \ \bot \ {t_2} \ \bot \cdots \ \bot \ {t_h}. \end{aligned}$$

We have to show that [h+1] is true, i.e., $t_{h+1}$ is orthogonal to all components $t_1,\ldots ,t_{h}$. The relation [h] implies that $t_{h}^{\intercal }\hat{U}_{(h)} = \mathbf {0}$, hence

$$\begin{aligned} t_{h}^{\intercal } t_{h+1} = t_{h}^{\intercal }\hat{U}_{(h)} w_{(h+1)} =0. \end{aligned}$$

According to Steps 2-h, the partial regressions imply, for all $j=1,\ldots ,p$,

$$\begin{aligned} R(x_j) = \hat{\beta }_{1j}t_1 + \hat{u}_{(1)j} = \hat{\beta }_{1j}t_1 + \hat{\beta }_{2j}t_2 + \hat{u}_{(2)j} = \cdots = \sum _{r=1}^{h}\hat{\beta }_{rj}t_r + \hat{u}_{(h-1)j}. \end{aligned}$$

(11)

The relation (11) provides $\hat{U}_{(h)} = \hat{U}_{(h-1)} - t_h\hat{\beta }_{(h)}^{\intercal }$, where $t_h\hat{\beta }_{(h)}^{\intercal }$ is the $n\times p$ matrix containing $\hat{\beta }_{hj}t_h$ in columns, for all $j=1,\ldots ,p$. Since [h] implies that $t_{h-1}^{\intercal }\hat{U}_{(h-1)}=\mathbf {0}$ and that $t_{h-1}^{\intercal } t_{h} = 0$, we get

$$\begin{aligned} t_{h-1}^{\intercal } t_{h+1}&= t_{h-1}^{\intercal }\hat{U}_{(h)} w_{(h+1)} \\&= t_{h-1}^{\intercal }\left( \hat{U}_{(h-1)} - t_h\hat{\beta }_{(h)}^{\intercal } \right) w_{(h+1)} \\&=\left( t_{h-1}^{\intercal }\hat{U}_{(h-1)} - t_{h-1}^{\intercal }t_h\hat{\beta }_{(h)}^{\intercal }\right) w_{(h+1)} = 0 \ . \end{aligned}$$

Using [h], we find

$$\begin{aligned} t_{h-2}^{\intercal } t_{h+1}&= t_{h-2}^{\intercal }\left( \hat{U}_{(h-1)} - t_h\hat{\beta }_{(h)}^{\intercal } \right) w_{(h+1)} \\&= t_{h-2}^{\intercal }\left( \hat{U}_{(h-2)} - t_{h-1}\hat{\beta }_{(h-1)}^{\intercal } - t_h\hat{\beta }_{(h)}^{\intercal }\right) w_{(h+1)} \\&= \left( t_{h-2}^{\intercal }\hat{U}_{(h-2)} - t_{h-2}^{\intercal }t_{h-1}\hat{\beta }_{(h-1)}^{\intercal } - t_{h-2}^{\intercal }t_h\hat{\beta }_{(h)}^{\intercal } \right) w_{(h+1)} = 0 \ . \end{aligned}$$

Finally, [h] yields

$$\begin{aligned} t_{1}^{\intercal } t_{h+1}&= t_{1}^{\intercal }\left( \hat{U}_{(h-1)} - t_h\hat{\beta }_{(h)}^{\intercal } \right) w_{(h+1)} \\&= \left( t_{1}^{\intercal }\hat{U}_{1} - t_{1}^{\intercal }\sum _{r=2}^{h}t_{r}\hat{\beta }_{(r)}^{\intercal } \right) w_{(h+1)} = 0 \ . \end{aligned}$$

(i) ${ w_\ell ^\intercal \hat{\beta }_{\ell } = 1, \ \forall \ell \in \{2,\ldots ,h\}:}$

Let $\hat{\beta }_h$ be the column vector whose elements are $\hat{\beta }_{hj}$ for all $j=1,\ldots ,p$. The components $t_h$ are given by $w_h^{\intercal }\hat{U}_{(h-1)}^{\intercal }=t^{\intercal }_h$, and so

$$\begin{aligned} w_{(h)}^{\intercal } \hat{\beta }_{h} = w_h^{\intercal }\frac{\hat{U}_{(h-1)}^{\intercal }t_h}{t_h^{\intercal }t_h}. \end{aligned}$$

(12)

For $h=1$, we have $w_1^{\intercal }X^{\intercal }=t^{\intercal }_1$, and so $w_{1}^{\intercal } \hat{\beta }_{1}=w_1^{\intercal }\frac{R(X)^{\intercal } t_1}{t_1^{\intercal }t_1}\ne 1$ if $R(X)\ne X$. For $h > 1$, we get $w_h^{\intercal }\hat{U}^{\intercal }_{(h)}=t^{\intercal }_h$. From expression (11), we have

$$\begin{aligned} \hat{\beta }_{h} = \frac{\hat{U}^{\intercal }_{(h)} t_h}{t_h^{\intercal }t_h}, \end{aligned}$$

(13)

and so $w_{h}^{\intercal } \hat{\beta }_{h} = w_h^{\intercal }\frac{\hat{U}^{\intercal }_{(h)} t_h}{t_h^{\intercal }t_h} =\frac{t_h^{\intercal } t_h}{t_h^{\intercal }t_h} = 1$.

(ii) ${ w_h^{\intercal } \hat{U}^{\intercal }_{(\ell )} = \mathbf {0}, \ \forall \ell \geqslant h > 1:}$

Relation (11), $R(x_j)=\hat{\beta }_{1j}t_1 + \hat{\beta }_{2j}t_2 + \cdots + \hat{u}_{(h-1)j}$, yields

$$\begin{aligned} \hat{U}_{(h-1)} - t_h\hat{\beta }_{h}^{\intercal } = \hat{U}_{(h)}. \end{aligned}$$

(14)

For $h=\ell =1$, we get $R(X) \equiv \hat{U}_{(0)}=t_1\hat{\beta }_{1}^{\intercal }+\hat{U}_{(1)}$, and so

$$\begin{aligned} w_1^{\intercal } \hat{U}^{\intercal }_{(1)} = w_1^{\intercal }\left( R(X)^{\intercal }- \hat{\beta }_1 t_1^\intercal \right) = w_1^{\intercal }\left( \hat{\beta }_1 t_1^\intercal +\hat{U}_{(1)}^\intercal - \hat{\beta }_1 t_1^\intercal \right) = w_1^{\intercal }\hat{U}_{(1)}^\intercal \ne \mathbf {0}. \end{aligned}$$

For $h=\ell >1$, using (i), we deduce from (14) that

$$\begin{aligned} w_h^{\intercal } \hat{U}_h^{\intercal }&= w_h^{\intercal }\left( \hat{U}^{\intercal }_{(h-1)} - \hat{\beta }_h t_h^\intercal \right) \\&= w_h^{\intercal }\hat{U}^{\intercal }_{(h-1)} - w_h^{\intercal }\hat{\beta }_h t_h^\intercal \\&= t_h^\intercal - t_h^\intercal = \mathbf {0}. \end{aligned}$$

For all $\ell> h > 1$, expressions (i) and (13) yield

$$\begin{aligned} w_h^{\intercal } \hat{U}_{(\ell +1)}^{\intercal }&= w_h^{\intercal }\left( \hat{U}_{(\ell )}- t_{\ell +1} \hat{\beta }_{\ell +1}^\intercal \right) ^{\intercal } \\&= w_h^{\intercal } \hat{U}_{(\ell )}^{\intercal } - w_h^{\intercal } \frac{\hat{U}_{(\ell )}^{\intercal }t_{\ell +1}}{t_{\ell +1}^{\intercal }t_{\ell +1}}t_{\ell +1}^{\intercal } \\&= w_h^{\intercal } \hat{U}_{(\ell )}^{\intercal } - w_h^{\intercal } \hat{U}_{(\ell )}^{\intercal }(t_{\ell +1}t_{\ell +1}^{\intercal })^{-1}t_{\ell +1} t_{\ell +1}^{\intercal } \\&=\mathbf {0}. \end{aligned}$$

(iii) ${w_h^{\intercal } \hat{\beta }_\ell = 0, \ \forall \ell> h > 1:}$

Due to relations (i) and (13), we get

$$\begin{aligned} w_h^{\intercal } \hat{\beta }_\ell = w_h^{\intercal } \frac{\hat{U}_{(\ell )}^{\intercal } t_\ell }{t_\ell ^ {\intercal } t_\ell }. \end{aligned}$$

For $\ell> h > 1$, relation (ii) provides the result.

(iv) ${ w_h^{\intercal } w_\ell = 0, \ \forall \ell> h > 1:}$

Relation (ii) yields $w_h^{\intercal }\hat{U}_{(\ell )}^{\intercal }= \mathbf {0}$, for all $\ell> h > 1$. Thus

$$\begin{aligned} w_h^{\intercal } w_\ell = w_h^{\intercal } \frac{\hat{U}_{(\ell -1)} \hat{\varepsilon }_{\ell -1}}{\left( \sum _{j=1}^p \text {cog}^2 \left( \hat{u}_{(\ell -1)j}^{\intercal }, \hat{\varepsilon }_{\ell -1} \right) \right) ^{\frac{1}{2}}} = w_h^{\intercal } \frac{\hat{U}_{(\ell -1)} \hat{\varepsilon }_{\ell -1}}{\left\| \hat{U}_{(\ell -1)} \hat{\varepsilon }_{\ell -1} \right\| } =0. \end{aligned}$$

(v) ${ t_h^{\intercal }\hat{U}_\ell = \mathbf {0}, \ \forall \ell \geqslant h \geqslant 1:}$

Equations (14) and (11) imply

$$\begin{aligned} t_h^{\intercal } \hat{U}_{(\ell )}&= t_h^{\intercal } \left( \hat{U}_{(\ell -1)} - t_\ell \hat{\beta }_\ell ^{\intercal } \right) \\&= t_h^{\intercal } \left( R(X) - \sum _{\ell =1}^h t_\ell \hat{\beta }_\ell ^{\intercal } \right) \\&= t_h^{\intercal } \hat{U}_{(h)} = \mathbf {0}. \end{aligned}$$

(vi) ${ \hat{U}_{(h)} \ne \hat{U}_{(0)} \prod _{\ell =1}^{h} \left( \mathbb {I} - w_\ell \beta _\ell ^{\intercal }\right) , \ \forall h \geqslant 1:}$

Let $h = 1$, since $\hat{U}_{(0)}\equiv R(X)$, by Eq. (14) we have

$$\begin{aligned} \hat{U}_{(1)}&= R(X) - t_1 \hat{\beta }_1^{\intercal } \\&= R(X) - X w_1 \beta _1^{\intercal } \\&\ne R(X)\left( \mathbb {I} - w_1 \beta _1^{\intercal }\right) , \ \text {if} \ X\ne R(X) \\&\ne R(X) \prod _{\ell =1}^{h+1} \left( \mathbb {I} - w_\ell \beta _\ell ^{\intercal }\right) . \end{aligned}$$

3. Properties (o)–(vi) of Gini2-PLS1.

All properties (o)–(v) are obtained in the same manner as in the Gini1-PLS1 case. As to property (vi), let $h = 1$, since $\hat{U}_{(0)}\equiv X$:

$$\begin{aligned} \hat{U}_{(1)}&= R(X) - t_1 \hat{\beta }_1^{\intercal } \\&= X - X w_1 \beta _1^{\intercal } \\&= X\left( \mathbb {I} - w_1 \beta _1^{\intercal }\right) \\&= X \prod _{\ell =1}^{h+1} \left( \mathbb {I} - w_\ell \beta _\ell ^{\intercal }\right) . \end{aligned}$$

Appendix B

See Tables 13, 14, 15 and 16.

Table 13 Descriptive statistics of the database

Full size table

Table 14 Database of cars data

Full size table

Table 15 Detecting outliers (before contamination)

Full size table

Table 16 Predictions with $t_1$ and $t_2$ (before contamination)

Full size table

Rights and permissions

Reprints and permissions

About this article

Cite this article

Mussard, S., Souissi-Benrejab, F. Gini-PLS Regressions. J. Quant. Econ. 17, 477–512 (2019). https://doi.org/10.1007/s40953-018-0132-9

Download citation

Published: 16 April 2018
Issue Date: 01 September 2019
DOI: https://doi.org/10.1007/s40953-018-0132-9

Keywords

JEL Classification

C3
C8

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Gini-PLS Regressions

Abstract

Access this article

Similar content being viewed by others

Lorenz Model Selection

Multicollinearity in regression: an efficiency comparison between Lp-norm and least squares estimators

Multiple Regression Analysis from Data Science Perspective

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Appendices

Appendix A: Proof of Proposition 2

Appendix B

Rights and permissions

About this article

Cite this article

Keywords

JEL Classification

Navigation

Gini-PLS Regressions

Abstract

Access this article

Similar content being viewed by others

Lorenz Model Selection

Multicollinearity in regression: an efficiency comparison between Lp-norm and least squares estimators

Multiple Regression Analysis from Data Science Perspective

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Appendices

Appendix A: Proof of Proposition 2

Appendix B

Rights and permissions

About this article

Cite this article

Share this article

Keywords

JEL Classification

Search

Navigation