Biased Adjusted Poisson Ridge Estimators-Method and Application

Månsson and Shukur (Econ Model 28:1475–1481, 2011) proposed a Poisson ridge regression estimator (PRRE) to reduce the negative effects of multicollinearity. However, a weakness of the PRRE is its relatively large bias. Therefore, as a remedy, Türkan and Özel (J Appl Stat 43:1892–1905, 2016) examined the performance of almost unbiased ridge estimators for the Poisson regression model. These estimators will not only reduce the consequences of multicollinearity but also decrease the bias of PRRE and thus perform more efficiently. The aim of this paper is twofold. Firstly, to derive the mean square error properties of the Modified Almost Unbiased PRRE (MAUPRRE) and Almost Unbiased PRRE (AUPRRE) and then propose new ridge estimators for MAUPRRE and AUPRRE. Secondly, to compare the performance of the MAUPRRE with the AUPRRE, PRRE and maximum likelihood estimator. Using both simulation study and real-world dataset from the Swedish football league, it is evidenced that one of the proposed, MAUPRRE (\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ \hat{k}_{q4} $$\end{document}k^q4) performed better than the rest in the presence of high to strong (0.80–0.99) multicollinearity situation. Electronic supplementary material The online version of this article (10.1007/s40995-020-00974-5) contains supplementary material, which is available to authorized users.


Introduction
The Poisson regression model (PRM) is a special form of the generalized linear models and is used when the dependent variable is collected in terms of counts of nonnegative integers. A PRM adopts a Poisson distribution for the dependent variable and assumes the log of its expected value can be modeled by a linear combination of relevant parameters. The model is commonly applied for counts such as the occurrence rate of an event (counts) per unit of time. These counts must be independent to facilitate that one count will not make another event to be more or less likely. Instead, the probability of a count per unit of time is related to independent variables such as, e.g., the time of day. Examples of likely Poisson processes could be the number of infected patients per day at a clinic, a country's number of bankruptcies per year, the number of vehicles per hour passing through a freeway toll. The maximum likelihood estimator (MLE) is used to estimate the unknown regression coefficients of the PRM. This estimator is considered to be the best estimator for the PRM, and as long as under-or overdispersion is not present in the data set, this is a standard model for these types of count problems. However, in the presence of multicollinearity problems, the mean square error (MSE) of the MLE become unstable with high variances of the regression coefficients and the inference based on MLE may not be reliable. Another consequence of multicollinearity is the wider confidence intervals, decreased statistical power which result in increased probabilities of type II errors in the parameters' hypothesis tests. In addition, the uncertainty of the estimated coefficients is higher because of an increased coefficient variance due to multicollinearity.
Many biased estimation techniques have been proposed for linear regression models to reduce multicollinearity, such as the ridge regression estimator by Hoerl and Kennard (1970) and the Liu estimator by Liu (1993). Moreover, Nomura (1988) developed an almost unbiased ridge estimator in the linear regression model, thus with the cost of a very low bias, but substantially more efficient as compared to the ordinary ridge regression under certain conditions. Månsson and Shukur (2011) proposed a Poisson ridge regression estimator (PRRE) to reduce the effects of problems associated with multicollinear data. Kibria et al. (2015) proposed a number of biasing parameters, and Asar and Genç (2018) suggested a two-parameter biased estimator in the PRM to adjust for multicollinearity. Türkan and Ö zel (2016) developed Almost Unbiased PRRE (AUPRRE) and Modified AUPPRE (MAUPRRE). Kaçıranlar and Dawoud (2018) examined the performance of Poisson and negative binomial ridge predictors. Algamal and Alanaz (2018) proposed different methods to estimate the value of ridge parameter (k) for PRRE. Rashad and Algamal (2019) proposed a new ridge regression approach in the PRM to reduce the issue of collinearity between explanatory variables, and recently Qasim et al. (2019) proposed a Liu-type of regression estimator for the PRM. Türkan and Ö zel (2016) did not discuss the MSE properties of AUPRRE and MAUPRRE and not derive the optimal value of the ridge parameter (k). However, no published research work seems available regarding the MSE properties of the AUPRRE and MAUPRRE and their optimal ridge estimators for the PRM.
The main contribution of this paper is twofold. One is to derive the MSE properties of the MAUPRRE and AUPRRE. Second is, by simulations and by the empirical application in terms of MSE and bias, to compare the performance of the MAUPRRE with the AUPRRE, PRRE and MLE. In addition, we introduce new estimating methods for estimate the value of ridge parameter (k) for AUPRRE and MAUPRRE and the performance of proposed ridge estimators is compared with the existing estimators by considering different factors in the simulation study. Furthermore, the intuitive, and empirical relevance of the MAUPRRE and AUPRRE are exhibited by employing an estimation of a real-world dataset, where we systematically investigate which estimator that to the highest degree can remedy the effects of multicollinearity. In this empirical application, we model the number of goals scored at away games (as a function of the quality of the teams measured by bookmaker odds). By this approach, it is easily demonstrated that the standard errors and the estimated MSEs of proposed estimators are decreased substantially as compared to the existing estimators in the presence of multicollinearity problem. Hence, the precision of the estimated parameters is increased, which of course is one of the main objectives of demonstrating the method in an empirical situation.
The rest of the article is organized as follows: in Sect. 2, we define the model of interest and MLE, PRRE, AUPRRE and MAUPRRE. The MSE properties are derived in Sect. 3. In Sect. 4, the optimal value of the ridge parameter is derived, and we propose new ridge estimators for estimating the value of ridge parameter, k for AUPRRE and MAUPPRE. Monte Carlo simulation and its results are presented in Sect. 5. In Sect. 6, the advantages of our proposed ridge estimators are illustrated by using our estimators to analyze an empirical dataset based on the Swedish football league. Finally, the concluding remarks of article are discussed in Sect. 7.

Methodology
This section illustrates the model of interest and characteristic of different estimators.

The Poisson Regression Model
The PRM is applicable only when the dependent variable deals with count data. Suppose y i is the dependent variable and follows a Poisson distribution with parameter ðl i Þ and it can be denoted as P l i ð Þ with probability mass function The PRM is commonly developed by using the canonical link function, such that l i ¼ exp , where x i is the ith row of X which is an n Â q ð Þ data matrix with q nonstochastic explanatory variables, b is a q Â 1 vector of the unknown regression coefficients. The log-likelihood function is defined as The traditional MLE is used to estimate the unknown regression coefficients for the PRM. The MLE is obtained by taking the first order derivative of Eq. (2) with respect to b: where S b ð Þ is the score function, since Eq. (3) is nonlinear in b, we estimate the unknown coefficients through iterative weighted least squares. Let b m ð Þ be the estimated value of MLE of b with m iterations which may be written as where At convergence in deviance of Eq. (4), the MLE is found by applying the following iterative weighted least squares method , is the adjusted response variable. BothŴ and z Ã are evaluated by Fisher's scoring iterative procedure (see, e.g., Hardin et al. 2007).
In order to obtain the MSEs of the parameters, we consider Q is the orthogonal matrix whose columns are the eigenvectors of X tŴ X and k 1 ! k 2 ! ; . . .; ! k q [ 0 are the eigenvalues of the matrix X tŴ X, respectively. Thê b MLE can be written as The covariance matrix of theb MLE is defined as In addition, the scalar MSE of theb MLE is defined as where k j is the jth eigenvalue of the Z tŴ Z matrix.

The Poisson Ridge Regression Estimator
It can be easily seen that the MSE of the MLE becomes overstated when the explanatory variables are linearly correlated because some of the eigenvalues will be small and Z tŴ Z is ill-conditioned. To reduce the effects of multicollinearity, Månsson and Shukur (2011) proposed a PRRE estimator which can be defined aŝ Theb PRRE can be written as where K kI q ¼ diag k 1 þ kI q ; k 2 þ kI q ; . . .; k q þ kI q À Á and k (k [ 0) is the ridge parameter. The bias, covariance matrix and MSE of theb PRRE are, respectively, defined as where K kI q ¼ diag k 1 þ kI q ; k 2 þ kI q ; . . .; k q þ kI q À Á and K ¼ diag k 1 ; k 2 ; . . .; k q À Á ¼ Z tŴ Z, where Q is the orthogonal matrix whose columns are the eigenvectors of Z tŴ Z. The scalar MSE of the PRRE is obtained by applying the tr(.) operator on Eq. (11), which can be defined as where a ¼ Ç tb MLE , c is the eigenvector of the matrix Z tŴ Z and k is the ridge parameter of the PRRE.

Almost Unbiased Poisson Ridge Regression Estimator
The PRRE overcome the problem of multicollinearity, but this estimator has a large bias. Therefore, Türkan and Ö zel (2016) proposed AUPRRE. This estimator cannot only remedy the problem of multicollinearity but also reduce the bias as compared to PRRE and MLE. Before explaining the full AUPRRE, we first define the almost unbiased ridge estimator in Definition 2.3.1: Definition 2.3.1 Xu and Yang (2011), Considerb is a biased estimator of the parameter b and the bias vectorb is Þb is called the almost unbiased estimator based on the biased estimatorb.
Below, we define the AUPRRE based on the PRRE. According to Definition 2.3.1, we define the following The above expression can be defined as The bias, covariance matrix and MSE of theb AUPRRE are defined, respectively, as following: The scalar MSE of the AUPRRE is obtained by applying the tr(.) operator on Eq. (16), which can be stated as

Modified Almost Unbiased Poisson Ridge Regression Estimator
Türkan and Ö zel (2016) proposed a modified Jackknifed ridge estimator or MAUPRRE for the PRM by following the work of Singh et al. (1986). The MAUPRRE is defined aŝ Theb MAUPRRE can be written as The bias, variance, MMSE and scalar MSE of thẽ b MAUPRRE are defined as The scalar MSE of the MAUPRRE is obtained by applying the tr(.) operator on Eq. (21), which can be stated as 3 Mean Square Error Properties of the Estimators In this section, we derive the MSE properties of the AUPRRE and MAUPRRE for the PRM. We also make a comparison of the AUPRRE and MAUPRRE with the existing estimators such as MLE and PRRE. We show the superiority of the AUPRRE and MAUPRRE under different conditions. The performance ofb MLE ,b PRRE ,b AUPRRE andb MAUPRRE is theoretically judged by using MSE and the bias criteria. Therefore, we define Lemma 3.1 for comparison purpose.  Proof By using Eqs. (9) and (14), we have
. . .; q;, then thẽ b AUPRRE is superior to theb PRRE for the PRM in terms of the scalar MSE.
Proof From Eqs. (12) and (17), we have Since D 2 is positive definite for k [ 0 if and only if when 2 ka i ð Þ 2 þkk j a 2 i À 3k À 2k j n o [ 0 and this expression is a quadratic function of k which has following roots Thus, the AUPRRE is superior to the PRRE in sense of scalar MSE for the PRM.
Proof From Eqs. (7) and (17), we have Since D 3 is positive definite if and only if Proof From Eqs. (6) and (21), the difference between MSEb MLE and MSEb MAUPRRE is obtained by Lemma 3.1., the proof is completed.

Proposed Ridge Estimators
It is a complicated challenge for practitioners to select an optimal value of k. Therefore, we propose new ridge estimatorsk q1 Àk q4 À Á for the AUPRRE and MAUPRRE. We also usedk TO ridge estimator that suggested by Türkan and Ö zel (2016) for the PRM. Moreover, the performance of k q1 Àk q4 is compared with thek TO in sense of MSE in the simulation and the empirical application sections. In order to obtain an optimal value of the AUPRRE, differentiating the MSEb AUPRRE with respect to k yields Eq. (18): Let o MSEb AUPRRE n o . ok ¼ 0 and resulting function solve for k, then we have following optimal value of k Türkan and Ö zel (2016) concluded that the k TO perform rather well and this estimator is defined aŝ whereâ 2 j is the jth j ¼ 1; 2; . . .; q ð Þ element of Ç tb MLE , Ç is the eigenvector of matrix X tŴ X and . Following ridge estimators are proposed for AUPPRE and MAUPRRE based on the optimal value which derived in Eq. (19).

The Monte Carlo Simulations
The Monte Carlo simulation study is designed to demonstrate the performance of the estimators. The performance of the proposed estimators is compared with the existing estimators in the sense of MSE and bias under different conditions which are given in Table 1. The dependent variable of the PRM is obtained from the P l i ð Þ distribution, where We selected the parametric values of b under the assumption that P q j¼1 b 2 j ¼ 1, which are standard restrictions in simulation studies. The correlated explanatory variables are generated as where q 2 is the correlation between the explanatory variables and z ij represents the independent standard normal pseudo-random numbers. Other factors are also varied in the simulation study such as explanatory variables q ¼ 3; 6 ð Þ , multicollinearity levels (q ¼ 0:80; 0:90; 0:95; 0:99) and different sample size. However, the sample sizes need to be increased with the increase in number of explanatory variables to attain the convergence of the iterative weighted least squares algorithm. In order to evaluate the performance of the proposed estimators, the MSE and absolute bias are considered as performance criteria. The MSE and absolute bias are defined as where R = 5000 is the total number replications andb r is the estimate of b in the rth replication obtained from the MLE, PRRE, AUPRRE and MAUPRRE.

Results and Discussion
In this subsection, we discuss the simulated MSE and bias of the estimators. The simulated results are shown in Tables 2, 3, 4 and 5. The performance of the estimators is inspected by changing different factors such as the sample size, multicollinearity level and the number of explanatory variables. From Tables 2 and 3, it is clear that the MSE of all the estimators decreases as the same size increases, while the value of MSE is increased when the degree of correlation is increased. However, the MLE has a larger MSE than the PRRE, AUPRRE and MAUPRRE. Table 2 reveals that estimators behave differently with respect to multicollinearity levels, and it is seen that the performance of proposedk q4 is better than the other estimators. The performance ofk q1 Àk q3 is not superior to thek TO when q 0:95 and q ¼ 3. In the presence of high but imperfect multicollinearity, the proposed ridge estimatorsk q1 Àk q4 are superior to the MLE andk TO . From Table 3, when   Table 4 Estimated absolute Bias values when q But the severe multicollinearity level q ¼ 0:99 ð Þdoes not show a substantial effect on the performance of MAUPRRE withk q4 as showing for other estimators. The effect of increasing the number of explanatory variables for a given q and n leads to an increase in the MSE. When q ¼ 6 and n ¼ 25, the performance of MLE is very poor. The performance of MAUPPREk q4 À Á is superior to the MLE, AUPRRE andk TO (k TO suggested by Türkan and Ö zel 2016). The simulated absolute bias values of the PPRE, AUPRRE and MAUPRRE are given in Table 4 and 5.k q4 give minimum bias as compared to other estimators. However, the performance of MAUPRRE is satisfactory in the sense of having the smallest bias (almost unbiased) when we usek q4 in MAUPRRE. As the sample size and the number of explanatory variables increases the absolute bias of the estimators is decreased. However, the multicollinearity level has a negative effect on the performance of the estimators. Overall, as expected, we can see that the estimated MSE and bias of the estimators increase due to the increase in the multicollinearity level, but the effects of multicollinearity are least problematic when using our new MAUPPREk q4 À Á . The AUPPREk q4 À Á provides minimum bias when sample size small and q ¼ 3. As q ¼ 6, n ! 1 and q ! 0:99, the performance of MAUPPREk q4 À Á is superior to other estimators in the sense of absolute bias. Finally, when looking at the simulation results, the greatest benefit of applying MAUPRRE is in the situation when ridge estimatork q4 is used.

Application: Swedish Football League 2019
For the purpose of illustrating the empirical relevance of the proposed methods, we analyze Swedish football data. 1 The proposed and existing estimation methods are explicated using a dataset regarding the performance of Swedish football teams in the top Swedish league (Allsvenskan) during the year of 2019. This dataset includes n ¼ 242 observations which include one dependent and six explanatory variables. These variables are defined as: number of, within full time, away-team goals (y), pinnacle home win odds x 1 ð Þ, pinnacle draw odds x 2 ð Þ, pinnacle away win odds x 3 ð Þ, maximum market home win odds x 4 ð Þ, maximum market draw win odds x 4 ð Þ and maximum market away win odds x 6 ð Þ. The effects of the regressors x 1 to x 6 ð Þon the dependent variable are analyzed using the PRM. The distribution of the dependent variable is illustrated in Fig. 1 which indicates that the PRM is well fitted. Based on a Chi-square v 2 ð Þ goodness of fit test, the results confirm that the response variable is well suited to the PRM (with a p value = 0.15). The correlation matrix of the regressors is exhibited in Table 6. Table 6 shows severe correlation among x 1 , x 3 , x 4 and x 6 . Furthermore, the condition number, which is the ratio of maximum to the minimum eigenvalues, is 1766 [ 1000 which indicates what can be defined as a severe multicollinearity problem in this dataset.
We present the coefficients and the standard errors of the estimators in Table 7. The MSE and bias values of the estimators are illustrated in Fig. 2a-c. Theoretical MSE values of theb MLE ,b PRRE ,b AUPRRE andb MAUPRRE are calculated using Eqs. (6), (11), (16) and (21), respectively. Simulation results revealed that the performance of the ridge estimatork q4 is an efficient andk q4 exhibited minimum MSE compared to other estimators. Therefore, we usek q4 in theb PRRE ,b AUPRRE andb MAUPRRE for estimation  x 2 x 3 x 4 x 5 x 6 x 1 1.000 x 2 0.077 1.000 x 3 -0.563 0.708 1.000 x 4 0.997 0.098 -0.548 1.000 x 5 0.034 0.988 0.755 0.054 1.000 x 6 -0.539 0.738 0.990 -0.524 0.783 1.000 of the PRM. For comparison purposes, we also usek TO from Türkan and Ö zel (2016). The effects of the estimated coefficients are changed, and the estimated standard errors of theb MAUPRRE are smaller than those ofb AUPRRE ,b PRRE , b MLE . It is evident from Table 7, based on high standard errors, that the MLE does not estimate the coefficients very precisely in the presence of multicollinearity. However, on the other hand, the proposed estimation method b MAUPRREkq4 À Á estimates the coefficients rather precisely. The PRRE provides smaller standard errors as compared to the AUPRRE and MLE. The AUPRRE gives higher standard errors of the parameters since AUPRRE provides minimum squared bias and MSE among other estimators under certain conditions. AUPRRE shrinks the bias, and therefore, we named it almost unbiased estimator due to its minimized bias. One can easily see that in the presence of multicollinearity MLE exhibits the wrong sign of the slope parametersb 3 andb 6 . However, biased estimation methods may change the sign of the slope parameters. For instance, theoretically, pinnacle away win odds and maximum market away win odds have negative effects on the number of fulltime away-team goals, while the MLE shows a negative effect. Meanwhile, proposed method shows positive effect and it is considered a good approach to tackle the problem of multicollinearity. Hence, the advantage of the proposed method over MLE using this empirical application is easily understood.
We also plot the squared bias and MSE values using Eqs. (11), (16) and (21) against assuming different values of k to show the performance of estimators under different conditions. In Fig. 2a, we plot the squared bias values of the PRRE and AUPRRE for changing the values of the ridge parameter k between 0 and 1. It is seen that AUPRRE has always the minimum bias compared to the PRRE, and these results satisfy Theorem 3.  Fig. 2c. It is found that the MSE of the biased estimators equals to MLE when k ¼ 0. As the value of k increases, the MSE of MAUPRRE demonstrate the minimum MSE compared to the AUPRRE, PRRE and MLE. Therefore, we can conclude that the performance of the PRRE, AUPRRE and MAUPRRE is a function of the values of the ridge estimators. Overall, we recommend practitioners to apply MAUPRRE with ridge estimatork q4 since this estimator gives lowest standard errors and MSE in the presence of multicollinearity.

Conclusions
In this paper, we derive the MSE properties of the AUPRRE and MAUPRRE to show the superiority over the existing estimators in the presence of multicollinearity. We also derive the optimal ridge parameter, k by minimizing the MSE of AUPRRE and suggest new ridge estimators. These estimators are based on the proposed optimal value of k for estimating of the ridge parameter, k, which we demonstrate to exhibit superiority over the existing estimators. The comparison of the proposed estimators is made using the AUPRRE, PRRE and MLE by means of Monte Carlo simulations. The comparison concludes that MAUPRRE with the ridge estimatork q4 has a smaller MSE than MLE, PRRE and AUPRRE. Moreover, the empirical relevance and appealing properties of the proposed estimator are also demonstrated by utilizing our approach on a collinear real-world application. In conclusion, both empirically and by using simulations, in the presence of multicollinearity our MAUPRRE (k q4 ) approach exhibits the lowest MSE compared to all competing estimators.
Acknowledgements Authors are dedicating this article to those who lost their lives during COVID-19. We are thankful to the Editor and two anonymous referees for their valuable and constructive suggestions/comments, which certainly improved the presentation and quality of the paper.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons. org/licenses/by/4.0/.
Funding Open access funding provided by Jönköping University.