Parameter estimation of some Kumaraswamy-G type distributions
- 592 Downloads
Abstract
Since Kum-G distributions have additional two parameters, the estimation of parameters becomes an interesting problem by itself. In this study, we consider parameter estimation of Kum-Weibull, Kum-Pareto and Kum-Power distributions by using the maximum likelihood and the maximum spacing methods. These three distributions are important in reliability and other applications. The Kum-Pareto and Kum-Power distributions have parameter-dependent boundaries, which makes the estimation of parameters more interesting. We performed simulations for each of these considered distributions by using the R software for estimating parameters using the maximum likelihood and the maximum spacing method. In addition, an application of these distribution families to real data for modeling wind speed in a particular location in Turkey is discussed.
Keywords
Kumaraswamy distribution Maximum likelihood Maximum spacing Parameter estimation SimulationIntroduction
In recent years one can find many papers which generalize this distribution by replacing x with some known distribution such as normal, Weibull, Pareto, and others (see, for example [2, 9, 12]). Based on the Kumaraswamy distribution Cordeiro and Castro [6] introduced a new generalized family of distributions, denoted in this paper by Kum-G, and discussed its basic statistical properties and application to a real data set.
It can be seen that in recent years many authors study applications and parameter estimation of special Kum-G distributions. For example, Cordeiro et al. [9] investigate the Kum-Weibull model and its application to failure data. Tamandi and Nadarajah [16] discuss parameter estimation of the Kum-Weibull, Kum-Normal and Kum-Inverse Gaussian families.
Since Kum-G distributions have additional two parameters, the estimation of parameters becomes an interesting problem by itself. The maximum likelihood method (ML) is one of the preferred methods for estimating the parameters in Kum-G distributions. Tamandi and Nadarajah [16] consider also the maximum spacing method (MSP) and compare it with the maximum likelihood (ML) method for estimating the parameters in some of the Kum-G distributions.
It is known that in situations like mixtures of distributions and distributions with a parameter-dependent lower bound, where the ML estimator leads to inconsistent estimators, the MSP estimator is consistent; see [13]. Motivated by this fact it is natural to consider the MSP estimator in parameter estimation for the Kum-Pareto and Kum-Power distributions.
In this study, we consider parameter estimation of the Kum-Weibull [6], Kum-Pareto [2] and Kum-Power [12] families of distributions by using the ML and MSP methods. Although one may find some studies for the Kum-Weibull and Kum-Pareto distributions, there is only one paper dealing with the Kum-Power family of distributions. We performed simulations for each of the considered family of distributions. For calculations we used the R software [14]. In particular, for estimating parameters in the simulations the optim function in R was applied with the Nelder–Mead method. The parameter estimates for the Weibull distribution were obtained by applying the fitdistr method in R.
It can be seen from the literature that wind speed can be modeled by various distributions such as Weibull, Rayleigh, gamma, lognormal, beta, Burr, and inverse Gaussian distributions, among others [17]. For example, Chang [3] compared the performance of six numerical methods in estimating Weibull parameters for wind energy application. He concludes that the maximum likelihood, modified maximum likelihood and moment methods show better performance in simulation tests. In this study, we consider modeling wind speed by using the following generalized families of distributions: Kum-Weibull and Kum-Power. We note here that, for example, the Kum-Weibull family of distributions includes the Weibull and Rayleigh distributions. It is expected that the flexibility of the two additional two parameters in the Kum-G family of distributions will improve the modeling results. The parameter estimates for the real data were obtained by applying the ga method [15], which is a genetic algorithm method implemented in R.
Kumaraswamy distributions considered
Some Kum-W special cases
Distribution | \(\lambda\) | c | a | b |
---|---|---|---|---|
Kum-exponential | 1 | |||
Kum-Rayleigh | 2 | |||
Exponentiated-Weibull | 1 | |||
Exponentiated-Rayleigh | 2 | 1 | ||
Exponentiated-exponential | 1 | 1 | ||
Weibull | 1 | 1 | ||
Rayleigh | 2 | 1 | 1 | |
Exponential | 1 | 1 | 1 |
Parameter estimation
The ML method is one of the most widely used parameter estimation methods in statistics. On the other hand, it is known that ML estimation may lead to inconsistent estimation results, especially in parameter-dependent boundary situations. Ranneby [13] showed that in such cases, the maximum spacing method is more reliable than the ML method. Ekström [7, 8], on the other hand, showed that the MSP estimators may give better results than ML estimators for small samples. Also, Cheng [4] showed that in unbounded likelihood problems such as estimation of three-parameters in the Weibull distribution, the MSP estimation method produces consistent and asymptotically efficient estimators. Recently, Tamandi and Nadarajah [16] investigated parameter estimation of some Kum-G distributions by using ML and MSP methods.
In this paper, we consider parameter estimation of the Kum-Weibull, Kum-Pareto and Kum-Power distributions by using ML and MSP methods. We note that in Kum-Par as well as Kum-Pow distributions parameter-dependent boundaries exist. Therefore, we hope that this study will contribute to parameter estimation in Kum-G distributions.
Since by definition of the Kum-G distributions two additional shape parameters are introduced to the family of \(G(x,{{\varvec {\theta }}})\) distributions, the estimation of parameters becomes an interesting problem. The additional two parameters a and b provide more flexibility in modeling and applications. On the other hand, it should be noted that this flexibility also causes some major problems in parameter estimation. It can be seen that one of the main problems is that one may have to deal with quite different support sets of the distribution for different parameter values. Thus classical hill-climbing approaches such as Newton–Raphson and as well as methods such as Nelder–Mead may actually not give consistent (or any) results in Kum-G distributions.
Maximum likelihood method
Maximum spacing method
The MSP method was introduced by Cheng [4] as an alternative to the ML method. Ranneby [13] derived the MSP method from an approximation of the Kullback–Leibler divergence (KLD). Cheng [4] showed that in unbounded likelihood problems such as estimation of three-parameter gamma, lognormal or Weibull distributions, the MSP estimation method produces consistent and asymptotically efficient estimators. In situations like mixtures of distributions and distributions with a parameter-dependent lower bound, where the MLE leads to inconsistent estimators, the MSP estimator is consistent; see [13]. Even in other situations, Ekström [8] showed that the MSP estimators have better properties than ML estimators for small samples. Ekström [8] showed that MSP estimators are L1-consistent for any unimodal pdf without any additional conditions. According to [13], the MSP method works better than the ML method for multivariate data too. MSP estimators have all the nice properties of ML estimators such as consistency, asymptotic normality, efficiency and invariance under one-to-one transformations. For a detailed survey of the MSP method, the reader is referred to [8]. On the other hand, MSP estimators have some disadvantages too. First of all, they are sensitive to closely spaced observations, and especially ties. They are also sensitive to secondary clustering: one example is when a set of observations is thought to come from a single normal distribution, but in fact comes from a mixture of normals with different means [5].
Simulation results
Simulation is a powerful tool that is used in many areas of science. For example, some recent simulation studies can be found in [1, 18]. Abbasbandy and Shivanian [1] used numerical simulation based on meshless technique to study the biological population model. Vajargah and Shoghi [18] used quasi-Monte Carlo method in prediction of total index of stock market and value at risk. To assess the performance of the ML and MSP estimators we conducted a small size simulation study for the Kum-W, Kum-Par and Kum-Pow distributions. It should be noted that these three Kum-G distributions have different characteristics and are also important in reliability problems and applications. The Kum-Par and Kum-Pow distributions both have parameter-dependent boundaries, which may have important implications in parameter estimation. We used 1000 runs in each simulation to compare estimation results for the estimators. In this study, we selected a sample size of \(n=25\).
Bias and MSE for sample size \(n=25\) (1000 runs)
Weibull | a | b | \(\hat{\lambda }\) | MSE\((\hat{\lambda })\) | \(\hat{c}\) | MSE\((\hat{c})\) | \(\hat{a}\) | MSE\((\hat{a})\) | \(\hat{b}\) | MSE\((\hat{b})\) |
---|---|---|---|---|---|---|---|---|---|---|
MLE | 0.5 | 0.5 | 0.0622 | 0.0377 | -0.0134 | 0.0639 | −0.0319 | 0.0855 | 0.0587 | 0.0958 |
MSP | −0.0057 | 0.0865 | 0.0284 | 0.0969 | 0.0334 | 0.0283 | 0.0427 | 0.0298 | ||
MLE | 0.5 | 1.0 | 0.1059 | 0.0412 | 0.0578 | 0.0306 | −0.4037 | 0.2029 | 0.3809 | 0.1666 |
MSP | 0.0123 | 0.0829 | 0.0123 | 0.0924 | 0.0547 | 0.0327 | 0.0544 | 0.1003 | ||
MLE | 0.5 | 2.5 | 0.1357 | 0.0542 | 0.0764 | 0.0330 | −0.3916 | 0.1640 | 0.6279 | 0.4740 |
MSP | 0.0658 | 0.0992 | 0.0161 | 0.0791 | 0.1449 | 0.0722 | 0.0484 | 0.5480 | ||
MLE | 2 | 0.5 | 0.0016 | 0.0203 | 0.0113 | 0.0238 | 0.1850 | 0.135 | 0.0708 | 0.0089 |
MSP | 0.0273 | 0.0843 | 0.0850 | 0.1250 | 0.0536 | 0.389 | 0.0200 | 0.0232 | ||
MLE | 2 | 1.0 | −0.0891 | 0.1176 | 0.1390 | 0.1206 | 0.1730 | 0.240 | 0.1142 | 0.1038 |
MSP | 0.0400 | 0.0824 | 0.1140 | 0.1460 | 0.0968 | 0.365 | −0.0017 | 0.0838 | ||
MLE | 2 | 2.5 | 0.2786 | 0.3257 | 0.3100 | 0.2091 | −0.2550 | 0.702 | 0.4131 | 0.3369 |
MSP | 0.1288 | 0.1416 | 0.1160 | 0.1370 | 0.0431 | 0.349 | 0.0315 | 0.5240 | ||
MLE | 10 | 0.5 | 0.3982 | 0.3028 | −0.926 | 1.230 | 0.6820 | 2.65 | 0.5470 | 0.3457 |
MSP | 0.2270 | 0.3490 | 0.598 | 0.599 | 0.0335 | 8.01 | 0.0084 | 0.0210 | ||
MLE | 10 | 1.0 | 0.0042 | 0.0425 | 0.579 | 0.677 | 0.4750 | 2.30 | 0.1280 | 0.0246 |
MSP | 0.1640 | 0.2290 | 0.668 | 0.775 | 0.2492 | 8.41 | 0.0121 | 0.0857 | ||
MLE | 10 | 2.5 | 0.0485 | 0.0800 | 0.672 | 0.725 | 0.3150 | 2.61 | 0.3200 | 0.1363 |
MSP | 0.2970 | 0.3330 | 0.565 | 0.634 | 0.2283 | 8.68 | −0.0023 | 0.4780 |
Bias and MSE for sample size \(n=25\) (1000 runs)
Pareto | a | b | \(\hat{\beta }\) | MSE\((\hat{\beta })\) | \(\hat{k}\) | MSE\((\hat{k})\) | \(\hat{a}\) | MSE\((\hat{a})\) | \(\hat{b}\) | MSE\((\hat{b})\) |
---|---|---|---|---|---|---|---|---|---|---|
MLE | 0.5 | 0.5 | −0.396 | 0.199 | 0.0081 | 0.0856 | 0.0284 | 0.0312 | 0.0546 | 0.0411 |
MSP | −0.412 | 0.204 | 0.0220 | 0.0883 | 0.0284 | 0.0264 | 0.0993 | 0.0394 | ||
MLE | 0.5 | 1.0 | −0.425 | 0.218 | 0.0120 | 0.0914 | 0.0231 | 0.0273 | 0.1191 | 0.1239 |
MSP | −0.418 | 0.204 | 0.0297 | 0.0814 | 0.0261 | 0.0250 | 0.0803 | 0.1005 | ||
MLE | 0.5 | 2.5 | −0.597 | 0.418 | 0.1606 | 0.1176 | −0.0356 | 0.0404 | 0.2870 | 0.6307 |
MSP | −0.560 | 0.350 | 0.2024 | 0.1297 | 0.0633 | 0.0371 | 0.0982 | 0.5622 | ||
MLE | 2 | 0.5 | −0.314 | 0.155 | −0.0032 | 0.0891 | 0.1058 | 0.351 | 0.0322 | 0.0356 |
MSP | −0.323 | 0.169 | 0.0071 | 0.0772 | −0.0416 | 0.335 | 0.1124 | 0.0476 | ||
MLE | 2 | 1.0 | −0.296 | 0.151 | 0.0146 | 0.0807 | 0.0303 | 0.338 | 0.0583 | 0.1124 |
MSP | −0.316 | 0.167 | 0.0900 | 0.1103 | 0.0059 | 0.334 | 0.0396 | 0.0760 | ||
MLE | 2 | 2.5 | −0.369 | 0.213 | 0.1837 | 0.1597 | 0.0447 | 0.350 | 0.0393 | 0.5444 |
MSP | −0.383 | 0.219 | 0.1971 | 0.1631 | 0.0200 | 0.342 | 0.0402 | 0.5619 | ||
MLE | 10 | 0.5 | 0.429 | 0.423 | −0.0019 | 0.0892 | −0.1095 | 8.240 | 0.0871 | 0.1170 |
MSP | −0.195 | 0.270 | 0.3750 | 0.4190 | 0.0129 | 8.260 | 0.4494 | 0.4548 | ||
MLE | 10 | 1.0 | 0.437 | 0.463 | 0.0227 | 0.1005 | 0.0875 | 8.080 | 0.1025 | 0.1850 |
MSP | −0.214 | 0.278 | 0.7960 | 1.0200 | −0.1422 | 8.610 | 0.0462 | 0.0912 | ||
MLE | 10 | 2.5 | 0.207 | 0.419 | 0.3083 | 0.4534 | −0.1044 | 8.440 | −0.0056 | 0.5170 |
MSP | −0.191 | 0.303 | 0.8350 | 1.0790 | 0.1349 | 8.220 | 0.0134 | 0.5236 |
Bias and MSE for sample size \(n=25\) (1000 runs)
Power | a | b | \(\hat{\alpha }\) | MSE\((\hat{\alpha })\) | \(\hat{\beta }\) | MSE\((\hat{\beta })\) | \(\hat{a}\) | MSE\((\hat{a})\) | \(\hat{b}\) | MSE\((\hat{b})\) |
---|---|---|---|---|---|---|---|---|---|---|
MLE | 0.5 | 0.5 | 0.0063 | 0.1050 | 0.558 | 0.390 | 0.2170 | 0.163 | 0.1290 | 0.1460 |
MSP | 0.7600 | 1.0700 | 0.530 | 0.366 | 0.3690 | 0.195 | 0.0600 | 0.0575 | ||
MLE | 0.5 | 1.0 | 0.0600 | 0.208 | 0.605 | 0.476 | 0.1197 | 0.322 | 0.1380 | 0.3990 |
MSP | 0.7610 | 1.100 | 0.485 | 0.317 | 0.3870 | 0.201 | 0.0421 | 0.1215 | ||
MLE | 0.5 | 2.5 | 0.0691 | 0.299 | 0.548 | 0.472 | 0.0585 | 0.488 | 0.2000 | 0.9570 |
MSP | 0.7140 | 1.010 | 0.485 | 0.318 | 0.3880 | 0.204 | 0.0423 | 0.5609 | ||
MLE | 2 | 0.5 | 0.122 | 0.101 | 0.653 | 0.495 | 0.2350 | 0.426 | −0.1374 | 0.0731 |
MSP | 0.814 | 1.180 | 0.884 | 0.876 | 0.0353 | 0.319 | 0.0020 | 0.0236 | ||
MLE | 2 | 1.0 | 0.194 | 0.156 | 0.634 | 0.503 | 0.1590 | 0.566 | −0.1248 | 0.3205 |
MSP | 0.878 | 1.240 | 0.655 | 0.515 | 0.0871 | 0.317 | 0.0229 | 0.1142 | ||
MLE | 2 | 2.5 | 0.212 | 0.268 | 0.530 | 0.432 | 0.1180 | 0.594 | 0.0573 | 1.0092 |
MSP | 0.976 | 1.430 | 0.497 | 0.336 | 0.1634 | 0.293 | 0.0949 | 0.5837 | ||
MLE | 10 | 0.5 | 0.399 | 0.418 | 0.753 | 0.681 | −0.0829 | 8.470 | −0.0141 | 0.0232 |
MSP | 0.876 | 1.260 | 1.400 | 2.140 | 0.0422 | 8.670 | −0.0067 | 0.0202 | ||
MLE | 10 | 1.0 | 0.482 | 0.447 | 0.691 | 0.574 | 0.0283 | 7.980 | −0.0921 | 0.1209 |
MSP | 0.885 | 1.250 | 1.320 | 1.930 | −0.1088 | 8.180 | 0.0081 | 0.0848 | ||
MLE | 10 | 2.5 | 0.620 | 0.596 | 0.656 | 0.550 | 0.0286 | 8.270 | −0.1156 | 0.6505 |
MSP | 1.040 | 1.510 | 1.180 | 1.610 | −0.1045 | 8.030 | −0.0001 | 0.5635 |
From Table 4 it can be observed that MLE, in general, outperforms MSP estimates. This can be explained by the fact that for the Kum-Pow distribution closely spaced observations are much more likely to occur. It is known that MSP is sensitive to closely spaced observations.
It should be noted that estimating all four parameters in the Kum-G families of distributions may result in inconsistent estimates. In addition, it can be observed that the estimates are highly dependent on the initial values which may also lead to convergence problems. For this reason when applying these families of distributions to real data, we preferred to use genetic algorithms for estimating the parameters.
Application to real data
Wind energy is an important alternative to conventional energy resources. Therefore, one may find many studies related to modeling wind characteristics such as wind speed in order to estimate the potential for use in generating energy. It can be observed that distributions such as Weibull, Rayleigh, gamma, lognormal, beta, Burr, and inverse Gaussian distributions are used in modeling wind speed frequencies [17]. As noted before, the two additional parameters in the Kum-G distribution families may provide more flexibility in modeling. For example, the Kum-Weibull family of distributions include the Weibull and Rayleigh distributions as special cases. Motivated by this fact the Kum-Weibull, Kum-Pareto and Kum-Power families of distributions are applied to model wind speed frequencies for a particular location, Cide, in Turkey. The data represent daily average wind speed measurements at the given location for January 2016 and are obtained from the Turkish State Meteorological Service.
Parameter estimates fitted to wind data
Model | \(\hat{a}\) | \(\hat{b}\) | \(\hat{\theta }_1^*\) | \(\hat{\theta }_2^*\) | \(\chi ^2\) statistic | p value |
---|---|---|---|---|---|---|
Weibull (MLE) | 1 | 1 | 4.366 | 2.774 | 0.09872 | 0.8943 |
Kum-W (MSP) | 4.814 | 9.211 | 0.482 | 4.993 | 0.078944 | 0.982 |
Kum-Pow (MSP) | 1.599 | 7.974 | 1.311 | 2.721 | 0.073136 | 0.9921 |
Conclusion
Tamandi and Nadarajah [16] considered parameter estimation of Kum-Weibull, Kum-Normal and Kum-InverseNormal distributions. They stated that for these distributions, in general, the MSP method results in smaller bias and MSEs for small sample sizes. It should be noted that in these distributions no parameter-dependent boundaries exist, that is the domain of the random variable is independent of the parameters. In this study, we considered three Kum-G distributions, all with different characteristics. The Kum-Par and Kum-Pow distributions both have parameter-dependent bounds and may model different distributions. In addition, we applied these families of distributions to model real data for wind speed measurements.
The computations in the simulations and in application to real data have shown that the MSP method, in general, outperforms the ML method. Also, we have seen that in the ML method the initial values for parameters may cause the algorithms to stop before reaching any feasible parameter estimate. Thus, in general the ML approach is sensitive to initial values leading to convergence problems. In contrast, the MSP method, in general, seems to give more consistent results. Therefore, to model wind speed we have preferred to use genetic algorithms with the MSP approach in order to obtain parameter estimates for the Kum-W and Kum-Pow families of distributions.
References
- 1.Abbasbandy, S., Shivanian, E.: Numerical solution based on meshless technique to study the biological population model. Math. Sci. 12, 123–130 (2016)CrossRefGoogle Scholar
- 2.Bourguignon, M., Silva, R.B., Zea, L.M., Cordeiro, G.M.: The Kumaraswamy Pareto distribution. J. Stat. Theory Appl. 12, 129–144 (2013)MathSciNetCrossRefGoogle Scholar
- 3.Chang, T.P.: Performance comparison of six methods in estimating Weibull parameters for wind energy application. Appl. Energy 88, 272–282 (2011)CrossRefGoogle Scholar
- 4.Cheng, R.C.H., Amin, N.: Estimating parameters in continuous univariate distributions with a shifted origin. J. R. Stat. Soc. B 45, 394–403 (1980)MathSciNetMATHGoogle Scholar
- 5.Cheng, R.C.H., Stephens, M.A.: A goodness-of-fit test using Moran’s statistic with estimated parameters. Biometrika 76, 386–392 (1989)MathSciNetCrossRefMATHGoogle Scholar
- 6.Cordeiro, G.M., Castro, M.: A new family of generalized distributions. J. Stat. Comput. Simul. 81, 883–898 (2011)MathSciNetCrossRefMATHGoogle Scholar
- 7.Ekström, M.: Consistency of generalized maximum spacing estimates. Scand. J. Stat. 28, 343–354 (2001)MathSciNetCrossRefGoogle Scholar
- 8.Ekström, M.: Alternatives to maximum likelihood estimation based on spacing and the Kullback–Leibler divergence. J. Stat. Plan. Inference 138, 1778–1791 (2008)MathSciNetCrossRefMATHGoogle Scholar
- 9.Gupta, R.C., Kannan, N., Raychaudhari, A.: The Kumaraswamy Weibull distribution with application to failure data. J. Frankl. Inst. 347, 1399–1429 (1997)MathSciNetGoogle Scholar
- 10.Jones, M.C.: A beta-type distribution with some tractability advantages. Stat. Methodol. 12, 70–81 (2009)MathSciNetCrossRefMATHGoogle Scholar
- 11.Kumaraswamy, P.: Generalized probability density-function for double-bounded random-processes. J. Hydrol. 462, 79–88 (1980)CrossRefGoogle Scholar
- 12.Oguntunde, P.E., Odetunmibi, O., Okagbue, H.I.: The Kumaraswamy-power distribution: a generalization of the power distribution. Int. J. Math. Anal. 9, 637–645 (2015)CrossRefGoogle Scholar
- 13.Ranneby, B.: The maximum spacing method. An estimation method related to the maximum likelihood method. Scand. J. Stat. 11, 93–112 (1984)MathSciNetMATHGoogle Scholar
- 14.R Core Team.: R: A language and environment for statistical computing, R Foundation for Statistical Computing, Vienna, Austria. 2016. https://www.R-project.org/
- 15.Scrucca, L. GA: A Package for Genetic Algorithms in R. J. Stat. Softw. 53, 1–37 (2013). http://www.jstatsoft.org/v53/i04/
- 16.Tamandi, M., Nadarajah, S.: On the estimation of parameters of Kumaraswamy-G distributions. Commun. Stat. Simul. Comput. (2014). doi:10.1080/03610918.2014.957840
- 17.Vaishali, S., Gupta, S., Nema, R.: comparative analysis of wind speed probability distributions for wind power assessment of four sites. Turk. J. Electr. Eng. Comput. Sci. 24, 4724–4735 (2016)CrossRefGoogle Scholar
- 18.Vajargah, K.F., Shoghi, M.: Simulation of stochastic differential equation of geometric Brownian motion by quasi-Monte Carlo method and its application in prediction of total index of stock market and value at risk. Math. Sci. 9, 115–125 (2015)CrossRefGoogle Scholar
Copyright information
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.