Penalty Function Methods

Choi, ByoungSeon

doi:10.1007/978-1-4613-9745-8_3

ByoungSeon Choi⁴

Part of the book series: Springer Series in Statistics ((SSS))

475 Accesses

Abstract

Since the early 1970s, some estimation-type identification procedures have been proposed. They are to choose the orders k and i minimizing

$$P(k,i) = {\text{ln}}{\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\smile}$}}{\sigma }}\mathop{{k,i}}\limits^{2} + (k + i)\frac{{C(T)}}{T}$$

, where σ ²_k,i is an estimate of the white noise variance obtained by fitting the ARMA(k, i) model to the observations. Because σ ²_k,i decreases as the orders increase, it cannot be a good criterion to choose the orders minimizing it. If the orders increase, the bias of the estimated model will decrease while the variance increases. Therefore, we should compromise between them. For this purpose we add the penalty term, (k + i)C(T)/T, into the model selection criterion The penalty function identification methods are regarded as objective.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

eBook: USD 16.99; Price excludes VAT (USA)

Softcover Book: USD 16.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Additional References

About choosing upper bounds of the AR and the MA orders, readers may refer to An, Chen, and Hannan (1982), Hannan and Rissanen (1982), Hannan and Kavalieris (1984b, 1986a), Poskitt (1987), Hannan and Deistler (1988), and the references therein.
Google Scholar
The FPE procedure has been used for statistical modeling beyond AR order determination. McClave (1975) utilized Hocking and Leslies’ subset regression technique (1967) with the FPE for AR model identification. The FPE procedure was studied for vector AR processes by Akaike (1971), Reinsel (1980, 1983), and Jones (1976), and for more general stochastic processes by Baillie (1979b) and Toyooka (1982). Hsiao (1979) used it for Granger causality tests. Akaike (1969b, 1970b ), Gersch and Sharpe (1973), and Jones (1974) used the AR model and the FPE criterion to estimate spectral densities. Their numerical examples have shown that the MFPEE and the YW estimates of AR coefficients result in reasonable spectral density estimates. However, some examples in disagreement were presented by Marple (1980).
Google Scholar
For more details about the asymptotic mean square error of a multi-step ahead predictor, readers may refer to Box and Jenkins (1976, p. 267), Bloomfield (1972), Bhansali (1978), Schmidt (1974), Janacek (1975), Yamamoto (1976, 1981), Baillie (1979a), Davies and Newbold (1980), Reinsel (1980), Shibata (1980), Ledolter and Abraham (1981), Puller and Hasza (1981), Newton and Pagano (1983), Fotopoulos and Ray (1983), and the references therein.
Google Scholar
Readers who are interested in Sanov’s theorem may refer to Bahadur and Zabell (1979), Vincze (1982), Deuschel and Strook (1989), and the references therein.
Google Scholar
The conditional probability characterization of the Kullback-Leibler information number has been discussed by Vasicek (1980), Csiszâr, Cover and Choi (1987), and Choi (1991b).
Google Scholar
The AIC was used to select optimal models in many fields of statistics. Akaike (1971-1983), Gersch and Kitagawa (1983), and others utilized it to determine the orders of ARMA processes. Kitagawa (1981) applied the AIC to model fitting for nonstationary time series. Kozin and Nakajima (1980) used the AIC for time-varying AR models. Gabr and Subba Rao (1981) applied it to bilinear time series models. Jones (1974), Sakai (1981), Quinn (1980b, 1988), and Paulsen and Tjostheim (1985) proposed using the AIC for determining the orders of vector AR processes. It was also used in factor analysis by Akaike (1972b, 1975) and Tong (1975a), in regression analysis by Sawa (1978) and Shibata (1981a, 1984), in the analysis of Markov processes by Tong (1975b), in the analysis of distributed lag model by Tong (1976), in the analysis of covariance by Akaike (1977a), in signal processing analysis by Tong (1975a, 1977) and Findley (1984), and for determining the histogram width by Taylor (1987). Other possible applications have been suggested by Akaike (1973a, 1977a) and Sugiura (1978).
Google Scholar
For more details of the asymptotic distribution of the MAICE, refer to Hannan and Deistler (1988, Section 5.6). Sakai (1981), Paulsen and Tjostheim (1985), and Quinn (1988) derived the asymptotic distribution of the MAICE for vector AR processes.
Google Scholar
Some illustrative examples of the CAT procedure were given by Parzen (1979a, 1979b, 1980a) and Parzen and Pagano (1979). The CAT for vector AR processes has been proposed by Parzen (1977) and Parzen and Newton (1980).
Google Scholar
For E. J. Hannan’s opinion about the AIC, refer to Hannan and Quinn (1979, p. 195) and Hannan (1980b, p. 1072 ); ( 1982, p. 411 ).
Google Scholar
For the HQC method, readers may also refer to Heyde and Scott (1973) and Bai, Subramanyam, and Zhao (1988). Quinn (1980b) has generalized the MHQCE to vector AR models and has shown its strong consistency.
Google Scholar
For more details about the consistency problem of the penalty function methods, refer to Hannan (1981), Hannan and Deistler ( 1988, Section 5.4), An and Chen (1986), and the references therein. Potscher (1990) has shown that if f is the estimate of r = max(p, q) having the first “local” minimum of the BIC under the assumption k = i.
Google Scholar
Rissanen (1986b) has derived Rissanen’s lower bound using coding theory. Kabaila (1987) has shown that under some fairly strong restrictions it can be derived via the Cramer-Rao lower bound or the Fisher bound on asymptotic variances for the case of Gaussian AR processes. Also, refer to Hannan, McDougall, and Poskitt (1989).
Google Scholar
There have been some recent advances in using cross-validation procedures in time series analysis. Some applications have been considered by Geisser and Eddy (1979), Bessler and Binkley (1980), and Hjorth and Holmqvist (1981). Hurvich and Beltrao (1990) have presented the cross-validated log-likelihood criterion, which can be viewed as a cross-validatory generalization of the AIC. Also, refer to Hurvich and Zeger (1990). Stoica, Eykhoff, Janssen, and Soderstrom (1986) have presented another cross-validation method, which yields asymptotically the same result as the BIC procedure. Also, refer to Jong (1988).
Google Scholar
Tjostheim and Paulsen (1985) have applied the penalty function identification methods to a particular nonstationary AR process, where the variance of the innovation process depends on time.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Applied Statistics, Yonsei University, 120-749, Seoul, Korea
ByoungSeon Choi

Authors

ByoungSeon Choi
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Choi, B. (1992). Penalty Function Methods. In: ARMA Model Identification. Springer Series in Statistics. Springer, New York, NY. https://doi.org/10.1007/978-1-4613-9745-8_3

Download citation

DOI: https://doi.org/10.1007/978-1-4613-9745-8_3
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4613-9747-2
Online ISBN: 978-1-4613-9745-8
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics