Abstract
A risk function based on the mean square error of prediction is a widely used measure of the goodness of a candidate model in model selection. A modified \(C_p\) criterion referred to as an \(MC_p\) criterion is an unbiased estimator of the risk function. The original \(MC_p\) criterion was proposed by Fujikoshi and Satoh (1997) in multivariate linear regression models. Thereafter, many authors have proposed \(MC_p\) criteria for various candidate models. A purpose of this paper is to propose an \(MC_p\) criterion for a wide class of candidate models, including results in previous studies.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Fujikoshi, Y., Satoh, K.: Modified AIC and \(C_p\) in multivariate linear regression. Biometrika 84, 707–716 (1997). https://doi.org/10.1093/biomet/84.3.707
Harville, D.A.: Matrix Algebra from a Statistician’s Perspective. Springer, New York (1997). https://doi.org/10.1007/b98818
Mallows, C.L.: Some comments on \(C_p\). Technometrics 15, 661–675 (1973). https://doi.org/10.2307/1267380
Mallows, C.L.: More comments on \(C_p\). Technometrics 37, 362–372 (1995). https://doi.org/10.2307/1269729
Potthoff, R.F., Roy, S.N.: A generalized multivariate analysis of variance model useful especially for growth curve problems. Biometrika 51, 313–326 (1964). https://doi.org/10.2307/2334137
Satoh, K., Kobayashi, M., Fujikoshi, Y.: Variable selection for the growth curve model. J. Multivar. Anal. 60, 277–292 (1997). https://doi.org/10.1006/jmva.1996.1658
Sparks, R.S., Coutsourides, D., Troskie, L.: The multivariate \(C_p\). Comm. Stat. Theor. Meth. 12, 1775–1793 (1983). https://doi.org/10.1080/03610928308828569
Yanagihara, H., Satoh, K.: An unbiased \(C_p\) criterion for multivariate ridge regression. J. Multivar. Anal. 101, 1226–1238 (2010). https://doi.org/10.1016/j.jmva.2009.09.017
Yanagihara, H., Nagai, I., Satoh, K.: A bias-corrected \(C_p\) criterion for optimizing ridge parameters in multivariate generalized ridge regression. Jpn. J. Appl. Stat. 38, 151–172 (2009) (in Japanese). https://doi.org/10.5023/jappstat.38.151
Acknowledgments
The authors wish to thank two reviewers for their helpful comments. This research was supported by JSPS Bilateral Program Grant Number JPJSBP 120219927 and JSPS KAKENHI Grant Number 20H04151.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Appendix: Mathematical Details
Appendix: Mathematical Details
1.1 A.1 The Proof of Lemma 1
Let \({\boldsymbol{\mathcal E}}\) be an \(n\times p\) random matrix defined by
Notice that
where \(\hat{{\boldsymbol{Y}}}\) is given by (4), and \(\mathcal L=\text{ tr }\{({\boldsymbol{Y}}_\textrm{F}-{\boldsymbol{M}}{\boldsymbol{Y}}){\boldsymbol{\varOmega }}^2({\boldsymbol{M}}{\boldsymbol{Y}}-\hat{{\boldsymbol{Y}}})'\}\). Since \({\boldsymbol{Y}}_\textrm{F}\) is independent of \({\boldsymbol{Y}}\) and distributed according to the same distribution as \({\boldsymbol{Y}}\), and \({\boldsymbol{M}}\) satisfies (2), we have
Notice that \({\boldsymbol{Y}}_\textrm{F}\), \({\boldsymbol{M}}{\boldsymbol{\mathcal E}}\), and \({\boldsymbol{G}}\) in (3) are mutually independent, and
The above equations and (2) imply
where \(\mathcal L_{1}=\text{ tr }({\boldsymbol{\mathcal E}}'{\boldsymbol{H}}{\boldsymbol{\mathcal E}}{\boldsymbol{\varOmega }}{\boldsymbol{G}}'{\boldsymbol{\varOmega }}^{-1})\). Since \({\boldsymbol{\mathcal E}}'{\boldsymbol{H}}{\boldsymbol{\mathcal E}}\) is independent of \({\boldsymbol{G}}\) and \(\text{ tr }({\boldsymbol{G}})=q\), the following equation is obtained.
From (A.2), (A.3), (A.4), and (A.5), Lemma 1 is proved.
1.2 A.2 The Proof of Lemma 2
Notice that \(\text{ tr }\{({\boldsymbol{Y}}-{\boldsymbol{M}}{\boldsymbol{Y}}){\boldsymbol{S}}^{-1}({\boldsymbol{M}}{\boldsymbol{Y}}-\hat{{\boldsymbol{Y}}})'\}=0\) because of
It follows from this result and a calculation similar to the one used to find (A.2) that
It is easy to see from the definition of \({\boldsymbol{S}}\) in (3) that
Substituting (A.7) into (A.6) yields
By (7), (A.8), and the definition of the bias, Lemma 2 is proved.
1.3 A.3The Proof of Lemma 3
Let \({\boldsymbol{W}}\) be a \(p\times p\) random matrix defined by
where \({\boldsymbol{\varOmega }}\) is given by (A.1). Then, \({\boldsymbol{W}}\sim \mathcal W_p(n-m,{\boldsymbol{I}}_p)\). Let \({\boldsymbol{Q}}_2\) be a \(p\times (p-q)\) matrix satisfying \({\boldsymbol{Q}}_2'{\boldsymbol{Q}}_2={\boldsymbol{I}}_{p-q}\) and \({\boldsymbol{Q}}_1'{\boldsymbol{Q}}_2={\boldsymbol{O}}_{q,p-q}\), where \({\boldsymbol{Q}}_1\) is given by (10), and let \({\boldsymbol{Q}}\) be the \(p\times p\) orthogonal matrix defined by \({\boldsymbol{Q}}=({\boldsymbol{Q}}_1,{\boldsymbol{Q}}_2)\). Then, the three matrices \({\boldsymbol{C}}_1\), \({\boldsymbol{C}}_2\), and \({\boldsymbol{C}}_3\) given in (9) can be rewritten as
Let \({\boldsymbol{V}}\) be a \(p\times p\) random matrix defined by \({\boldsymbol{V}}={\boldsymbol{Q}}'{\boldsymbol{W}}{\boldsymbol{Q}}\), and let \({\boldsymbol{Z}}_1\) and \({\boldsymbol{Z}}_2\) be respectively \((n-m)\times q\) and \((n-m)\times (p-q)\) independent random matrices distributed as \(({\boldsymbol{Z}}_1,{\boldsymbol{Z}}_2)\sim \mathcal N_{(n-m)\times p}({\boldsymbol{O}}_{n-m,p},{\boldsymbol{I}}_{(n-m)p})\). By using \({\boldsymbol{V}}\sim \mathcal W_p(n-m,{\boldsymbol{I}}_p)\), the partitioned \({\boldsymbol{V}}\) can be rewritten in \({\boldsymbol{Z}}_1\) and \({\boldsymbol{Z}}_2\) as
Hence, it follows from \({\boldsymbol{V}}^{-1}={\boldsymbol{Q}}'{\boldsymbol{W}}^{-1}{\boldsymbol{Q}}\) and the general formula for the inverse of a partitioned matrix, e.g., th. 8.5.11 in [2], that
where \({\boldsymbol{V}}_{11\cdot 2}={\boldsymbol{V}}_{11}-{\boldsymbol{V}}_{12}{\boldsymbol{V}}_{22}^{-1}{\boldsymbol{V}}_{12}'\). Substituting (A.11) into (A.10) yields
By using the independence of \({\boldsymbol{Z}}_1\) and \({\boldsymbol{Z}}_2\), and formulas of expectations of the matrix normal distribution and Wishart distribution, we have
where \(c_1\) and \(c_2\) are given by (11). It follows from the result \(E[{\boldsymbol{Z}}_1{\boldsymbol{Z}}_1']=q{\boldsymbol{I}}_{n-m}\) and the independence of \({\boldsymbol{Z}}_1\) and \({\boldsymbol{Z}}_2\) that
where \(c_3\) is given by (11). By using the above expectations, (A.12), and \({\boldsymbol{Q}}_2{\boldsymbol{Q}}_2'={\boldsymbol{I}}_p-{\boldsymbol{Q}}_1{\boldsymbol{Q}}_1'\), we have
Therefore, Lemma 3 is proved.
1.4 A.4 The Proof of Lemma 4
Let \({\boldsymbol{U}}_0\), \({\boldsymbol{U}}_1\) and \({\boldsymbol{U}}_2\) be \(p\times p\) symmetric random matrices defined by
where \({\boldsymbol{\varOmega }}\) is given by (A.1). It follows from the definitions of \({\boldsymbol{C}}_1\), \({\boldsymbol{C}}_2\), and \({\boldsymbol{C}}_3\) in (9) that
By using these results, the definitions of d and \(\hat{d}\) in (5), and the assumptions of \({\boldsymbol{M}}\) and \({\boldsymbol{H}}\) in (2), \(d({\boldsymbol{M}}{\boldsymbol{Y}},\hat{{\boldsymbol{Y}}})\) and \(\hat{d}({\boldsymbol{M}}{\boldsymbol{Y}},\hat{{\boldsymbol{Y}}})\) in (8) can be rewritten as
where \({\boldsymbol{W}}\) is given by (A.9). Notice that \({\boldsymbol{U}}_0\) and \({\boldsymbol{S}}\), \({\boldsymbol{U}}_1\) and \({\boldsymbol{S}}\), and \({\boldsymbol{U}}_2\) and \({\boldsymbol{S}}\) are independent because of \({\boldsymbol{M}}({\boldsymbol{I}}_n-{\boldsymbol{M}})={\boldsymbol{O}}_{n,n}\) and \({\boldsymbol{H}}({\boldsymbol{I}}_n-{\boldsymbol{M}})={\boldsymbol{O}}_{n,n}\), and \({\boldsymbol{C}}_1\), \({\boldsymbol{C}}_2\), and \({\boldsymbol{C}}_3\) are random matrices in which \({\boldsymbol{S}}\) is the only random variable. These imply that
where \({\boldsymbol{\varDelta }}_0=E[{\boldsymbol{U}}_0]\), \({\boldsymbol{\varDelta }}_1=E[{\boldsymbol{U}}_1]\), and \({\boldsymbol{\varDelta }}_2=E[{\boldsymbol{U}}_2]\). Notice that \((n-m)E[{\boldsymbol{W}}^{-1}]=c_1{\boldsymbol{I}}_p\), where \(c_1\) is given by (11). By using this result and Lemma 3, we have
where \(c_2\) and \(c_3\) are given by (11). Let \(L=\text{ tr }\{{\boldsymbol{U}}_1((n-m){\boldsymbol{W}}^{-1}-{\boldsymbol{C}}_2)\}\). It is easy to see that \(c_2^{-1}=c_1^{-1}+c_2^{-1}c_3\) and \(E[L]=c_1\text{ tr }({\boldsymbol{\varDelta }}_1)-\text{ tr }({\boldsymbol{\varDelta }}_1 E[{\boldsymbol{C}}_2])\). Hence, we can derive
A simple calculation shows that \(c_2^{-1}c_3=q/(n-m)\) and \(L=\text{ tr }({\boldsymbol{R}})\), where \({\boldsymbol{R}}\) is given by (12). Using these results, (A.13), and (A.14) yields
Consequently, Lemma 4 is proved.
1.5 A.5 The Proof of Theorem 1
From Lemmas 1 and 4, an unbiased estimator of \(R_p\) in (6) can be given by
where \(\hat{D}\) is given by (13). Notice that \(\{1-(p+1)/(n-m)\}(n-m)p=(n-m)p-p(p+1)\). Hence, the result and (A.8) imply that
Substituting (A.16) into (A.15) yields that an unbiased estimator \(\hat{R}_p\) coincides with \(MC_p\) in (14).
1.6 A.6 The Proof of Equation in (15)
The \(\hat{d}({\boldsymbol{M}}{\boldsymbol{Y}},\hat{{\boldsymbol{Y}}})\) when \({\boldsymbol{H}}={\boldsymbol{M}}\) can be rewritten as
Since \({\boldsymbol{P}}={\boldsymbol{S}}^{1/2}({\boldsymbol{I}}_p-{\boldsymbol{G}}){\boldsymbol{S}}^{-1/2}\) is a symmetric idempotent matrix, we have
This implies that \(\hat{d}({\boldsymbol{M}}{\boldsymbol{Y}},\hat{{\boldsymbol{Y}}})=\text{ tr }\{{\boldsymbol{Y}}'{\boldsymbol{M}}{\boldsymbol{Y}}({\boldsymbol{I}}_p-{\boldsymbol{G}}){\boldsymbol{S}}^{-1}\}=\text{ tr }({\boldsymbol{R}})\). From this result and (A.8), \(\hat{d}({\boldsymbol{Y}},\hat{{\boldsymbol{Y}}})=(n-m)p+\text{ tr }({\boldsymbol{R}})\) is derived.
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Yanagihara, H., Nagai, I., Fukui, K., Hijikawa, Y. (2023). Modified \(C_p\) Criterion in Widely Applicable Models. In: Czarnowski, I., Howlett, R., Jain, L.C. (eds) Intelligent Decision Technologies. KESIDT 2023. Smart Innovation, Systems and Technologies, vol 352. Springer, Singapore. https://doi.org/10.1007/978-981-99-2969-6_15
Download citation
DOI: https://doi.org/10.1007/978-981-99-2969-6_15
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-2968-9
Online ISBN: 978-981-99-2969-6
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)