Skip to main content
Log in

Bayesian estimation for misclassification rate in linear discriminant analysis

  • Original Paper
  • Published:
Japanese Journal of Statistics and Data Science Aims and scope Submit manuscript

Abstract

We consider discriminant analysis in the case of two multivariate normal populations with common covariance matrices. Estimation of misclassification rates plays an important role in discriminant analysis. In this paper, we consider Bayesian estimation for the misclassification rate associated with a population linear discriminant, referred to as the optimal error rate. The optimal error rate is an unavoidable error, which is different from the misclassification rate associated with the sample linear discriminant. We derive an explicit expression of the Bayes estimator of the optimal error rate. In general, the expression is somewhat complicated. However, the estimator is simply expressed under some conditions. In addition, approximations for the Bayes estimator are suggested based on the approximate posterior predictive distribution of the population linear discriminant. The performance of the suggested approximations is investigated by simulation. Finally, we apply the obtained results to estimate the optimal error rate for Fisher’s iris data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Availability of data and material

Not applicable.

Code availability

R Version 1.1.456.

References

  • Abramowitz, M., & Stegun, I. A. (1965). Handbook of mathematical functions : with formulas, graphs, and mathematical tables. Dover Publications.

  • Barndorff-Nielsen, O., Kent, J., & Sorensen, M. (1982). Normal variance-mean mixtures and \(z\) distributions. International Statistical Review, 50(2), 145–159.

    Article  MathSciNet  Google Scholar 

  • Bodnar, T., Mazur, S., Muhinyuza, S., & Parolya, N. (2018). On the product of a singular Wishart matrix and a singular Gaussian vector in high dimension. Theory of Probability and Mathematical Statistics, 99, 37–50.

    MathSciNet  MATH  Google Scholar 

  • Bodnar, T., Mazur, S., Ngailo, E., & Parolya, N. (2020). Discriminant analysis in small and large dimensions. Theory of Probability and Mathematical Statistics, 100, 21–41. https://doi.org/10.1090/tpms/1096

    Article  MathSciNet  MATH  Google Scholar 

  • Bodnar, T., Mazur, S., & Okhrin, Y. (2014). Distribution of the product of singular Wishart matrix and normal vector. Theory of Probability and Mathematical Statistics, 91, 1–15.

    Article  MathSciNet  Google Scholar 

  • Bodnar, T., Mazur, S., & Podgórski, K. (2016). Singular inverse Wishart distribution and its application to portfolio theory. Journal of Multivariate Analysis, 143, 314–326.

    Article  MathSciNet  Google Scholar 

  • Bodnar, T., & Okhrin, Y. (2008). Properties of the singular, inverse and generalized inverse partitioned Wishart distributions. Journal of Multivariate Analysis, 99, 2389–2405.

    Article  MathSciNet  Google Scholar 

  • Bowker, A. (1961). Representation of Hotelling’s T2 and Anderson’s classification statistics \(W\) in terms of simple statistics. In H. Solomon (Ed.), Studies in Item analysis and prediction (pp. 285–292). Stanford University Press.

  • Dalton, L. A., & Dougherty, E. R. (2011). Bayesian minimum mean-square error estimation for classification error-part II: The Bayesian MMSE error estimator for linear classification of Gaussian distributions. IEEE Transactions on signal processing, 59(1), 130–144. https://doi.org/10.1109/TSP.2010.2084573.

    Article  MathSciNet  MATH  Google Scholar 

  • Díaz-García, J. A., Gitiérrez-Jáimez, R., & Mardia, K. V. (1997). Wishart and pseudo-Wishart distributions and some applications to shape theory. Journal of Multivariate Analysis, 63, 73–87.

    Article  MathSciNet  Google Scholar 

  • Fatti, L. P. (1983). The random-effects model in discriminant analysis. Journal of the American Statistical Association, 78(383), 679–687. https://doi.org/10.1080/01621459.1983.10478029

    Article  MathSciNet  MATH  Google Scholar 

  • Geisser, S. (1967). Estimation associated with linear discriminants. The Annals of Mathematical Statistics, 38(3), 807–817.

    Article  MathSciNet  Google Scholar 

  • Geisser, S. (1982). Bayesian discrimination. In P. R. Krishnaiah, & L. N. Kanal (Eds.), Handbook of statistics: Classification, pattern recognition, and reduction of dimensionality (Vol 2., pp. 101–120). North-Holland Publish Company.

  • Gradshteyn, I. S., & Ryzhik, I. M. (2007). Table of integrals, series, products. Elsevier.

  • Hall, P. (1992). The bootstrap and Edgeworth expansion. Springer.

  • Javed, F., Loperdo, N., & Mazur, S. (2021). Edgeworth expansions for multivariate random sums. Econometrics and Statistics. https://doi.org/10.1016/j.ecosta.2021.04.005.

    Article  Google Scholar 

  • Muirhead, R. J. (1982). Aspects of multivariate statistical theory. Wiley.

  • Okamoto, M. (1963). An asymptotic expansion for the distribution of the linear discriminant function. Annals of Mathematical Statistics, 34, 1286–1301.

    Article  MathSciNet  Google Scholar 

  • Siotani, M., & Wang, R. H. (1977). Asymptotic expansions for error rates and comparison of the \(W\)-procedure and the \(Z\)-procedure in discriminant analysis. In P. R. Krishnaiah (Eds. ), Multivariate Analysis IV: proceedings of the fourth International Symposium on Multivariate Analysis (pp. 523–545). Elsevier/North Holland.

  • Srivastava, M. S. (2003). Singular Wishart and multivariate beta distributions. The Annals of Statistics, 31(5), 1537–1560.

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

We wish to express our appreciation to two anonymous referees for their insightful comments on our paper. The comments have helped us significantly improve the paper.

Funding

Not applicable.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Koshiro Yonenaga.

Ethics declarations

Conflict of interest

On behalf of all authors, the corresponding author states that there is no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Appendix

Proof

We only show the proof of Theorem 1 (a), because this proof can be applied to the proof of Theorem 1 (b) in an obvious way. When a new observation is from \(\pi _1\), the conditional distribution of U given \(\varvec{\mu }_1\), \(\varvec{\mu }_2\) and \(\varvec{\varSigma }\) is \(U | \varvec{\mu }_1, \varvec{\mu }_2, \varvec{\varSigma } , \mathbf{y} \in \pi _1 \sim N ( \varDelta ^2/2, \varDelta ^2)\) with \(\varDelta ^2 = (\varvec{\mu }_1 - \varvec{\mu }_2)' \varvec{\varSigma } ^{ - 1 } (\varvec{\mu }_1 - \varvec{\mu }_2)\). Define \( v = \varDelta ^2/c \) with \(c = N_1^{ - 1 } + N_2^{ - 1 }\). Since

$$\begin{aligned} \frac{ 1 }{ \sqrt{ c } }\varvec{\varSigma }^{-1/2}( \varvec{\mu }_1 - \varvec{\mu }_2) | \varvec{\varSigma } \sim N_p\left( \frac{ 1 }{ \sqrt{c} } \varvec{\varSigma } ^{-1/2}( \bar{ \mathbf{x} }_1 - \bar{ \mathbf{x} }_2), \varvec{I}_p \right) , \end{aligned}$$

the conditional distribution of \(v | \varvec{\varSigma } \) is the non-central \(\chi ^2\) distribution with non-centrality parameter \( \delta = (\bar{ \mathbf{x} }_1 - \bar{ \mathbf{x} }_2 )' \varvec{\varSigma } ^{ - 1 } (\bar{ \mathbf{x} }_1 - \bar{ \mathbf{x} }_2 )/c. \) This is expressed as

$$\begin{aligned} v | \varvec{\varSigma } \sim \chi ^2_p \left( \frac{Q}{nc}z \right) , \end{aligned}$$

where

$$\begin{aligned} Q = (\bar{ \mathbf{x} }_1 - \bar{ \mathbf{x} }_2 )' \varvec{S}^{ - 1 } (\bar{ \mathbf{x} }_1 - \bar{ \mathbf{x} }_2 ),~ z = \frac{(\bar{ \mathbf{x} }_1 - \bar{ \mathbf{x} }_2 )' \varvec{\varSigma } ^{ - 1 } (\bar{ \mathbf{x} }_1 - \bar{ \mathbf{x} }_2) }{(\bar{ \mathbf{x} }_1 - \bar{ \mathbf{x} }_2 )' (n\varvec{S})^{ - 1 } (\bar{ \mathbf{x} }_1 - \bar{ \mathbf{x} }_2 )}. \end{aligned}$$

Since z is distributed as \(\chi ^2_m\) from Theorem 3.2.8 of Muirhead (1982), the unconditional distribution of U is given by

$$\begin{aligned} f ( u| \mathbf{y} \in \pi _1 )&{=} \int _0^\infty \int _0^\infty \frac{ 1 }{ \sqrt{ c v } } \phi \left( \frac{ u - cv/2 }{ \sqrt{ c v } } \right) g_p( v ) \exp \left( - \frac{ Q }{ 2nc } z \right) {_0}F_1 \left( \frac{ p }{ 2 } ; \frac{ Q }{ 4nc } v z \right) \\&\quad \times g_m( z ) \mathrm{{d}}v \mathrm{{d}}z. \end{aligned}$$

Using Lemma 1.3.3 of Muirhead (1982), the desired result follows immediately. \(\square \)

Proof of Theorem 2.

Proof

We derive the first four moments of U using the expression of density given in Theorem 1. If we define

$$\begin{aligned} W( k )&= \left( \frac{ nc }{ nc + Q } \right) ^{ n / 2 } \int _0^\infty \frac{ v^{ p/2 + k - 1 } e^{ - v/2 } }{ 2^{ p/2 } \varGamma ( p/2 ) } {_1}F_1 \left( \frac{ n }{ 2 } ; \frac{ p }{ 2 } ; \frac{ Q }{ 2 ( nc + Q ) } v \right) \mathrm{{d}}v, \end{aligned}$$
(12)

for a non-negative number k, then the first four moments of U can be represented by

$$\begin{aligned} \mathrm {E}(U)&= \frac{ c }{ 2 } W( 1 ), \nonumber \\ \mathrm {E}(U^2)&= \frac{ c^2 }{ 4 } W( 2 ) + c W( 1 ), \nonumber \\ \mathrm {E}(U^3)&= \frac{ c^3 }{ 8 } W( 3 ) + \frac{ 3c^2 }{ 2 } W( 2 ), \nonumber \\ \mathrm {E}(U^4)&= \frac{ c^4 }{ 16 } W( 4 ) + \frac{ 3c^3 }{ 2 } W( 3 ) + 3c^2 W( 2 ). \end{aligned}$$
(13)

Hence, the central moments of U are expressed as

$$\begin{aligned} \mathrm {V}(U)&= \frac{ c^2 }{ 4 } ( W( 2 ) - W( 1 )^2 ) + c W( 1 ), \end{aligned}$$
(14)
$$\begin{aligned} \mathrm {E}(U - \mathrm {E}(U) )^3&= \frac{ c^3 }{ 8 } W( 3 ) + \frac{ 3c^2 }{ 2 } W( 2 ) - \frac{ 3c^3 }{ 8 } W( 2 )W( 1 ) \nonumber \\&\quad - \frac{ 3c^2 }{ 2 } W( 1 )^2 + \frac{ c^3 }{ 4 } W( 1 )^3, \end{aligned}$$
(15)
$$\begin{aligned} \mathrm {E}(U - \mathrm {E}(U) )^4&= \frac{ c^4 }{ 16 } W( 4 ) + \frac{ 3c^3 }{ 2 } W( 3 ) + 3c^2 W( 2 ) \nonumber \\&\quad - \frac{ c^4 }{ 4 } W( 3 )W( 1 ) - 3c^3 W( 2 )W( 1 ) + \frac{ 3c^4 }{ 8 } W( 2 )W( 1 )^2 \nonumber \\&\quad + \frac{ 3c^3 }{ 2 } W( 1 )^3 - \frac{ 3c^4 }{ 16 } W( 1 )^4. \end{aligned}$$
(16)

Applying Lemma 1.3.3 of Muirhead (1982) to (12),

$$\begin{aligned} W(k)&= \left( \frac{ nc }{ nc + Q } \right) ^{ n / 2 } \frac{ \varGamma ( p/2 + k ) }{ \varGamma ( p/2 ) } 2^k {_2}F_1 \left( \frac{ n }{ 2 }, \frac{ p }{ 2 } + k ; \frac{ p }{ 2 } ; \frac{ Q }{ nc + Q } \right) . \end{aligned}$$
(17)

Using equality

$$\begin{aligned} {_2}F_1 (a, b ; c ; z) = ( 1 - z )^{ -a} {_2}F_1 \left( a, c - b ; c ; \frac{ z }{ z - 1 } \right) \end{aligned}$$

in (17),

$$\begin{aligned} W(k)&= \frac{ \varGamma ( p/2 + k ) }{ \varGamma ( p/2 ) } 2^k {_2}F_1 \left( \frac{ n }{ 2 }, - k ; \frac{ p }{ 2 } ; -\frac{ Q }{ nc } \right) \nonumber \\&= \frac{ \varGamma ( p/2 + k ) }{ \varGamma ( p/2 ) } 2^k \sum _{r = 0}^k \frac{ ( n/2 )_r ( -k )_r }{ ( p/2 )_r } \frac{ 1 }{ r! } \left( - \frac{ Q }{ nc } \right) ^r. \end{aligned}$$
(18)

If we put \(k=\)1, 2, 3 and 4 into (18), then

$$\begin{aligned} W( 1 )&= p \left( 1 + \frac{ Q }{ pc } \right) , \\ W( 2 )&= p( p + 2 ) \left[ 1 + \frac{ 2Q }{ pc } + \frac{ n( n + 2 ) }{ p( p + 2 ) } \left( \frac{ Q }{ nc } \right) ^2 \right] , \\ W( 3 )&= p( p + 2 )( p + 4 ) \left[ 1 + \frac{ 3Q }{ pc } + \frac{ 3n( n + 2 ) }{ p( p + 2 ) } \left( \frac{ Q }{ nc } \right) ^2 + \frac{ n( n + 2 )( n + 4 ) }{ p( p + 2 )( p + 4 ) } \left( \frac{ Q }{ nc } \right) ^3 \right] , \\ W( 4 )&= p( p + 2 )( p + 4 )( p + 6 ) \left[ 1 + \frac{ 4Q }{ pc } + \frac{ 6n( n + 2 ) }{ p( p + 2 ) } \left( \frac{ Q }{ nc } \right) ^2 \right. \\&\quad \left. +4 \frac{ n( n + 2 )( n + 4 ) }{ p( p + 2 )( p + 4 ) } \left( \frac{ Q }{ nc } \right) ^3 + \frac{ n( n + 2 )( n + 4 )( n + 6 ) }{ p( p + 2 )( p + 4 )( p + 6 ) } \left( \frac{ Q }{ nc } \right) ^4 \right] . \end{aligned}$$

If we substitute these for (13), (14), (15) and (16), and then make arrangements of the equations, mean, variance, skewness and kurtosis of U can be obtained. \(\square \)

Proof of Theorem 3.

Proof

We only show the proof of Theorem 3\(\mathrm {(a)}\), because this proof can be applied to the proof of Theorem 3\(\mathrm {(b)}\) in an obvious way. If we integrate the posterior predictive density of U given in Theorem 1\(\mathrm {(a)}\) from \(-\infty \) to x, then

$$\begin{aligned} F( x )&= \left( \frac{ nc }{ nc + Q } \right) ^{n/2} \int \limits _0^\infty \varPhi \left( \frac{ x - cv / 2 }{ \sqrt{ cv } } \right) g_p(v){_1}F_1 \left( \frac{ n }{ 2 } ; \frac{ p }{ 2 } ; \frac{ Q }{ 2 ( nc + Q ) } v \right) \mathrm{{d}}v. \end{aligned}$$
(19)

If we apply an identity

$$\begin{aligned} {_1}F_1 ( a ; b ; x ) = e^x {_1}F_1 \left( b - a ; b ; -x \right) \end{aligned}$$

to (19), then

$$\begin{aligned} F( x )&= \left( \frac{ nc }{ nc + Q } \right) ^{n/2} \int _0^\infty \varPhi \left( \frac{ x - cv / 2 }{ \sqrt{ cv } } \right) g_p(v) \exp \left( \frac{ Q }{ 2( nc + Q ) } v \right) \\&\quad \times \sum _{k=0}^{ \infty } \frac{ \left( \frac{ p-n }{ 2 } \right) _k }{ \left( \frac{ p }{ 2 } \right) _k } \frac{ v^k }{ k! } \left( \frac{ - Q }{ 2( nc + Q ) } \right) ^k \mathrm{{d}}v \\&= \sum _{k=0}^{ \infty } \frac{ \left( \frac{ p-n }{ 2 } \right) _k }{ k! \left( \frac{ p }{ 2 } \right) _k } \left( \frac{ - Q }{ 2( nc + Q ) } \right) ^k \left( \frac{ nc }{ nc + Q } \right) ^{n/2} \\&\frac{ 1 }{ 2^{ \frac{ p }{ 2 } } \varGamma \left( \frac{ p }{ 2 } \right) } \int _0^{\infty } \varPhi \left( \frac{ x - cv / 2 }{ \sqrt{ cv } } \right) v^{ \frac{ p }{ 2 } + k - 1 } \exp \left( -\frac{ 1 }{ 2 } \left( 1 - \frac{ Q }{ nc + Q } \right) v \right) \mathrm{{d}}v. \end{aligned}$$

If we change the variable to \(nc ( nc + Q )^{ -1 } v = v^*\), then

$$\begin{aligned} F( x )&= \sum _{k=0}^{ \infty } \frac{ \left( \frac{ p-n }{ 2 } \right) _k }{ k! \left( \frac{ p }{ 2 } \right) _k } \left( \frac{ - Q }{ 2( nc + Q ) } \right) ^k \left( \frac{ nc }{ nc + Q } \right) ^{ \frac{ n - p }{ 2 } - k} \nonumber \\&\quad \frac{ 1 }{ 2^{ \frac{ p }{ 2 } } \varGamma \left( \frac{ p }{ 2 } \right) } \int _0^{\infty } \varPhi \left( \frac{ x - ( c + Q/n )v^* / 2 }{ \sqrt{ ( c + Q/n ) v^* } } \right) v^{*\frac{ p }{ 2 } + k - 1 } \exp \left( -\frac{ v^* }{ 2 } \right) \mathrm{{d}}v. \end{aligned}$$
(20)

If we apply equalities

$$\begin{aligned} (x)_n&= \frac{ \varGamma ( x + n ) }{ \varGamma ( x ) },~ (- x )_n = ( - 1)^n ( x - n + 1)_n \end{aligned}$$

to (20), then

$$\begin{aligned} F( x )&= \left( \frac{ nc }{ nc + Q } \right) ^{ - 1 } \sum _{k=0}^{ \infty } \frac{ \varGamma \left( \frac{ n - p }{ 2 } + 1 \right) }{ k! \varGamma \left( \frac{ n - p }{ 2 } - k + 1 \right) } \left( 1 - \frac{ nc }{ nc + Q } \right) ^k \left( \frac{ nc }{ nc + Q } \right) ^{\frac{ n - p }{ 2 } - k + 1} M(x), \end{aligned}$$

with

$$\begin{aligned} M(x)&= \int _0^{\infty } \varPhi \left( \frac{ x - ( c + Q/n )v / 2 }{ \sqrt{ ( c + Q/n ) v } } \right) g_{p+2k}( v ) \mathrm{{d}}v, \end{aligned}$$

where \(g_{ p + 2k } (v)\) is the \(\chi ^2\) density of degrees of freedom \(p + 2k\). To calculate F(x), we need to compute M(x). First, if we put \(x = 0\) and \(\alpha = c + Q/n\) in M(x), then

$$\begin{aligned} M(0)&= \int _0^\infty \varPhi \left( - \frac{ \sqrt{ \alpha v } }{ 2 } \right) g_{p + 2k}( v ) \mathrm{{d}}v \\&= \frac{ 1 }{ \sqrt{ 2 \pi } } \frac{ 1 }{ 2^ { \frac{ p }{ 2 } + k} \varGamma ( \frac{ p }{ 2 } + k ) } \int _0^\infty \int _{ -\infty }^{ - \sqrt{ \alpha v }/2 } v^{ \frac{ p }{ 2 } + k - 1 } \exp \left( - \frac{ z^2 }{ 2 } - \frac{ v }{ 2 } \right) \mathrm{{d}}z\mathrm{{d}}v \\&= \frac{ 1 }{ \sqrt{2 \pi } } \frac{ 1 }{ 2^ { \frac{ p }{ 2 } + k} \varGamma ( \frac{ p }{ 2 } + k ) } \int _0^\infty \int _{ \sqrt{ \alpha v }/2 }^{ \infty } v^{ \frac{ p }{ 2 } + k - 1 } \exp \left( - \frac{ z^2 }{ 2 } - \frac{ v }{ 2 } \right) \mathrm{{d}}z\mathrm{{d}}v. \end{aligned}$$

If we make the change of variable \(y = \sqrt{ v }\), then

$$\begin{aligned} M(0)&= \sqrt{ \frac{ 2 }{ \pi } } \frac{ 1 }{ 2^ { \frac{ p }{ 2 } + k} \varGamma ( \frac{ p }{ 2 } + k ) } \int _0^\infty \int _{ \sqrt{ \alpha }y/2 }^{ \infty } y^{ p + 2k - 1 } \exp \left( - \frac{ z^2 }{ 2 } - \frac{ y^2 }{ 2 } \right) \mathrm{{d}}z\mathrm{{d}}v. \end{aligned}$$
(21)

The integral (21) can be evaluated by applying the proof of Lemma D.1 given by Dalton and Dougherty (2011). Next, we consider the case \(x\ne 0\). Let \(\phi ( \cdot )\) be the standard normal density and \(G_s (\cdot )\) be the CDF of a \(\chi ^2\) variable with degrees of freedom s. If we apply the integration by parts to M(x), then

$$\begin{aligned} M(x)&= \left[ \varPhi \left( \frac{ x - \alpha v / 2 }{ \sqrt{ \alpha v } } \right) G_{ p + 2k }( v ) \right] _0^\infty \\&\quad + \frac{ 1 }{ 2 }\int _0^\infty \phi \left( \frac{ x - \alpha v / 2 }{ \sqrt{ \alpha v } } \right) \left( \frac{ x }{ \sqrt{ \alpha } } v^{ - 3/2 } + \frac{ \sqrt{ \alpha } }{ 2 } v^{- 1/2 } \right) G_{ p + 2k }( v ) \mathrm{{d}}v \\&= \frac{ 1 }{ 2 }\int _0^\infty \phi \left( \frac{ x - \alpha v / 2 }{ \sqrt{ \alpha v } } \right) \left( \frac{ x }{ \sqrt{ \alpha } } v^{ - 3/2 } + \frac{ \sqrt{ \alpha } }{ 2 } v^{- 1/2 } \right) G_{ p + 2k }( v ) \mathrm{{d}}v \\&= \frac{ e^{ x/2 } }{ 2 \sqrt{ 2\pi } } \left[ \frac{ x }{ \sqrt{ \alpha } } \int _0^\infty v^{ -3/2 } \exp \left( - \frac{ x^2 }{ 2 \alpha }\frac{ 1 }{ v } - \frac{ \alpha }{ 8 } v \right) G_{ p + 2k }( v ) \mathrm{{d}}v \right. \\&\quad + \left. \frac{ \sqrt{ \alpha } }{ 2 } \int _0^\infty v^{ -1/2 } \exp \left( - \frac{ x^2 }{ 2 \alpha }\frac{ 1 }{ v } - \frac{ \alpha }{ 8 } v \right) G_{ p + 2k }( v ) \mathrm{{d}}v \right] . \\ \end{aligned}$$

If we define

$$\begin{aligned}&N( t ) = \int _0^\infty v^{ -t/2 } \exp \left( - \frac{ x^2 }{ 2\alpha }\frac{ 1 }{ v } - \frac{ \alpha }{ 8 } v \right) G_{ p + 2k }( v ) \mathrm{{d}}v, \end{aligned}$$
(22)

then,

$$\begin{aligned} M(x)&= \frac{ e^{ x/2 } }{ 2 \sqrt{ 2\pi } } \left[ \frac{ x }{ \sqrt{ \alpha } } N( 3 ) + \frac{ \sqrt{ \alpha } }{ 2 } N( 1 ) \right] . \end{aligned}$$
(23)

Equation (22) can be evaluated using the following expressions for \(G_{ p + 2k }( v )\) given in the Appendix of Fatti (1983):

$$\begin{aligned} G_{ p + 2k }( v ) = {\left\{ \begin{array}{ll} \displaystyle 1 - \exp \left( - \frac{ v }{ 2 } \right) \sum _{ i = 0 }^{ p/2 + k - 1 } \frac{ 1 }{ i! } \left( \frac{ v }{ 2 } \right) ^i, &{} p\text { is even, } \\ \displaystyle 2 \varPhi ( \sqrt{ v } ) - 1 - \exp \left( - \frac{ v }{ 2 } \right) \sum _{ i = 0 }^{ ( p - 1 )/2 + k - 1 } \frac{ 1 }{ \varGamma ( i + 3/2 ) } \left( \frac{ v }{ 2 } \right) ^{ i + 1/2 } &{} \ p\text { is odd}. \end{array}\right. } \end{aligned}$$

When p is even, (22) becomes

$$\begin{aligned} N( t )&= 2 \left( \frac{ 2 | x | }{ \alpha } \right) ^{ 1 - t/2 } K_{ 1 - t/2 } \left( \frac{ | x | }{ 2 } \right) \nonumber \\&\quad - \sum _{ i = 0 }^{ p/2 + k - 1 } \frac{ 2 }{ 2^i i! } \left( \frac{ 2 | x | }{ \sqrt{ \alpha } \sqrt{ \alpha + 4 } } \right) ^{ i + 1 - t/2 } K_{ i + 1 - t/2 } \left( \frac{ | x | }{ 2 } \sqrt{ 1 + \frac{ 4 }{ \alpha } } \right) , \end{aligned}$$
(24)

where we applied the equality

$$\begin{aligned}&\int _0^\infty v^{ s - 1 } e^{ - t_1 v^{ - 1 } - t_2 v } \mathrm{{d}}v = 2 \left( \frac{ t_1 }{ t_2 } \right) ^{s/2} K_{s} \left( 2\sqrt{t_1 t_2} \right) , \end{aligned}$$
(25)

where \(t_1>0\) and \(t_2>0\). This equality is given by equation 9 of 3.471 in Gradshteyn and Ryzhik (2007). If we put (24) into (23), then

$$\begin{aligned} M(x)&= \frac{ e^{ x/2 } }{ 2\sqrt{ 2 \pi } } \left\{ \frac{ x }{ \sqrt{ \alpha } } \left[ 2 \left( \frac{ 2 | x | }{ \alpha } \right) ^{ -1/2 } K_{ -1/2 } \left( \frac{ | x | }{ 2 } \right) \right. \right. \nonumber \\&\quad \left. - \sum _{ i = 0 }^{ p/2 + k - 1 } \frac{ 2 }{ 2^i i! } \left( \frac{ 2 | x | }{ \sqrt{ \alpha } \sqrt{ \alpha + 4 } } \right) ^{ i - 1/2 } K_{ i - 1/2 } \left( \frac{ | x | }{ 2 } \sqrt{ 1 + \frac{ 4 }{ \alpha } } \right) \right] \nonumber \\&\quad + \frac{ \sqrt{ \alpha } }{ 2 } \left[ 2 \left( \frac{ 2 | x | }{ \alpha } \right) ^{ -1/2 } K_{ 1/2 } \left( \frac{ | x | }{ 2 } \right) \right. \nonumber \\&\quad \left. \left. - \sum _{ i = 0 }^{ p/2 + k - 1 } \frac{ 2 }{ 2^i i! } \left( \frac{ 2 | x | }{ \sqrt{ \alpha } \sqrt{ \alpha + 4 } } \right) ^{ i + 1/2 } K_{ i + 1/2 } \left( \frac{ | x | }{ 2 } \sqrt{ 1 + \frac{ 4 }{ \alpha } } \right) \right] \right\} . \end{aligned}$$
(26)

If we use the equality

$$\begin{aligned} K_{1/2} ( x ) = K_{-1/2} ( x ) = \sqrt{ \frac{ \pi }{ 2x } } \exp ( - x ) \end{aligned}$$

in (26), then the desired result for even p and \(x\ne 0\) can be obtained. When p is odd, Eq. (22) becomes

$$\begin{aligned} N( t )&= \int _0^\infty v^{ -\frac{t}{2} } \exp \left( - \frac{ x^2 }{ 2 \alpha }\frac{ 1 }{ v } - \frac{ \alpha }{ 8 } v \right) \nonumber \\&\quad \times \left[ 2 \varPhi ( \sqrt{ v } ) - 1 - \exp \left( - \frac{ v }{ 2 } \right) \sum _{ i = 0 }^{ \frac{p - 1}{2} + k - 1 } \frac{ 1 }{ \varGamma ( i + \frac{3}{2} ) } \left( \frac{ v }{ 2 } \right) ^{ i + \frac{1}{2} } \right] \mathrm{{d}}v \nonumber \\&= \int _0^\infty v^{ -\frac{t}{2} } \exp \left( - \frac{ x^2 }{ 2 \alpha }\frac{ 1 }{ v } - \frac{ \alpha }{ 8 } v \right) \nonumber \\&\quad \times \left[ 2 \int _0^{ \sqrt{v} } \phi (z) dz - \exp \left( - \frac{ v }{ 2 } \right) \sum _{ i = 0 }^{ \frac{p - 1}{2} + k - 1 } \frac{ 1 }{ \varGamma ( i + \frac{3}{2} ) } \left( \frac{ v }{ 2 } \right) ^{ i + \frac{1}{2} } \right] \mathrm{{d}}v. \end{aligned}$$
(27)

Since the error function is defined as

$$\begin{aligned} \mathrm{erf}( z ) = \frac{ 2 }{ \sqrt{ \pi } } \int _0^z e^{ -t^2 } \mathrm{{d}}t, \end{aligned}$$

we can express Eq. (27) as

$$\begin{aligned} N(t)&= \int _0^\infty v^{ -\frac{t}{2} } \exp \left( - \frac{ x^2 }{ 2\alpha }\frac{ 1 }{ v } - \frac{ \alpha }{ 8 } v \right) \nonumber \\&\quad \times \left[ \mathrm{erf}\left( \sqrt{ \frac{ v }{ 2 } } \right) - \exp \left( - \frac{ v }{ 2 } \right) \sum _{ i = 0 }^{ \frac{p-1}{2} + k - 1 } \frac{ 1 }{ \varGamma ( i + \frac{3}{2} ) } \left( \frac{ v }{ 2 } \right) ^{ i + \frac{1}{2} } \right] \mathrm{{d}}v. \end{aligned}$$
(28)

Using the equality

$$\begin{aligned} \mathrm{erf}( z ) = \frac{ 2 z e^{ - z^2 } }{ \sqrt{ \pi } } {_1} F_1 \left( 1; \frac{ 3 }{ 2 } ; z^2 \right) , \end{aligned}$$

Eq. (28) becomes

$$\begin{aligned} N(t)&= \int _0^\infty v^{ -\frac{t}{2} } \exp \left( - \frac{ x^2 }{ 2 \alpha }\frac{ 1 }{ v } - \frac{ \alpha }{ 8 } v \right) \nonumber \\&\quad \times \left[ \sqrt{ \frac{ 2 }{ \pi } } \sqrt{ v } \exp \left( - \frac{ v }{ 2 } \right) {_1}F_1 \left( 1 ; \frac{ 3 }{ 2 } ; \frac{ v }{ 2 } \right) -\quad \exp \left( - \frac{ v }{ 2 } \right) \sum _{ i = 0 }^{ \frac{p-1}{2} + k - 1 } \frac{ 1 }{ \varGamma ( i + \frac{3}{2} ) } \left( \frac{ v }{ 2 } \right) ^{ i + \frac{1}{2} } \right] \mathrm{{d}}v \nonumber \\&=\sqrt{ \frac{ 2 }{ \pi } }\int _0^\infty v^{ (1 - t)/2 } \exp \left( - \frac{ x^2 }{ 2 \alpha }\frac{ 1 }{ v } - \frac{ \alpha + 4 }{ 8 } v \right) {_1}F_1 \left( 1 ; \frac{ 3 }{ 2 } ; \frac{ v }{ 2 } \right) \mathrm{{d}}v \nonumber \\&\quad - \sum _{ i = 0 }^{ \frac{ p - 1 }{2} + k - 1 } \frac{ 1 }{ 2^{ i + 1/2 } }\frac{ 1 }{ \varGamma ( i + 3/2 ) } \int _0^\infty v^{ (1 - t)/2 + i } \exp \left( - \frac{ x^2 }{ 2 \alpha }\frac{ 1 }{ v } - \frac{ \alpha + 4 }{ 8 } v \right) \mathrm{{d}}v. \end{aligned}$$
(29)

The second term of (29) can be computed as

$$\begin{aligned} \sum _{ i = 0 }^{ \frac{ p - 1 }{2} + k - 1 } \frac{ 2 }{ 2^{ i + 1/2 } }\frac{ 1 }{ \varGamma ( i + 3/2 ) } \left( \frac{ 2 | x | }{ \sqrt{\alpha }\sqrt{ \alpha + 4 } } \right) ^{ \frac{ 1 - t }{ 2 } + 1 + i } K_{ \frac{ 1 - t }{ 2 } + 1 + i } \left( \frac{ | x | }{ 2 } \sqrt{ 1 + \frac{ 4 }{ \alpha } } \right) . \end{aligned}$$
(30)

using equality (25), while the first term of (29) as

$$\begin{aligned}&\sqrt{ \frac{ 2 }{ \pi } }\int _0^\infty v^{ (1 - t)/2 } \exp \left( - \frac{ x^2 }{ 2 \alpha }\frac{ 1 }{ v } - \frac{ \alpha + 4 }{ 8 } v \right) \int _0^\infty e^{ -w } {_0}F_1 \left( \frac{ 3 }{ 2 } ; \frac{ vw }{ 2 } \right) \mathrm{{d}}w \mathrm{{d}}v. \nonumber \\&\quad =2 \int _0^\infty v^{ - t/2 } \exp \left( - \frac{ x^2 }{ 2 \alpha }\frac{ 1 }{ v } - \frac{ \alpha }{ 8 } v \right) \nonumber \\&\qquad \times \int _0^\infty \frac{ v^{ 3/2 - 1}e^{-v/2} }{ 2^{3/2} \varGamma ( 3/2 ) }e^{ -2w/2 } {_0}F_1 \left( \frac{ 3 }{ 2 } ; \frac{ 2w }{ 4 }v \right) \mathrm{{d}}w \mathrm{{d}}v. \end{aligned}$$
(31)

Since the integrand in (31) is the noncentral chi squared density with degrees of freedom 3 and noncentrality parameter 2w and Corollary 1.3.5 of Muirhead (1982), (31) becomes

$$\begin{aligned}&2\sum _{i = 0}^\infty \int _0^\infty v^{ - t/2 } \exp \left( - \frac{ x^2 }{ 2 \alpha }\frac{ 1 }{ v } - \frac{ \alpha }{ 8 } v \right) g_{3 + 2i}(v)\mathrm{{d}}v \nonumber \\&\quad = \sum _{i = 0}^\infty \frac{ 1 }{ 2^{1/2 + i}\varGamma ( 3/2 + i )} \int _0^\infty v^{ ( 3 + 2i - t)/2 - 1 } \exp \left( - \frac{ x^2 }{ 2 \alpha }\frac{ 1 }{ v } - \frac{ \alpha + 4 }{ 8 } v \right) \mathrm{{d}}v \nonumber \\&\quad =\sum _{i = 0}^\infty \frac{ 2 }{ 2^{1/2 + i}\varGamma ( 3/2 + i )} \left( \frac{ 2 | x | }{ \sqrt{ \alpha }\sqrt{ \alpha + 4 } } \right) ^{ \frac{ 1 - t }{ 2 } + 1 + i } K_{ \frac{ 1 - t }{ 2 } + 1 + i } \left( \frac{ | x | }{ 2 } \sqrt{ 1 + \frac{ 4 }{ \alpha } } \right) . \end{aligned}$$
(32)

Putting (30) and (32) into (29),

$$\begin{aligned} N( t )&= \sum _{i = ( p - 1 )/2 + k }^\infty \frac{ 2 }{ 2^{1/2 + i}\varGamma ( 3/2 + i )} \left( \frac{ 2 | x | }{ \sqrt{ \alpha }\sqrt{ \alpha + 4 } } \right) ^{ \frac{ 1 - t }{ 2 } + 1 + i } \nonumber \\&\quad K_{ \frac{ 1 - t }{ 2 } + 1 + i } \left( \frac{ | x | }{ 2 } \sqrt{ 1 + \frac{ 4 }{ \alpha } }\right) . \end{aligned}$$
(33)

If we put (33) into (23), then we can obtain the result for odd p and \(x \ne 0\). \(\square \)

Proof

Since \((n-p)/2\) is an even number,

$$\begin{aligned}&{_1}F_1 \left( \frac{ n }{ 2 } ; \frac{ p }{ 2 } ; \frac{ Q }{ 2 ( nc + Q ) } v \right) = \exp \left( \frac{ Q }{ 2 ( nc + Q ) } v \right) {_1}F_1 \left( \frac{ p - n }{ 2 } ; \frac{ p }{ 2 } ; \frac{ Q }{ 2 ( nc + Q ) } v \right) \\&\quad = \sum _{k=0}^{ | p - n |/2 } \frac{ \left( \frac{ p-n }{ 2 } \right) _k }{ \left( \frac{ p }{ 2 } \right) _k } \frac{ v^k }{ k! } \left( \frac{ - Q }{ 2( nc + Q ) } \right) ^k. \end{aligned}$$

Thus, F(x) is represented by

$$\begin{aligned} F(x) =&\sum _{k=0}^{ |p - n|/2 } \frac{ \left( \frac{ p-n }{ 2 } \right) _k }{ k! \left( \frac{ p }{ 2 } \right) _k } \left( \frac{ - Q }{ 2( nc + Q ) } \right) ^k \left( \frac{ nc }{ nc + Q } \right) ^{n/2} \\&\frac{ 1 }{ 2^{ \frac{ p }{ 2 } } \varGamma \left( \frac{ p }{ 2 } \right) } \int _0^{\infty } \varPhi \left( \frac{ x - cv / 2 }{ \sqrt{ cv } } \right) v^{ \frac{ p }{ 2 } + k - 1 } \exp \left( -\frac{ 1 }{ 2 } \left( 1 - \frac{ Q }{ nc + Q } \right) v \right) \mathrm{{d}}v. \end{aligned}$$

The rest of the proof is the same as that of Theorem 3\(\mathrm {(a)}\). In addition, the finite representation for \(K_{i \pm 1/2}\) is found in 8.468 and 8.469 of Gradshteyn and Ryzhik (2007). \(\square \)

Proof

Since mean and variance of X/f are given by \( \mathrm { E }( X/f ) = \delta \) and \( \mathrm { V }( X/f ) = \gamma / f \) with \(\gamma = \varOmega + 2 \delta ^2\), the conditional distribution of the standardized X/f is given by

$$\begin{aligned} S_f \sim N \left( - \frac{ \sqrt{ f } \delta }{ \sqrt{ \gamma } } + \frac{ \delta }{ g \sqrt{ f \gamma } } (gW), (gW) \right) , \end{aligned}$$

where \(W \sim \chi ^2_f\) and \(g = \varOmega / (f \gamma )\). From Definition 2.1 of Barndorff-Nielsen et al. (1982), the distribution of \(S_f\) is the normal variance-mean mixture with position \(- \sqrt{ f } \delta / \sqrt{ \gamma }\), drift \(\delta / (g \sqrt{ f \gamma }),\) structure matrix 1, and mixing distribution \(gW {\mathop {=}\limits ^{\mathrm {d}}} g \chi ^2_f\). Hence, from equation (2.2) of Barndorff-Nielsen et al. (1982), the characteristic function of \(S_f\) is given by

$$\begin{aligned} {\hat{g}}( \theta ) = \exp \left( - \frac{ i \theta \sqrt{ f } \delta }{ \sqrt{ \gamma } } \right) \left( 1 - \frac{ 2 i \theta \delta }{ \sqrt{ f \gamma } } + \frac{ \theta ^2 \varOmega }{ f \gamma } \right) ^ { -f / 2 }. \end{aligned}$$

The cumulant generating function of \(S_f\) becomes

$$\begin{aligned} K( \theta )&= - \frac{ i \theta \sqrt{ f } \delta }{ \sqrt{ \gamma } } - \frac{ f }{ 2 } \log \left( 1 - \frac{ 2 i \theta \delta }{ \sqrt{ f \gamma } } + \frac{ \theta ^2 \varOmega }{ f \gamma } \right) \\&= - \frac{ i \theta \sqrt{ f } \delta }{ \sqrt{ \gamma } } - \frac{ f }{ 2 } \sum _{ r = 1 }^\infty \frac{ ( -1)^{ r + 1 } }{ r } \left( - \frac{ 2 i \theta \delta }{ \sqrt{ f \gamma } } + \frac{ \theta ^2 \varOmega }{ f \gamma } \right) ^r \\&= - \frac{ i \theta \sqrt{ f } \delta }{ \sqrt{ \gamma } } - \frac{ 1 }{ 2 } \sum _{ r = 1 }^\infty \frac{ ( -1)^{ r + 1 } }{ r } \sum _{ s = 0 }^r \left( {\begin{array}{c} r \\ s \end{array}}\right) \left( - \frac{ 2i \delta }{ \sqrt{ \gamma } } \right) ^{ r - s } \left( \frac{ \varOmega }{ \gamma } \right) ^s \left( \frac{ \theta }{ \sqrt{ f } } \right) ^{ r + s } \left( \frac{ 1 }{ \sqrt{ f } } \right) ^{ - 2 }. \end{aligned}$$

For \(l \ge 2\), the lth derivative of \(K( \theta )\) is given by

$$\begin{aligned} K^{ ( l ) } ( \theta )&= - \frac{ 1 }{ 2 } \sum _{ r = 1 }^\infty \frac{ ( - 1 )^{ r + 1 } }{ r } \sum _{ s = 0 }^r \left( {\begin{array}{c} r \\ s \end{array}}\right) \left( - \frac{ 2i \delta }{ \sqrt{ \gamma } } \right) ^{ r - s } \left( \frac{ \varOmega }{ \gamma } \right) ^s \nonumber \\&\quad \times ( r + s )( r + s - 1 ) \cdots ( r + s - l + 1 ) \left( \frac{ \theta }{ \sqrt{ f } } \right) ^{ r + s - l } \left( \frac{ 1 }{ \sqrt{ f } } \right) ^{ l - 2} . \end{aligned}$$
(34)

To compute the lth cumulant, we express (34) as

$$\begin{aligned} K^{ ( l ) } ( \theta )= & {} \sum _{ r + s = l, s\le r } \left[ - \frac{ 1 }{ 2 } \frac{ ( - 1 )^{ r + 1 } }{ r } \left( {\begin{array}{c} r \\ s \end{array}}\right) \left( - \frac{ 2i \delta }{ \sqrt{ \gamma } } \right) ^{ r - s } \left( \frac{ \varOmega }{ \gamma } \right) ^s \right. \nonumber \\&\left. \times ( r + s )( r + s - 1 ) \cdots ( r + s - l + 1 ) \left( \frac{ \theta }{ \sqrt{ f } } \right) ^{ r + s - l } \left( \frac{ 1 }{ \sqrt{ f } } \right) ^{ l - 2} \right] \nonumber \\&+ \sum _{ r + s \ne l, s\le r } \left[ - \frac{ 1 }{ 2 } \frac{ ( - 1 )^{ r + 1 } }{ r } \left( {\begin{array}{c} r \\ s \end{array}}\right) \left( - \frac{ 2i \delta }{ \sqrt{ \gamma } } \right) ^{ r - s } \left( \frac{ \varOmega }{ \gamma } \right) ^s \right. \nonumber \\&\left. \times ( r + s )( r + s - 1 ) \cdots ( r + s - l + 1 ) \left( \frac{ \theta }{ \sqrt{ f } } \right) ^{ r + s - l } \left( \frac{ 1 }{ \sqrt{ f } } \right) ^{ l - 2} \right] .\nonumber \\ \end{aligned}$$
(35)

If we put \(l=3\) in (35), then the first summation becomes

$$\begin{aligned}&- \frac{ 1 }{ 2 } \left[ \frac{ ( - 1 )^3 }{ 2 } \left( {\begin{array}{c} 2 \\ 1 \end{array}}\right) \left( - \frac{ 2i \delta }{ \sqrt{ \gamma } } \right) \left( \frac{ \varOmega }{ \gamma } \right) 3! \frac{ 1 }{ \sqrt{ f } } +\frac{ ( - 1 )^4 }{ 3 } \left( {\begin{array}{c} 3 \\ 0 \end{array}}\right) \left( - \frac{ 2i \delta }{ \sqrt{ \gamma } } \right) ^3 3! \frac{ 1 }{ \sqrt{ f } } \right] \\&\quad = -i \frac{ 2 \delta ( 3 \varOmega + 4\delta ^2 ) }{ \gamma ^{3/2} }\frac{ 1 }{ \sqrt{ f } }. \end{aligned}$$

Hence, the third cumulant of \(S_f\) is given by

$$\begin{aligned} \frac{ 2 \delta ( 3 \varOmega + 4\delta ^2 ) }{ ( 2\delta ^2 + \varOmega )^{3/2} } \frac{ 1 }{ \sqrt{ f } }. \end{aligned}$$

The fourth cumulant of \(S_f\) can be obtained by following the procedures to obtain the third cumulant of \(S_f\). The characteristic function of \(S_f\) is represented by

$$\begin{aligned} {\hat{g}}( \theta ) = \exp \left\{ -\frac{ 1 }{ 2 } \theta ^2 + \frac{ 1 }{ \sqrt{ n } } \frac{ 1 }{ 3! } {\tilde{\kappa }}_3 ( i \theta )^3 + \frac{ 1 }{ n } \frac{ 1 }{ 4! } {\tilde{\kappa }}_4 ( i \theta )^4 + \cdots \right\} , \end{aligned}$$
(36)

where

$$\begin{aligned} {\tilde{\kappa }}_3 = \frac{ 2 \delta ( 3 \varOmega + 4\delta ^2 ) }{ ( 2\delta ^2 + \varOmega )^{3/2} } \text { and } {\tilde{\kappa }}_4 = \frac{ 6 ( \varOmega ^2 + 8\varOmega \delta ^2 + 8 \delta ^4 ) }{ ( 2\delta ^2 + \varOmega )^2 }. \end{aligned}$$

Since Eq. (36) is identical to equation (2.7) of Hall (1992), the Edgeworth expansion of \(S_f\) is followed immediately. \(\square \)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yonenaga, K., Suzukawa, A. Bayesian estimation for misclassification rate in linear discriminant analysis. Jpn J Stat Data Sci 4, 861–885 (2021). https://doi.org/10.1007/s42081-021-00139-7

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s42081-021-00139-7

Keywords

Navigation