Skip to main content
Log in

Simultaneous Predictive Gaussian Classifiers

  • Published:
Journal of Classification Aims and scope Submit manuscript

Abstract

Gaussian distribution has for several decades been ubiquitous in the theory and practice of statistical classification. Despite the early proposals motivating the use of predictive inference to design a classifier, this approach has gained relatively little attention apart from certain specific applications, such as speech recognition where its optimality has been widely acknowledged. Here we examine statistical properties of different inductive classification rules under a generic Gaussian model and demonstrate the optimality of considering simultaneous classification of multiple samples under an attractive loss function. It is shown that the simpler independent classification of samples leads asymptotically to the same optimal rule as the simultaneous classifier when the amount of training data increases, if the dimensionality of the feature space is bounded in an appropriate manner. Numerical investigations suggest that the simultaneous predictive classifier can lead to higher classification accuracy than the independent rule in the low-dimensional case, whereas the simultaneous approach suffers more from noise when the dimensionality increases.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • AITCHISON, J., and DUNSMORE, I.R., (1975), Statistical Prediction Analysis, Cambridge: Cambridge University Press.

  • AITCHISON, J., and KAY, J.W., (1975), “Principles, Practice and Performance in Decision Making in Clinical Medicine,” in Proceedings of 1973 NATO Conference on the Role and Effectiveness of Theories of Decision in Practice, eds. K.C. Bowen and D.G. White, London: Hodder and Stoughton, pp. 252–272.

  • AITCHISON, J., HABBEMA, J.D.F., and KAY, J.W., (1977), “A Critical Comparison of Two Methods of Statistical Discrimination,” Applied Statistics, 26, 15–25.

  • AITKEN, A.C., (1944), Determinants and Matrices, Edinburgh & London: Oliver and Boyd.

  • BICKEL, P., and LEVINA, E., (2004), “Some Theory for Fisher’s Linear Discriminant Function, ’Naive Bayes’, and Some Alternatives When There Are Many More Variables Than Observations,” Bernoulli, 10, 989–1010.

  • BISHOP, C.M., (2007), Pattern Recognition and Machine Learning, New York: Springer.

  • CHIEN, J.T., HUANG, C.H., SHINODA, S., and FURUI, S., (2006), “Towards Optimal Bayes Decision for Speech Recognition,” in Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 45–48.

  • CONSONNI, G., and LA ROCCA, L., (2012), “Objective Bayes Factors for Gaussian Directed Acyclic Graphical Models,” Scandinavian Journal of Statistics, Wiley Online Library, doi: 10.1111/j.1467-9469.2011.00785.x

  • CORANDER, J., GYLLENBERG,M., and KOSKI, T., (2006), “Bayesian Model Learning Based on a Parallel MCMC Strategy,” Statistics and Computing, 16, 355–362.

  • CORANDER, J., GYLLENBERG, M., and KOSKI, T., (2009), “Bayesian Unsupervised Classification Framework Based on Stochastic Partitions of Data and a Parallel Search Strategy,” Advances in Data Analysis and Classification, 3, 3–24.

  • CORANDER, J., CUI, Y., KOSKI, T., and SIR`EN, J., (2013), “Have I Seen You Before? Principles of Predictive Classification Revisited,” Statistics and Computing, 23(1), 59–73.

  • CORANDER, J., XIONG, J., CUI, Y., and KOSKI, T., (2013), “Optimal Viterbi Bayesian Predictive Classification for Data From Finite Alphabets,” Journal of Statistical Planning and Inference, 143(2), 261–275.

  • CORANDER, J., CUI, Y., and KOSKI, T., (2013), “Inductive Inference and Partition Exchangeability in Classification,” in Algorithmic Probability and Friends. Bayesian Prediction and Artificial Intelligence, Springer Lecture Notes in Artificial Intelligence (LNAI) 7070, ed. D.L. Dowe, Berlin:Springer, pp. 91–105.

  • CORANDER, J., KOSKI, T., PAVLENKO, T., and TILLANDER, A. (2013), “Bayesian Block-Diagonal Predictive Classifier for Gaussian Data,” in Advances in Intelligent Systems and Computing, eds. R. Kruse, M. R. Berthold, C. Moewes, M. Ángeles Gil, P. Grzegorzewski, and O. Hryniewicz, Berlin: Springer, 190, 543–551.

  • DAWID, A.P., and LAURITZEN, S.L., (1993), “Hyper-Markov Laws in the Statistical Analysis of Decomposable Graphical Models,” Annals of Statistics, 21, 1272–1317.

  • DUDA, R.O., HART, P.E., and STORK, D.G., (2000), Pattern Classification (2nd ed.), New York: Wiley.

  • FAN, J., FENG, Y., and TONG, X., (2012), “A Road to Classification in High Dimensional Space: The Regularized Optimal Affine Discriminant,” Journal of the Royal Statistical Society B, 74, 745–771.

  • GEISSER, S., (1964), “Posterior Odds for Multivariate Normal Classifications,” Journal of the Royal Statistical Society B, 26, 69–76.

  • GEISSER, S., (1966), “Predictive Discrimination,” in Multivariate Analysis, ed. P.R. Krishnajah, New York and London: Academic Press, pp. 149–163.

  • GEISSER, S., (1993), Predictive Inference: An Introduction, London: Chapman & Hall.

  • HAN, T.S., and VERDU, S., (1994), “Generalizing the Fano Unequality,” IEEE Transactions on Information Theory, 40, 1247–1251.

  • HAUSSLER, D., and BARRON, A., (1992), “How Well Do Bayes Methods Work for On-Line Prediction of {±1} Values?” in Proceedings of the Third NEC Symposium on Computation and Cognition, pp. 74–100.

  • HUO, Q., JIANG, H., and LEE, C-H., (1997), “A Bayesian Predictive Classification Approach to Robust Speech Recognition,” in Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 1547–1550.

  • HUO, Q., and LEE, C-H., (2000), “A Bayesian Predictive Classification Approach to Robust Speech Recognition,” IEEE Transactions on Speech and Audio Processing, 8, 200–204.

  • HUELSENBECK, J.P., and ANDOLFATTO, P., (2007), “Inference of Population Structure Under a Dirichlet Process Model,” Genetics, 175, 1787–1802.

  • KO, J., SI, L., NYBERG, E., and MITAMURA, T., (2010), “Probabilistic Models for Answer-Ranking in Multilingual Question-Answering,” ACM Transactions on Information Systems, 28, 1–37.

  • KOLLO, T., and VON ROSEN, D., (2005), Advanced Multivariate Statistics with Matrices, New York: Springer.

  • LETAC, G., and MASSAM, H., (2007), “Wishart Distributions for Decomposable Graphs,” Annals of Statistics, 35, 1278–1323.

  • MANGASARIAN, O.L., STREET, W.N., and Wolberg, W.H., (1995), “Breast Cancer Diagnosis and Prognosis Via Linear Programming,” Operations Research, 43, 570–577.

  • MCLACHLAN, G.J., (1992), Discriminant analysis and Statistical Pattern Recognition, New York: Wiley.

  • NÁ DAS, A. (1985), “Optimal Solution of a Training Problem in Speech Recognition,” IEEE Transactions on Acoustics, Speech and Signal Processing, 33, 326–329.

  • RIPLEY, B.D., (1988), Statistical Inference for Spatial Processes, Cambridge: Cambridge University Press.

  • RIPLEY, B.D. (1996), Pattern Recognition and Neural Networks, Cambridge: Cambridge University Press.

  • SOLOMONOFF, R.J., (1964), “A Formal Theory of Inductive Inference, Information and Control, 7, 1–22.

  • SOLOMONOFF, R.J., (1978), “Complexity-Based Induction Systems: Comparisons and Convergence Theorems,” IEEE Transactions on Information Theory, 24, 422–432.

  • SOLOMONOFF, R.J., (2008), “Three Kinds of Probabilistic Induction: Universal Distributions and Convergence Theorems,” Computer Journal, 51, 566–570.

  • SONG, H., (1996), “Study on the L’Hospital Law of Functions of Two Variables,” J. Zhelimu Animal Husbandry College, China (in Chinese), 6, 58–60.

  • ZELLNER, A. (1971), An Introduction to Bayesian Inference in Econometrics, New York: Wiley.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jukka Corander.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Cui, Y., Sirén, J., Koski, T. et al. Simultaneous Predictive Gaussian Classifiers. J Classif 33, 73–102 (2016). https://doi.org/10.1007/s00357-016-9197-3

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00357-016-9197-3

Keywords

Navigation