Skip to main content
Log in

Nonlinear logistic discrimination via regularized radial basis functions for classifying high-dimensional data

  • Published:
Annals of the Institute of Statistical Mathematics Aims and scope Submit manuscript

Abstract

A flexible nonparametric method is proposed for classifying high- dimensional data with a complex structure. The proposed method can be regarded as an extended version of linear logistic discriminant procedures, in which the linear predictor is replaced by a radial-basis-expansion predictor. Radial basis functions with a hyperparameter are used to take the information on covariates and class labels into account; this was nearly impossible within the previously proposed hybrid learning framework. The penalized maximum likelihood estimation procedure is employed to obtain stable parameter estimates. A crucial issue in the model-construction process is the choice of a suitable model from candidates. This issue is examined from information-theoretic and Bayesian viewpoints and we employed Ando et al. (Japanese Journal of Applied Statistics, 31, 123–139, 2002)’s model evaluation criteria. The proposed method is available not only for the high-dimensional data but also for the variable selection problem. Real data analysis and Monte Carlo experiments show that our proposed method performs well in classifying future observations in practical situations. The simulation results also show that the use of the hyperparameter in the basis functions improves the prediction performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Akaike H. (1973). Information theory and an extension of the maximum likelihood principle. In Petrov B.N., Csaki F (eds). 2nd International Symposium on Information Theory. Budapest, Akademiai Kiado, pp. 267–281

    Google Scholar 

  • Akaike H. (1974). A new look at the statistical model identification. IEEE Transactions on Automatic Control AC-19: 716–723

    Article  MathSciNet  Google Scholar 

  • Alizadeh A.A., Eisen M.B., Davis R.E., Ma C., Lossos I.S., Rosenwald A., et al. (2000). Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling. Nature 403, 503–511

    Article  Google Scholar 

  • Alpaydin E., Kaynak C. (1998) Cascading classifiers. Kybernetika 34, 369–374

    Google Scholar 

  • Ando T. (2003). Kernel flexible discriminant analysis for classifying high-dimensional data with nonlinear structure and its applications (in Japanese). Proceedings of the Institute of Statistical Mathematics 51, 389–406

    MathSciNet  Google Scholar 

  • Ando T., Imoto S., Konishi S. (2001). Estimating nonlinear regression models based on radial basis function networks (in Japanese). Japanese Journal of Applied Statistics 30, 19–35

    Google Scholar 

  • Ando T., Simauchi J., Konishi S. (2002). Nonlinear pattern recognition using radial basis function networks and its application (in Japanese). Japanese Journal of Applied Statistics 31, 123–139

    Article  Google Scholar 

  • Ando T., Imoto S., Konishi S. (2004). Adaptive learning machines for nonlinear classification and Bayesian information criteria. Bulletin of Informatics and Cybernetics 36, 147–162

    MathSciNet  Google Scholar 

  • Ando, T., Imoto, S., Konishi, S. (2005). Nonlinear regression modeling via regularized radial basis function networks. Journal of Statistical Planning and Inference (to appear).

  • Bishop C.M. (1995). Neural networks for pattern recognition. Oxford, Oxford University Press

    Google Scholar 

  • Blake, C. L., Merz, C. J. (1998). UCI repository of machine learning databases. http://www.ics.uci.edu/∼ mlearn/MLRepository.html, University of California, Department of Information and Computer Sciences.

  • Breiman L., Friedman J.H., Olshen R.A., Stone C.J. (1984). Classification and regression trees. Belmont, CA, Wadsworth

    MATH  Google Scholar 

  • Broomhead D.S., Lowe D. (1988). Multivariable functional interpolation and adaptive networks. Complex Systems 2, 321–335

    MATH  MathSciNet  Google Scholar 

  • Dudoit S., Fridlyand J., Speed T. (2002). Comparison of discrimination methods for the classification of tumors using gene expression data. Journal of the American Statistical Association 97, 77–87

    Article  MATH  MathSciNet  Google Scholar 

  • Eilers P.H.C., Marx B.D. (1996). Flexible smoothing with B-splines and penalties (with discussion). Statistical Science 11, 89–121

    Article  MATH  MathSciNet  Google Scholar 

  • Fujii T., Konishi S. (2006). Nonlinear regression modeling via regularized wavelets and smoothing parameter selection. Journal of Multivariate Analysis 97, 2023–2033

    Article  MATH  MathSciNet  Google Scholar 

  • Girosi F., Jones M., Poggio T. (1995). Regularization theory and neural architectures. Neural Computation 7, 219–269

    Article  Google Scholar 

  • Green P.J., Silverman B.W. (1994). Nonparametric regression and generalized linear models. London, Chapman & Hall

    MATH  Google Scholar 

  • Golub T.R., Slonim D.K., Tamayo P., Huard C., Gaasenbeek M., Mesirov J.P., et al. (1999). Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286, 531–537

    Article  Google Scholar 

  • Hastie T., Tibshirani R., Buja A. (1994). Flexible discriminant analysis by optimal scoring. Journal of the American Statistical Association 89, 1255–1270

    Article  MATH  MathSciNet  Google Scholar 

  • Hastie T., Tibshirani R., Friedman J. (2001). The elements of statistical learning. New York, Springer

    MATH  Google Scholar 

  • Hosmer D.W., Lemeshow S. (1989). Applied logistic regression. New York, Wiley

    Google Scholar 

  • Imoto S., Konishi S. (2003). Selection of smoothing parameters in B-spline nonparametric regression models using information criteria. Annals of the Institute of Statistical Mathematics 55, 671–687

    Article  MATH  MathSciNet  Google Scholar 

  • Karayiannis N.B., Mi G.W. (1997). Growing radial basis neural networks: merging supervised and unsupervised learning with network growth techniques. IEEE Transactions on Neural Networks 8, 1492–1506

    Article  Google Scholar 

  • Kass R.E., Tierney L., Kadane J.B. (1990). The validity of posterior expansions based on Laplace’s method. In Geisser S., Hodges J.S., Press S.J., Zellner A. (eds). Essays in honor of George Barnard. Amsterdam, North-Holland, pp. 473–488

    Google Scholar 

  • Konishi S., Kitagawa G. (1996). Generalised information criteria in model selection. Biometrika 83, 875–890

    Article  MATH  MathSciNet  Google Scholar 

  • Konishi S., Ando T., Imoto S. (2004). Bayesian information criteria and smoothing parameter selection in radial basis function networks. Biometrika 91, 27–43

    Article  MATH  MathSciNet  Google Scholar 

  • Kullback S., Leibler R.A. (1951). On information and sufficiency. Annals of Mathematical Statistics 22, 79–86

    Article  MATH  MathSciNet  Google Scholar 

  • MacQueen J. (1967). Some methods for classification and analysis of multivariate observations. In LeCam L.M., Neyman J. (eds). Proceeding of the fifth Berkeley symposium on mathematics, statistics, and probability. Berkeley, University of California Press, p. 281

    Google Scholar 

  • Moody J., Darken C.J. (1989). Fast learning in networks of locally-tuned processing units. Neural Computation 1, 281–294

    Article  Google Scholar 

  • Nabney I.T. (2002). NETLAB algorithms for pattern recognition. UK, Springer

    MATH  Google Scholar 

  • Nonaka Y., Konishi S. (2005). Nonlinear regression modeling using regularized local likelihood method. Annals of the Institute of Statistical Mathematics 57, 617–635

    Article  MATH  MathSciNet  Google Scholar 

  • Ranganath S., Arun K. (1997). Face recognition using transform features and neural networks. Pattern Recognition 30, 1615–1622

    Article  Google Scholar 

  • Ripley B.D. (1994). Neural networks and related methods for classification. Journal of the Royal Statistical Society Series B 56, 409–456

    MATH  MathSciNet  Google Scholar 

  • Ripley B.D. (1996). Pattern recognition and neural networks. Cambridge, UK, Cambridge University Press

    MATH  Google Scholar 

  • Sato T. (1996). On artificial neural networks as a statistical model (in Japanese). Proceedings of the Institute of Statistical Mathematics 44, 85–98

    Google Scholar 

  • Schwarz G. (1978). Estimating the dimension of a model. Annals of Statistics 6, 461–464

    Article  MATH  MathSciNet  Google Scholar 

  • Seber G.A.F. (1984). Multivariate observations. New York, Wiley

    Book  MATH  Google Scholar 

  • Shaffer A.L., Rosenwald A., Staudt L.M. (2002). Lymphoid malignancies: the dark side of B-cell differentiation. Nature Reviews Immunology 2, 920–933

    Article  Google Scholar 

  • Tierney L., Kadane J.B. (1986). Accurate approximations for posterior moments and marginal densities. Journal of the American Statistical Association 81, 82–86

    Article  MATH  MathSciNet  Google Scholar 

  • Tierney L., Kass R.E., Kadane J.B. (1989). Fully exponential Laplace approximations to expectations and variances of nonpositive functions. Journal of the American Statistical Association 84, 710–716

    Article  MATH  MathSciNet  Google Scholar 

  • Troyanskaya O.G., Garber M.E., Brown P.O., Botstein D., Altman R.B. (2002). Nonparametric methods for identifying differentially expressed genes in microarray data. Bioinformatics 18, 1454–1461

    Article  Google Scholar 

  • Webb A. (1999). Statistical pattern recognition. London, Arnold

    MATH  Google Scholar 

  • Xu L. (1998). RBF nets, mixture experts, and Bayesian ying-yang learning. Neurocomputing 19, 223–257

    Article  MATH  Google Scholar 

  • Xu L., Jordan M.I., Hinton G.E. (1995). An alternative model for mixtures of experts. In Cowan J.D., et al. (eds). Advances in Neural Information Processing Systems 7. Cambridge, MA, MIT Press, pp. 633–640

    Google Scholar 

  • Zhou P., Levy N.B., Xie H., Qian L., Lee C.Y., Gascoyne R.D., et al. (2001). MCL1 transgenic mice exhibit a high incidence of B-cell lymphoma manifested as a spectrum of histologic subtypes. Blood 97, 3902–3909

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tomohiro Ando.

About this article

Cite this article

Ando, T., Konishi, S. Nonlinear logistic discrimination via regularized radial basis functions for classifying high-dimensional data. Ann Inst Stat Math 61, 331–353 (2009). https://doi.org/10.1007/s10463-007-0143-3

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10463-007-0143-3

Keywords

Navigation