Abstract
A functional classification methodology, based on the Reproducing Kernel Hilbert Space (RKHS) theory, is proposed for discrimination of gene expression profiles. The parameter function involved in the definition of the functional logistic regression is univocally and consistently estimated, from the minimization of the penalized negative log-likelihood over a RKHS generated by a suitable wavelet basis. An iterative descendent method, the gradient method, is applied for solving the corresponding minimization problem, i.e., for computing the functional estimate. Temporal gene expression data involved in the yeast cell cycle are classified with the wavelet-RKHS-based discrimination methodology considered. A simulation study is developed for testing the performance of this statistical classification methodology in comparison with other statistical discrimination procedures.
Similar content being viewed by others
References
Abramovich F, Angelini C (2006) Testing in mixed-effects FANOVA models. J Stat Plan Inference 136: 4326–4348
Angelini C, De Canditiis D, Leblanc F (2003) Wavelet regression estimation in nonparametric mixed effect models. J Multivar Anal 85: 267–291
Araki Y, Konishi S, Kawano S, Matsui H (2009a) Functional regression modeling via regularized Gaussian basis expansions. Ann Inst Stat Math 61: 811–833
Araki Y, Konishi S, Kawano S, Matsui H (2009b) Functional logistic discrimination via regularized basis expansions. Commun Stat Theory Methods 38: 2944–2957
Cardot H, Sarda P (2005) Estimation in generalized linear model for functional data via penalized likelihood. J Multivar Anal 92: 24–41
Ferraty F, Vieu P (2006) Nonparameric functional data analysis. Springer, New York
Hall P, Poskitt D, Presnell B (2001) A functional data-analytic approach to signal discrimination. Technometrics 43: 1–9
Kawano S, Konishi S (2009) Nonlinear logistic discrimination via regularized Gaussian basis expansions. Commun Stat Simul Comput 38: 1414–1425
Konishi, Kitagawa G (1996) Generalised information criteria in model selection. Biometrika 83: 875–890
Konishi S, Ando T, Imoto S (2004) Bayesian information criteria and smoothing parameter selection in radial basis function networks. Biometrika 91: 27–43
Leng X, Müller HG (2006) Classification using functional data analysis for temporal gene expression data. Bioinformatics 22: 68–76
Lian H (2007) Nonlinear functional models for functional responses in reproducing kernel Hilbert spaces. Canad J Stat 35: 597–606
McCullagh P, Nelder JA (1989) Generalized linear models. Chapman & Hall, London
Mendelson S (2002) Learnability in Hilbert spaces with reproducing kernels. J Complexity 18: 152–170
Müller HG (2005) Functional modelling and classification of longitudinal data. Cand J Stat 32: 223–240
Müller HG, Stadtmüller U (2005) Generalized functional linear models. Ann Stat 33: 774–805
Picard R, Cook D (1984) Cross-validation of regression models. J Am Stat Assoc 79: 575–583
Preda C (2007) Regression models for functional data by reproducing kernel Hilbert spaces methods. J Stat Plann Inference 137: 829–840
Rachdi M, Vieu P (2007) Nonparametric regression for functional data: automatic smoothing parameter selection. J Stat Plann Inference 137: 2784–2801
Ramsay J, Silverman BW (2005) Functional data analysis Springer series in statistics. Springer: New York
Rincón M, Ruiz-Medina MD (2012) Local wavelet-vaguelette-based functional classification of gene expression data. Biometrical J 54: 75–93
Ruiz-Medina MD, Salmerón R (2010) Functional maximum-likelihood estimation of ARH(p) models. Stoch Environ Res Risk Assess 24: 131–146
Schölkopf B, Smola A (2002) Learning with kernels. MIT Press, Cambridge
Spellman PT, Sherlock G, Zhang MQ, Iyer VR, Anders K, Eisen MB, Brown PO, Botstein D, Futcher B (1998) Comprehensive identification of cell-cycle regulated genes of the yeast Saccharomyces cerevisiae by microarray hibridization. Mol Biol Cell 9: 3273–3297
Triebel H (1978) Interpolation theory, function spaces, differential operators. North-Holland, Amsterdam
Vakhania NN, Tarieladze VI, Chebonyan SA (1987) Probability distributions in banach spaces. D. Reidel Publishing Company, Dordrecht
Vidakovic B (2006) Statistical modelling by wavelets. Wiley, New York
Yang JY, Peng ZL, Yu Z, Zhang R-J, Anh V, Wang D (2009) Prediction of protein structural classes by recurrence quantification analysis based on chaos game representation. J Theor Biol 257: 618–626
Yang J-Y, Yu Z, Anh V (2009) Clustering structures of large proteins using multifractal analyses based on a 6-letter model and hydrophobicity scale of amino acids. Chaos Solitons Fractals 40: 607–620
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Rincón, M., Ruiz-Medina, M.D. Wavelet-RKHS-based functional statistical classification. Adv Data Anal Classif 6, 201–217 (2012). https://doi.org/10.1007/s11634-012-0112-4
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11634-012-0112-4
Keywords
- Functional data analysis
- Gene expression profiles
- Penalized logistic regression
- Reproducing kernel Hilbert space
- Wavelet decomposition
- Yeast cell cycle