Skip to main content
Log in

Functional classwise principal component analysis: a classification framework for functional data analysis

  • Published:
Data Mining and Knowledge Discovery Aims and scope Submit manuscript

Abstract

In recent times, functional data analysis has been successfully applied in the field of high dimensional data classification. In this paper, we present a classification framework using functional data and classwise Principal Component Analysis (PCA). Our proposed method can be used in high dimensional time series data which typically suffers from small sample size problem. Our method extracts a piecewise linear functional feature space and is particularly suitable for hard classification problems. The proposed framework converts time series data into functional data and uses classwise functional PCA for feature extraction followed by classification using a Bayesian linear classifier. We demonstrate the efficacy of our proposed method by applying it to both synthetic data sets and real time series data from diverse fields including but not limited to neuroscience, food science, medical sciences and chemometrics.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Availability of data and materials

Available on request.

Code availability

https://github.com/ChatterjeeAvishek/FCPCA.

Notes

  1. \(A=\{f_1,f_2,\ldots , f_n\}\) is said to be linearly independent (LI) if \(c_1f_1+c_2f_2+\ldots +c_nf_n=\textbf{0}\), where \(c_i\)’s are scalars and \(\textbf{0}\) denotes zero function, has only one solution i.e., \(c_1=c_2=\ldots =c_n=0\) (Kreyszig 1991). A set that is not LI is called linearly dependent (LD). Hence a linearly dependent set of functions has at least one function \(f_j\) that can be written as a linear combination of other elements of that set. Usually, the Gram–Schmidt orthonormalization process is applied to LI set. If we try to apply the Gram–Schmidt orthonormalization process on the LD set, then some \(g_j\), for \(j\in \{1, \ldots ,n\}\) will become zero function, and hence we cannot obtain an orthonormal set.

References

  • Acal C, Aguilera AM (2022) Basis expansion approaches for functional analysis of variance with repeated measures. Adv Data Anal Classif:1–31

  • Agrawal R, Faloutsos C, Swami A (1993) Efficient similarity search in sequence databases. In: International conference on foundations of data organization and algorithms. Springer, pp 69–84

  • Aguilera AM, Escabias M (2000) Principal component logistic regression. In: COMPSTAT. Springer, pp 175–180

  • Alpaydin E (2021) Machine learning. MIT Press, Cambridge

    Book  MATH  Google Scholar 

  • Bagnall A, Lines J, Hills J, Bostrom A (2015) Time-series classification with cote: the collective of transformation-based ensembles. IEEE Trans Knowl Data Eng 27(9):2522–2535

    Article  Google Scholar 

  • Bagnall A, Lines J, Bostrom A, Large J, Keogh E (2017) The great time series classification bake off: a review and experimental evaluation of recent algorithmic advances. Data Min Knowl Discov 31(3):606–660

    Article  MathSciNet  Google Scholar 

  • Bagnall A, Flynn M, Large J, Lines J, Middlehurst M (2020) On the usage and performance of the hierarchical vote collective of transformation-based ensembles version 1.0 (hive-cote v1. 0). In: International workshop on advanced analytics and learning on temporal data. Springer, pp 3–18

  • Bagnall A, Lines J, Vickers W, Keogh E (2018) The uea & ucr time series classification repository. http://www.timeseriesclassification.com

  • Belhumeur PN, Hespanha JP, Kriegman DJ (1996) Eigenfaces vs. fisherfaces: recognition using class specific linear projection. In: European conference on computer vision. Springer, pp 43–58

  • Bishop CM (2006) Pattern recognition and machine learning. Springer, New York

    MATH  Google Scholar 

  • Björck Å (1967) Solving linear least squares problems by gram-schmidt orthogonalization. BIT Numer Math 7(1):1–21

    Article  MathSciNet  MATH  Google Scholar 

  • Bostrom A, Bagnall A (2017) Binary shapelet transform for multiclass time series classification, pp 24–46

  • Bottou L, Curtis FE, Nocedal J (2018) Optimization methods for large-scale machine learning. Siam Review 60(2):223–311

    Article  MathSciNet  MATH  Google Scholar 

  • Carmen Aguilera-Morillo M, Aguilera AM (2020) Multi-class classification of biomechanical data: a functional lda approach based on multi-class penalized functional pls. Stat Model 20(6):592–616

    Article  MathSciNet  MATH  Google Scholar 

  • Chen T, Guestrin C (2016) Xgboost: a scalable tree boosting system. In: Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining, pp 785–794

  • Chiou J-M, Chen Y-T, Yang Y-F (2014) Multivariate functional principal component analysis: a normalization approach. Statistica Sinica:1571–1596

  • Das K, Nenadic Z (2009) An efficient discriminant-based solution for small sample size problem. Pattern Recogn 42(5):857–866

    Article  MATH  Google Scholar 

  • Dau HA, Bagnall A, Kamgar K, Yeh C-CM, Zhu Y, Gharghabi S, Ratanamahatana CA, Keogh E (2019) The ucr time series archive. IEEE/CAA J Automatica Sinica 6(6):1293–1305

    Article  Google Scholar 

  • Dempster A, Petitjean F, Webb GI (2020) Rocket: exceptionally fast and accurate time series classification using random convolutional kernels. Data Min Knowl Discov 34(5):1454–1495

    Article  MathSciNet  MATH  Google Scholar 

  • Dempster A, Schmidt DF, Webb GI (2021) Minirocket: a very fast (almost) deterministic transform for time series classification. In: Proceedings of the 27th ACM SIGKDD conference on knowledge discovery & data mining, pp 248–257

  • Escabias M, Aguilera AM, Valderrama MJ (2004) Principal component estimation of functional logistic regression: discussion of two different approaches. J Nonparametric Stat 16(3–4):365–384

    Article  MathSciNet  MATH  Google Scholar 

  • Fawaz HI, Forestier G, Weber J, Idoumghar L, Muller P-A (2019) Deep learning for time series classification: a review. Data Min Knowl Discov 33(4):917–963

    Article  MathSciNet  MATH  Google Scholar 

  • Fawaz HI, Lucas B, Forestier G, Pelletier C, Schmidt DF, Weber J, Webb GI, Idoumghar L, Muller P-A, Petitjean F (2020) Inceptiontime: finding alexnet for time series classification. Data Min Knowl Discov 34(6):1936–1962

    Article  MathSciNet  Google Scholar 

  • Ferraty F, Vieu P (2006) Nonparametric functional data analysis: theory and practice. Springer, New York

    MATH  Google Scholar 

  • Friedman JH (1989) Regularized discriminant analysis. J Am Stat Assoc 84(405):165–175

    Article  MathSciNet  Google Scholar 

  • Garcia S, Herrera F (2008) An extension on “statistical comparisons of classifiers over multiple data sets” for all pairwise comparisons. J Mach Learn Res 9(12)

  • Ge H (1998) Iterative gram-schmidt orthonormalization for efficient parameter estimation. In: Proceedings of the 1998 IEEE international conference on acoustics, speech and signal processing, ICASSP’98 (Cat. No. 98CH36181). IEEE, vol 4, pp 2477–2480

  • Gertheiss J, Maity A, Staicu A-M (2013) Variable selection in generalized functional linear models. Stat 2(1):86–101

    Article  MathSciNet  Google Scholar 

  • Goodfellow I, Bengio Y, Courville A (2016) Deep learning. MIT press, Cambridge

    MATH  Google Scholar 

  • Górecki T, Krzyśko M (2012) A kernel version of functional principal component analysis. Stat Trans New Ser 13(3):559–668

    Google Scholar 

  • Hadjipantelis PZ, Müller H-G (2018) Functional data analysis for big data: a case study on california temperature trends. In: Handbook of big data analytics, pp 457–483

  • Hall P, Müller H-G, Wang J-L (2006) Properties of principal component methods for functional and longitudinal data analysis. Ann Stat:1493–1517

  • Hastie T, Buja A, Tibshirani R (1995) Penalized discriminant analysis. Ann Stat:73–102

  • Horváth L, Kokoszka P (2012) Inference for functional data with applications, vol 200. Springer, New York

    MATH  Google Scholar 

  • Hsing T, Eubank R (2015) Theoretical foundations of functional data analysis, with an introduction to linear operators. Wiley, Chichester

    Book  MATH  Google Scholar 

  • Huang Y-W, Yu PS (1999) Adaptive query processing for time-series data. In: Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining, pp 282–286

  • Huang X, Caron M, Hindson D (2001) A recursive gram-schmidt orthonormalization procedure and its application to communications. In: 2001 IEEE third workshop on signal processing advances in wireless communications (SPAWC’01). Workshop Proceedings (Cat. No. 01EX471). IEEE, pp 340–343

  • Izenman AJ (2008) Modern multivariate statistical techniques. Springer, New York

    Book  MATH  Google Scholar 

  • James GM, Hastie TJ (2001) Functional linear discriminant analysis for irregularly sampled curves. J R Stat Soc Ser B (Stat Methodol) 63(3):533–550

    Article  MathSciNet  MATH  Google Scholar 

  • Joy AA, HasanMd AM, Sayeed A (2020) An improved class-wise principal component analysis based feature extraction framework for hyperspectral image classification. In: Proceedings of the international conference on computing advancements, pp 1–6

  • Kadri H, Preux P, Duflos E, Canu S (2011) Multiple functional regression with both discrete and continuous covariates. Recent Adv Funct Data Anal Relat Top. Springer, pp 189–195

  • Koel D, Rizzuto Daniel S, Zoran N (2009) Mental state estimation for brain-computer interfaces. IEEE Trans Biomed Eng 56(8):2114–2122

    Article  Google Scholar 

  • Korenberg M, Billings SA, Liu YP, McIlroy PJ (1988) Orthogonal parameter estimation algorithm for non-linear stochastic systems. Int J Control 48(1):193–210

    Article  MATH  Google Scholar 

  • Kreyszig E (1991) Introductory functional analysis with applications, vol 17. Wiley, New York

    MATH  Google Scholar 

  • Kvam PH, Vidakovic B (2007) Nonparametric statistics with applications to science and engineering. Wiley, New Jersey

    Book  MATH  Google Scholar 

  • Lines J, Bagnall A (2015) Time series classification with ensembles of elastic distance measures. Data Min Knowl Discov 29(3):565–592

    Article  MathSciNet  MATH  Google Scholar 

  • Liu Z-Y, Chiu K-C, Lei X (2003) Improved system for object detection and star/galaxy classification via local subspace analysis. Neural Netw 16(3–4):437–451

    Article  Google Scholar 

  • Liu R, Wang H, Wang S (2018) Functional variable selection via gram-schmidt orthogonalization for multiple functional linear regression. J Stat Comput Simul 88(18):3664–3680

    Article  MathSciNet  MATH  Google Scholar 

  • Lucas B, Shifaz A, Pelletier C, O’Neill L, Zaidi N, Goethals B, Petitjean F, Webb GI (2019) Proximity forest: an effective and scalable distance-based classifier for time series. Data Min Knowl Discov 33(3):607–635

    Article  Google Scholar 

  • McCullagh P, Nelder JA (1989) Binary data. In: McCullagh P, Nelder JA (eds) Generalized linear models. Springer, New York, pp 98–148

    Chapter  MATH  Google Scholar 

  • Middlehurst M, Large J, Flynn M, Lines J, Bostrom A, Bagnall A (2021) Hive-cote 2.0: a new meta ensemble for time series classification. Mach Learn 110(11):3211–3243

    Article  MathSciNet  MATH  Google Scholar 

  • Middlehurst M, Large J, Bagnall A (2020a) The canonical interval forest (cif) classifier for time series classification. In: 2020 IEEE international conference on big data (big data). IEEE, pp 188–195

  • Middlehurst M, Large J, Cawley G, Bagnall A (2020b) The temporal dictionary ensemble (tde) classifier for time series classification. In: Joint European conference on machine learning and knowledge discovery in databases. Springer, pp 660–676

  • Middlehurst M, Vickers W, Bagnall A (2019) Scalable dictionary classifiers for time series classification. In: International conference on intelligent data engineering and automated learning. Springer, pp 11–19

  • Min W, Ke L, He X (2004) Locality pursuit embedding. Pattern Recogn 37(4):781–788

    Article  MATH  Google Scholar 

  • Pascual-Marqui RD et al (2002) Standardized low-resolution brain electromagnetic tomography (sloreta): technical details. Methods Find Exp Clin Pharmacol 24(Suppl D):5–12

    Google Scholar 

  • Pfisterer F, Beggel L, Sun X, Scheipl F, Bischl B (2019) Benchmarking time series classification–functional data vs machine learning approaches. Preprint arXiv:1911.07511

  • Preda C, Saporta G, Lévéder C (2007) Pls classification of functional data. Comput Stat 22(2):223–235

    Article  MathSciNet  MATH  Google Scholar 

  • Rakthanmanon T, Campana B, Mueen A, Batista G, Westover B, Zhu Q, Zakaria J, Keogh E (2013) Addressing big data time series: mining trillions of time series subsequences under dynamic time warping. ACM Trans Knowl Discov from Data (TKDD) 7(3):1–31

    Article  Google Scholar 

  • Ramsay JO (2004) Functional data analysis. Encyclopedia Stat Sci 4

  • Ramsay JO (2009) Giles Hooker, and Spencer Graves, Introduction to functional data analysis. Springer, New York

    Google Scholar 

  • Ramsay JO, Silverman BW (2006) Functional data analysis, 2nd edn. Springer, New York

    MATH  Google Scholar 

  • Ramsay JO, Silverman BW (2007) Applied functional data analysis: methods and case studies. Springer, New York

    MATH  Google Scholar 

  • Ramsay J, Silverman BW (2013) Functional data analysis, springer series in statistics. Springer, New York

    Google Scholar 

  • Rossion B, Joyce CA, Cottrell GW, Tarr MJ (2003) Early lateralization and orientation tuning for face, word, and object processing in the visual cortex. Neuroimage 20(3):1609–1624

    Article  Google Scholar 

  • Roy TS, Giri B, Chowdhury AS, Mazumder S, Das K (2020) How our perception and confidence are altered using decision cues. Front Neurosci 13:1371

    Article  Google Scholar 

  • Ruiz AP, Flynn M, Large J, Middlehurst M, Bagnall A (2021) The great multivariate time series classification bake off: a review and experimental evaluation of recent algorithmic advances. Data Min Knowl Discov 35(2):401–449

    Article  MathSciNet  MATH  Google Scholar 

  • Saito N, Coifman RR (1995) Local discriminant bases and their applications. J Math Imaging Vis 5(4):337–358

    Article  MathSciNet  MATH  Google Scholar 

  • Schäfer P (2015) The boss is concerned with time series classification in the presence of noise. Data Min Knowl Discov 29(6):1505–1530

    Article  MathSciNet  MATH  Google Scholar 

  • Schäfer P (2016) Scalable time series classification. Data Min Knowl Discov 30(5):1273–1298

    Article  MathSciNet  MATH  Google Scholar 

  • Schäfer P, Leser U (2017) Fast and accurate time series classification with weasel. In: Proceedings of the 2017 ACM on conference on information and knowledge management, pp 637–646

  • Shifaz A, Pelletier C, Petitjean F, Webb GI (2020) Ts-chief: a scalable and accurate forest algorithm for time series classification. Data Min Knowl Discov 34(3):742–775

    Article  MathSciNet  MATH  Google Scholar 

  • Shin H, Hsing T (2012) Linear prediction in functional data analysis. Stoch Process Appl 122(11):3680–3700

    Article  MathSciNet  MATH  Google Scholar 

  • Stein ML (1999) Interpolation of spatial data: some theory for kriging. Springer, Chicago

    Book  MATH  Google Scholar 

  • Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1–9

  • Turk M, Pentland A (1991) Eigenfaces for recognition. J Cogn Neurosci 3(1):71–86

    Article  Google Scholar 

  • Ullah S, Finch CF (2010) Functional data modelling approach for analysing and predicting trends in incidence rates-an application to falls injury. Osteoporosis Int 21(12):2125–2134

    Article  Google Scholar 

  • Ullah S, Finch CF (2013) Applications of functional data analysis: a systematic review. BMC Med Res Methodol 13(1):1–12

    Article  Google Scholar 

  • Wang J-L, Chiou J-M, Müller H-G (2016) Functional data analysis. Ann Rev Stat Appl 3:257–295

    Article  Google Scholar 

  • Wright MN, Ziegler A (2015) Ranger: a fast implementation of random forests for high dimensional data in c++ and r. Preprint arXiv:1508.04409

Download references

Acknowledgements

A. Chatterjee is supported by an INSPIRE fellowship from the Department of Science and Technology (DST), Government of India. We sincerely thank the anonymous reviewers for their insightful comments which have significantly improved the manuscript.

Funding

This work was supported by INSPIRE fellowship from the Department of Science and Technology (DST), Government of India (INSPIRE Code: IF170367)

Author information

Authors and Affiliations

Authors

Contributions

AC and KD designed the research. AC and SM performed the research. AC wrote the manuscript. KD edited the manuscript and supervised the entire work.

Corresponding author

Correspondence to Koel Das.

Ethics declarations

Conflict of interest

The authors have no competing interests to declare that are relevant to the content of this article.

Additional information

Responsible editor: Michelangelo Ceci.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chatterjee, A., Mazumder, S. & Das, K. Functional classwise principal component analysis: a classification framework for functional data analysis. Data Min Knowl Disc 37, 552–594 (2023). https://doi.org/10.1007/s10618-022-00898-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10618-022-00898-1

Keywords

Navigation