Skip to main content
Log in

Multivariate functional clustering and its application to typhoon data

  • Original Paper
  • Published:
Behaviormetrika Aims and scope Submit manuscript

Abstract

We propose a multivariate nonlinear mixed effects model for clustering multiple longitudinal data. Advantages of the nonlinear mixed effects model are that it is easy to handle unbalanced data which highly occur in the longitudinal study, and it can take into account associations among longitudinal variables at a given time point. The joint modeling for multivariate longitudinal data, however, requires a high computational cost because numerous parameters are included in the model. To overcome this issue, we perform a pairwise fitting procedure based on a pseudo-likelihood function. Unknown parameters included in each bivariate model are estimated by the maximum likelihood method along with the EM algorithm, and then the number of basis functions included in the model is selected by model selection criteria. After estimating the model, a non-hierarchical clustering algorithm by self-organizing maps is applied to the predicted coefficient vectors of individual specific random effect functions. We present the results of the application of the proposed method to the analysis of data of typhoons that occurred between 2000 and 2017 in Asia.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  • Akaike H (1973) Information theory and an extension of the maximum likelihood principle. In: Petrov B, Csaki F (eds) Proceedings of 2nd international symposium information theory. Akademiai Kiado, Budapest, pp 267–281

  • Akaike H (1974) A new look at the statistical model identification. IEEE Trans Auto Control 19:716–723

    Article  MathSciNet  MATH  Google Scholar 

  • Coffey N, Hinde J, Holian E (2014) Clustering longitudinal profiles using \(P\)-splines and mixed effects models applied to time-course gene expression data. Comput Stat Data Anal 71:14–19

    Article  MathSciNet  MATH  Google Scholar 

  • Fieuws S, Verbeke G (2006) Pairwise fitting of mixed models for the joint modeling of multivariate longitudinal profiles. Biometrics 62:424–431

    Article  MathSciNet  MATH  Google Scholar 

  • Giacofci M, Lambert-Lacroix S, Marot G, Picard F (2013) Wavelet-based clustering for mixed-effects functional models in high dimension. Biometrics 69:31–40

    Article  MathSciNet  MATH  Google Scholar 

  • Guo W (2002) Functional mixed effects models. Biometrics 58:121–128

    Article  MathSciNet  MATH  Google Scholar 

  • Hui FK, Müller S, Welsh AH (2017) Sparse pairwise likelihood estimation for multivariate longitudinal mixed models. J Am Stat Assoc. https://doi.org/10.1080/01621459.2017.1371026

    MATH  Google Scholar 

  • Jacques J, Preda C (2014a) Model-based clustering for multivariate functional data. Comput Stat Data Anal 71:92–106

    Article  MathSciNet  MATH  Google Scholar 

  • Jacques J, Preda C (2014b) Functional data clustering: a survey. Adv Data Anal Classif 8:231–255

    Article  MathSciNet  Google Scholar 

  • James G, Sugar C (2003) Clustering for sparsely sampled functional data. J Am Stat Assoc 98:397–408

    Article  MathSciNet  MATH  Google Scholar 

  • Kawano S, Konishi S (2007) Nonlinear regression modeling via regularized Gaussian basis functions. Bull Inform Cybern 39:83–96

    MathSciNet  MATH  Google Scholar 

  • Kayano M, Dozono K, Konishi S (2010) Functional cluster analysis via orthonormalized Gaussian basis expansions and its application. J Classif 230:211–230

    Article  MathSciNet  MATH  Google Scholar 

  • Matsui H, Misumi T, Yokomizo T, Konishi S (2010) Clustering for functional data via nonlinear mixed effects models. Jpn J Appl Stat 45:25–45

    Article  Google Scholar 

  • Kohonen T (2001) Self-organizing maps, 3rd edn. Springer Science & Business Media, New York

    Book  MATH  Google Scholar 

  • Laird N, Ware J (1982) Random-effects models for longitudinal data. Biometrics 38:963–974

    Article  MATH  Google Scholar 

  • Luan Y, Li H (2003) Clustering of time-course gene expression data using a mixed-effects model with \(B\)-splines. Bioinformatics 19:474–482

    Article  Google Scholar 

  • Park J, Ahn J (2017) Clustering multivariate functional data with phase variation. Biometrics 73:324–333

    Article  MathSciNet  MATH  Google Scholar 

  • Reithinger F, Jank W, Tutz G, Shmueli G (2008) Modelling price paths in on-line auctions: smoothing sparse and unevenly sampled curves by using semiparametric mixed models. J R Stat Soc C 57:127–148

    Article  MathSciNet  MATH  Google Scholar 

  • Rice J, Wu C (2001) Nonparametric mixed effects models for unequally sampled noisy curves. Biometrics 57:253–259

    Article  MathSciNet  MATH  Google Scholar 

  • Schwarz G (1978) Estimating the dimension of a model. Ann Stat 6:461–464

    Article  MathSciNet  MATH  Google Scholar 

  • Verbeke G, Molenberghs G (2000) Linear mixed models for longitudinal data. Springer, New York

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Toshihiro Misumi.

Additional information

Communicated by Gil González-Rodríguez.

Appendices

Appendix 1: EM-algorithm for the bivariate nonlinear mixed effects model

We give the EM algorithm for estimating the bivariate nonlinear mixed effects model proposed in Eq. (4).

Step 0 Initialize \(\hat{\varvec{\Sigma }}_{(\xi )}^{rs}=\varvec{I}_2\) and \(\hat{\varvec{\Delta }}_{(\xi )}^{rs}=\varvec{I}_{2p_r}\) for the iteration number \(\xi =0\).

Step 1 Set \(\xi =\xi +1\), then update \(\hat{\varvec{\alpha }}_{(\xi )}^{rs}\) and \(\hat{\varvec{b}}_{i(\xi )}^{rs}\) using

$$\begin{aligned} \hat{\varvec{\alpha }}_{(\xi )}^{rs}= & {} \left( \sum _{i=1}^{n}(\varvec{X}_i^{rs})^{\prime }(\hat{\varvec{V}}_{i(\xi -1)}^{rs})^{-1}\varvec{X}_i^{rs}\right) ^{-1} \left( \sum _{i=1}^{n}(\varvec{X}_i^{rs})^{\prime }(\hat{\varvec{V}}_{i(\xi -1)}^{rs})^{-1}\varvec{y}_i^{rs} \right) ,\\ \hat{\varvec{b}}_{i(\xi )}^{rs}= & {} \hat{\varvec{\Delta }}_{(\xi -1)}^{rs}(\varvec{Z}_i^{rs})^{\prime }(\hat{\varvec{V}}_{i(\xi -1)}^{rs})^{-1}(\varvec{y}_i^{rs} - \varvec{X}_i^{rs}\hat{\varvec{\alpha }}_{(\xi )}^{rs}) \quad (i=1,2,\ldots , n), \end{aligned}$$

where \(\hat{\varvec{V}}_{i(\xi -1)}^{rs} = \varvec{Z}_i^{rs}\hat{\varvec{\Delta }}_{(\xi -1)}^{rs}(\varvec{Z}_i^{rs})^{\prime } + \hat{\varvec{\Sigma }}_{(\xi -1)}^{rs}\varvec{I}_{J_i}\).

Step 2 Update \(\hat{\sigma }_{r(\xi )}^2\) and \(\hat{\varvec{\Delta }}_{(\xi )}^{rs}\) using the following conditional expectations,

$$\begin{aligned} \hat{\sigma }_{k(\xi )}^2= & {} \frac{1}{J}\sum _{i=1}^{n}\left\{ \hat{\varvec{\varepsilon }}_{ik(\xi )}^{\prime } \hat{\varvec{\varepsilon }}_{ik(\xi )} + \hat{\sigma }_{k(\xi -1)}^2\mathrm{tr}[I_{J_i}- \hat{\sigma }_{k(\xi -1)}^2(\varvec{I}_{J_i}|\varvec{0})\varvec{\Gamma }_{i(\xi -1)}^{rs}(\varvec{I}_{J_i}|\varvec{0})^{\prime }]\right\} \quad \ (k=r,s),\\ \hat{\varvec{\Delta }}_{(\xi )}^{rs}= & {} \frac{1}{n}\sum _{i=1}^{n}\left\{ \hat{\varvec{b}}_{i(\xi )}^{rs} \hat{\varvec{b}}_{i(\xi )}^{rs\prime } + \hat{\varvec{\Delta }}_{(\xi -1)}^{rs} - \hat{\varvec{\Delta }}_{(\xi -1)}^{rs}\varvec{Z}_i^{rs\prime }\varvec{\Gamma }_{i(\xi -1)}^{rs}\varvec{Z}_i^{rs}\hat{\varvec{\Delta }}_{(\xi -1)}^{rs} \right\} , \end{aligned}$$

where \(J=\sum _{i=1}^nJ_i\), \(\hat{\varvec{\varepsilon }}_{ik(\xi )} = \varvec{y}_{ik} - \varvec{X}_{ik}\hat{\varvec{\alpha }}_{k(\xi )} - \varvec{Z}_{ik}\hat{\varvec{b}}_{ik(\xi )}\) and \(\varvec{\Gamma }_{i(\xi -1)}^{rs} = (\hat{\varvec{V}}_{i(\xi -1)}^{rs})^{-1} - (\hat{\varvec{V}}_{i(\xi -1)}^{rs})^{-1} \varvec{X}_{i}^{rs} (\sum _{i=1}^{n}(\varvec{X}_i^{rs})^{\prime }(\hat{\varvec{V}}_{i(\xi -1)}^{rs})^{-1}\varvec{X}_{i}^{rs})^{-1}(\varvec{X}_i^{rs})^{\prime }(\hat{\varvec{V}}_{i(\xi -1)}^{rs})^{-1} \).

Step 3 Repeat step 1 and 2 until convergence.

Appendix 2: algorithm of self-organizing maps

Consider classifying the vector \(\tilde{\varvec{\gamma }}_i \in \mathbb {R}^d\)\((i = 1, \ldots , n)\) into C clusters. Now, let \(\{ \varvec{u}_c = (u_{c1}, u_{c2}, \ldots , u_{cd})^{\prime }; c = 1, \ldots , C \}\) be a reference vector for the cth cluster and \(\{ \varvec{r}_c; c = 1, \ldots , C \}\) be an output layer for \(\varvec{u}_c\), where \(\varvec{r}_c\) are placed at equal spaced intervals in two-dimensional space \([1, g_1] \times [1, g_2]\) with \(g_1,g_2 \in \mathbb {N}\) and \(C=g_1g_2\). Let t be the current iteration and \(T=10,000\) be the total number of iterations. Then the SOM algorithm for clustering \(\tilde{\varvec{\gamma }}_i\) is given as follows:

  1. 1.

    Initialize the reference vector \(\varvec{u}_k\)\((k = 1, \ldots , K)\) from an uniform distribution U(0, 1).

  2. 2.

    Update \(\varvec{u}_k\) by applying the following steps to \(i = 1, \ldots , n\).

  3. (a)

    Define \(\tilde{c}\) as follows.

    $$\begin{aligned} \tilde{c} = \arg \min _{k} \left\{ \left\| \tilde{\varvec{\gamma }}_i - \varvec{u}_k \right\| _{2} \right\} . \end{aligned}$$
  4. (b)

    Let \(U(\varvec{r}_{\tilde{c}})\) be a neighborhood of \(\varvec{r}_{\tilde{c}}\) on \(\mathbb {R}^2 \) with a radius \(\sigma (t)\), where \(\sigma (t)\) is a monotonically decreasing function for t. Let \(N_{\tilde{c}}(t)=\{k; \varvec{r}_k\in U(\varvec{r}_{\tilde{c}})\}\), then for all \(k \in N_{\tilde{c}}(t)\), update the reference vector as follows.

    $$\begin{aligned} \varvec{u}_k \leftarrow \varvec{u}_k + h_{\tilde{c}k}(t) \left( \tilde{\varvec{\gamma }}_i - \varvec{u}_k \right) , \end{aligned}$$

    where \(h_{\tilde{c}k} (t)\) is a neighborhood function given by

    $$\begin{aligned} h_{\tilde{c}k}(t) = \alpha (t) \exp \left\{ -\frac{\Vert \varvec{r}_{\tilde{c}} - \varvec{r}_k\Vert ^2}{2 \left\{ \sigma (t) \right\} ^2 } \right\} . \end{aligned}$$

    \(\alpha (t) \in (0, 1)\) is a learning rate coefficient and is set to be a monotone decreasing function of t. Here, \(\alpha (t) = 0.9(1 - t / T)\).

  5. 3.

    Repeat step 2 for T times.

  6. 4.

    Let \(\hat{\varvec{u}}_{k}\) be the reference vector obtained by T iterations. Assign \(\tilde{\varvec{\gamma }}_i\)\((i = 1, \ldots , N)\) to the \(\tilde{c}\)th cluster by

    $$\begin{aligned} \tilde{c} = \arg \min _{k} \left\{ \left\| \tilde{\varvec{\gamma }}_i - \hat{\varvec{u}}_{{k}} \right\| _{2} \right\} . \end{aligned}$$

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Misumi, T., Matsui, H. & Konishi, S. Multivariate functional clustering and its application to typhoon data. Behaviormetrika 46, 163–175 (2019). https://doi.org/10.1007/s41237-018-0066-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s41237-018-0066-8

Keywords

Navigation