Multivariate functional clustering and its application to typhoon data

Misumi, Toshihiro; Matsui, Hidetoshi; Konishi, Sadanori

doi:10.1007/s41237-018-0066-8

Multivariate functional clustering and its application to typhoon data

Original Paper
Published: 08 September 2018

Volume 46, pages 163–175, (2019)
Cite this article

Behaviormetrika Aims and scope Submit manuscript

430 Accesses
7 Citations
3 Altmetric
Explore all metrics

Abstract

We propose a multivariate nonlinear mixed effects model for clustering multiple longitudinal data. Advantages of the nonlinear mixed effects model are that it is easy to handle unbalanced data which highly occur in the longitudinal study, and it can take into account associations among longitudinal variables at a given time point. The joint modeling for multivariate longitudinal data, however, requires a high computational cost because numerous parameters are included in the model. To overcome this issue, we perform a pairwise fitting procedure based on a pseudo-likelihood function. Unknown parameters included in each bivariate model are estimated by the maximum likelihood method along with the EM algorithm, and then the number of basis functions included in the model is selected by model selection criteria. After estimating the model, a non-hierarchical clustering algorithm by self-organizing maps is applied to the predicted coefficient vectors of individual specific random effect functions. We present the results of the application of the proposed method to the analysis of data of typhoons that occurred between 2000 and 2017 in Asia.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A New Unsupervised Classification Technique Through Nonlinear Non Parametric Mixed-Effects Models

Clustering longitudinal ordinal data via finite mixture of matrix-variate distributions

Article 17 February 2024

Outlier Detection for Pandemic-Related Data Using Compositional Functional Data Analysis

References

Akaike H (1973) Information theory and an extension of the maximum likelihood principle. In: Petrov B, Csaki F (eds) Proceedings of 2nd international symposium information theory. Akademiai Kiado, Budapest, pp 267–281
Akaike H (1974) A new look at the statistical model identification. IEEE Trans Auto Control 19:716–723
Article MathSciNet MATH Google Scholar
Coffey N, Hinde J, Holian E (2014) Clustering longitudinal profiles using $P$-splines and mixed effects models applied to time-course gene expression data. Comput Stat Data Anal 71:14–19
Article MathSciNet MATH Google Scholar
Fieuws S, Verbeke G (2006) Pairwise fitting of mixed models for the joint modeling of multivariate longitudinal profiles. Biometrics 62:424–431
Article MathSciNet MATH Google Scholar
Giacofci M, Lambert-Lacroix S, Marot G, Picard F (2013) Wavelet-based clustering for mixed-effects functional models in high dimension. Biometrics 69:31–40
Article MathSciNet MATH Google Scholar
Guo W (2002) Functional mixed effects models. Biometrics 58:121–128
Article MathSciNet MATH Google Scholar
Hui FK, Müller S, Welsh AH (2017) Sparse pairwise likelihood estimation for multivariate longitudinal mixed models. J Am Stat Assoc. https://doi.org/10.1080/01621459.2017.1371026
MATH Google Scholar
Jacques J, Preda C (2014a) Model-based clustering for multivariate functional data. Comput Stat Data Anal 71:92–106
Article MathSciNet MATH Google Scholar
Jacques J, Preda C (2014b) Functional data clustering: a survey. Adv Data Anal Classif 8:231–255
Article MathSciNet Google Scholar
James G, Sugar C (2003) Clustering for sparsely sampled functional data. J Am Stat Assoc 98:397–408
Article MathSciNet MATH Google Scholar
Kawano S, Konishi S (2007) Nonlinear regression modeling via regularized Gaussian basis functions. Bull Inform Cybern 39:83–96
MathSciNet MATH Google Scholar
Kayano M, Dozono K, Konishi S (2010) Functional cluster analysis via orthonormalized Gaussian basis expansions and its application. J Classif 230:211–230
Article MathSciNet MATH Google Scholar
Matsui H, Misumi T, Yokomizo T, Konishi S (2010) Clustering for functional data via nonlinear mixed effects models. Jpn J Appl Stat 45:25–45
Article Google Scholar
Kohonen T (2001) Self-organizing maps, 3rd edn. Springer Science & Business Media, New York
Book MATH Google Scholar
Laird N, Ware J (1982) Random-effects models for longitudinal data. Biometrics 38:963–974
Article MATH Google Scholar
Luan Y, Li H (2003) Clustering of time-course gene expression data using a mixed-effects model with $B$-splines. Bioinformatics 19:474–482
Article Google Scholar
Park J, Ahn J (2017) Clustering multivariate functional data with phase variation. Biometrics 73:324–333
Article MathSciNet MATH Google Scholar
Reithinger F, Jank W, Tutz G, Shmueli G (2008) Modelling price paths in on-line auctions: smoothing sparse and unevenly sampled curves by using semiparametric mixed models. J R Stat Soc C 57:127–148
Article MathSciNet MATH Google Scholar
Rice J, Wu C (2001) Nonparametric mixed effects models for unequally sampled noisy curves. Biometrics 57:253–259
Article MathSciNet MATH Google Scholar
Schwarz G (1978) Estimating the dimension of a model. Ann Stat 6:461–464
Article MathSciNet MATH Google Scholar
Verbeke G, Molenberghs G (2000) Linear mixed models for longitudinal data. Springer, New York
MATH Google Scholar

Download references

Author information

Authors and Affiliations

Department of Biostatistics, Yokohama City University School of Medicine, Yokohama, Japan
Toshihiro Misumi
Faculty of Data Science, Shiga University, Hikone, Japan
Hidetoshi Matsui
Department of Mathematics, Chuo University, Tokyo, Japan
Sadanori Konishi

Authors

Toshihiro Misumi
View author publications
You can also search for this author in PubMed Google Scholar
Hidetoshi Matsui
View author publications
You can also search for this author in PubMed Google Scholar
Sadanori Konishi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Toshihiro Misumi.

Additional information

Communicated by Gil González-Rodríguez.

Appendices

Appendix 1: EM-algorithm for the bivariate nonlinear mixed effects model

We give the EM algorithm for estimating the bivariate nonlinear mixed effects model proposed in Eq. (4).

Step 0 Initialize $\hat{\varvec{\Sigma }}_{(\xi )}^{rs}=\varvec{I}_2$ and $\hat{\varvec{\Delta }}_{(\xi )}^{rs}=\varvec{I}_{2p_r}$ for the iteration number $\xi =0$.

Step 1 Set $\xi =\xi +1$, then update $\hat{\varvec{\alpha }}_{(\xi )}^{rs}$ and $\hat{\varvec{b}}_{i(\xi )}^{rs}$ using

$$\begin{aligned} \hat{\varvec{\alpha }}_{(\xi )}^{rs}= & {} \left( \sum _{i=1}^{n}(\varvec{X}_i^{rs})^{\prime }(\hat{\varvec{V}}_{i(\xi -1)}^{rs})^{-1}\varvec{X}_i^{rs}\right) ^{-1} \left( \sum _{i=1}^{n}(\varvec{X}_i^{rs})^{\prime }(\hat{\varvec{V}}_{i(\xi -1)}^{rs})^{-1}\varvec{y}_i^{rs} \right) ,\\ \hat{\varvec{b}}_{i(\xi )}^{rs}= & {} \hat{\varvec{\Delta }}_{(\xi -1)}^{rs}(\varvec{Z}_i^{rs})^{\prime }(\hat{\varvec{V}}_{i(\xi -1)}^{rs})^{-1}(\varvec{y}_i^{rs} - \varvec{X}_i^{rs}\hat{\varvec{\alpha }}_{(\xi )}^{rs}) \quad (i=1,2,\ldots , n), \end{aligned}$$

where $\hat{\varvec{V}}_{i(\xi -1)}^{rs} = \varvec{Z}_i^{rs}\hat{\varvec{\Delta }}_{(\xi -1)}^{rs}(\varvec{Z}_i^{rs})^{\prime } + \hat{\varvec{\Sigma }}_{(\xi -1)}^{rs}\varvec{I}_{J_i}$.

Step 2 Update $\hat{\sigma }_{r(\xi )}^2$ and $\hat{\varvec{\Delta }}_{(\xi )}^{rs}$ using the following conditional expectations,

$$\begin{aligned} \hat{\sigma }_{k(\xi )}^2= & {} \frac{1}{J}\sum _{i=1}^{n}\left\{ \hat{\varvec{\varepsilon }}_{ik(\xi )}^{\prime } \hat{\varvec{\varepsilon }}_{ik(\xi )} + \hat{\sigma }_{k(\xi -1)}^2\mathrm{tr}[I_{J_i}- \hat{\sigma }_{k(\xi -1)}^2(\varvec{I}_{J_i}|\varvec{0})\varvec{\Gamma }_{i(\xi -1)}^{rs}(\varvec{I}_{J_i}|\varvec{0})^{\prime }]\right\} \quad \ (k=r,s),\\ \hat{\varvec{\Delta }}_{(\xi )}^{rs}= & {} \frac{1}{n}\sum _{i=1}^{n}\left\{ \hat{\varvec{b}}_{i(\xi )}^{rs} \hat{\varvec{b}}_{i(\xi )}^{rs\prime } + \hat{\varvec{\Delta }}_{(\xi -1)}^{rs} - \hat{\varvec{\Delta }}_{(\xi -1)}^{rs}\varvec{Z}_i^{rs\prime }\varvec{\Gamma }_{i(\xi -1)}^{rs}\varvec{Z}_i^{rs}\hat{\varvec{\Delta }}_{(\xi -1)}^{rs} \right\} , \end{aligned}$$

where $J=\sum _{i=1}^nJ_i$, $\hat{\varvec{\varepsilon }}_{ik(\xi )} = \varvec{y}_{ik} - \varvec{X}_{ik}\hat{\varvec{\alpha }}_{k(\xi )} - \varvec{Z}_{ik}\hat{\varvec{b}}_{ik(\xi )}$ and $\varvec{\Gamma }_{i(\xi -1)}^{rs} = (\hat{\varvec{V}}_{i(\xi -1)}^{rs})^{-1} - (\hat{\varvec{V}}_{i(\xi -1)}^{rs})^{-1} \varvec{X}_{i}^{rs} (\sum _{i=1}^{n}(\varvec{X}_i^{rs})^{\prime }(\hat{\varvec{V}}_{i(\xi -1)}^{rs})^{-1}\varvec{X}_{i}^{rs})^{-1}(\varvec{X}_i^{rs})^{\prime }(\hat{\varvec{V}}_{i(\xi -1)}^{rs})^{-1} $.

Step 3 Repeat step 1 and 2 until convergence.

Appendix 2: algorithm of self-organizing maps

Consider classifying the vector $\tilde{\varvec{\gamma }}_i \in \mathbb {R}^d$$(i = 1, \ldots , n)$ into C clusters. Now, let $\{ \varvec{u}_c = (u_{c1}, u_{c2}, \ldots , u_{cd})^{\prime }; c = 1, \ldots , C \}$ be a reference vector for the cth cluster and $\{ \varvec{r}_c; c = 1, \ldots , C \}$ be an output layer for $\varvec{u}_c$, where $\varvec{r}_c$ are placed at equal spaced intervals in two-dimensional space $[1, g_1] \times [1, g_2]$ with $g_1,g_2 \in \mathbb {N}$ and $C=g_1g_2$. Let t be the current iteration and $T=10,000$ be the total number of iterations. Then the SOM algorithm for clustering $\tilde{\varvec{\gamma }}_i$ is given as follows:

1.
Initialize the reference vector $\varvec{u}_k$$(k = 1, \ldots , K)$ from an uniform distribution U(0, 1).
2.
Update $\varvec{u}_k$ by applying the following steps to $i = 1, \ldots , n$.
(a)
Define $\tilde{c}$ as follows.
$$\begin{aligned} \tilde{c} = \arg \min _{k} \left\{ \left\| \tilde{\varvec{\gamma }}_i - \varvec{u}_k \right\| _{2} \right\} . \end{aligned}$$
(b)
Let $U(\varvec{r}_{\tilde{c}})$ be a neighborhood of $\varvec{r}_{\tilde{c}}$ on $\mathbb {R}^2 $ with a radius $\sigma (t)$, where $\sigma (t)$ is a monotonically decreasing function for t. Let $N_{\tilde{c}}(t)=\{k; \varvec{r}_k\in U(\varvec{r}_{\tilde{c}})\}$, then for all $k \in N_{\tilde{c}}(t)$, update the reference vector as follows.
$$\begin{aligned} \varvec{u}_k \leftarrow \varvec{u}_k + h_{\tilde{c}k}(t) \left( \tilde{\varvec{\gamma }}_i - \varvec{u}_k \right) , \end{aligned}$$
where $h_{\tilde{c}k} (t)$ is a neighborhood function given by
$$\begin{aligned} h_{\tilde{c}k}(t) = \alpha (t) \exp \left\{ -\frac{\Vert \varvec{r}_{\tilde{c}} - \varvec{r}_k\Vert ^2}{2 \left\{ \sigma (t) \right\} ^2 } \right\} . \end{aligned}$$
$\alpha (t) \in (0, 1)$ is a learning rate coefficient and is set to be a monotone decreasing function of t. Here, $\alpha (t) = 0.9(1 - t / T)$.
3.
Repeat step 2 for T times.
4.
Let $\hat{\varvec{u}}_{k}$ be the reference vector obtained by T iterations. Assign $\tilde{\varvec{\gamma }}_i$$(i = 1, \ldots , N)$ to the $\tilde{c}$th cluster by
$$\begin{aligned} \tilde{c} = \arg \min _{k} \left\{ \left\| \tilde{\varvec{\gamma }}_i - \hat{\varvec{u}}_{{k}} \right\| _{2} \right\} . \end{aligned}$$

About this article

Cite this article

Misumi, T., Matsui, H. & Konishi, S. Multivariate functional clustering and its application to typhoon data. Behaviormetrika 46, 163–175 (2019). https://doi.org/10.1007/s41237-018-0066-8

Download citation

Received: 15 March 2018
Accepted: 01 September 2018
Published: 08 September 2018
Issue Date: 01 April 2019
DOI: https://doi.org/10.1007/s41237-018-0066-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multivariate functional clustering and its application to typhoon data

Abstract

Access this article

Similar content being viewed by others

A New Unsupervised Classification Technique Through Nonlinear Non Parametric Mixed-Effects Models

Clustering longitudinal ordinal data via finite mixture of matrix-variate distributions

Outlier Detection for Pandemic-Related Data Using Compositional Functional Data Analysis

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Appendices

Appendix 1: EM-algorithm for the bivariate nonlinear mixed effects model

Appendix 2: algorithm of self-organizing maps

About this article

Cite this article

Keywords

Navigation

Multivariate functional clustering and its application to typhoon data

Abstract

Access this article

Similar content being viewed by others

A New Unsupervised Classification Technique Through Nonlinear Non Parametric Mixed-Effects Models

Clustering longitudinal ordinal data via finite mixture of matrix-variate distributions

Outlier Detection for Pandemic-Related Data Using Compositional Functional Data Analysis

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Appendices

Appendix 1: EM-algorithm for the bivariate nonlinear mixed effects model

Appendix 2: algorithm of self-organizing maps

About this article

Cite this article

Share this article

Keywords

Search

Navigation