Skip to main content
Log in

Clustering on the Torus

  • Original Article
  • Published:
Journal of Statistical Theory and Practice Aims and scope Submit manuscript

Abstract

Several probability distributions for circular data have been discussed by Rao (Linear statistical inference and its applications, 2nd edn. Wiley, New York, 1973) in his classic book. The aim of this paper is to introduce model-based clustering methods for bivariate circular or toroidal data. Here a mixture model approach based on the joint distribution of the two dependent circular variables is proposed. In particular, two types of such mixture models are constructed, one based on the marginal and the other on the conditional specification. Convergence property of Expectation-Maximization (EM) algorithm for the members of the regular exponential family used for our models is studied. Cluster properties, such as optimum number, homogeneity, etc. are also discussed. A real life application on gene data is made to illustrate the use of the proposed approaches. Comparison of the two models is also done based on this example. Clustering method for observations on the torus do not seem to be available in the literature, and this paper is possibly the maiden attempt in that direction.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Availability of data and material

The data on the peak expression times or phases (from a microarray experiment) have been obtained from [13] as mentioned above.

References

  1. Banfield JD, Raftery AE (1993) Model-based Gaussian and non-Gaussian clustering. Biometrics 49(3):803–821

    Article  MathSciNet  Google Scholar 

  2. Brendan JF, Dueck D (2007) Clustering by passing messages between data points. Science 315:972–976

    Article  MathSciNet  Google Scholar 

  3. Cheeseman P, Stutz J (1996) Bayesian classification (AutoClass): theory and results. Adv Knowl Discov Data Min 153–180

  4. Dingxi Q, Tamhane A (2007) A comparative study of K-means algorithm and the normal mixture model for clustering: univariate case. J Stat Plan Inf 137:3722–3740

    Article  MathSciNet  Google Scholar 

  5. Everitt BS, Landau S, Leese M, Stahl D (2011) Cluster analysis. Wiley Series in Probability and Statistics, UK

  6. Fraley C, Raftery AE (1998) How many clusters? Which clustering method? Answers via model-based cluster analysis. Technical Report No. 329, Department of Statistics University of Washington

  7. Fisher NI, Lee AJ (1992) Regression models for an angular response. Biometrics 48(3):665–677

    Article  MathSciNet  Google Scholar 

  8. Jammalamadaka SR, SenGupta A (2001) Topics in circular statistics. World Scientific Publishers, New Jersey

    Book  Google Scholar 

  9. Johnson RA, Wehrly TE (1978) Some angular-linear distributions and related regression models. J Am Stat Assoc 73:602–606

    Article  MathSciNet  Google Scholar 

  10. Johnson RA, Wichern DW (2007) Applied multivariate statistical analysis. Pearson Prentice Hall, New Jersey

    MATH  Google Scholar 

  11. Kim S, SenGupta A (2017) Multivariate multiple circular regression. J Stat Comput Simul 87:1277–1291

    Article  MathSciNet  Google Scholar 

  12. Kim S, SenGupta A (2018) Clustering methods for spherical data: an overview and a new generalization. In: Proceedings of the 2nd Pacific Rim Statistical Conference for Production Engineering—big data, Production Engineering and Statistics. ICSA Book Series in Statistics, pp 155–164

  13. Liu D, Peddada SD, Li L, Weinberg CR (2006) Phase analysis of circadian-related genes in two tissues. Bioinform BMC. https://doi.org/10.1186/1471-2105-7-87

    Article  Google Scholar 

  14. McLachlan GJ, Krishnan T (1997) The EM algorithm and extensions. Wiley, New York

    MATH  Google Scholar 

  15. Rao CR (1973) Linear statistical inference and its applications, 2nd edn. Wiley, New York

    Book  Google Scholar 

  16. Saxena A, Prasad M, Gupta A, Bharill N, Patel OP, Tiwari A, Joo AM, Weiping D, Teng LC (2017) A review of clustering techniques and developments. Neurocomputing 267:664–681

    Article  Google Scholar 

  17. SenGupta A (2004) On the construction of probability distributions for directional data. Bull Cal Math Soc 96(2):139–154

    MathSciNet  MATH  Google Scholar 

  18. SenGupta A, Arnold BC, Kim S (2013) Inverse circular-circular regression. J Multivar Anal 119:200–208

    Article  MathSciNet  Google Scholar 

  19. SenGupta A, Chattopadhyay AK, Roy M (2020) Model based clustering for cylindrical data. Invited Paper. In: Ghosh I, Balakrishnan N, Ng HKT (eds) Advances in Statistics—Theory and Applications—Honoring the Contributions of Barry C. Arnold in Statistical Science. Springer, New York. In Press

  20. SenGupta A, Ugwuowo F (2011) A classification method for directional data with application to the human skull. Commun Stat Theory Methods 40:457–466

    Article  MathSciNet  Google Scholar 

  21. Singh H, Hnizdo V, Demchuk E (2002) Probabilistic model for two dependent circular variables. Biometrika 89(3):719–723

    Article  MathSciNet  Google Scholar 

  22. Wallace CS, Dowe DL (1994) Intrinsic classification by mml—the snob program. In: Proceedings of the 7th Australian Joint Conference on Artificial Intelligence, pp 37–44

  23. Wu JCF (1983) On the convergence properties of the EM algorithm. Ann Stat 11(1):95–103

    Article  MathSciNet  Google Scholar 

  24. Xu R, Wunsch D (2005) Survey of clustering algorithms. IEEE Trans Neural Netw 16(3):645–678

    Article  Google Scholar 

  25. Xu R, Wunsch DC (2010) Clustering algorithms in biomedical research: a review. IEEE Rev Biomed Eng 3:120–154

    Article  Google Scholar 

Download references

Funding

Not applicable.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Moumita Roy.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Code availability

The code may be made available on request from the authors.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article is part of the topical collection “Celebrating the Centenary of Professor C. R. Rao” guest edited by, Ravi Khattree, Sreenivasa Rao Jammalamadaka , and M. B. Rao.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

SenGupta, A., Roy, M. Clustering on the Torus. J Stat Theory Pract 15, 58 (2021). https://doi.org/10.1007/s42519-021-00178-z

Download citation

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s42519-021-00178-z

Keywords

Navigation