Skip to main content
Log in

Clustering of eclipsing binary light curves through functional principal component analysis

  • Original Article
  • Published:
Astrophysics and Space Science Aims and scope Submit manuscript

Abstract

In this paper, we revisit the problem of clustering 1318 new variable stars found in the Milky way. Our recent work distinguishes these stars based on their light curves which are univariate series of brightness from the stars observed at discrete time points. This work proposes a new approach to look at these discrete series as continuous curves over time by transforming them into functional data. Then, functional principal component analysis is performed using these functional light curves. Clustering based on the significant functional principal components reveals two distinct groups of eclipsing binaries with consistency and superiority compared to our previous results. This method is established as a new powerful light curve-based classifier, where implementation of a simple clustering algorithm is effective enough to uncover the true clusters based merely on the first few relevant functional principal components. Simultaneously we discard the noise from the data study involving the higher order functional principal components. Thus the suggested method is very useful for clustering big light curve data sets which is also verified by our simulation study.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

Data Availability

All data analyzed and generated during this study are referenced in this published article.

References

  • Bandyopadhyay, U., Modak, S.: Bivariate density estimation using normal-gamma kernel with application to astronomy. J. Appl. Probab. Stat. 13, 23–39 (2018)

    Google Scholar 

  • Batista, G.E.A.P.A., Keogh, E.J., Tataw, O.M., de Souza, V.M.A.: CID: an efficient complexity-invariant distance for time series. Data Min. Knowl. Discov. 28, 634–669 (2014)

    Article  MathSciNet  Google Scholar 

  • Bezdek, J.C.: Pattern Recognition with Fuzzy Objective Function Algorithms. Plenum Press, New York (1981)

    Book  Google Scholar 

  • Cassisi, C., Montalto, P., Aliotta, M., Cannata, A., Pulvirenti, A.: Similarity measures and dimensionality reduction techniques for time series data mining. In: Advances in Data Mining Knowledge Discovery and Applications, pp. 71–96. Intech, Rijeka (2012). Chap. 3

    Google Scholar 

  • Chattopadhyay, T., Sinha, A., Chattopadhyay, A.K.: Influence of binary fraction on the fragmentation of Young massive clusters– a Monte Carlo simulation. Astrophys. Space Sci. 361, 120–133 (2016)

    Article  ADS  Google Scholar 

  • Craven, P., Wahba, G.: Smoothing noisy data with spline functions: estimating the correct degree of smoothing by the method of generalized cross-validation. Numer. Math. 31, 377–403 (1979)

    Article  MathSciNet  Google Scholar 

  • de Boor, C.: A Practical Guide to Splines. Springer, New York (2001)

    MATH  Google Scholar 

  • Deb, S., Singh, H.P.: Light curve analysis of variable stars using Fourier decomposition and principal component analysis. Astron. Astrophys. 507, 1729–1737 (2009)

    Article  ADS  Google Scholar 

  • Delaigle, A., Hall, P., Pham, T.: Clustering functional data into groups by using projections. J. R. Stat. Soc. Ser. B 81, 271–304 (2019)

    Article  MathSciNet  Google Scholar 

  • Gu, C.: Smoothing Spline ANOVA Models. Springer, New York (2002)

    Book  Google Scholar 

  • Handl, J., Knowles, K., Kell, D.: Computational cluster validation in post-genomic data analysis. Bioinformatics 21, 3201–3212 (2005)

    Article  Google Scholar 

  • Jacques, J., Preda, C.: Functional data clustering: a survey. Adv. Data Anal. Classif. 8, 231–255 (2014)

    Article  MathSciNet  Google Scholar 

  • Kaufman, L., Rousseeuw, P.J.: Finding Groups in Data: An Introduction to Cluster Analysis. Wiley, New Jersey (2005)

    MATH  Google Scholar 

  • Kirk, B., Conroy, K., Prša, A., et al.: Kepler eclipsing binary stars. VII. The catalog of eclipsing binaries found in the entire Kepler data set. Astron. J. 151, 68–88 (2016)

    Article  ADS  Google Scholar 

  • Kochoska, A., Mowlavi, N., Prša, A., Lecoeur-Taïbi, I., Holl, B., Rimoldini, L., Süveges, M., Eyer, L.: Gaia eclipsing binary and multiple systems. A study of detectability and classification of eclipsing binaries with Gaia. Astron. Astrophys. 602, A110 (2017)

    Article  ADS  Google Scholar 

  • Malkov, O.Yu., Oblak, E., Avvakumova, E.A., Torra, J.: Classification of eclipsing binaries. In: Demircan, O., Selam, S.O., Albayrak, B. (eds.) Solar and Stellar Physics Through Eclipses. ASP Conference Series, vol. 370 (2007)

    Google Scholar 

  • Matijevič, G., Prša, A., Orosz, J.A., Welsh, W.F., Bloemen, S., Barclay, T.: Kepler eclipsing binary stars. III. Classification of Kepler eclipsing binary light curves with locally linear embedding. Astron. J. 143, 123–128 (2012)

    Article  ADS  Google Scholar 

  • Miller, V.R., Albrow, M.D., Afonso, C., Henning , Th.: 1318 new variable stars in a 0.25 square degree region of the Galactic plane. Astron. Astrophys. 519, A12 (2010)

    Article  ADS  Google Scholar 

  • Modak, S.: Uncovering astrophysical phenomena related to galaxies and other objects through statistical analysis. Ph.D. Thesis (2019) http://hdl.handle.net/10603/314773

  • Modak, S.: Distinction of groups of gamma-ray bursts in the BATSE catalog through fuzzy clustering. Astron. Comput. 34, 100441 (2021a)

    Article  Google Scholar 

  • Modak, S.: A new nonparametric interpoint distance-based measure for assessment of clustering. J. Stat. Comput. Simul. (2021b, in press). https://doi.org/10.1080/00949655.2021.1984487

    Article  Google Scholar 

  • Modak, S.: A new measure for assessment of clustering based on kernel density estimation. Commun. Stat., Theory Methods (2022, in press). https://doi.org/10.1080/03610926.2022.2032168

    Article  Google Scholar 

  • Modak, S., Bandyopadhyay, U.: A new nonparametric test for two sample multivariate location problem with application to astronomy. J. Stat. Theory Appl. 18, 136–146 (2019)

    Article  MathSciNet  Google Scholar 

  • Modak, S., Chattopadhyay, T., Chattopadhyay, A.K.: Two phase formation of massive elliptical galaxies: study through cross-correlation including spatial effect. Astrophys. Space Sci. 362, 206–215 (2017)

    Article  ADS  MathSciNet  Google Scholar 

  • Modak, S., Chattopadhyay, A.K., Chattopadhyay, T.: Clustering of gamma-ray bursts through kernel principal component analysis. Commun. Stat., Simul. Comput. 47, 1088–1102 (2018)

    Article  MathSciNet  Google Scholar 

  • Modak, S., Chattopadhyay, T., Chattopadhyay, A.K.: Unsupervised classification of eclipsing binary light curves through k-medoids clustering. J. Appl. Stat. 47, 376–392 (2020)

    Article  MathSciNet  Google Scholar 

  • Mowlavi, N., Lecoeur-Taïbi, I., Holl, B., Rimoldini, L., Barblan, F., Prsa, A., Kochoska, A., Süveges, M., Eyer, L., Nienartowicz, K., Jevardat, G., Charnas, J., Guy, L., Audard, M.: Gaia eclipsing binary and multiple systems. Two-Gaussian models applied to OGLE-III eclipsing binary light curves in the Large Magellanic Cloud. Astron. Astrophys. 606, A92 (2017)

    Article  Google Scholar 

  • Percy, J.R.: Understanding Variable Stars. Cambridge University Press, New York (2007)

    Book  Google Scholar 

  • Prati, R.C., Batista, G.E.A.P.A.: A complexity-invariant measure based on fractal dimension for time series classification. Int. J. Nat. Comput. Res. 3, 59–73 (2012)

    Article  Google Scholar 

  • Press, W.H., Teukolsky, S.A., Vetterling, W.T., Flannery, W.T.: Numerical Recipes in C. The Art of Scientific Computing 2nd edn. pp. 105–128. Cambridge University Press, Cambridge (1992)

    MATH  Google Scholar 

  • Ramsay, J.O., Silverman, B.W.: Applied Functional Data Analysis: Methods and Case Studies. Springer, New York (2002)

    Book  Google Scholar 

  • Ramsay, J.O., Silverman, B.W.: Functional Data Analysis. Springer, New York (2005)

    Book  Google Scholar 

  • Ramsay, J.O., Hooker, G., Graves, S.: Functional Data Analysis with R and MATLAB. Springer, New York (2009)

    Book  Google Scholar 

  • Soszyński, I., Udalski, A., Szymański, M.K., Wyrzykowski, Ł., Ulaczyk, K., Poleski, R., Pietrukowicz, P., Kozłowski, S., Skowron, D.M., Skowron, J., Mróz, P., Pawlak, M.: The OGLE collection of variable stars. Over 45 000 RR Lyrae stars in the Magellanic System. Acta Astron. 66, 131–147 (2016)

    ADS  Google Scholar 

  • Stoer, J., Bulirsch, R.: Introduction to Numerical Analysis. Springer, New York (2002)

    Book  Google Scholar 

  • Süveges, M., Barblan, F., Lecoeur-Taïbi, I., Prša, A., Holl, B., Eyer, L., Kochoska, A., Mowlavi, N., Rimoldini, L.: Gaia eclipsing binary and multiple systems. Supervised classification and self-organizing maps. Astron. Astrophys. 603, A117 (2017)

    Article  ADS  Google Scholar 

  • Thieler, A.M., Backes, M., Fried, R., Rhode, W.: Periodicity detection in irregularly sampled light curves by robust regression and outlier detection. Stat. Anal. Data Min. 6, 73–89 (2013)

    Article  ADS  MathSciNet  Google Scholar 

  • Thieler, A.M., Fried, R., Rathjens, J.: RobPer: an R package to calculate periodograms for light curves based on robust regression. J. Stat. Softw. 69, 1–36 (2016)

    Article  Google Scholar 

  • Ward, J.H. Jr.: Hierarchical grouping to optimize an objective function. J. Am. Stat. Assoc. 58, 236–244 (1963)

    Article  MathSciNet  Google Scholar 

  • Wei, Y.: Multi-dimensional time warping based on complexity invariance and its application in sports evaluation. In: 11th International Conference on Fuzzy Systems and Knowledge Discovery (FSKD), pp. 677–680. IEEE, Xiamen (2014)

    Google Scholar 

Download references

Acknowledgements

The authors would like to thank the editors for encouraging the present work on Astrostatistics and one anonymous reviewer for its intriguing inquiries which helped the authors to present the results in a more convincing way.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Soumita Modak.

Ethics declarations

Conflict of Interest

No potential conflict of interest was reported by the authors.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Modak, S., Chattopadhyay, T. & Chattopadhyay, A.K. Clustering of eclipsing binary light curves through functional principal component analysis. Astrophys Space Sci 367, 19 (2022). https://doi.org/10.1007/s10509-022-04050-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10509-022-04050-9

Keywords

Navigation