Abstract
The classification of the X-ray sources into classes (such as extragalactic sources, background stars,...) is an essential task in astronomy. Typically, one of the classes corresponds to extragalactic radiation, whose photon emission behaviour is well characterized by a homogeneous Poisson process. We propose to use normalized versions of the Wasserstein and Zolotarev distances to quantify the deviation of the distribution of photon interarrival times from the exponential class. Our main motivation is the analysis of a massive dataset from X-ray astronomy obtained by the Chandra Orion Ultradeep Project (COUP). This project yielded a large catalog of 1616 X-ray cosmic sources in the Orion Nebula region, with their series of photon arrival times and associated energies. We consider the plug-in estimators of these metrics, determine their asymptotic distributions, and illustrate their finite-sample performance with a Monte Carlo study. We estimate these metrics for each COUP source from three different classes. We conclude that our proposal provides a striking amount of information on the nature of the photon emitting sources. Further, these variables have the ability to identify X-ray sources wrongly catalogued before. As an appealing conclusion, we show that some sources, previously classified as extragalactic emissions, have a much higher probability of being young stars in Orion Nebula.
Similar content being viewed by others
References
Araujo A, Giné E (1980) The central limit theorem for real and Banach valued random variables. Wiley, New York
Ascher S (1990) A survey of tests for exponentiality. Commun Stat Theory Methods 19:1811–1825
Baíllo A, Cárcamo J, Nieto S (2015) A test for convex dominance with respect to the exponential class based on an $L^1$ distance. IEEE Trans Reliab 64:71–82
Broos PS, Getman KV, Povich MS, Townsley LK (2011) A naive Bayes source classifier for X-ray sources. Astrophys J Suppl Ser. https://doi.org/10.1088/0067-0049/194/1/4
Cárcamo J (2017) Integrated empirical processes in $L^p$ with applications to estimate probability metrics. Bernoulli 23:3412–3436
del Barrio E, Giné E, Matrán C (1999) Central limit theorems for the Wasserstein distance between the empirical and the true distributions. Ann Probab 27:1009–1071
Efron B, Tibshirani RJ (1993) An introduction to the bootstrap. Chapman and Hall, New York
Feigelson ED, Babu GJ (2012) Modern statistical methods for astronomy. With R applications. Cambridge University Press, Cambridge
Feigelson ED, Getman K, Townsley L, Garmire G, Preibisch T, Grosso N, Montmerle T, Muench A, McCaughrean M (2005) Global X-ray properties of the Orion Nebula region. Astrophys J Suppl Ser 160:379–389
García-Escudero LA, Gordaliza A, Mayo-Iscar A (2014) A constrained robust proposal for mixture modeling avoiding spurious solutions. Adv Data Anal Classif 8:27–43
Getman K, Feigelson E, Grosso N, McCaughrean M, Micela G, Broos P, Garmire G, Townsley L (2005a) Membership of the Orion Nebula population from the Chandra Orion Ultradeep Project. Astrophys J Suppl Ser 160:353–378
Getman K, Flaccomio E, Broos P, Grosso N, Tsujimoto M, Townsley L, Garmire G, Kastner J, Li J, Harnden F, Wolk S, Murray S, Lada C, Muench A, McCaughrean M, Meeus G, Damiani F, Micela G, Sciortino S, Bally J, Hillenbrand L, Herbst W, Preibisch T, Feigelson E (2005b) Chandra Orion Ultradeep Project: observations and source lists. Astrophys J Suppl Ser 160:319–352
Getman KV, Feigelson ED, Broos PS, Micela G, Garmire GP (2008a) X-ray flares in Orion young stars. I. Flare characteristics. Astrophys J 688:418–436
Getman KV, Feigelson ED, Micela G, Jardine MM, Gregory SG, Garmire GP (2008b) X-ray flares in Orion young stars. II. Flares, magnetospheres, and protoplanetary disks. Astrophys J 688:437–455
Grafakos L (2014) Classical Fourier analysis, 3rd edn. Springer, New York
Henze N, Meintanis SG (2005) Recent and classical tests for exponentiality: a partial review with comparisons. Metrika 61:29–45
Hubert M, Rousseeuw P, Segaert P (2017) Multivariate and functional classification using depth and distance. Adv Data Anal Classif 11:445–466
Kallenberg O (2002) Foundations of modern probability, 2nd edn. Springer, New York
Kuhn M (2008) Building predictive models in R using the caret package. J Stat Softw 28:1–26
Ledoux M, Talagrand M (2011) Probability in Banach spaces. Springer, Berlin
Rachev ST, Stoyanov SV, Fabozzi FJ (2011) A probability metrics approach to financial risk measures. Wiley-Blackwell, Oxford
Rachev ST, Klebanov LB, Stoyanov SV, Fabozzi FJ (2013) The methods of distances in the theory of probability and statistics. Springer, New York
Schulz NS (2012) The formation and early evolution of stars. Springer, Berlin
Scrucca L, Fop M, Murphy TB, Raftery AE (2016) mclust 5: clustering, classification and density estimation using Gaussian finite mixture models. R J 8:289–317
Shaked M, Shanthikumar JG (2007) Stochastic orders. Springer, New York
van der Vaart AW (1998) Asymptotic statistics. Cambridge University Press, Cambridge
van der Vaart AW, Wellner JA (1996) Weak convergence and empirical processes. Springer, New York
Venables WN, Ripley BD (2002) Modern applied statistics with S. Springer, New York
Wolk SJ, Harnden FR, Flaccomio E, Micela G, Favata F, Shang H, Feigelson ED (2005) Stellar activity on the young suns of Orion: COUP observations of K5–7 pre-main-sequence stars. Astrophys J Suppl S 160:423–449
Acknowledgements
The authors are grateful to three reviewers and the associate editor for their insightful comments which have improved the presentation of the paper.
Author information
Authors and Affiliations
Corresponding author
Additional information
Research by AB and JC was supported by the Spanish MEyC Grants MTM2013-44045-P and MTM2016-78751-P. KG acknowledges the support from the Chandra ACIS Team contract SV4-74018 (G. Garmire and L. Townsley, PIs), issued by the Chandra X-ray Center, which is operated by the Smithsonian Astrophysical Observatory on behalf of NASA under Contract NAS8-03060.
Rights and permissions
About this article
Cite this article
Baíllo, A., Cárcamo, J. & Getman, K. New distance measures for classifying X-ray astronomy data into stellar classes. Adv Data Anal Classif 13, 531–557 (2019). https://doi.org/10.1007/s11634-018-0309-2
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11634-018-0309-2
Keywords
- Classification
- X-ray astronomy
- Wasserstein distance
- Zolotarev metric
- Photon interarrival time
- Exponential distribution