Data Deluge in Astrophysics: Photometric Redshifts as a Template Use Case

  • Massimo BresciaEmail author
  • Stefano Cavuoti
  • Valeria Amaro
  • Giuseppe Riccio
  • Giuseppe Angora
  • Civita Vellucci
  • Giuseppe Longo
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 822)


Astronomy has entered the big data era and Machine Learning based methods have found widespread use in a large variety of astronomical applications. This is demonstrated by the recent huge increase in the number of publications making use of this new approach. The usage of machine learning methods, however is still far from trivial and many problems still need to be solved. Using the evaluation of photometric redshifts as a case study, we outline the main problems and some ongoing efforts to solve them.


Big data Astroinformatics Photometric redshifts 



MB acknowledges the INAF PRIN-SKA 2017 program and the funding from MIUR Premiale 2016: MITIC. MB and GL acknowledge the H2020-MSCA-ITN-2016 SUNDIAL (SUrvey Network for Deep Imaging Analysis and Learning), financed within the Call H2020-EU.1.3.1.


  1. 1.
    Ivezic, Z., et al.: LSST: from science drivers to reference design and anticipated data products. arXiv:0805.2366v4 (2008)
  2. 2.
    Blake, C.A., Abdalla, F.B., Bridle, S.L., Rawlings, S.: Cosmology with the SKA. New Astron. Rev. 48(11–12), 1063–1077 (2004)CrossRefGoogle Scholar
  3. 3.
    Allen, M.G., Fernique, P., Boch, T., et al.: An Hierarchical Approach to Big Data. arXiv:1611.01312 (2016)
  4. 4.
    Longo, G., Brescia, M., Cavuoti, S.: The astronomical data deluge: the template case of photometric redshifts. In: CEUR Workshop Proceedings, vol. 2022, pp. 27–29 (2017)Google Scholar
  5. 5.
    Dunham, M.: Data Mining Introductory and Advanced Topics. Prentice-Hall, Upper Saddle River (2002)Google Scholar
  6. 6.
    Annunziatella, M., et al.: Inside catalogs: a comparison of source extraction software. PASP 125(923), 68–82 (2013)CrossRefGoogle Scholar
  7. 7.
    Odenwald, S.: Cosmology in More Than 4 Dimensions. Astrophysics Workshop, N.R.L. (1987)Google Scholar
  8. 8.
    Paliouras, G.: Scalability of Machine Learning Algorithms. M.Sc. thesis, University of Manchester (1993)Google Scholar
  9. 9.
    Goldberg, D.E., Holland, J.H.: Genetic algorithms and machine learning. Mach. Learn. 3, 95–99 (1988)Google Scholar
  10. 10.
    Brescia, M., Cavuoti, S., Longo, G., et al.: DAMEWARE: a web cyberinfrastructure for astrophysical data mining. PASP 126(942), 783–797 (2014)Google Scholar
  11. 11.
    Hey, T., Tansley, S., Tolle, K.: The Fourth Paradigm: Data-Intensive Scientific Discovery. Microsoft Research, Redmond (2009)Google Scholar
  12. 12.
    Brescia, M.: New trends in E-science: machine learning and knowledge discovery in databases. In: Horizons in Computer Science Research, vol. 7, pp. 1–73. Nova Science Publishers (2012)Google Scholar
  13. 13.
    Baum, W.A.: Photometric magnitudes and redshifts. In: McVittie, G.C. (ed.) IAU Symposium, vol. 15, Problems of Extra-Galactic Research, p. 390 (1962)Google Scholar
  14. 14.
    Connolly, A.J., Csabai, I., Szalay, A.S., et al.: Slicing through multicolour space: galaxy redshifts from broadband photometry. AJ 110, 2655 (1995)CrossRefGoogle Scholar
  15. 15.
    Bolzonella, M., Miralles, J.M., Pello, R.: Photometric redshifts based on standard SED fitting procedures. A&A 363, 476–492 (2000)Google Scholar
  16. 16.
    Arnouts, S., Cristiani, S., Moscardini, L., et al.: Measuring and modelling the redshift evolution of clustering: the Hubble Deep Field North. MNRAS 310, 540 (1999)CrossRefGoogle Scholar
  17. 17.
    Ilbert, O., Arnouts, S., McCracken, H.J., et al.: Accurate photometric redshifts for the CFHT Legacy Survey calibrated using the VIMOS VLT Deep Survey. A&A 457, 841 (2006)CrossRefGoogle Scholar
  18. 18.
    Tanaka, M.: Photometric redshift with Bayesian priors on physical properties of galaxies. AJ 801, 1, 20 (2015)CrossRefGoogle Scholar
  19. 19.
    Tagliaferri, R., Longo, G., Andreon, S., et al.: Neural Networks and Photometric Redshifts, ArXiv e-prints:0203445 (2002)Google Scholar
  20. 20.
    Cavuoti, S., Brescia, M., Tortora, C., et al.: Machine-Learning-based photometric redshifts for the KiDS ESO DR2 galaxies. MNRAS 452(3), 3100–3105 (2015)CrossRefGoogle Scholar
  21. 21.
    Cavuoti, S., Brescia, M., De Stefano, V., Longo, G.: Photometric redshift estimation based on data mining with PhotoRApToR. Exp. Astron. 39(1), 45–71 (2015)CrossRefGoogle Scholar
  22. 22.
    Brescia, M., Cavuoti, S., Longo, G., De Stefano, V.: A catalogue of photometric redshifts for the SDSS-DR9 galaxies (Research Note). Astron. Astrophys. 568, A126 (2014)CrossRefGoogle Scholar
  23. 23.
    Carrasco, K., Brunner, R.J.: Implementing Probabilistic Photometric Redshifts, Astronomical Data Analysis Software and Systems XXII. San Francisco: Astronomical Society of the Pacific, p. 69 (2013)Google Scholar
  24. 24.
    Abdalla, et al.: A comparison of six photometric redshift methods applied to 1.5 million luminous red galaxies. MNRAS 417, 1891 (2011)CrossRefGoogle Scholar
  25. 25.
    Collister, A.A., Lahav, O.: ANNz: estimating photometric redshifts using artificial neural networks. PASP 116, 345 (2004)CrossRefGoogle Scholar
  26. 26.
    Gerdes, et al.: ArborZ: photometric redshifts using boosted decision trees. AJ 715, 823 (2010)CrossRefGoogle Scholar
  27. 27.
    Carrasco, K., Brunner, R.J.: Sparse representation of photometric redshift PDFs: preparing for petascale astronomy. MNRAS 438(4), 3409–3421 (2014)CrossRefGoogle Scholar
  28. 28.
    Carrasco, K., Brunner, R.J.: Exhausting the information: novel bayesian combination of photometric redshift PDFs. MNRAS 442(4), 3380–3399 (2014)CrossRefGoogle Scholar
  29. 29.
    Cavuoti, S., Brescia, M., Longo, G., Mercurio, A.: Photometric redshifts with the quasi Newton algorithm (MLPQNA) results in the PHAT1 contest. A&A 546, 13 (2012)CrossRefGoogle Scholar
  30. 30.
    Cavuoti, S., et al.: Genetic algorithm modeling with GPU parallel computing technology smart innovation. Syst. Technol. 19, 29–39 (2013)Google Scholar
  31. 31.
    Cavuoti, S., et al.: Astrophysical data mining with GPU. A case study: genetic classification of globular clusters, New Astron. 26, 12–22 (2014)Google Scholar
  32. 32.
    Hildebrandt, H., et al.: PHAT: PHoto- z Accuracy Testing. A&A 523, A31 (2010)CrossRefGoogle Scholar
  33. 33.
    Hoyle, B., Rau, M.M., Bonnett, C., Seitz, S., Weller, J.: Anomaly detection for machine learning redshifts applied to SDSS galaxies. MNRAS 450, 305–316 (2015)CrossRefGoogle Scholar
  34. 34.
    Cavuoti, S., et al.: A cooperative approach among methods for photometric redshifts estimation: an application to KiDS data. MNRAS 466(2), 2039–2053 (2017)CrossRefGoogle Scholar
  35. 35.
    Duncan, K.J., Jarvis, M.J., Brown, M.J.I., et al.: Photometric redshifts for the next generation of deep radio continuum surveys - II. Gaussian processes and hybrid estimates, arXiv:1712.04476 (2017)
  36. 36.
    Laurino, O., DAbrusco, R., Longo, G., Riccio, G.: Astroinformatics of galaxies and quasars: a new general method for photometric redshifts estimation. MNRAS 418, 2165 (2011)CrossRefGoogle Scholar
  37. 37.
    Polsterer, K.L., Gieseke, F., Igel, C., Goto, T.: Improving the performance of photometric regression models via massive parallel feature selection. In: Manset, N., Forshay, P. (ed.) Data Analysis Software and Systems. ASP Conference Series, vol. 485, p. 425 (2014)Google Scholar
  38. 38.
    Masters, D., Capak, P., Stern, D., et al.: Mapping the galaxy color–redshift relation: optimal photometric redshift calibration strategies for cosmology surveys. ApJ 813(1), 53 (2015)CrossRefGoogle Scholar
  39. 39.
    Laigle, C., et al.: The COSMOS2015 Catalog: Exploring the 1 < z < 6 Universe with Half a Million Galaxies, ApJ Supp. Ser. 224(2), 23 (2016). Article id. 24CrossRefGoogle Scholar
  40. 40.
    Dubath, P., Apostolakos, N., Bonchi, A., et al.: The euclid data processing challenges. Proc. IAU 12(S325), 73–82 (2016)CrossRefGoogle Scholar
  41. 41.
    Ahn, C.P., Alexandroff, R., Allende Prieto, C., et al.: The ninth data release of the sloan digital sky survey: first spectroscopic data from the SDSS-III baryon oscillation spectroscopic survey. ApJS 203, 21 (2012)CrossRefGoogle Scholar
  42. 42.
    D’Isanto, A., Cavuoti, S., Gieseke, F., Polsterer, K.L.: Return of the features - Efficient feature selection and interpretation for photometric redshifts. Submitted to A&A (2018)Google Scholar
  43. 43.
    Sadeh, I., Abdalla, F.B., Lahav, O.: ANNz2: photometric redshift and probability distribution function estimation using machine learning. PASP 128, 104502 (2016)CrossRefGoogle Scholar
  44. 44.
    Cavuoti, S., Amaro, V., Brescia, M., et al.: METAPHOR: a machine-learning-based method for the probability density estimation of photometric redshifts. MNRAS 465(2), 1959–1973 (2017)CrossRefGoogle Scholar
  45. 45.
    de Jong, J.T.A., Verdoes Kleijn, G.A., Erben, T., Hildebrandt, H., et al.: The third data release of the Kilo-Degree Survey and associated data products. Astron. Astrophys. 604, A134 (2017)CrossRefGoogle Scholar
  46. 46.
    Brescia, M., Cavuoti, S., D’Abrusco, R., Mercurio, A., Longo, G.: Photometric redshifts for quasars in multi-band surveys. ApJ 772(2), 140 (2013)CrossRefGoogle Scholar
  47. 47.
    Amaro, V., Cavuoti, S., Brescia M., Vellucci C., Longo, G., et al.: Statistical analysis of probability density functions for photometric redshifts through the KiDS-ESO-DR3 galaxies. MNRAS submitted (2018)Google Scholar
  48. 48.
    Benitez, N.: Bayesian Photometric Redshift Estimation. ApJ 536(2), 571–583 (2000)CrossRefGoogle Scholar
  49. 49.
    Gneiting, T., Raftery, A.E., Westveld, A.H., Goldman, T.: Calibrated probabilistic forecasting using ensemble model output statistics and minimum CRPS estimation. Mon. Weather Rev. 133(5), 1098 (2005)CrossRefGoogle Scholar
  50. 50.
    Wittman, D., Bhaskar, R., Tobin, R.: Overconfidence in photometric redshift estimation. MNRAS 457, 4005 (2016)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  1. 1.INAF - Osservatorio Astronomico di CapodimonteNapoliItaly
  2. 2.Università degli Studi Federico II - Dipartimento di Fisica “E. Pancini”NapoliItaly
  3. 3.INFN - Napoli UnitNapoliItaly

Personalised recommendations