Skip to main content
Log in

Photometric redshift estimation using ExtraTreesRegressor: Galaxies and quasars from low to very high redshifts

  • Original Article
  • Published:
Astrophysics and Space Science Aims and scope Submit manuscript

Abstract

Although photometric redshift estimation using machine learning (ML) methods is gaining popularity in recent times, almost all previous work has focused on estimating the redshift, \(z\), for \(z < 1\). Few projects, employing state-of-the-art deep learning techniques have worked on a greater range of redshift values. However, the rigorous demand for a large dataset has limited the deep learning approach to redshift values less than 4, since there is high sparsity in the number of samples of higher values in the available dataset. The main challenge here is to train a model with a highly imbalanced dataset. This paper proposes a method for obtaining photometric redshifts, of both galaxies and quasars, that span the entire redshift range (\(0 < z < 7\)) in the SDSS catalogue, using ExtraTreesRegressor, a conventional ML method. Data Release 12, 13 and 14 have been used to increase the number of training samples for high redshifts. The redshift values are first transferred to a logarithmic domain to negate the class imbalance effect. Besides the five photometric magnitudes (u-g-r-i-z) typically employed for such tasks, we have used a total of 30 features including morphological parameters, band overlap magnitudes and adjacent filters’ mean magnitudes, and the less contributing features are eliminated using recursive feature elimination. In this work, we propose a custom scoring metric, redshift-weighted mean squared error to penalize the errors of higher redshift values more. Postprocessing methods are used to compensate the biasing of the model for high redshift values. A uniform test set is used to evaluate the performance in all subranges of labels by splitting the entire range into seven bins of equal width. A large number of other in-state ML models have also been used for comparison, where the proposed method has proved to be much superior in terms of various performance metrics. The mean squared error obtained for the uniform test set is 0.66.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Notes

  1. Archive used to download the data: https://skyserver.sdss.org/dr14/en/tools/search/sql.aspx.

References

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Moonzarin Reza.

Ethics declarations

Conflict of interest

The authors declare that there is no conflict of interest regarding this paper.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Reza, M., Haque, M.A. Photometric redshift estimation using ExtraTreesRegressor: Galaxies and quasars from low to very high redshifts. Astrophys Space Sci 365, 50 (2020). https://doi.org/10.1007/s10509-020-03758-w

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10509-020-03758-w

Keywords

Navigation