Skip to main content

Advertisement

Log in

Region-income-based prioritisation of Sustainable Development Goals by Gradient Boosting Machine

  • Original Article
  • Published:
Sustainability Science Aims and scope Submit manuscript

Abstract

The Sustainable Development Goals (SDGs) seek to address complex global challenges and cover aspects of social development, environmental protection, and economic growth. However, the holistic and complicated nature of the goals has made their attainment difficult. Achieving all goals by 2030 given countries’ limited budgets with the economic and social disruption that the COVID-19 pandemic has caused is over-optimistic. To have the most profound impact on the SDGs achievement, prioritising and improving co-beneficial goals is an effective solution. This study confirms that countries’ geographic location and income level have a significant relationship with overall SDGs achievement. This article applies the Gradient Boosting Machine (GBM) algorithm to identify the top five SDGs that drive the overall SDG score. The results show that the influential SDGs vary for countries with a specific income level located in different regions. In Europe and Central Asia, SDG10 is among the most influential goals for high-income countries, SDG9 for upper-middle-income, SDG3 in low and lower-middle-income countries of Sub-Saharan Africa, and SDG5 in Latin America and the Caribbean upper-middle-income countries. This systematic and exploratory data-driven study generates new insights that confirm the uniqueness, and non-linearity of the relationship between goals and overall SDGs achievement.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+
from $39.99 /Month
  • Starting from 10 chapters or articles per month
  • Access and download chapters and articles from more than 300k books and 2,500 journals
  • Cancel anytime
View plans

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

Download references

Acknowledgments

This study is funded by the Australian Government Research Training Program Scholarship provided by the Australian Commonwealth Government and the University of Melbourne. The first author would also like to extend her thanks to Dr. Roozbeh Valavi who offered his time and support throughout this research.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Atie Asadikia.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Handled by Takanori Matsui, Osaka University, Japan.

Appendices

Appendix A

Hyperparameter setting

In the gbm.step function, apart from dependent and independent variables, we set the method (family) of regression as gaussian, bag.fraction to 0.75 (recommended for small training samples). According to Greg Ridgeway (2007) “shrinkage = 0.001 will almost certainly result in a model with better out-of-sample predictive performance than setting shrinkage = 0.01”. However, lower \(lr\) (shrinkage) will impact negatively on the computation time. Since our dataset is not large, we ignored the computing time (the difference is less than 1 min) and memory usage. Instead, we focused on setting hyperparameters based on predictive performance with fewer error rates without over-fitting our models. This assists us to establish a more precise interpretation of SDGs relationships.

In GBM the \(tc\) and the \(lr\) determine the required number of trees for optimal prediction (Elith et al. 2008). To tune those hyperparameters (\(tc\),\(lr\)) we used the train function which is part of the Caret package (Kuhn 2021) with cross-validation (cv) method using R 3.6.3 (R Core Team 2021). The method is set as “gbm”, and the metric is based on Root Mean Square Error (RMSE). The range of 3–10 for \(tc\), and three options for \(lr\) (0.005, 0.001, 0.0001), and the number of trees set as minimum 500, maximum 15e3 with steps 50.

Appendix B

Algorithm: Out-Of-Sample cross-validation over years by authors

figure a

Appendix C

See Table 2.

Table 2 Number of trees, lr, and tc used to fit each model using gbm.step function along with the result of RMSE and MAE calculated from fivefold cross-validation

Appendix D

See Fig. 7.

Fig. 7
figure 7

The results of model validation. The x-axis is the actual SDG Index, and the y-axis is the predicted SDG Index

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Asadikia, A., Rajabifard, A. & Kalantari, M. Region-income-based prioritisation of Sustainable Development Goals by Gradient Boosting Machine. Sustain Sci 17, 1939–1957 (2022). https://doi.org/10.1007/s11625-022-01120-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11625-022-01120-3

Keywords

Profiles

  1. Mohsen Kalantari