Abstract
State-of-the-art approaches for multi-target prediction, such as Regressor Chains, can exploit interdependencies among the targets and model the outputs jointly, by flowing predictions from the first output to the last. While these models are very useful in applications where targets are highly interdependent and should be modeled jointly, they are however unable to answer queries in situations when targets are not only mutually dependent but also have joint constraints over the output. In addition, existing models are unsuitable when certain target values are fixed or manually imputed prior to inference, and as a result, the flow of predictions cannot cascade backward from an already-imputed output. Here we present a solution to the aforementioned problem as a backward inference algorithm for Regressor Chains via Metropolis-Hastings sampling. We evaluate the proposed approach via different metrics using both synthetic and real-world data. We show that our approach notably reduces errors when compared to traditional marginal inference methods that overlook joint modeling. Furthermore, we show that the proposed method can provide useful insights into a problem in conservation science in predicting the distribution of potential natural vegetation.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Aitchison, J.: A concise guide to compositional data analysis. In: Compositional Data Analysis Workshop (2005)
Antonenko, E., Read, J.: Multi-modal ensembles of regressor chains for multi-output prediction. In: Bouadi, T., Fromont, E., Hüllermeier, E. (eds.) IDA 2022. LNCS, vol. 13205, pp. 1–13. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-01333-1_1
Beigaitė, R., Read, J., Žliobaitė, I.: Multi-output regression with structurally incomplete target labels: a case study of modelling global vegetation cover. Eco. Inform. 72, 101849 (2022)
Chiarucci, A., Araújo, M.B., Decocq, G., Beierkuhnlein, C., Fernández-Palacios, J.M.: The concept of potential natural vegetation: an epitaph? J. Veg. Sci. 21(6), 1172–1178 (2010)
Fick, S.E., Hijmans, R.J.: WorldClim 2: new 1-km spatial resolution climate surfaces for global land areas. Int. J. Climatol. 37, 4302–4315 (2017)
Friedl, M., Sulla-Menashe, D.: MCD12Q1 MODIS/Terra+Aqua land cover type yearly L3 global 500m SIN grid v006, NASA EOSDIS Land Processes DAAC (2019)
Gal, Y., Ghahramani, Z.: Dropout as a Bayesian approximation: representing model uncertainty in deep learning. arXiv (1506.02142) (2015)
Hastings, W.K.: Monte Carlo sampling methods using Markov chains and their applications. Biometrika 57(1), 97–109 (1970)
Hemsing, L., Bryn, A.: Three methods for modelling potential natural vegetation (PNV) compared. Nor. Geogr. Tidsskr. 66(1), 11–29 (2012)
Hengl, T., Walsh, M.G., Sanderman, J., Wheeler, I., Harrison, S.P., Prentice, I.C.: Global mapping of potential natural vegetation: an assessment of machine learning algorithms for estimating land potential. PeerJ 6, e5457 (2018)
Kocev, D., Vens, C., Struyf, J., Džeroski, S.: Tree ensembles for predicting structured outputs. Pattern Recogn. 46(3), 817–833 (2013)
Mechenich, M.F., Žliobaitė, I.: Eco-ISEA3H, a machine learning ready spatial database for ecometric and species distribution modeling. Sci. Data 10, 77 (2023)
Metropolis, N., Rosenbluth, A., Rosenbluth, M., Teller, A., Teller, E.: Equation of state calculations by fast computing machines. J. Chem. Phys. 21(6), 1087–1092 (1953)
Read, J., Martino, L.: Probabilistic regressor chains with Monte-Carlo methods. Neurocomputing 413, 471–486 (2020)
Read, J., Pfahringer, B., Holmes, G., Frank, E.: Classifier chains: a review and perspectives. J. Artif. Intell. Res. (JAIR) 70, 683–718 (2021)
Salmerón, A., Rumí, R., Langseth, H., Nielsen, T., Madsen, A.: A review of inference algorithms for hybrid Bayesian networks. J. Artif. Intell. Res. 62, 799–828 (2018)
Santana, E., Mastelini, S., Barbon, S.: Deep regressor stacking for air ticket prices prediction. In: Anais do XIII Simpósio Brasileiro de Sistemas de Informação, pp. 25–31. SBC (2017)
Spyromitros-Xioufis, E., Tsoumakas, G., Groves, W., Vlahavas, I.: Multi-target regression via input space expansion: treating targets as inputs. Mach. Learn. 104(1), 55–98 (2016)
Tsanas, A., Xifara, A.: Accurate quantitative estimation of energy performance of residential buildings using statistical machine learning tools. Energy Build. 49, 560–567 (2012)
Yeh, I.C.: Modeling slump flow of concrete using second-order regressions and artificial neural networks. Cement Concr. Compos. 29(6), 474–480 (2007)
Acknowledgements
Research leading to these results was supported by Research Council of Finland (grants no 314803 and 341623 to IŽ).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Antonenko, E., Mechenich, M., Beigaitė, R., Žliobaitė, I., Read, J. (2024). Backward Inference in Probabilistic Regressor Chains with Distributional Constraints. In: Miliou, I., Piatkowski, N., Papapetrou, P. (eds) Advances in Intelligent Data Analysis XXII. IDA 2024. Lecture Notes in Computer Science, vol 14642. Springer, Cham. https://doi.org/10.1007/978-3-031-58553-1_4
Download citation
DOI: https://doi.org/10.1007/978-3-031-58553-1_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-58555-5
Online ISBN: 978-3-031-58553-1
eBook Packages: Computer ScienceComputer Science (R0)