Rule-based Bayesian regression

Botsas, Themistoklis; Mason, Lachlan R.; Pan, Indranil

doi:10.1007/s11222-022-10100-7

Rule-based Bayesian regression

Published: 28 May 2022

Volume 32, article number 44, (2022)
Cite this article

Statistics and Computing Aims and scope Submit manuscript

275 Accesses
1 Citation
1 Altmetric
Explore all metrics

Abstract

We introduce a novel rule-based approach for handling regression problems. The new methodology carries elements from two frameworks: (i) it provides information about the uncertainty of the parameters of interest using Bayesian inference, and (ii) it allows the incorporation of expert knowledge through rule-based systems. The blending of those two different frameworks can be particularly beneficial for various domains (e.g., engineering), where even though the significance of uncertainty quantification motivates a Bayesian approach, there is no simple way to incorporate researcher intuition into the model. We validate our models by applying them to synthetic applications: a simple linear regression problem and two more complex structures based on partial differential equations, and we illustrate their use through two cases derived from real data. Finally, we review the advantages of our methodology, which include the simplicity of the implementation, the uncertainty reduction due to the added information and, in some occasions, the derivation of better point predictions, and we outline limitations, mainly from the computational complexity perspective, such as the difficulty in choosing an appropriate algorithm and the added computational burden.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 8

A random forest guided tour

Article 19 April 2016

Estimating power in (generalized) linear mixed models: An open introduction and tutorial in R

Article Open access 05 May 2021

Aleatoric and epistemic uncertainty in machine learning: an introduction to concepts and methods

Article Open access 08 March 2021

Notes

https://github.com/themisbo/Rule-based-Bayesian-regr.

References

Abadi, M., Barham, P., Chen, J., Chen, Z., Davis, A., Dean, J., Devin, M., Ghemawat, S., Irving, G., Isard, M., et al. (2016). Tensorflow: a system for large-scale machine learning. In: 12th \(\{USENIX\}\) symposium on operating systems design and implementation (\(\{OSDI\}\) 16), pp. 265–283
Bar-Sinai, Y., Hoyer, S., Hickey, J., Brenner, M.P.: Learning data-driven discretizations for partial differential equations. Proc. Natl. Acad. Sci. 116(31), 15344–15349 (2019)
Article MathSciNet Google Scholar
Bishop, C.M.: Pattern Recognition and Machine Learning. Springer, Berlin (2006)
MATH Google Scholar
Breiman, L., Friedman, J., Stone, C.J., Olshen, R.A.: Classification and Regression Trees. CRC Press, Boca Raton (1984)
MATH Google Scholar
Ching, J., Chen, Y.-C.: Transitional Markov chain Monte Carlo method for Bayesian model updating, model class selection, and model averaging. J. Eng. Mech. 133(7), 816–832 (2007)
Article Google Scholar
Chipman, H.A., George, E.I., McCulloch, R.E., et al.: BART: Bayesian additive regression trees. Ann. Appl. Stat. 4(1), 266–298 (2010)
Article MathSciNet Google Scholar
de Boor, C. (1978). A practical guide to spline, volume 27. Springer (New York, NY [ua])
Gelman, A., Carlin, J.B., Stern, H.S., Dunson, D.B., Vehtari, A., Rubin, D.B.: Bayesian Data Analysis. CRC Press, Boca Raton (2013)
Book Google Scholar
González-Díaz, A., Alcaráz-Calderón, A.M., González-Díaz, M.O., Méndez-Aranda, Á., Lucquiaud, M., González-Santaló, J.M.: Effect of the ambient conditions on gas turbine combined cycle power plants with post-combustion CO2 capture. Energy 134, 221–233 (2017)
Article Google Scholar
Hastings, W.K.: Monte Carlo sampling methods using Markov chains and their applications. Biometrika 57(1), 97–109 (1970)
Article MathSciNet Google Scholar
Hoyer, S., Zhuang, J. (2020). Data driven discretizations for solving 2D PDEs. https://github.com/google-research/data-driven-pdes
Kaya, H., Tüfekc\(\dot{i}\), P., Uzun, E. (2019). Predicting CO and NOx emissions from gas turbines: novel data and a benchmark PEMS. Turk. J. Electr. Eng. Comput. Sci. 27(6):4783–4796
Kharratzadeh, M. (2017). Splines in Stan. https://github.com/milkha/Splines_in_Stan/blob/master/splines_in_stan.pdf
Lakshminarayanan, B., Roy, D.M., Teh, Y.W. (2016). Mondrian forests for large-scale regression when uncertainty matters. In: Artificial Intelligence and Statistics, pp. 1478–1487
Lundberg, S.M., Erion, G., Chen, H., DeGrave, A., Prutkin, J.M., Nair, B., Katz, R., Himmelfarb, J., Bansal, N., Lee, S.-I.: From local explanations to global understanding with explainable AI for trees. Nat. Mach. Intell. 2(1), 2522–5839 (2020)
Article Google Scholar
Minson, S., Simons, M., Beck, J.: Bayesian inversion for finite fault earthquake source models I-Theory and algorithm. Geophys. J. Int. 194(3), 1701–1726 (2013)
Nelder, J.A., Mead, R.: A simplex method for function minimization. Comput. J. 7(4), 308–313 (1965)
O’Hagan, A.: Expert knowledge elicitation: subjective but scientific. Am. Stat. 73(sup1), 69–81 (2019)
Article MathSciNet Google Scholar
Pan, I., Bester, D.: Fuzzy Bayesian learning. IEEE Trans. Fuzzy Syst. 26(3), 1719–1731 (2017)
Article Google Scholar
Pan, I., Bester, D.: Marginal likelihood based model comparison in Fuzzy Bayesian Learning. IEEE Trans. Emerg. Topics Comput. Intell. 4(6), 794–799 (2018)
Article Google Scholar
Rasmussen, C.E. (2003). Gaussian processes in machine learning. In: Summer School on Machine Learning, pp. 63–71. Springer
Rochford, A. (2017). A PyMC3 port of Splines in Stan. https://gist.github.com/AustinRochford/d640a240af12f6869a7b9b592485ca15
Salvatier, J., Wiecki, T.V., Fonnesbeck, C.: Probabilistic programming in python using PyMC3. PeerJ Comput. Sci. 2, e55 (2016)
Article Google Scholar
Stan Development Team: RStan: the R interface to Stan. R package version 2(19), 1 (2019)
Tüfekci, P.: Prediction of full load electrical power output of a base load operated combined cycle power plant using machine learning methods. Int. J. Electr. Power Energy Syst. 60, 126–140 (2014)
Article Google Scholar

Download references

Acknowledgements

This work was supported by Wave 1 of The UKRI Strategic Priorities Fund under the EPSRC Grant EP/T001569/1, particularly the Digital Twins for Complex Engineering Systems theme within that grant and The Alan Turing Institute. IP acknowledges funding from the Imperial College Research Fellowship scheme. We acknowledge Dr. Daya Shankar Pandey at University of Huddersfield, UK, who is a power plant expert and helped with the rule elicitation in Sect. 4.5.3.

Author information

Authors and Affiliations

The Alan Turing Institute, London, UK
Themistoklis Botsas, Lachlan R. Mason & Indranil Pan
Imperial College London, London, UK
Lachlan R. Mason & Indranil Pan

Authors

Themistoklis Botsas
View author publications
You can also search for this author in PubMed Google Scholar
Lachlan R. Mason
View author publications
You can also search for this author in PubMed Google Scholar
Indranil Pan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Themistoklis Botsas.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Botsas, T., Mason, L.R. & Pan, I. Rule-based Bayesian regression. Stat Comput 32, 44 (2022). https://doi.org/10.1007/s11222-022-10100-7

Download citation

Received: 01 August 2020
Accepted: 29 April 2022
Published: 28 May 2022
DOI: https://doi.org/10.1007/s11222-022-10100-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Rule-based Bayesian regression

Abstract

Access this article

Similar content being viewed by others

A random forest guided tour

Estimating power in (generalized) linear mixed models: An open introduction and tutorial in R

Aleatoric and epistemic uncertainty in machine learning: an introduction to concepts and methods

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Rule-based Bayesian regression

Abstract

Access this article

Similar content being viewed by others

A random forest guided tour

Estimating power in (generalized) linear mixed models: An open introduction and tutorial in R

Aleatoric and epistemic uncertainty in machine learning: an introduction to concepts and methods

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation