Fieller Stability Measure: a novel model-dependent backtesting approach

Bravo, Cristián; Maldonado, Sebastián

doi:10.1057/jors.2015.18

Fieller Stability Measure: a novel model-dependent backtesting approach

General Paper
Published: 15 April 2015

Volume 66, pages 1895–1905, (2015)
Cite this article

Journal of the Operational Research Society

Cristián Bravo¹ &
Sebastián Maldonado²

77 Accesses
6 Citations
1 Altmetric
Explore all metrics

Abstract

Dataset shift is present in almost all real-world applications, since most of them are constantly dealing with changing environments. Detecting fractures in datasets on time allows recalibrating the models before a significant decrease in the model’s performance is observed. Since small changes are normal in most applications and do not justify the efforts that a model recalibration requires, we are only interested in identifying those changes that are critical for the correct functioning of the model. In this work we propose a model-dependent backtesting strategy designed to identify significant changes in the covariates, relating a confidence zone of the change to a maximal deviance measure obtained from the coefficients of the model. Using logistic regression as a predictive approach, we performed experiments on simulated data, and on a real-world credit scoring dataset. The results show that the proposed method has better performance than traditional approaches, consistently identifying major changes in variables while taking into account important characteristics of the problem, such as sample sizes and variances, and uncertainty in the coefficients.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A data-driven approach for a class of stochastic dynamic optimization problems

Article 28 September 2021

Bayesian model selection for complex dynamic systems

Article Open access 04 May 2018

A general model-checking procedure for semiparametric accelerated failure time models

Article 07 May 2024

References

Anderson R (2007). The Credit Scoring Toolkit. Oxford University Press: New York, USA.
Google Scholar
Baesens B (2014). Analytics in a Big Data World. John Wiley and Sons: New York, USA.
Google Scholar
Baesens B, Mues C, Martens D and Vanthienen J (2009). 50 years of data mining and or: Upcoming trends and challenges. Journal of the Operational Research Society 60 (S1): 16–23.
Article Google Scholar
Basu A, Harris IR and Basu S (1997). Minimum distance estimation: The approach using density-based distances. Handbook of Statistics. Vol. 15: Robust Inference Elsevier: North Holland, Netherlands, pp 21–48.
Google Scholar
Bergtold J, Yeager E and Featherstone A (2011). Sample size and robustness of inferences from logistic regression in the presence of nonlinearity and multicollinearity. In: Proceedings of the Agricultural & Applied Economics Associations 2011 AAEA & NAREA Joint Annual Meeting. Pittsburg, Pennsylvania, USA.
Beyene J and Moineddin R (2005). Methods for confidence interval estimation of a ratio parameter with application to location quotients. BMC Medical Research Methodology 5 (32): 1–7.
Google Scholar
Birón M and Bravo C (2014). On the discriminative power of credit scoring systems trained on independent samples. Data Analysis, Machine Learning and Knowledge Discovery. Springer International Publishing, pp 247–254.
Chapter Google Scholar
Bravo C, Maldonado S and Weber R (2013). Granting and managing loans for micro-entrepreneurs: New developments and practical experiences. European Journal of Operational Research 227 (2): 358–366.
Article Google Scholar
Castermans G, Hamers B, Van Gestel T and Baesens B (2010). An overview and framework for PD backtesting and benchmarking. The Journal of the Operational Research Society 61 (3): 359–373.
Article Google Scholar
Cieslak D and Chawla N (2007). Detecting fractures in classifier performance. In: Proceedings of the Seventh IEEE International Conference on Data Mining, Department of Computer Science and Engineering, University of Notredame, Indiana, USA, pp 123–132.
Fieller EC (1954). Some problems in interval estimation. Journal of the Royal Statistical Society, Series B 16 (2): 175–185.
Google Scholar
Hofer V and Krempl G (2013). Drift mining in data: A framework for addressing drift in classification. Computational Statistics and Data Analysis 57 (1): 377–391.
Article Google Scholar
Hosmer D and Lemeshow H (2000). Applied Logistic Regression. John Wiley & Sons: Hoboken, New Jersey, USA.
Book Google Scholar
Kelly M, Hand D and Adams N (1999). The impact of changing populations on classifier performance. In: Proceedings of the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. San Diego, California, USA, pp 367–371.
Lewis EM (1992). An Introduction to Credit Scoring. Fair, Isaac & Co., Inc: California, USA.
Google Scholar
Lima E, Mues C and Baesens B (2011). Monitoring and backtesting churn models. Expert Systems with Applications 38 (1): 975–982.
Article Google Scholar
Moreno-Torres JG, Raeder TR, Aláiz-Rodríguez R, Chawla NV and Herrera F (2012). A unifying view on dataset shift in classification. Pattern Recognition 45 (1): 521–530.
Article Google Scholar
Quiñonero Candela J, Sugiyama M, Schwaighofer A and Lawrence ND (eds). (2009). Dataset Shift in Machine Learning. MIT Press: Cambridge, Massachusetts, USA.
Google Scholar
Robinson S, Brooks R and Lewis C (2002). Detecting shifts in the mean of a simulation output process. Journal of the Operational Research Society 53 (5): 559–573.
Article Google Scholar
Schenker N and Gentleman JF (2001). On judging the significance of differences by examining the overlap between confidence intervals. The American Statistician 55 (3): 182–186.
Article Google Scholar
Schlimmer J and Granger R (1986). Beyond incremental processing: tracking concept drift. In: Proceedings of the Fifth National Conference on Artificial Intelligence. San Francisco, CA, USA, pp 502–507.
Siddiqi N (2006). Credit Risk Scorecards: Developing and Implementing Intelligent Credit Scoring. John Wiley and Sons: Hoboken, New Jersey, USA.
Google Scholar
Smirnov N (1948). Tables for estimating the goodness of fit of empirical distributions. Annals of Mathematical Statistics 19 (2): 279–281.
Article Google Scholar
Utts JM and Heckard RF (2012). Mind on Statistics. 4th edn, Cengage Learning: Belmont, California, USA.
Google Scholar
Yang Y, Wu X and Zhu X (2008). Conceptual equivalence for contrast mining in classification learning. Data and Knowledge Engineering 67 (3): 413–429.
Article Google Scholar

Download references

Acknowledgements

The first author acknowledges the support of CONICYT Becas Chile PD-74140041. The second author was supported by CONICYT FONDECYT Initiation into Research 11121196. Both authors acknowledge the support of the Institute of Complex Engineering Systems (ICM: P-05-004- F, CONICYT: FBO16).

Author information

Authors and Affiliations

Universidad de Talca, Curicó, Chile
Cristián Bravo
Universidad de los Andes, Mons. Álvaro del Portillo 12455, Las Condes, Santiago, Chile
Sebastián Maldonado

Authors

Cristián Bravo
View author publications
You can also search for this author in PubMed Google Scholar
Sebastián Maldonado
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Cristián Bravo.

Appendix

Results for the experiments on real data

The following graphs present the results of the dataset shift tests (the proposed approach and SI) for the six remaining variables (Variables 4–9).

Figure A1

Figure A2

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bravo, C., Maldonado, S. Fieller Stability Measure: a novel model-dependent backtesting approach. J Oper Res Soc 66, 1895–1905 (2015). https://doi.org/10.1057/jors.2015.18

Download citation

Received: 17 April 2014
Accepted: 05 March 2015
Published: 15 April 2015
Issue Date: 01 November 2015
DOI: https://doi.org/10.1057/jors.2015.18

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fieller Stability Measure: a novel model-dependent backtesting approach

Abstract

Access this article

Similar content being viewed by others

A data-driven approach for a class of stochastic dynamic optimization problems

Bayesian model selection for complex dynamic systems

A general model-checking procedure for semiparametric accelerated failure time models

References

Acknowledgements