Model Predictivity Assessment: Incremental Test-Set Selection and Accuracy Evaluation

Fekhari, Elias; Iooss, Bertrand; Muré, Joseph; Pronzato, Luc; Rendas, Maria-João

doi:10.1007/978-3-031-16609-9_20

Elias Fekhari⁵,
Bertrand Iooss⁵,
Joseph Muré⁵,
Luc Pronzato⁶ &
…
Maria-João Rendas⁶

Part of the book series: Springer Proceedings in Mathematics & Statistics ((PROMS,volume 406))

Included in the following conference series:

Convegno della Società Italiana di Statistica

368 Accesses
2 Citations

Abstract

Unbiased assessment of the predictivity of models learnt by supervised machine learning (ML) methods requires knowledge of the learned function over a reserved test set (not used by the learning algorithm). The quality of the assessment depends, naturally, on the properties of the test set and on the error statistic used to estimate the prediction error. In this work we tackle both issues, proposing a new predictivity criterion that carefully weights the individual observed errors to obtain a global error estimate, and using incremental experimental design methods to “optimally” select the test points on which the criterion is computed. Several incremental constructions are studied, including greedy-packing (coffee-house design), support points and kernel herding techniques. Our results show that the incremental and weighted versions of the latter two, based on Maximum Mean Discrepancy concepts, yield superior performance. An industrial test case provided by the historical French electricity supplier (EDF) illustrates the practical relevance of the methodology, indicating that it is an efficient alternative to expensive cross-validation techniques.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 189.00; Price excludes VAT (USA)

Softcover Book: USD 249.99; Price excludes VAT (USA)

Hardcover Book: USD 249.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
https://pypi.org/project/otkerneldesign/.

References

Baudin, M., Dutfoy, A., Iooss, B., Popelin, A-P.: Open TURNS: An industrial software for uncertainty quantification in simulation. In: Ghanem, R., Higdon, D., Owhadi, H. (eds.) Springer Handbook on Uncertainty Quantification, pp. 2001–2038. Springer (2017)
Google Scholar
Berlinet , A., Thomas-Agnan, C.: Reproducing Kernel Hilbert Spaces in Probability and Statistics. Springer (2004)
Google Scholar
Borovicka, T., Jr. Jirina, M., Kordik, P., Jirina, M.: Selecting representative data sets. In: Karahoca, A. (eds) Advances in Data Mining, Knowledge Discovery and Applications, pp. 43–70. INTECH (2012)
Google Scholar
Chen, W.Y., Barp, A., Briol, F.-X., Gorham, J., Girolami, M., Mackey, L., Oates, C.: Stein Point Markov Chain Monte Carlo. arXiv preprint. arXiv:1905.03673 (2019)
Chen, W.Y., Mackey, L., Gorham, J., Briol, F.-X., Oates, C.J.: Stein Points. Proc. ICML (2018). arXiv preprint arXiv:1803.10161v4
Chen, Y., Welling, M., Smola, A.: Super-samples from kernel herding. In Proceedings of the Twenty-Sixth Conference on Uncertainty in Artificial Intelligence, pp. 109–116. AUAI Press (2010)
Google Scholar
Chevalier, C., Bect, J., Ginsbourger, D., Picheny, V., Richet, Y., Vazquez, E.: Fast kriging-based stepwise uncertainty reduction with application to the identification of an excursion set. Technometrics 56, 455–465 (2014)
Article MathSciNet Google Scholar
Crombecq, K., Laermans, E., Dhaene, T.: Efficient space-filling and non-collapsing sequential design strategies for simulation-based modelling. Eur. J. Oper. Res. 214, 683–696 (2011)
Article Google Scholar
Da Veiga, S.: Global sensitivity analysis with dependence measures. J. Stat. Comput. Simul. 85, 1283–1305 (2015)
Article MathSciNet MATH Google Scholar
Da Veiga, S., Gamboa, F., Iooss, B., Prieur, C.: Basics and Trends in Sensitivity Analysis. Theory and Practice in R. SIAM (2021)
Google Scholar
de Crécy, A., Bazin, P., Glaeser, H., Skorek, T., Joufcla, J., Probst, P., Fujioka, K., Chung, B.D., Oh, D.Y., Kyncl, M., Pernica, R., Macek, J., Meca, R., Macian, R., D’Auria, F., Petruzzi, A., Batet, L., Perez, M., Reventos, F.: Uncertainty and sensitivity analysis of the LOFT L2–5 test: results of the BEMUSE programme. Nucl. Eng. Design 12, 3561–3578 (2008)
Article Google Scholar
Demay, C., Iooss, B., Le Gratiet, L., Marrel, A.: Model selection for Gaussian Process regression: an application with highlights on the model variance validation. Qual. Reliab. Eng. Int. J. 38, 1482–1500 (2022). https://doi.org/10.1002/qre.2973
Dubrule, O.: Cross validation of kriging in a unique neighborhood. J. Int. Assoc. Math. Geol. 15(6), 687–699 (1983)
Article MathSciNet Google Scholar
ENIQ: Qualification of an AI/ML NDT system—Technical basis. NUGENIA, ENIQ Technical Report (2019)
Google Scholar
Fang, K.-T., Li, R., Sudjianto, A.: Design and Modeling for Computer Experiments. Chapman & Hall/CRC (2006)
Google Scholar
Geffraye, G., Antoni, O., Farvacque, M., Kadri, D., Lavialle, G., Rameau, B., Ruby, A.: CATHARE2 V2.5_2: a single version for various applications. Nucl. Eng. Des. 241, 4456–4463 (2011)
Google Scholar
Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. The MIT Press (2016)
Google Scholar
Gretton, A., Bousquet, O., Smola, A., Schölkopf, B.: Measuring statistical dependence with Hilbert-Schmidt norms. In Proceedings Algorithmic Learning Theory, pp. 63–77. Springer-Verlag (2005)
Google Scholar
Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning, 2nd edn. Springer (2009)
Google Scholar
Hawkins, R., Paterson, C., Picardi, C., Jia, Y., Calinescu, R., Habli, I.: Guidance on the assurance of machine learning in autonomous systems (AMLAS). University of York, Assuring Autonomy International Programme (AAIP) (2021)
Google Scholar
Iooss, B.: Sample selection from a given dataset to validate machine learning models. In Proceedings of 50th Meeting of the Italian Statistical Society (SIS2021), pp. 88–93. Pisa, Italy, June (2021)
Google Scholar
Iooss, B., Boussouf, L., Feuillard, V., Marrel, A.: Numerical studies of the metamodel fitting and validation processes. Int. J. Adv. Syst. Measure. 3, 11–21 (2010)
Google Scholar
Joseph, V.R., Vakayil, A.: SPlit: an optimal method for data splitting. Technometrics 64(2), 166–176 (2022)
Article MathSciNet Google Scholar
Kennard, R.W., Stone, L.A.: Computer aided design of experiments. Technometrics 11, 137–148 (1969)
Article MATH Google Scholar
Kleijnen, J.P.C., Sargent, R.G.: A methodology for fitting and validating metamodels in simulation. Eur. J. Oper. Res. 120, 14–29 (2000)
Article MATH Google Scholar
Lemaire, M., Chateauneuf, A., Mitteau, J.-C.: Structural Reliability. Wiley (2009)
Google Scholar
Li, W., Lu, L., Xie, X., Yang, M.: A novel extension algorithm for optimized Latin hypercube sampling. J. Stat. Comput. Simul. 87, 2549–2559 (2017)
Article MathSciNet MATH Google Scholar
Lorenzo, G., Zanocco, P., Giménez, M., Marquès, M., Iooss, B., Bolado-Lavin, R., Pierro, F., Galassi, G., D’Auria, F., Burgazzi, L.: Assessment of an isolation condenser of an integral reactor in view of uncertainties in engineering parameters. Sci. Technol. Nucl. Install. (2011). https://doi.org/10.1155/2011/827354
Mak, S., Joseph, V.R.: Support points. Ann. Stat. 46, 2562–2592 (2018)
Article MathSciNet MATH Google Scholar
Marrel, A., Chabridon, V.: Statistical developments for target and conditional sensitivity analysis: Application on safety studies for nuclear reactor. Reliab. Eng. Syst. Saf. 214, 107711 (2021)
Article Google Scholar
Marrel, A., Iooss, B., Chabridon, V.: The ICSCREAM methodology: identification of penalizing configurations in computer experiments using screening and metamodel - Applications in thermal-hydraulics. Nucl. Sci. Eng. 196, 301–321 (2022). https://doi.org/10.1080/00295639.2021.1980362
Molnar, C.: Interpretable Machine Learning. github (2019)
Google Scholar
Morris, M.D., Mitchell, T.J.: Exploratory designs for computational experiments. J. Stat. Planning Inference 43, 381–402 (1995)
Article MATH Google Scholar
Müller, W.G.: Collecting Spatial Data, 3rd edn. Springer (2007)
Google Scholar
Nash, J., Sutcliffe, J.: River flow forecasting through conceptual models part I-A discussion of principles. J. Hydrol. 10(3), 282–290 (1970)
Article Google Scholar
Nogales Gómez, A., Pronzato, L., Rendas, M.-J.: Incremental space-filling design based on coverings and spacings: improving upon low discrepancy sequences. J. Stat. Theory Pract. 15(4), 77 (2021)
Google Scholar
Pronzato, L.: Performance analysis of greedy algorithms for minimising a maximum mean discrepancy. Statistics and Computing, to appear (2022), hal-03114891. arXiv:2101.07564
Pronzato, L., Müller, W.: Design of computer experiments: space filling and beyond. Stat. Comput. 22, 681–701 (2012)
Article MathSciNet MATH Google Scholar
Pronzato, L., Rendas, M.-J.: Validation design I: construction of validation designs via kernel herding. Preprint (2021), hal-03474805. arXiv:2112.05583
Pronzato, L., Zhigljavsky, A.A.: Bayesian quadrature and energy minimization for space-filling design. SIAM/ASA J. Uncertainty Quant. 8, 959–1011 (2020)
Article MathSciNet MATH Google Scholar
Qian, P.Z.G., Ai, M., Wu, C.F.J.: Construction of nested space-filling designs. Ann. Stat. 37, 3616–3643 (2009)
Article MathSciNet MATH Google Scholar
Qian, P.Z.G., Wu, C.F.J.: Sliced space filling designs. Biometrika 96, 945–956 (2009)
Article MathSciNet MATH Google Scholar
Rasmussen, C.E., Williams, C.K.I.: Gaussian Processes for Machine Learning. MIT Press (2006)
Google Scholar
Santner, T., Williams, B., Notz, W.: The Design and Analysis of Computer Experiments. Springer (2003)
Google Scholar
Sejdinovic, D., Sriperumbudur, B., Gretton, A., Fukumizu, K.: Equivalence of distance-based and RKHS-based statistics in hypothesis testing. Ann. Stat. 41(5), 2263–2291 (2013)
Article MathSciNet MATH Google Scholar
Shang, B., Apley, D.W.: Fully-sequential space-filling design algorithms for computer experiments. J. Qual. Technol. 53(2), 173–196 (2021)
Article Google Scholar
Sheikholeslami, R., Razavi, S.: Progressive Latin hypercube sampling: an efficient approach for robust sampling-based analysis of environmental models. Environ. Model. Softw. 93, 109–126 (2017)
Article Google Scholar
Smith, R.C.: Uncertainty Quantification. SIAM (2014)
Google Scholar
Smola, A., Gretton, A., Song, L., Schölkopf, B.: A Hilbert space embedding for distributions. In International Conference on Algorithmic Learning Theory, pp. 13–31. Springer (2007)
Google Scholar
Snee, R.D.: Validation of regression models: methods and examples. Technometrics 19, 415–428 (1977)
Article MATH Google Scholar
Sriperumbudur, B.K., Gretton, A., Fukumizu, K., Schölkopf, B., Lanckriet, G.R.: Hilbert space embeddings and metrics on probability measures. J. Mach. Learn. Res. 11, 1517–1561 (2010)
Google Scholar
Székely, G.J., Rizzo, M.L.: Testing for equal distributions in high dimension. InterStat 5, 1–6 (2004)
Google Scholar
Székely, G.J., Rizzo, M.L.: Energy statistics: a class of statistics based on distances. J. Stat. Planning Inference 143, 1249–1272 (2013)
Article MathSciNet MATH Google Scholar
Teymur, O., Gorham, J., Riabiz, M., Oates, C.J.: Optimal quantisation of probability measures using maximum mean discrepancy. In International Conference on Artificial Intelligence and Statistics, pp. 1027–1035 (2021). arXiv preprint arXiv:2010.07064v1
Wold, S., Sjöström, M., Eriksson, L.: PLS-regression: a basic tool of chemometrics. Chemometr. Intell. Lab. Syst. 58(2), 109–130 (2001)
Article Google Scholar
Xu, Y., Goodacre, R.: On splitting training and validation set: a comparative study of cross-validation, bootstrap and systematic sampling for estimating the generalization performance of supervised learning. J. Anal. Testing 2, 249–262 (2018)
Article Google Scholar

Download references

Acknowledgements

This work was supported by project INDEX (INcremental Design of EXperiments) ANR-18-CE91-0007 of the French National Research Agency (ANR). The authors are grateful to Guillaume Levillain and Thomas Bittar for their code development during their work at EDF. Thanks also to Sébastien Da Veiga for fruitful discussions.

Author information

Authors and Affiliations

EDF R&D, 6 Quai Watier, 78401, Chatou, France
Elias Fekhari, Bertrand Iooss & Joseph Muré
CNRS, Université Côte d’Azur, Laboratoire I3S, Bât. Euclide, Les Algorithmes, 2000 route des Lucioles, 06900, Sophia Antipolis cedex, France
Luc Pronzato & Maria-João Rendas

Authors

Elias Fekhari
View author publications
You can also search for this author in PubMed Google Scholar
Bertrand Iooss
View author publications
You can also search for this author in PubMed Google Scholar
Joseph Muré
View author publications
You can also search for this author in PubMed Google Scholar
Luc Pronzato
View author publications
You can also search for this author in PubMed Google Scholar
Maria-João Rendas
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Bertrand Iooss .

Editor information

Editors and Affiliations

Department of Economics and Management, University of Pisa, Pisa, Italy
Nicola Salvati
Department of Economics and Statistics, University of Salerno, Fisciano, Salerno, Italy
Cira Perna
Department of Economics and Management, University of Pisa, Pisa, Italy
Stefano Marchetti
School of Mathematics and Applied Statistics, University of Wollongong, Wollongong, NSW, Australia
Raymond Chambers

Appendix

1.1 Appendix A: Maximum Mean Discrepancy

Let K be a positive definite kernel on $\mathcal {X}\times \mathcal {X}$, defining a reproducing kernel Hilbert space (RKHS) $\mathcal {H}_K$ of functions on $\mathcal {X}$, with scalar product $\langle f,g\rangle _{\mathcal {H}_K}$ and norm $\Vert f\Vert _{\mathcal {H}_K}$; see, e.g., [2]. For any $f\in \mathcal {H}_K$ and any probability measures $\mu $ and $\xi $ on $\mathcal {X}$, we have

$$\begin{aligned} \left| \int _\mathcal {X}f(\textbf{x})\, \text{ d }\xi (\textbf{x}) - \int _\mathcal {X}f(\textbf{x})\, \text{ d }\mu (\textbf{x}) \right|= & {} \left| \int _\mathcal {X}\langle f,K_\textbf{x}\rangle _{\mathcal {H}_K}\, \text{ d }(\xi -\mu )(\textbf{x}) \right| \nonumber \\= & {} \left| \langle f,(P_{K,\xi }-P_{K,\mu }\rangle _{\mathcal {H}_K} \right| \,, \end{aligned}$$

(16)

where we have denoted $K_\textbf{x}(\cdot )=K(\textbf{x},\cdot )$ and used the reproducing property $f(\textbf{x})=\langle f,K_\textbf{x}\rangle _{\mathcal {H}_K}$ for all $\textbf{x}\in \mathcal {X}$, and where, for any probability measure $\nu $ on $\mathcal {X}$ and $\textbf{x}\in \mathcal {X}$,

$$\begin{aligned} P_{K,\nu }(\textbf{x}) = \int _\mathcal {X}K(\textbf{x}, \textbf{x}') \, \text{ d }\nu (\textbf{x}') \,, \end{aligned}$$

(17)

is the potential of $\nu $ at $\textbf{x}$. $P_{K,\nu }\in \mathcal {H}_K$ and is called kernel embedding of $\nu $ in ML. In some cases, the potential can be expressed analytically (see. Appendix 6), otherwise it can be estimated by numerical quadrature (Quasi Monte Carlo). Cauchy-Schwartz inequality applied to (16) gives

$$ \left| \int _\mathcal {X}f(\textbf{x})\, \text{ d }\xi (\textbf{x}) - \int _\mathcal {X}f(\textbf{x})\, \text{ d }\mu (\textbf{x}) \right| \le \Vert f\Vert _{\mathcal {H}_K}\,\Vert P_{K,\xi }-P_{K,\mu }\Vert _{\mathcal {H}_K} $$

and therefore

$$ \Vert P_{K,\xi }-P_{K,\mu }\Vert _{\mathcal {H}_K}=\sup _{f\in \mathcal {H}_K:\ \Vert f\Vert _{\mathcal {H}_K}=1} \left| \int _\mathcal {X}f(\textbf{x})\, \text{ d }\xi (\textbf{x}) - \int _\mathcal {X}f(\textbf{x})\, \text{ d }\mu (\textbf{x}) \right| \,. $$

The Maximum Mean Discrepancy (MMD) between $\xi $ and $\mu $ (for the kernel K and set $\mathcal {X}$) is $d_K(\xi ,\mu )=\Vert P_{K,\xi }-P_{K,\mu }\Vert _{\mathcal {H}_K}$. Direct calculation gives

$$\begin{aligned} d_K^2(\xi ,\mu )= & {} \Vert P_{K,\xi }-P_{K,\mu }\Vert _{\mathcal {H}_K}^2 = \int _{\mathcal {X}^2} K(\textbf{x},\textbf{x}')\, \text{ d }(\xi -\mu )(\textbf{x})\text{ d }(\xi -\mu )(\textbf{x}') \end{aligned}$$

(18)

$$\begin{aligned}= & {} \mathbb {E}_{\zeta ,\zeta '\sim \xi } K(\zeta ,\zeta ') + \mathbb {E}_{\zeta ,\zeta '\sim \mu } K(\zeta ,\zeta ') - 2\mathbb {E}_{\zeta \sim \xi , \zeta '\sim \mu } K(\zeta ,\zeta ') \,, \end{aligned}$$

(19)

where the random variables $\zeta $ and $\zeta '$ in (19) are independent, see [49]. When K is the energy distance kernel (10), one recovers the expression (11) for the corresponding MMD. One may refer to [51] for an illuminating exposition on MMD, kernel embedding, and conditions on K (the notion of characteristic kernel) that make $d_K$ a metric on the space of probability measures on $\mathcal {X}$. The distance and Matérn kernels considered in this paper are characteristic.

1.2 Appendix B: Analytical Computation of Potentials for Matérn Kernels

As for tensor-product kernels, the potential is the product of the one-dimensional potentials, we only consider one-dimensional input spaces.

For $\mu $ the uniform distribution on [0, 1] and K the Matérn kernel $K_{5/2,\theta }$ with smoothness $\nu =5/2$ and correlation length $\theta $, see (15), we get

$$\begin{aligned} P_{K_{5/2,\theta },\mu }(x) = \frac{16 \theta }{3 \sqrt{5}} - \frac{1}{15 \theta } (S_\theta (x) + S_\theta (1-x)), \end{aligned}$$

where

$$\begin{aligned} \nonumber S_\theta (x) = \exp \left( - \frac{\sqrt{5}}{\theta } x \right) \left( 5 \sqrt{5} x^2 + 25 \theta x + 8 \sqrt{5} \theta ^2 \right) . \end{aligned}$$

The expressions $P_{K_{\nu ,\theta },\mu }(x)$ for $\nu =1/2$ and $\nu =3/2$ can be found in [40].

When $\mu $ is the standard normal distribution $\mathcal {N}(0,1)$, the potential $P_{K_{5/2,\theta },\mathcal {N}(0,1)}$ is $ P_{K_{5/2,\theta },\mathcal {N}(0,1)}(x) = T_\theta (x) + T_\theta (-x), $ where

$$\begin{aligned} T_\theta (x)= & {} \frac{1}{6} \left( \frac{5}{\theta ^2} x^2 + \left( 3 - \frac{10}{\theta ^2} \right) \frac{\sqrt{5}}{\theta } x + \frac{5}{\theta ^2} \left( \frac{5}{\theta ^2} -2 \right) + 3 \right) \\{} & {} \times \, \textrm{erfc} \left( \frac{\frac{\sqrt{5}}{\theta } - x}{\sqrt{2}} \right) \exp \left( \frac{5}{2 \theta ^2} - \frac{\sqrt{5}}{\theta }x\right) \\ {}{} & {} + \frac{1}{3 \sqrt{2 \pi }} \frac{\sqrt{5}}{\theta } \left( 3 - \frac{5}{\theta ^2} \right) \exp \left( -\frac{x^2}{2}\right) . \end{aligned}$$

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Fekhari, E., Iooss, B., Muré, J., Pronzato, L., Rendas, MJ. (2022). Model Predictivity Assessment: Incremental Test-Set Selection and Accuracy Evaluation. In: Salvati, N., Perna, C., Marchetti, S., Chambers, R. (eds) Studies in Theoretical and Applied Statistics . SIS 2021. Springer Proceedings in Mathematics & Statistics, vol 406. Springer, Cham. https://doi.org/10.1007/978-3-031-16609-9_20

Download citation

DOI: https://doi.org/10.1007/978-3-031-16609-9_20
Published: 15 February 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-16608-2
Online ISBN: 978-3-031-16609-9
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics

Model Predictivity Assessment: Incremental Test-Set Selection and Accuracy Evaluation

Abstract

Access this chapter

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Appendix

Appendix

1.1 Appendix A: Maximum Mean Discrepancy

1.2 Appendix B: Analytical Computation of Potentials for Matérn Kernels

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation