Design of Computer Experiments Using Competing Distances Between Set-Valued Inputs

  • David Ginsbourger
  • Jean Baccou
  • Clément Chevalier
  • Frédéric Perales
Conference paper
Part of the Contributions to Statistics book series (CONTRIB.STAT.)

Abstract

In many numerical simulation experiments from natural sciences and engineering, inputs depart from the classical moderate-dimensional vector set-up and include more complex objects such as parameter fields or maps. In this case, and when inputs are generated using stochastic methods or taken from a pre-existing large set of candidates, one often needs to choose a subset of “representative” elements because of practical restrictions. Here we tackle the design of experiments based on distances or dissimilarity measures between input maps, and more specifically between inputs of set-valued nature. We consider the problem of choosing experiments given dissimilarities such as the Hausdorff or Wasserstein distances but also of eliciting adequate dissimilarities not only based on practitioners’ expertise but also on quantitative and graphical diagnostics including nearest neighbour cross-validation and non-Euclidean structural analysis. The proposed approaches are illustrated on an original uncertainty quantification case study from mechanical engineering, where using partitioning around medoids with ad hoc distances gives promising results in terms of stratified sampling.

References

  1. 1.
    Adler, R.J., Taylor, J.E.: Random Fields and Geometry. Springer, New York (2007)MATHGoogle Scholar
  2. 2.
    Cast3M software: http://www-cast3m.cea.fr. Accessed Dec 2015
  3. 3.
    Cressie, N.A.C.: Statistics for Spatial Data. Wiley, New York (1993)MATHGoogle Scholar
  4. 4.
    Curriero, F.C.: On the use of non-Euclidean distance measures in geostatistics. Math. Geol. 38 (8), 907–926 (2006)CrossRefMATHMathSciNetGoogle Scholar
  5. 5.
    Ginsbourger, D., Rosspopoff, B., Pirot, G., Durrande, N., Renard, P.: Distance-based kriging relying on proxy simulations for inverse conditioning. Adv. Water Res. 52, 275–291 (2013)CrossRefGoogle Scholar
  6. 6.
    Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer, New York (2009)CrossRefMATHGoogle Scholar
  7. 7.
    Johnson, M.E., Moore, L.M., Ylvisaker, D.: Minimax and maximin distance designs. J. Stat. Plan. Inference 26, 131–148 (1990)CrossRefMathSciNetGoogle Scholar
  8. 8.
    Molchanov, I.: Theory of Random Sets. Springer, London (2005)MATHGoogle Scholar
  9. 9.
    Radovanović, M., Nanopoulos, A., Ivanović, M.: Hubs in space: popular nearest neighbors in high-dimensional data. J. Mach. Learn. Res. 11, 2487–2531 (2010)MATHMathSciNetGoogle Scholar
  10. 10.
    Schuhmacher, D., Xia, A.: A new metric between distributions of point processes. Adv. Appl. Probab. 40, 651–672 (2008)CrossRefMATHMathSciNetGoogle Scholar
  11. 11.
    von Luxburg, U.: Statistical learning with similarity and dissimilarity functions. Ph.D. thesis, Technische Universität Berlin (2004)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  • David Ginsbourger
    • 1
    • 2
  • Jean Baccou
    • 3
    • 4
  • Clément Chevalier
    • 5
    • 6
  • Frédéric Perales
    • 3
    • 4
  1. 1.Idiap Research Institute, Centre du ParcMartignySwitzerland
  2. 2.IMSV, Department of Mathematics and StatisticsUniversity of BernBernSwitzerland
  3. 3.Institut de Radioprotection et de Sûreté NucléairePSN-RES, SEMIACentre de CadaracheFrance
  4. 4.Laboratoire de Micromécanique et d’Intégrité des StructuresIRSN-CNRS-UMSaint-Paul-lès-DuranceFrance
  5. 5.Institut de StatistiqueUniversité de NeuchâtelNeuchâtelSwitzerland
  6. 6.Institute of MathematicsUniversity of ZurichZurichSwitzerland

Personalised recommendations