Sankhya A

pp 1–27 | Cite as

Bayesian Subset Selection Methods for Finding Engineering Design Values: an Application to Lumber Strength

  • Yumi Kondo
  • James V Zidek
  • Carolyn G Taylor
  • Constance van Eeden


The paper concerns a random property T of a manufactured product that must with high probability e.g. P* = 95% exceed a specified quantity ηa called the characteristic value (CV). However the product comes from any one of K different subpopulations that may represent such things as manufacturers, regions or countries; the distribution of T will generally differ from one subpopulation to another and so will the associated CV ηka, = 1,…,K. Moreover in applications such as the one we focus on in this paper where the subpopulations are species, the subpopulation of origin will, for both strategic or practical reasons, not be known. The problem confronted in this paper is the creation of a single CV for the population consisting of the union of all the subpopulations. A solution proposed long ago in the application concerning manufactured lumber that is addressed in this paper, selects a subset of the subpopulations using random samples of the T s, called the subset of controlling species CS, that includes the smallest of the {ηka} with high probability. The estimated CV for the entire population is then found by combining and treating as one, the samples for the subpopulations in CS. That method has been published in an ASTM standards document for the lumber industry to ensure the structural engineering strength of manufactured lumber. However this published method has been shown to have some unexpected and undesirable properties, leading to the search for an alternative and this paper. The paper presents and compares three subset selection methods. The simplest of the three methods is an extension of a classical nonparametric method for subset selection. The remaining two, which are more complex, are variations of nonparametric Bayesian methods. Each of the three is seen as a possible candidate for consideration by ASTM committees as a possible replacement for the ASTM method for lumber species depending on what criterion is ultimately used for its selection. But they may well apply in other contexts as well.

Keywords and phrases

Nonparametric bayes Rizvi–Sobel Dirichlet process prior Design values Weibull mixtures 

AMS (2000) subject classification

Primary 62C10 Secondary 62P25 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.



We are indebted to Conroy Lum from FPInnovations for introducing the second author to the topic addressed in this report and for many helpful discussions during the course of the work. Thanks also to Kyle Hambrook and John Petkau for helpful discussions during the course of the work.

Supplementary material

13171_2018_157_MOESM1_ESM.tex (17 kb)
(TEX 17.2 KB)


  1. ASTM Standard D1990 (2007). Standard practice for establishing allowable properties for visually-graded dimension lumber from in-grade tests of full-size specimens. Technical Report DOI: 10.1520/D1990-07, ASTM International.
  2. Berger, J.O. and Deely, J. (1988). A Bayesian approach to ranking and selection of related means with alternatives to analysis-of-variance methodology. J. Am. Stat. Assoc. 83, 364–373.MathSciNetCrossRefzbMATHGoogle Scholar
  3. Berger, R.L. et al. (1979). Minimax subset selection for loss measured by subset size. Ann. Stat. 7, 6, 1333–1338.MathSciNetCrossRefzbMATHGoogle Scholar
  4. Blackwell, D. and MacQueen, J.B. (1973). Ferguson distributions via polya urn schemes. Ann. Stat. 1, 353–355.CrossRefzbMATHGoogle Scholar
  5. Caflisch, R.E. (1998). Monte carlo and quasi-monte carlo methods. Acta Numerica 7, 1–49.MathSciNetCrossRefzbMATHGoogle Scholar
  6. Chakraborty, D. (2008). Statistical decision theory. estimation, testing and selection. Investigación Operacional 29, 2, 184–185.Google Scholar
  7. Evans, J., Kretschmann, D., Herian, V.L. and Green, D. (2001). Procedures for developing allowable properties for a single species under ASTM D1990 and computer programs useful for the calculations. General technical report FPL, 126. Madison, WI : U.S. Dept of Agriculture, Forest Service, Forest Products Laboratory.Google Scholar
  8. Ferguson, T.S. (1973). A Bayesian analysis of some nonparametric problem. Ann. Statist. 1, 209–230.MathSciNetCrossRefzbMATHGoogle Scholar
  9. Ferguson, T.S. (1983). Bayesian density estimation by mixtures of normal distributions. In: Recent advances in statistics, pp. 287–302. Elsevier.Google Scholar
  10. Fong, K.H. and Berger, J. (1993). Ranking, estimation and hypothesis testing in unbalanced models – a Bayesian approach. Statist. Decisions 11, 1–24.MathSciNetzbMATHGoogle Scholar
  11. Ishwaran, H. and James, L.F. (2001). Gibbs sampling methods for stick-breaking priors. J. Am. Stat. Assoc. 96, 161–173.MathSciNetCrossRefzbMATHGoogle Scholar
  12. Johnson, R.A., Evans, J.W. and Green, D.W. (1999). Nonparametric Bayesian predictive distributions for future order statistics. Statist. Probab. Lett. 41, 247–254.MathSciNetCrossRefzbMATHGoogle Scholar
  13. Johnson, R.A. and Lu, W. (2007). Proof load designs for estimation of dependence in a bivariate weibull model. Statist. Probab. Lett. 77, 1061–1069.MathSciNetCrossRefzbMATHGoogle Scholar
  14. Jones, E. (1988). In-grade testing of structural lumber. In: Proceedings of the Workshop on the In-Grade Testing of Structural Lumber, pp. 11–14. Forest Products Research Society.Google Scholar
  15. Kondo, Y. and Zidek, J.V. (2013). Bayesian nonparametric subset selection procedures with Weibull components. Technical Report 273, University of British Columbia.Google Scholar
  16. Kottas, A. (2006). Nonparametric Bayesian Survival Analysis using Mixtures of Weibull Distribution. Journal of Statistical Planning and Inference 136, 3, 578–596.MathSciNetCrossRefzbMATHGoogle Scholar
  17. Liu, Y., Salibiàn-Barrera, M., Zamar, R. and Zidek, J.V. (2019). Using artificial censoring to improve extreme tail quantile estimates. Applied Statistics. To appear.Google Scholar
  18. McDonald, G.C. (2016). Applications of subset selection procedures and bayesian ranking methods in analysis of traffic fatality data. Wiley Interdiscip. Rev. Comput. Stat. 8, 6, 222–237.MathSciNetCrossRefGoogle Scholar
  19. Rizvi, M. and Sobel, M. (1967). Nonparametric procedures for selecting a subset containing the population with the largest a-quantile. Ann. Math. Stat. 38, 6, 1788–1803.MathSciNetCrossRefzbMATHGoogle Scholar
  20. Sethuraman, J. (1994). A constructive definition of Dirichlet priors. Stat. Sin. 4, 639–650.MathSciNetzbMATHGoogle Scholar
  21. van Eeden, C. and Zidek, J.V. (2012). Subset selection – extended Rizvi–Sobel for unequal sample sizes and its implementation. Journal of Nonparametric Statistics 24, 299–315. Scholar

Copyright information

© Indian Statistical Institute 2018

Authors and Affiliations

  1. 1.Data Mining Services and SolutionsRobert Bosch LLCStuttgartGermany
  2. 2.Department of StatisticsUniversity of British ColumbiaVancouverCanada

Personalised recommendations