An information theoretical algorithm for analyzing supersaturated designs for a binary response

Balakrishnan, N.; Koukouvinos, C.; Parpoula, C.

doi:10.1007/s00184-011-0373-5

An information theoretical algorithm for analyzing supersaturated designs for a binary response

Published: 02 November 2011

Volume 76, pages 1–18, (2013)
Cite this article

Metrika Aims and scope Submit manuscript

N. Balakrishnan¹,
C. Koukouvinos² &
C. Parpoula²

174 Accesses
3 Citations
Explore all metrics

Abstract

A supersaturated design is a factorial design in which the number of effects to be estimated is greater than the number of runs. It is used in many experiments, for screening purpose, i.e., for studying a large number of factors and identifying the active ones. In this paper, we propose a method for screening out the important factors from a large set of potentially active variables through the symmetrical uncertainty measure combined with the information gain measure. We develop an information theoretical analysis method by using Shannon and some other entropy measures such as Rényi entropy, Havrda–Charvát entropy, and Tsallis entropy, on data and assuming generalized linear models for a Bernoulli response. This method is quite advantageous as it enables us to use supersaturated designs for analyzing data on generalized linear models. Empirical study demonstrates that this method performs well giving low Type I and Type II error rates for any entropy measure we use. Moreover, the proposed method is more efficient when compared to the existing ROC methodology of identifying the significant factors for a dichotomous response in terms of error rates.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Analysis of Factorial Designs with the Consideration of Interactions via the Stepwise Response Refinement Screener (SRRS)

Projection properties of two-level supersaturated designs constructed from Hadamard designs using Lin’s method

Article 11 January 2021

The Contribution to Experimental Designs by Kai-Tai Fang

References

Abraham B, Chipman H, Vijayan K (1999) Some risks in the construction and analysis of supersaturated designs. Technometrics 41: 135–141
Article Google Scholar
Beattie SD, Fong DKF, Lin DKJ (2002) A two-stage Bayesian model selection srategy for supersaturated designs. Technometrics 44: 55–63
Article MathSciNet Google Scholar
Biesiada J, Duch W et al (2007) Feature selection for high-dimensional data: a Pearson redundancy based filter. In: Kurzynski M (eds) Computer recognitions systems 2, vol 45. Springer, Berlin, pp 242–249
Chapter Google Scholar
Box GEP, Meyer RD (1986) An analysis for unreplicated fractional factorials. Technometrics 28: 11–18
Article MathSciNet MATH Google Scholar
Candes EJ, Tao T (2007) The Dantzig selector: statistical estimation when p is much larger than n. Ann Stat 35: 2313–2351
Article MathSciNet MATH Google Scholar
Chipman H, Hamada M, Wu CFJ (1997) A Bayesian variable selection approach for analyzing designed experiments with complex aliasing. Technometrics 39: 372–381
Article MATH Google Scholar
Dash M, Liu H, Motoda H (2000) Consistency based feature selection. In: Proceedings of the fourth Pacific Asia conference on knowledge discovery and data mining. Springer, pp 98–109
Gini C (1912) Variabilita e mutabilita: contributo allo studio delle distribuzioni e relazioni stati-stiche. Studi Economico-Giuridici dell’Univ. di Cagliari 3: 1–158
Google Scholar
Hall MA (1999) Correlation based feature selection for machine learning. PhD thesis, Department of Computer Science, Waikato
Havrda J, Charvát F (1967) Quantification method of classification processes: concept of structural entropy. Kybernetika 3: 30–35
MathSciNet MATH Google Scholar
Holcomb DR, Montgomery DC, Carlyle WM (2003) Analysis of supersaturated designs. J Qual Technol 35: 13–27
Google Scholar
Jones B, Lin DKJ, Nachtsheimc CJ (2008) Bayesian D-optimal supersaturated designs. J Stat Plan Inference 138: 86–92
Article MATH Google Scholar
Kira K, Rendell L (1992) The feature selection problem: traditional methods and a new algorithm. In: Proceedings of the tenth National conference on artificial intelligence. AAAI Press/The MIT Press, Menlo Park, pp 129–134
Koukouvinos C, Mylona K, Simos DE (2008) E(s ²)-optimal and minimax-optimal cyclic supersaturated designs via multi-objective simulated annealing. J Stat Plan Inference 138: 1639–1646
Article MathSciNet MATH Google Scholar
Li R, Lin DKJ (2002) Data analysis in supersaturated designs. Stat Probab Lett 59: 135–144
Article MathSciNet MATH Google Scholar
Lin DKJ (1993) A new class of supersaturated designs. Technometrics 35: 28–31
Article Google Scholar
Lin DKJ (1995) Generating systematic supersaturated designs. Technometrics 37: 213–225
Article MATH Google Scholar
Lu X, Wu X (2004) A strategy of searching active factors in supersaturated screening experiments. J Qual Technol 36: 392–399
Google Scholar
McCullagh P, Nelder JA (1989) Generalized linear models, 2nd edn. Chapman & Hall, London
MATH Google Scholar
Montgomery DC, Peck EA, Vining GG (2006) Introduction to linear regression analysis, 4th edn. Wiley, Hoboken
MATH Google Scholar
Pepe MS (2000a) Receiver operating characteristic methodology. J Am Stat Assoc 95: 308–311
Article Google Scholar
Pepe MS (2000b) An interpretation for ROC curve and inference using GLM procedures. Biometrics 56: 352–359
Article MathSciNet MATH Google Scholar
Phoa FKH, Pan Y-H, Xu H (2009) Analysis of supersaturated designs via the Dantzig selector. J Stat Plan Inference 139: 2362–2372
Article MathSciNet MATH Google Scholar
Press WH, Flannery BP, Teukolsky SA, Vetterling WT (1988) Numerical recipes in C. Cambridge University Press, Cambridge
MATH Google Scholar
Quinlan JR (1986) Induction of decision trees. Mach Learn 1: 81–106
Google Scholar
Rényi A (1961) On measures of information and entropy. In: Proceedings of the 4th Berkeley symposium on mathematics, statistics and probability. Berkeley University Press, Berkeley, pp 547–561
Shannon CE (1948) A mathematical theory of communication. Bell Syst Tech J 27:379-423 (623–656)
Google Scholar
Tang B, Wu CFJ (1997) A method for constructing supersaturated designs and its E(s ²)-optimality. Can J Stat 25: 191–201
Article MathSciNet MATH Google Scholar
Tsallis C (1988) Possible generalization of Boltzmann-Gibbs statistics. J Stat Phys 52: 479–487
Article MathSciNet MATH Google Scholar
Wang PC (1995) Comments on Lin (1993). Technometrics 37: 358–359
Google Scholar
Yu L, Liu H (2003) Feature selection for high-dimensional data: a fast correlation-based filter solution. In: Proceedings of the twentieth international conference on machine learning (ICML-2003), Washington, DC, pp 856–863
Zhang QZ, Zhang RC, Liu MQ (2007) A method for screening active effects in supersaturated designs. J Stat Plan Inference 137: 235–248
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Mathematics and Statistics, McMaster University, Hamilton, ON, L8S 4K1, Canada
N. Balakrishnan
Department of Mathematics, National Technical University of Athens, 15773, Zografou, Athens, Greece
C. Koukouvinos & C. Parpoula

Authors

N. Balakrishnan
View author publications
You can also search for this author in PubMed Google Scholar
C. Koukouvinos
View author publications
You can also search for this author in PubMed Google Scholar
C. Parpoula
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to C. Koukouvinos.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Balakrishnan, N., Koukouvinos, C. & Parpoula, C. An information theoretical algorithm for analyzing supersaturated designs for a binary response. Metrika 76, 1–18 (2013). https://doi.org/10.1007/s00184-011-0373-5

Download citation

Received: 20 December 2010
Published: 02 November 2011
Issue Date: January 2013
DOI: https://doi.org/10.1007/s00184-011-0373-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An information theoretical algorithm for analyzing supersaturated designs for a binary response

Abstract

Access this article

Similar content being viewed by others

Analysis of Factorial Designs with the Consideration of Interactions via the Stepwise Response Refinement Screener (SRRS)

Projection properties of two-level supersaturated designs constructed from Hadamard designs using Lin’s method

The Contribution to Experimental Designs by Kai-Tai Fang

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

An information theoretical algorithm for analyzing supersaturated designs for a binary response

Abstract

Access this article

Similar content being viewed by others

Analysis of Factorial Designs with the Consideration of Interactions via the Stepwise Response Refinement Screener (SRRS)

Projection properties of two-level supersaturated designs constructed from Hadamard designs using Lin’s method

The Contribution to Experimental Designs by Kai-Tai Fang

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation