Skip to main content

Properties of the Number of Iterations of a Feasible Solutions Algorithm

  • Chapter
  • First Online:
Modern Statistical Methods for Spatial and Multivariate Data

Abstract

In recent years, statistical analyses, algorithms, and modeling have been constrained due to computational complexity. Further, the added complexity of relationships among response and explanatory variables, such as higher-order interaction effects, makes identifying predictors using standard statistical techniques difficult. These difficulties are only exacerbated in the case of small sample sizes in some studies. Recent analyses have targeted the identification of interaction effects in big data, but the development of methods to identify higher-order interaction effects has been limited by computational concerns. One recently studied method is the feasible solutions algorithm (FSA), a fast, flexible method that aims to find a set of statistically optimal models via a stochastic search algorithm. Although FSA has shown promise, its current limits include that the user must choose the number of times to run the algorithm. Here, we provide statistical guidance for this number of iterations by deriving a lower bound on the probability of obtaining the statistically optimal model in a number of iterations of FSA. For example, when considering a two-way interaction model, if you would like the probability of obtaining the statistically optimal solution to be at least 80%, then you would need to choose the number of random starts of FSA to be 40% of the number of possible explanatory variables in your data set. The performance of this bound is then tested on both simulated and real data. This work allows FSA users to make statistically informed choices about FSA that can improve data analysis techniques.

Electronic Supplementary Material The online version of this chapter (https://doi.org/10.1007/978-3-030-11431-2_5) contains supplementary material, which is available to authorized users.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 119.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  • Friedman, J., Hastie, T., Tibshirani, R.: glmnet: Lasso and elastic-net regularized generalized linear models. R package version 1(4) (2009)

    Google Scholar 

  • Gemperline, P.J.: Computation of the range of feasible solutions in self-modeling curve resolution algorithms. Anal. Chem. 71(23), 5398–5404 (1999)

    Article  Google Scholar 

  • Goudey, B., Abedini, M., Hopper, J.L., Inouye, M., Makalic, E., Schmidt, D.F., Wagner, J., Zhou, Z., Zobel, J., Reumann, M.: High performance computing enabling exhaustive analysis of higher order single nucleotide polymorphism interaction in genome wide association studies. Health Inf. Sci. Syst. 3(1), 1 (2015)

    Article  Google Scholar 

  • Hawkins, D.M.: The feasible set algorithm for least median of squares regression. Comput. Stat. Data Anal. 16(1), 81–101 (1993)

    Article  Google Scholar 

  • Hawkins, D.M.: A feasible solution algorithm for the minimum volume ellipsoid estimator in multivariate data. Comput. Stat. 8, 95–95 (1993)

    Google Scholar 

  • Hawkins, D.M.: The feasible solution algorithm for least trimmed squares regression. Comput. Stat. Data Anal. 17(2), 185–196 (1994)

    Article  Google Scholar 

  • Hawkins, D.M.: The feasible solution algorithm for the minimum covariance determinant estimator in multivariate data. Comput. Stat. Data Anal. 17(2), 197–210 (1994)

    Article  Google Scholar 

  • Hawkins, D.M., Olive, D.J.: Improved feasible solution algorithms for high breakdown estimation. Comput. Stat. Data Anal. 30(1), 1–11 (1999)

    Article  MathSciNet  Google Scholar 

  • Lambert, J., Gong, L., Elliot, C.F., Thompson, K., Stromberg, A.: rFSA: an R package for finding best subsets and interactions. R J. 10(2), 295–308 (2018)

    Article  Google Scholar 

  • Lumley, T., Miller, A.: Leaps: regression subset selection. R package version 2 (2004)

    Google Scholar 

  • Miller, A.J.: Selection of subsets of regression variables. J. R. Stat. Soc. Ser. A Gen. 147(3), 389–425 (1984)

    Article  MathSciNet  Google Scholar 

  • Moore, J.H., Williams, S.M.: Epistasis and its implications for personal genetics. Am. J. Hum. Genet. 85(3), 309–320 (2009)

    Article  Google Scholar 

  • Zhang, W., Korstanje, R., Thaisz, Staedtler, F., Harttman, N., Xu, L., Feng, M., Yanas, L., Yang, H., Valdar, W., et al.: Genome-wide association mapping of quantitative traits in outbred mice. G3: Genes Genomes Genetics 2(2), 167–174 (2012)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Katherine L. Thompson .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Janse, S.A., Thompson, K.L. (2019). Properties of the Number of Iterations of a Feasible Solutions Algorithm. In: Diawara, N. (eds) Modern Statistical Methods for Spatial and Multivariate Data. STEAM-H: Science, Technology, Engineering, Agriculture, Mathematics & Health. Springer, Cham. https://doi.org/10.1007/978-3-030-11431-2_5

Download citation

Publish with us

Policies and ethics