Properties of the Number of Iterations of a Feasible Solutions Algorithm

Janse, Sarah A.; Thompson, Katherine L.

doi:10.1007/978-3-030-11431-2_5

Sarah A. Janse³ &
Katherine L. Thompson⁴

Part of the book series: STEAM-H: Science, Technology, Engineering, Agriculture, Mathematics & Health ((STEAM))

1002 Accesses
1 Citations

Abstract

In recent years, statistical analyses, algorithms, and modeling have been constrained due to computational complexity. Further, the added complexity of relationships among response and explanatory variables, such as higher-order interaction effects, makes identifying predictors using standard statistical techniques difficult. These difficulties are only exacerbated in the case of small sample sizes in some studies. Recent analyses have targeted the identification of interaction effects in big data, but the development of methods to identify higher-order interaction effects has been limited by computational concerns. One recently studied method is the feasible solutions algorithm (FSA), a fast, flexible method that aims to find a set of statistically optimal models via a stochastic search algorithm. Although FSA has shown promise, its current limits include that the user must choose the number of times to run the algorithm. Here, we provide statistical guidance for this number of iterations by deriving a lower bound on the probability of obtaining the statistically optimal model in a number of iterations of FSA. For example, when considering a two-way interaction model, if you would like the probability of obtaining the statistically optimal solution to be at least 80%, then you would need to choose the number of random starts of FSA to be 40% of the number of possible explanatory variables in your data set. The performance of this bound is then tested on both simulated and real data. This work allows FSA users to make statistically informed choices about FSA that can improve data analysis techniques.

Electronic Supplementary Material The online version of this chapter (https://doi.org/10.1007/978-3-030-11431-2_5) contains supplementary material, which is available to authorized users.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Hardcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Identifying Pareto-based solutions for regression subset selection via a feasible solution algorithm

Article 23 May 2020

Forward stability and model path selection

Article Open access 20 February 2024

On generalized degrees of freedom with application in linear mixed models selection

Article 26 July 2014

References

Friedman, J., Hastie, T., Tibshirani, R.: glmnet: Lasso and elastic-net regularized generalized linear models. R package version 1(4) (2009)
Google Scholar
Gemperline, P.J.: Computation of the range of feasible solutions in self-modeling curve resolution algorithms. Anal. Chem. 71(23), 5398–5404 (1999)
Article Google Scholar
Goudey, B., Abedini, M., Hopper, J.L., Inouye, M., Makalic, E., Schmidt, D.F., Wagner, J., Zhou, Z., Zobel, J., Reumann, M.: High performance computing enabling exhaustive analysis of higher order single nucleotide polymorphism interaction in genome wide association studies. Health Inf. Sci. Syst. 3(1), 1 (2015)
Article Google Scholar
Hawkins, D.M.: The feasible set algorithm for least median of squares regression. Comput. Stat. Data Anal. 16(1), 81–101 (1993)
Article Google Scholar
Hawkins, D.M.: A feasible solution algorithm for the minimum volume ellipsoid estimator in multivariate data. Comput. Stat. 8, 95–95 (1993)
Google Scholar
Hawkins, D.M.: The feasible solution algorithm for least trimmed squares regression. Comput. Stat. Data Anal. 17(2), 185–196 (1994)
Article Google Scholar
Hawkins, D.M.: The feasible solution algorithm for the minimum covariance determinant estimator in multivariate data. Comput. Stat. Data Anal. 17(2), 197–210 (1994)
Article Google Scholar
Hawkins, D.M., Olive, D.J.: Improved feasible solution algorithms for high breakdown estimation. Comput. Stat. Data Anal. 30(1), 1–11 (1999)
Article MathSciNet Google Scholar
Lambert, J., Gong, L., Elliot, C.F., Thompson, K., Stromberg, A.: rFSA: an R package for finding best subsets and interactions. R J. 10(2), 295–308 (2018)
Article Google Scholar
Lumley, T., Miller, A.: Leaps: regression subset selection. R package version 2 (2004)
Google Scholar
Miller, A.J.: Selection of subsets of regression variables. J. R. Stat. Soc. Ser. A Gen. 147(3), 389–425 (1984)
Article MathSciNet Google Scholar
Moore, J.H., Williams, S.M.: Epistasis and its implications for personal genetics. Am. J. Hum. Genet. 85(3), 309–320 (2009)
Article Google Scholar
Zhang, W., Korstanje, R., Thaisz, Staedtler, F., Harttman, N., Xu, L., Feng, M., Yanas, L., Yang, H., Valdar, W., et al.: Genome-wide association mapping of quantitative traits in outbred mice. G3: Genes Genomes Genetics 2(2), 167–174 (2012)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Center for Biostatistics, The Ohio State University, Columbus, OH, USA
Sarah A. Janse
Department of Statistics, University of Kentucky, Lexington, KY, USA
Katherine L. Thompson

Authors

Sarah A. Janse
View author publications
You can also search for this author in PubMed Google Scholar
Katherine L. Thompson
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Katherine L. Thompson .

Editor information

Editors and Affiliations

Department of Mathematics and Statistics, Old Dominion University, Norfolk, VA, USA
Norou Diawara

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Janse, S.A., Thompson, K.L. (2019). Properties of the Number of Iterations of a Feasible Solutions Algorithm. In: Diawara, N. (eds) Modern Statistical Methods for Spatial and Multivariate Data. STEAM-H: Science, Technology, Engineering, Agriculture, Mathematics & Health. Springer, Cham. https://doi.org/10.1007/978-3-030-11431-2_5

Download citation

DOI: https://doi.org/10.1007/978-3-030-11431-2_5
Published: 29 June 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-11430-5
Online ISBN: 978-3-030-11431-2
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics

Properties of the Number of Iterations of a Feasible Solutions Algorithm

Abstract

Access this chapter

Similar content being viewed by others

Identifying Pareto-based solutions for regression subset selection via a feasible solution algorithm

Forward stability and model path selection

On generalized degrees of freedom with application in linear mixed models selection

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Publish with us

Navigation

Properties of the Number of Iterations of a Feasible Solutions Algorithm

Abstract

Access this chapter

Similar content being viewed by others

Identifying Pareto-based solutions for regression subset selection via a feasible solution algorithm

Forward stability and model path selection

On generalized degrees of freedom with application in linear mixed models selection

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation