Synergized Bootstrapping: The Whole is Faster than the Sum of Its Parts

Loossens, Tim; Verdonck, Stijn; Tuerlinckx, Francis

doi:10.1007/978-3-030-43469-4_18

Tim Loossens⁶,
Stijn Verdonck⁶ &
Francis Tuerlinckx⁶

Part of the book series: Springer Proceedings in Mathematics & Statistics ((PROMS,volume 322))

Included in the following conference series:

The Annual Meeting of the Psychometric Society

674 Accesses

Abstract

Re-sampling methods are popular for assessing uncertainty, for testing hypotheses, or for cross-validation because of their simplicity. They all rely on a similar scheme: generating replicated datasets by sampling data points from an original dataset, fitting a model or conducting a statistical test on each of these, and aggregating the results. However, when fitting the model or conducting the statistical test becomes time-consuming, re-sampling methods become impractical because of the many replications. Many methods have been proposed to alleviate the computational burden, but they generally do not incorporate two key features of re-sampled datasets. One, re-sampled datasets all stem from the same origin and therefore have similar characteristics. Two, there is a large class of cost functions for which the cost of a parameter set given data can be computed by summing its costs across the individual data points. As a consequence, once the costs of the individual data points are known, the parameter set’s cost can be obtained for any of the cost functions related to one of the replicated datasets. The synergized bootstrap method put forward in this paper exploits these two features to accelerate the optimization procedures for re-sampling methods. It is applied to the non-parametric bootstrapping of the parameters of a univariate mixture model, of which the min-log-likelihood function can be shown to have multiple local minima, using the differential evolution heuristic as global optimizer. It is demonstrated that the synergized method can lead to incredible accelerations (up to 100-500 times faster) while being more accurate than the standard DE method.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Adzhubei, I. A., Schmidt, S., Peshkin, L., Ramensky, V. E., Gerasimova, A., Bork, P., et al. (2010). A method and server for predicting damaging missense mutations. Nature Methods, 7(4), 248–249.
Article Google Scholar
Andrews, D. W. K. (1999). Higher-order improvements of a computationally attractive-step bootstrap for extremum estimators (Tech. Rep. No. 1230). Cowles Foundation for Research in Economics, Yale University.
Google Scholar
Boonthiem, S., Boonta, S., & Klongdee, W. (2017). A differential evolution algorithm with adaptive controlling weighted parameter for finite mixture model of some fire insurance data in Thailand. SNRU Journal of Science and Technology, 9, 491–501.
Google Scholar
Bringmann, L. F., Vissers, N., Wichers, M., Geschwind, N., Kuppens, P., Peeters, F., et al. (2013). A network approach to psychopathology: new insights into clinical longitudinal data. PLOS ONE, 8(4), e60188.
Article Google Scholar
Cawley, G. C., & Talbot, N. L. C. (2008). Efficient approximate leave-one-out cross-validation for kernel logistic regression. Machine Learning, 71(2–3), 243–264.
Article Google Scholar
Cox, J., & Mann, M. (2008). MaxQuant enables high peptide identification rates, individualized p.p.b.-range mass accuracies and proteome-wide protein quantification. Nature Biotechnology, 26(12), 1367–1372.
Article Google Scholar
Crainiceanu, C. M., & Ruppert, D. (2004). Likelihood ratio tests in linear mixed models with one variance component. Journal of the Royal Statistical Society. Series B: Statistical Methodology, 66(1), 165–185.
Article MathSciNet MATH Google Scholar
Davidson, R., & MacKinnon, J. G. (1999). Bootstrap testing in nonlinear models. International Economic Review, 40(2), 487–508.
Article MathSciNet Google Scholar
Efron, B. (1987). Better bootstrap confidence intervals. Journal of the American Statistical Association, 82(397), 171–185.
Article MathSciNet MATH Google Scholar
Efron, B. (1990). More efficient bootstrap computations. Journal of the American Statistical Association, 85(409), 79–89.
Article MathSciNet MATH Google Scholar
Efron, B., & Tibshirani, R. J. (1994). An introduction to the bootstrap. Boca Raton: CRC Press. (Google-Books-ID: gLlpIUxRntoC).
Google Scholar
Good, P. I. (2000). Permutation tests: A practical guide to resampling methods for testing hypotheses. New York: Springer.
Book MATH Google Scholar
Halekoh, U., & Højsgaard, S. (2014). A Kenward-Roger approximation and parametric bootstrap methods for tests in linear mixed models: The R Package pbkrtest. Journal of Statistical Software, 59(9), 1–32.
Article Google Scholar
Hastie, T., Tibshirani, R., & Friedman, J. (2016). The elements of statistical learning: Data mining, inference, and prediction (2nd ed.). New York: Springer.
MATH Google Scholar
Hu, F., & Kalbfleisch, J. D. (2000). The estimating function bootstrap. Canadian Journal of Statistics, 28(3), 449–499.
Article MathSciNet MATH Google Scholar
Kleiner, A., Talwalkar, A., Sarkar, P., & Jordan, M. I. (2011). A scalable bootstrap for massive data. arXiv:1112.5016 [stat], (arXiv: 1112.5016).
Google Scholar
Kwedlo, W. (2014). Etimation of parameters of Gaussian mixture models by a hybrid method combining a self-adaptive differential evolution with the EM Algorithm. Advances in Computer Science Research, 11, 109–123.
Google Scholar
Lippert, C., Listgarten, J., Liu, Y., Kadie, C. M., Davidson, R. I., & Heckerman, D. (2011). FaST linear mixed models for genome-wide association studies. Nature Methods, 8(10), 833.
Article Google Scholar
Maho, Y. L., Whittington, J. D., Hanuise, N., Pereira, L., Boureau, M., Brucker, M., et al. (2014). Rovers minimize human disturbance in research on wild animals. Nature Methods, 11(12), 1242.
Article Google Scholar
McLachlan, G. & Peel, D. (2000). Finite mixture models (1 ed.). New York: Wiley-Interscience.
Book MATH Google Scholar
Mestdagh, M., Verdonck, S., Duisters, K., & Tuerlinckx, F. (2015). Fingerprint resampling: A generic method for efficient resampling. Scientific Reports, 5, 16970.
Article Google Scholar
Mestdagh, M., Verdonck, S., Meers, K., Loossens, T., & Tuerlinckx, F. (2018). Prepaid parameter estimation without likelihoods. arXiv:1812.09799 [stat]. (arXiv: 1812.09799).
Google Scholar
Mohamed, A. W., Sabry, H. Z., & Khorshid, M. (2012). An alternative differential evolution algorithm for global optimization. Journal of Advanced Research, 3(2), 149–165.
Article Google Scholar
Persson, F., Lindén, M., Unoson, C., & Elf, J. (2013). Extracting intracellular diffusive states and transition rates from single-molecule tracking data. Nature Methods, 10(3), 265.
Article Google Scholar
Ramaswamy, S., Ross, K. N., Lander, E. S., & Golub, T. R. (2003). A molecular signature of metastasis in primary solid tumors. Nature Genetics, 33(1), 49–54.
Article Google Scholar
Samuh, M. H., Grilli, L., Rampichini, C., Salmaso, L., & Lunardon, N. (2012). The use of permutation tests for variance components in linear mixed models. Communications in Statistics – Theory and Methods, 41(16–17), 3020–3029.
Article MathSciNet MATH Google Scholar
Shaw, P., Greenstein, D., Lerch, J., Clasen, L., Lenroot, R., Gogtay, N., et al. (2006). Intellectual ability and cortical development in children and adolescents. Nature, 440(7084), 676.
Article Google Scholar
Stamatakis, A., Hoover, P., & Rougemont, J. (2008). A rapid bootstrap algorithm for the RAxML Web servers. Systematic Biology, 57(5), 758–771.
Article Google Scholar
Storn, R. & Price, K. (1997). Differential evolution – a simple and efficient heuristic for global optimization over continuous spaces. Journal of Global Optimization, 11(4), 341–359.
Article MathSciNet MATH Google Scholar
Turnbaugh, P. J., Hamady, M., Yatsunenko, T., Cantarel, B. L., Duncan, A., Ley, R. E., et al. (2009). A core gut microbiome in obese and lean twins. Nature, 457(7228), 480.
Article Google Scholar
Verdonck, S., & Tuerlinckx, F. (2014). The Ising decision maker: A binary stochastic network for choice response time. Psychological Review, 121(3), 422–462.
Article Google Scholar
Verdonck, S., & Tuerlinckx, F. (2016). Factoring out nondecision time in choice reaction time data: Theory and implications. Psychological Review, 123(2), 208–218.
Article Google Scholar
Zeng, D., & Lin, D. Y. (2008). Efficient resampling methods for nonsmooth estimating functions. Biostatistics (Oxford, England), 9(2), 355–363.
Article MATH Google Scholar
Zhou, X., & Stephens, M. (2012). Genome-wide efficient mixed-model analysis for association studies. Nature Genetics, 44(7), 821–824.
Article Google Scholar
Zhou, X., & Stephens, M. (2014). Efficient multivariate linear mixed model algorithms for genome-wide association studies. Nature Methods, 11(4), 407.
Article Google Scholar

Download references

Author information

Authors and Affiliations

KU Leuven, Leuven, Belgium
Tim Loossens, Stijn Verdonck & Francis Tuerlinckx

Authors

Tim Loossens
View author publications
You can also search for this author in PubMed Google Scholar
Stijn Verdonck
View author publications
You can also search for this author in PubMed Google Scholar
Francis Tuerlinckx
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tim Loossens .

Editor information

Editors and Affiliations

Department of Statistics, USBE, Umeå University, Umeå, Sweden
Marie Wiberg
Department of Psychology, University of Amsterdam, Amsterdam, Noord-Holland, The Netherlands
Dylan Molenaar
Facultad de Matematicas, Pontificia Universidad Católica de Chile, Santiago, Chile
Jorge González
Kellogg School of Management, Northwestern University, Evanston, IL, USA
Ulf Böckenholt
Department of Educational Psychology, University of Wisconsin–Madison, Madison, WI, USA
Jee-Seon Kim

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Loossens, T., Verdonck, S., Tuerlinckx, F. (2020). Synergized Bootstrapping: The Whole is Faster than the Sum of Its Parts. In: Wiberg, M., Molenaar, D., González, J., Böckenholt, U., Kim, JS. (eds) Quantitative Psychology. IMPS 2019. Springer Proceedings in Mathematics & Statistics, vol 322. Springer, Cham. https://doi.org/10.1007/978-3-030-43469-4_18

Download citation

DOI: https://doi.org/10.1007/978-3-030-43469-4_18
Published: 24 July 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-43468-7
Online ISBN: 978-3-030-43469-4
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics