Abstract
Re-sampling methods are popular for assessing uncertainty, for testing hypotheses, or for cross-validation because of their simplicity. They all rely on a similar scheme: generating replicated datasets by sampling data points from an original dataset, fitting a model or conducting a statistical test on each of these, and aggregating the results. However, when fitting the model or conducting the statistical test becomes time-consuming, re-sampling methods become impractical because of the many replications. Many methods have been proposed to alleviate the computational burden, but they generally do not incorporate two key features of re-sampled datasets. One, re-sampled datasets all stem from the same origin and therefore have similar characteristics. Two, there is a large class of cost functions for which the cost of a parameter set given data can be computed by summing its costs across the individual data points. As a consequence, once the costs of the individual data points are known, the parameter set’s cost can be obtained for any of the cost functions related to one of the replicated datasets. The synergized bootstrap method put forward in this paper exploits these two features to accelerate the optimization procedures for re-sampling methods. It is applied to the non-parametric bootstrapping of the parameters of a univariate mixture model, of which the min-log-likelihood function can be shown to have multiple local minima, using the differential evolution heuristic as global optimizer. It is demonstrated that the synergized method can lead to incredible accelerations (up to 100-500 times faster) while being more accurate than the standard DE method.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Adzhubei, I. A., Schmidt, S., Peshkin, L., Ramensky, V. E., Gerasimova, A., Bork, P., et al. (2010). A method and server for predicting damaging missense mutations. Nature Methods, 7(4), 248–249.
Andrews, D. W. K. (1999). Higher-order improvements of a computationally attractive-step bootstrap for extremum estimators (Tech. Rep. No. 1230). Cowles Foundation for Research in Economics, Yale University.
Boonthiem, S., Boonta, S., & Klongdee, W. (2017). A differential evolution algorithm with adaptive controlling weighted parameter for finite mixture model of some fire insurance data in Thailand. SNRU Journal of Science and Technology, 9, 491–501.
Bringmann, L. F., Vissers, N., Wichers, M., Geschwind, N., Kuppens, P., Peeters, F., et al. (2013). A network approach to psychopathology: new insights into clinical longitudinal data. PLOS ONE, 8(4), e60188.
Cawley, G. C., & Talbot, N. L. C. (2008). Efficient approximate leave-one-out cross-validation for kernel logistic regression. Machine Learning, 71(2–3), 243–264.
Cox, J., & Mann, M. (2008). MaxQuant enables high peptide identification rates, individualized p.p.b.-range mass accuracies and proteome-wide protein quantification. Nature Biotechnology, 26(12), 1367–1372.
Crainiceanu, C. M., & Ruppert, D. (2004). Likelihood ratio tests in linear mixed models with one variance component. Journal of the Royal Statistical Society. Series B: Statistical Methodology, 66(1), 165–185.
Davidson, R., & MacKinnon, J. G. (1999). Bootstrap testing in nonlinear models. International Economic Review, 40(2), 487–508.
Efron, B. (1987). Better bootstrap confidence intervals. Journal of the American Statistical Association, 82(397), 171–185.
Efron, B. (1990). More efficient bootstrap computations. Journal of the American Statistical Association, 85(409), 79–89.
Efron, B., & Tibshirani, R. J. (1994). An introduction to the bootstrap. Boca Raton: CRC Press. (Google-Books-ID: gLlpIUxRntoC).
Good, P. I. (2000). Permutation tests: A practical guide to resampling methods for testing hypotheses. New York: Springer.
Halekoh, U., & Højsgaard, S. (2014). A Kenward-Roger approximation and parametric bootstrap methods for tests in linear mixed models: The R Package pbkrtest. Journal of Statistical Software, 59(9), 1–32.
Hastie, T., Tibshirani, R., & Friedman, J. (2016). The elements of statistical learning: Data mining, inference, and prediction (2nd ed.). New York: Springer.
Hu, F., & Kalbfleisch, J. D. (2000). The estimating function bootstrap. Canadian Journal of Statistics, 28(3), 449–499.
Kleiner, A., Talwalkar, A., Sarkar, P., & Jordan, M. I. (2011). A scalable bootstrap for massive data. arXiv:1112.5016 [stat], (arXiv: 1112.5016).
Kwedlo, W. (2014). Etimation of parameters of Gaussian mixture models by a hybrid method combining a self-adaptive differential evolution with the EM Algorithm. Advances in Computer Science Research, 11, 109–123.
Lippert, C., Listgarten, J., Liu, Y., Kadie, C. M., Davidson, R. I., & Heckerman, D. (2011). FaST linear mixed models for genome-wide association studies. Nature Methods, 8(10), 833.
Maho, Y. L., Whittington, J. D., Hanuise, N., Pereira, L., Boureau, M., Brucker, M., et al. (2014). Rovers minimize human disturbance in research on wild animals. Nature Methods, 11(12), 1242.
McLachlan, G. & Peel, D. (2000). Finite mixture models (1 ed.). New York: Wiley-Interscience.
Mestdagh, M., Verdonck, S., Duisters, K., & Tuerlinckx, F. (2015). Fingerprint resampling: A generic method for efficient resampling. Scientific Reports, 5, 16970.
Mestdagh, M., Verdonck, S., Meers, K., Loossens, T., & Tuerlinckx, F. (2018). Prepaid parameter estimation without likelihoods. arXiv:1812.09799 [stat]. (arXiv: 1812.09799).
Mohamed, A. W., Sabry, H. Z., & Khorshid, M. (2012). An alternative differential evolution algorithm for global optimization. Journal of Advanced Research, 3(2), 149–165.
Persson, F., Lindén, M., Unoson, C., & Elf, J. (2013). Extracting intracellular diffusive states and transition rates from single-molecule tracking data. Nature Methods, 10(3), 265.
Ramaswamy, S., Ross, K. N., Lander, E. S., & Golub, T. R. (2003). A molecular signature of metastasis in primary solid tumors. Nature Genetics, 33(1), 49–54.
Samuh, M. H., Grilli, L., Rampichini, C., Salmaso, L., & Lunardon, N. (2012). The use of permutation tests for variance components in linear mixed models. Communications in Statistics – Theory and Methods, 41(16–17), 3020–3029.
Shaw, P., Greenstein, D., Lerch, J., Clasen, L., Lenroot, R., Gogtay, N., et al. (2006). Intellectual ability and cortical development in children and adolescents. Nature, 440(7084), 676.
Stamatakis, A., Hoover, P., & Rougemont, J. (2008). A rapid bootstrap algorithm for the RAxML Web servers. Systematic Biology, 57(5), 758–771.
Storn, R. & Price, K. (1997). Differential evolution – a simple and efficient heuristic for global optimization over continuous spaces. Journal of Global Optimization, 11(4), 341–359.
Turnbaugh, P. J., Hamady, M., Yatsunenko, T., Cantarel, B. L., Duncan, A., Ley, R. E., et al. (2009). A core gut microbiome in obese and lean twins. Nature, 457(7228), 480.
Verdonck, S., & Tuerlinckx, F. (2014). The Ising decision maker: A binary stochastic network for choice response time. Psychological Review, 121(3), 422–462.
Verdonck, S., & Tuerlinckx, F. (2016). Factoring out nondecision time in choice reaction time data: Theory and implications. Psychological Review, 123(2), 208–218.
Zeng, D., & Lin, D. Y. (2008). Efficient resampling methods for nonsmooth estimating functions. Biostatistics (Oxford, England), 9(2), 355–363.
Zhou, X., & Stephens, M. (2012). Genome-wide efficient mixed-model analysis for association studies. Nature Genetics, 44(7), 821–824.
Zhou, X., & Stephens, M. (2014). Efficient multivariate linear mixed model algorithms for genome-wide association studies. Nature Methods, 11(4), 407.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Loossens, T., Verdonck, S., Tuerlinckx, F. (2020). Synergized Bootstrapping: The Whole is Faster than the Sum of Its Parts. In: Wiberg, M., Molenaar, D., GonzĂ¡lez, J., Böckenholt, U., Kim, JS. (eds) Quantitative Psychology. IMPS 2019. Springer Proceedings in Mathematics & Statistics, vol 322. Springer, Cham. https://doi.org/10.1007/978-3-030-43469-4_18
Download citation
DOI: https://doi.org/10.1007/978-3-030-43469-4_18
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-43468-7
Online ISBN: 978-3-030-43469-4
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)