Subdividing Long-Running, Variable-Length Analyses Into Short, Fixed-Length BOINC Workunits

Abstract

We describe a scheme for subdividing long-running, variable-length analyses into short, fixed-length boinc workunits using phylogenetic analyses as an example. Fixed-length workunits decrease variance in analysis runtime, improve overall system throughput, and make boinc a more useful resource for analyses that require a relatively fast turnaround time, such as the phylogenetic analyses submitted by users of the garli web service at molecularevolution.org. Additionally, we explain why these changes will benefit volunteers who contribute their processing power to boinc projects, such as the Lattice boinc Project (http://boinc.umiacs.umd.edu). Our results, which demonstrate the advantages of relatively short workunits, should be of general interest to anyone who develops and deploys an application on the boinc platform.

References

  1. 1.

    Anderson, D.P.: BOINC: a system for public-resource computing and storage. In: Proceedings of the 5th IEEE/ACM International Workshop on Grid Computing, GRID ’04, pp 4–10. USA, Washington, DC (2004)

  2. 2.

    Myers, D.S., Bazinet, A.L., Cummings, M.P.: Expanding the reach of Grid computing: combining Globus- and BOINC-based systems. In: Talbi, E.-G., Zomaya, A.Y. (eds.) Grids for Bioinformatics and Computational Biology, Wiley Book Series on Bioinformatics: Computational Techniques and Engineering, Chapter 4, pp 71–85. Wiley-Interscience, Hoboken (2008)

  3. 3.

    Bazinet, A.L., Cummings, M.P.: The Lattice Project: a Grid research and production environment combining multiple Grid computing models. In: Weber, M H W (ed.) Distributed & Grid Computing — Science Made Transparent for Everyone. Principles, Applications and Supporting Communities, Chapter 1, pp 2–13. Rechenkraft.net, Marburg (2008)

  4. 4.

    Foster, I., Kesselman, C.: Globus: a toolkit-based grid architecture. In: Foster, I., Kesselman, C. (eds.) The Grid: Blueprint for a New Computing Infrastructure, pp 259–278. Morgan-Kaufmann (1999)

  5. 5.

    Bazinet, A.L., Cummings, M.P.: Computing the tree of life: Leveraging the power of desktop and service grids, pp 1896–1902. IEEE International Symposium on Parallel and Distributed Processing Workshops and PhD Forum (2011)

  6. 6.

    Zwickl, D.J.: GARLI 2.0 https://www.nescent.org/wg_garli/main_page (2011)

  7. 7.

    Zwickl, D.J.: Genetic algorithm approaches for the phylogenetic analysis of large biological sequence datasets under the maximum likelihood criterion. PhD thesis. The University of Texas at Austin (2006)

  8. 8.

    Bazinet, A.L., Zwickl, D.J., Cummings, M.P.: A gateway for phylogenetic analysis powered by grid computing featuring GARLI 2.0. Syst. Biol. 63(5), 812–8 (2014)

    Article  Google Scholar 

  9. 9.

    Regier, J.C., Zwick, A., Cummings, M.P., Kawahara, A.Y., Cho, S., Weller, S., Roe, A., Baixeras, J., Brown, J.W., Parr, C., Davis, D.R., Epstein, M., Hallwachs, W., Hausmann, A., Janzen, D.H., Kitching, I.J., Solis, M.A., Yen, S.-H., Bazinet, A.L., Mitter, C.: Toward reconstructing the evolution of advanced moths and butterflies (Lepidoptera: Ditrysia): an initial molecular study. BMC Evol. Biol. 9, 280 (2009)

    Article  Google Scholar 

  10. 10.

    Bazinet, A.L., Cummings, M.P., Mitter, K.T., Mitter, C.W.: Can RNA-Seq resolve the rapid radiation of advanced moths and butterflies (Hexapoda: Lepidoptera: Apoditrysia)? An exploratory study. PLoS ONE 8(12), e82615 (2013)

    Article  Google Scholar 

  11. 11.

    Regier, J.C., Mitter, C., Zwick, A., Bazinet, A.L., Cummings, M.P., Kawahara, A.Y., Sohn, J.-C., Zwickl, D.J., Cho, S., Davis, D.R., Baixeras, J., Brown, J., Parr, C., Weller, S., Lees, D.C., Mitter, K.T.: A large-scale, higher-level, molecular phylogenetic study of the insect order Lepidoptera (moths and butterflies). PLoS ONE 8(3), e58568 (2013)

    Article  Google Scholar 

  12. 12.

    Cummings, M.P., JC Huskamp, J.C.: Grid computing. EDUCAUSE Review 40, 116–117 (2005)

    Google Scholar 

  13. 13.

    Litzkow, M.J., Livny, M., Mutka, M.W.: Condor–a hunter of idle workstations. In: 8th International Conference on Distributed Computing Systems, pp 104– 111 (1988)

  14. 14.

    Bazinet, A.L.: The Lattice Project: A multi-model grid computing system. Master’s thesis. University of Maryland, College Park (2009)

  15. 15.

    Myers, D.S., Cummings, M.P.: Necessity is the mother of invention: a simple grid computing system using commodity tools. J. Parallel Distr. Com. 63(5), 578–589 (2003)

    Article  Google Scholar 

  16. 16.

    Kawahara, A.Y., Ohshima, I., Kawakita, A., Regier, J.C., Mitter, C., Cummings, M.P., Davis, D.R., Wagner, D.L., De Prins, J., Lopez-Vaamonde, C.: Increased gene sampling provides stronger support for higher-level groups within moths, gracillariid leaf mining relatives (Lepidoptera, Gracillariidae). BMC Evol. Biol. 11, 182 (2011)

    Article  Google Scholar 

  17. 17.

    Sohn, J.-C., Regier, J.C., Mitter, C., Davis, D., Landry, J.-F., Zwick, A., Cummings, M.P.: A molecular phylogeny for Yponomeutoidea (Insecta, Lepidoptera, Ditrysia) and its implications for classification, biogeography and the evolution of host plant use. PLoS ONE 8(1), e55066 (2013)

    Article  Google Scholar 

  18. 18.

    Huang, S.W., Huang, T.-C., Lyu, S.-R., Shieh, C.-K., Chou, Y.-S.: Improving speculative execution performance with coworker for cloud computing. In: IEEE 17th International Conference Parallel and Distributed Systems (ICPADS), pp. 1004–1009 (2011)

  19. 19.

    Ananthanarayanan, G., Ghodsi, A., Shenker, S., Stoica, I.: Effective straggler mitigation: attack of the clones. In: USENIX Symposium on Networked Systems Design and Implementation, vol. 13, pp 185–198 (2013)

  20. 20.

    Anderson, D.P.: Emulating volunteer computing scheduling policies. In: IEEE International Symposium on Parallel and Distributed Processing Workshops and Phd Forum (IPDPSW), pp 1839–1846 (2011)

  21. 21.

    Kondo, D., Anderson, D.P., McLeod, J.: Performance evaluation of scheduling policies for volunteer computing. In: IEEE International Conference on e-Science and Grid Computing, pp 415–422 (2007)

  22. 22.

    Estrada, T., Flores, D.A., Taufer, M., Teller, P.J., Kerstens, A., Anderson, D.P.: The effectiveness of threshold-based scheduling policies in BOINC projects. In: Second IEEE International Conference on e-Science and Grid Computing, 2006. e-Science ’06, pp. 88–88 (2006)

  23. 23.

    Heien, E.M., Fujimoto, N., Hagihara, K.: Computing low latency batches with unreliable workers in volunteer computing environments. In: IEEE International Symposium on Parallel and Distributed Processing, 2008. IPDPS 2008, pp. 1–8 (2008)

  24. 24.

    Toth, D., Finkel, D.: Improving the productivity of volunteer computing by using the most effective task retrieval policies. J. Grid Comput. 7(4), 519–535 (2009)

    Article  Google Scholar 

  25. 25.

    Rood, B., Lewis, M.J.: Grid resource availability prediction-based scheduling and task replication. J. Grid Comput. 7(4), 479–500 (2009)

    Article  Google Scholar 

  26. 26.

    Estrada, T., Taufer, M., Anderson, D.P.: Performance prediction and analysis of BOINC projects: an empirical study with EmBOINC. J. Grid Comput 7(4), 537–554 (2009)

    Article  Google Scholar 

  27. 27.

    Kovács, J., Marosi, A.-C., Visegrádi, A., Farkas, Z., Kacsuk, P., Lovas, R.: Boosting gLite with cloud augmented volunteer computing. Futur. Gener. Comput. Syst. 43–44, 12–23 (2015)

    Article  Google Scholar 

  28. 28.

    Kondo, D., Chien, A.A., Casanova, H.: Scheduling task parallel applications for rapid turnaround on enterprise desktop grids. J. Grid Comput. 5(4), 379–405 (2007)

    Article  Google Scholar 

Download references

Author information

Affiliations

Authors

Corresponding author

Correspondence to Adam L. Bazinet.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Bazinet, A.L., Cummings, M.P. Subdividing Long-Running, Variable-Length Analyses Into Short, Fixed-Length BOINC Workunits. J Grid Computing 14, 429–441 (2016). https://doi.org/10.1007/s10723-015-9348-5

Download citation

Keywords

  • BOINC
  • Volunteer computing
  • Grid computing
  • Scheduling
  • Checkpointing
  • GARLI
  • Phylogenetics
  • Maximum likelihood