Advertisement

Static Partitioning of Spreadsheets for Parallel Execution

  • Alexander Asp Bock
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11372)

Abstract

Spreadsheets are popular tools for end-user development and complex modelling but can suffer from poor performance. While end-users are usually domain experts they are seldom IT professionals that can leverage today’s abundant multicore architectures to offset such poor performance. We present an iterative, greedy algorithm for automatically partitioning spreadsheets into load-balanced, acyclic groups of cells that can be scheduled to run on shared-memory multicore processors. A big-step cost semantics for the spreadsheet formula language is used to estimate work and guide partitioning. The algorithm does not require end-users to modify the spreadsheet in any way. We implement three extensions to the algorithm for further accelerating computation; two of which recognise common cell structures known as cell arrays that naturally express a degree of parallelism. To the best of our knowledge, no such automatic algorithm has previously been proposed for partitioning spreadsheets. We report a maximum 24-fold speed-up on 48 logical cores.

Keywords

Spreadsheets Partitioning Parallelism 

Notes

Acknowledgements

The author would like to thank Peter Sestoft and Florian Biermann for valuable insight and discussions during the development of this work, as well as Peter Sestoft and Holger Stadel Borum for proofreading.

References

  1. 1.
    Abraham, R., Erwig, M.: Inferring templates from spreadsheets. In: ICSE (2006)Google Scholar
  2. 2.
    Abramson, D., Sosic, R., Giddy, J., Hall, B.: Nimrod: a tool for performing parametrised simulations using distributed workstations. In: HPDC (1995)Google Scholar
  3. 3.
    Abramson, D., Roe, P., Kotler, L., Mather, D.: Activesheets: super-computing with spreadsheets. In: HPC (2001)Google Scholar
  4. 4.
    Biermann, F., Bock, A.A.: Puncalc: task-based parallelism and speculative reevaluation in spreadsheets. In: HLPP (2018)Google Scholar
  5. 5.
    Biermann, F., Dou, W., Sestoft, P.: Rewriting high-level spreadsheet structures into higher-order functional programs. In: Calimeri, F., Hamlen, K., Leone, N. (eds.) PADL 2018. LNCS, vol. 10702, pp. 20–35. Springer, Cham (2018).  https://doi.org/10.1007/978-3-319-73305-0_2CrossRefGoogle Scholar
  6. 6.
    Bock, A.A.: A literature review of spreadsheet technology. Technical report (2016). ISBN 978-87-7949-364-3Google Scholar
  7. 7.
    Bock, A.A., Bøgholm, T., Sestoft, P., Thomsen, B., Thomsen, L.L.: Concrete and abstract cost semantics for spreadsheets. Technical report (2018). ISBN 978-87-7949-369-8Google Scholar
  8. 8.
    Cann, D.: Retire Fortran? A debate rekindled. Commun. ACM 35(8), 81–89 (1992)CrossRefGoogle Scholar
  9. 9.
    Dou, W., Cheung, S.C., Wei, J.: Is spreadsheet ambiguity harmful? Detecting and repairing spreadsheet smells due to ambiguous computation. In: ICSE (2014)Google Scholar
  10. 10.
  11. 11.
    Fisher, M., Rothermel, G.: The EUSES spreadsheet corpus: a shared resource for supporting experimentation with spreadsheet dependability mechanisms. In: SIGSOFT SEN (2005)Google Scholar
  12. 12.
    Hermans, F., Dig, D.: BumbleBee: a refactoring environment for spreadsheet formulas. In: SIGSOFT FSE (2014)Google Scholar
  13. 13.
    Hermans, F., Murphy-Hill, E.: Enron’s spreadsheets and related emails: a dataset and analysis. In: ICSE (2015)Google Scholar
  14. 14.
    Hermans, F., Pinzger, M., van Deursen, A.: Supporting professional spreadsheet users by generating leveled dataflow diagrams. In: ICSE (2011)Google Scholar
  15. 15.
    Leijen, D., Schulte, W., Burckhardt, S.: The design of a task parallel library. SIGPLAN Not. 44(10), 227–242 (2009)CrossRefGoogle Scholar
  16. 16.
    Microsoft: HPC Services For ExcelGoogle Scholar
  17. 17.
    Sarkar, V.: Partitioning and Scheduling Parallel Programs for Multiprocessors. Research Monographs In Parallel and Distributed Computing. MIT Press, Cambridge (1989)zbMATHGoogle Scholar
  18. 18.
    Sestoft, P.: Spreadsheet Implementation Technology. MIT Press, Cambridge (2014)Google Scholar
  19. 19.
    Swidan, A., Hermans, F., Koesoemowidjojo, R.: Improving the performance of a large scale spreadsheet: a case study. In: SANER (2016)Google Scholar
  20. 20.
    Trudeau, J.: Collaboration and Open Source at AMD: LibreOffice. https://developer.amd.com/collaboration-and-open-source-at-amd-libreoffice/
  21. 21.
    Wack, A.P.: Partitioning dependency graphs for concurrent execution: a parallel spreadsheet on a realistically modeled message passing environment. Ph.D. thesis, Newark, DE, USA (1996)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Computer Science DepartmentIT University of CopenhagenCopenhagenDenmark

Personalised recommendations