Skip to main content

Automatically Extracting Class Diagrams from Spreadsheets

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 6183))

Abstract

The use of spreadsheets to capture information is widespread in industry. Spreadsheets can thus be a wealthy source of domain information. We propose to automatically extract this information and transform it into class diagrams. The resulting class diagram can be used by software engineers to understand, refine, or re-implement the spreadsheet’s functionality. To enable the transformation into class diagrams we create a library of common spreadsheet usage patterns. These patterns are localized in the spreadsheet using a two- dimensional parsing algorithm. The resulting parse tree is transformed and enriched with information from the library. We evaluate our approach on the spreadsheets from the Euses Spreadsheet Corpus by comparing a subset of the generated class diagrams with reference class diagrams created manually.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Abraham, R., Erwig, M.: Header and unit inference for spreadsheets through spatial analyses. In: Proceedings of the IEEE International Symposium on Visual Languages and Human-Centric Computing(VL/HCC), pp. 165–172 (2004)

    Google Scholar 

  2. Abraham, R., Erwig, M.: Inferring templates from spreadsheets. In: Proceedings of the 28th International Conference on Software Engineering (ICSE), pp. 182–191. ACM, New York (2006)

    Google Scholar 

  3. Abraham, R., Erwig, M.: Mutation operators for spreadsheets. IEEE Transactions on Software Engineering 35(1), 94–108 (2009)

    Article  Google Scholar 

  4. Abraham, R., Erwig, M., Andrew, S.: A type system based on end-user vocabulary. In: Proceedings of the IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC), Washington, DC, USA, pp. 215–222. IEEE Computer Society, Los Alamitos (2007)

    Chapter  Google Scholar 

  5. Ahmad, Y., Antoniu, T., Goldwater, S., Krishnamurthi, S.: A type system for statically detecting spreadsheet errors. In: Proceedings of the IEEE International Conference on Automated Software Engineering, pp. 174–183 (2003)

    Google Scholar 

  6. Aho, A., Sethi, R., Ullman, J.D.: Compilers: Principles, Techniques, and Tools. Addison-Wesley, Reading

    Google Scholar 

  7. Baker, T.P.: A technique for extending rapid exact-match string matching to arrays of more than one dimension. SIAM Journal on Computing 7(4), 533–541 (1978)

    Article  MATH  MathSciNet  Google Scholar 

  8. Bird, R.S.: Two dimensional pattern matching. Information Processing Letters 6(5), 168–170 (1977)

    Article  Google Scholar 

  9. Cunha, J., Saraiva, J., Visser, J.: Discovery-based edit assistance for spreadsheets. In: Proceedings of the IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC), pp. 233–237. IEEE, Los Alamitos (2009)

    Chapter  Google Scholar 

  10. Fisher, M., Cao, M., Rothermel, G., Cook, C.R., Burnett, M.M.: Automated test case generation for spreadsheets. In: Proceedings of the International Conference on Software Engineering (ICSE), pp. 141–151 (2002)

    Google Scholar 

  11. Fisher, M., Rothermel, G.: The EUSES spreadsheet corpus: A shared resource for supporting experimentation with spreadsheet dependability mechanisms. In: 1st Workshop on End-User Software Engineering, pp. 47–51 (2005)

    Google Scholar 

  12. Giammarresi, D., Restivo, A.: Two-dimensional finite state recognizability. Fundamenta Informaticae 25(3), 399–422 (1996)

    MATH  MathSciNet  Google Scholar 

  13. Groenewegen, D.M., Hemel, Z., Kats, L.C.L., Visser, E.: WebDSL: A domain-specific language for dynamic web applications. In: Proceedings of the Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA), pp. 779–780 (2008)

    Google Scholar 

  14. Janvrin, D., Morrison, J.: Using a structured design approach to reduce risks in end user spreadsheet development. Information & Management 37(1), 1–12 (2000)

    Article  Google Scholar 

  15. Knight, B., Chadwick, D., Rajalingham, K.: A structured methodology for spreadsheet modelling. In: Proceedings of the European Spreadsheet Risks Interest Group(EuSpRiG), vol. 1, p. 158 (2000)

    Google Scholar 

  16. Kollman, R., Selonen, P., Stroulia, E., Systä, T., Zündorf, A.: A study on the current state of the art in tool-supported uml-based static reverse engineering. In: Proceedings of the Working Conference on Reverse Engineering (WCRE), p. 22 (2002)

    Google Scholar 

  17. Levenshtein, V.I.: On the minimal redundancy of binary error-correcting codes. Information and Control 28(4), 268–291 (1975)

    Article  MathSciNet  Google Scholar 

  18. Mittermeir, R., Clermont, M.: Finding high-level structures in spreadsheet programs. In: Proceedings of the the Ninth Working Conference on Reverse Engineering (WCRE), Washington, DC, USA, p. 221. IEEE Computer Society, Los Alamitos (2002)

    Chapter  Google Scholar 

  19. Novelli, N., Cicchetti, R.: Fun: An efficient algorithm for mining functional and embedded dependencies. In: Proceedings of the International Conference on Database Theory (ICDT), pp. 189–203 (2001)

    Google Scholar 

  20. Panko, R.R.: What we know about spreadsheet errors. Journal of End User Computing 10(2), 15–21 (1998)

    Google Scholar 

  21. Panko, R.R., Halverson Jr., R.P.: Individual and group spreadsheet design: Patterns of errors. In: Proceedings of the Hawaii International Conference on System Sciences (HICSS), pp. 4–10 (1994)

    Google Scholar 

  22. Ronen, B., Ronen, B., Palley, M.A., Palley, M.A., Lucas, H.C., Lucas, H.C.: Spreadsheet analysis and design. Communications of the ACM 32, 84–93 (1989)

    Article  Google Scholar 

  23. Rosenfeld, A.: Array grammars  vol. 291, pp. 67–70 (1986)

    Google Scholar 

  24. Scaffidi, C., Shaw, M., Myers, B.A.: Estimating the numbers of end users and end user programmers. In: Proceedings of the IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC), pp. 207–214 (2005)

    Google Scholar 

  25. Siromoney, G., Siromoney, R., Krithivasan, K.: Abstract families of matrices and picture languages, pp. 284–307 (1972)

    Google Scholar 

  26. Zhu, R.F., Takaoka, T.: A technique for two-dimensional pattern matching. Commununications of the ACM 32(9), 1110–1120 (1989)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Hermans, F., Pinzger, M., van Deursen, A. (2010). Automatically Extracting Class Diagrams from Spreadsheets. In: D’Hondt, T. (eds) ECOOP 2010 – Object-Oriented Programming. ECOOP 2010. Lecture Notes in Computer Science, vol 6183. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-14107-2_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-14107-2_4

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-14106-5

  • Online ISBN: 978-3-642-14107-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics