Skip to main content

VisualSynth: Democratizing Data Science in Spreadsheets

  • Conference paper
  • First Online:
Machine Learning and Knowledge Discovery in Databases. Applied Data Science and Demo Track (ECML PKDD 2020)

Abstract

We introduce VisualSynth, a framework that wants to democratize data science by enabling naive end-users to specify the data science tasks that match their needs. In VisualSynth, the user and the spreadsheet application interact by highlighting parts of the data using colors. The colors define a partial specification of a data science task (such as data wrangling or clustering), which is then completed and solved automatically using artificial intelligence techniques. The user can interactively refine the specification until she is satisfied with the result.

This work has received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant agreement No [694980] SYNTH: Synthesising Inductive Data Models).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Demonstration video: https://youtu.be/df6JgHl28Vw.

References

  1. Bishop, C.M.: Pattern Recognition and Machine Learning. Springer, Heidelberg (2006)

    Google Scholar 

  2. Chambers, C., Scaffidi, C.: Struggling to excel: a field study of challenges faced by spreadsheet users. In: 2010 IEEE Symposium on Visual Languages and Human-Centric Computing, pp. 187–194. IEEE (2010)

    Google Scholar 

  3. Feurer, M., Klein, A., Eggensperger, K., Springenberg, J., Blum, M., Hutter, F.: Efficient and robust automated machine learning. In: Advances in Neural Information Processing Systems, pp. 2962–2970 (2015)

    Google Scholar 

  4. Gautrais, C., Dauxais, Y., Teso, S., Kolb, S., Verbruggen, G., De Raedt, L.: Human-machine collaboration for democratizing data science. arXiv preprint arXiv:2004.11113 (2020)

  5. Halbert, D.C.: Programming by example. Ph.D. thesis, University of California, Berkeley (1984)

    Google Scholar 

  6. Kolb, S., Paramonov, S., Guns, T., De Raedt, L.: Learning constraints in spreadsheets and tabular data. Mach. Learn. 106(9), 1441–1468 (2017). https://doi.org/10.1007/s10994-017-5640-x

    Article  MathSciNet  MATH  Google Scholar 

  7. Kolb, S., Teso, S., Dries, A., De Raedt, L.: Predictive spreadsheet autocompletion with constraints. Mach. Learn. 109(2), 307–325 (2019). https://doi.org/10.1007/s10994-019-05841-y

    Article  MathSciNet  MATH  Google Scholar 

  8. Sarkar, A., Jamnik, M., Blackwell, A.F., Spott, M.: Interactive visual machine learning in spreadsheets. In: 2015 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC), pp. 159–163. IEEE (2015)

    Google Scholar 

  9. Thornton, C., Hutter, F., Hoos, H.H., Leyton-Brown, K.: Auto-weka: combined selection and hyperparameter optimization of classification algorithms. In: Proceedings of the 19th ACM SIGKDD, pp. 847–855. ACM (2013)

    Google Scholar 

  10. Verbruggen, G., De Raedt, L.: Automatically wrangling spreadsheets into machine learning data formats. In: Duivesteijn, W., Siebes, A., Ukkonen, A. (eds.) IDA 2018. LNCS, vol. 11191, pp. 367–379. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01768-2_30

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Clément Gautrais .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Gautrais, C. et al. (2021). VisualSynth: Democratizing Data Science in Spreadsheets. In: Dong, Y., Ifrim, G., Mladenić, D., Saunders, C., Van Hoecke, S. (eds) Machine Learning and Knowledge Discovery in Databases. Applied Data Science and Demo Track. ECML PKDD 2020. Lecture Notes in Computer Science(), vol 12461. Springer, Cham. https://doi.org/10.1007/978-3-030-67670-4_37

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-67670-4_37

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-67669-8

  • Online ISBN: 978-3-030-67670-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics