Abstract
We introduce VisualSynth, a framework that wants to democratize data science by enabling naive end-users to specify the data science tasks that match their needs. In VisualSynth, the user and the spreadsheet application interact by highlighting parts of the data using colors. The colors define a partial specification of a data science task (such as data wrangling or clustering), which is then completed and solved automatically using artificial intelligence techniques. The user can interactively refine the specification until she is satisfied with the result.
This work has received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant agreement No [694980] SYNTH: Synthesising Inductive Data Models).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Demonstration video: https://youtu.be/df6JgHl28Vw.
References
Bishop, C.M.: Pattern Recognition and Machine Learning. Springer, Heidelberg (2006)
Chambers, C., Scaffidi, C.: Struggling to excel: a field study of challenges faced by spreadsheet users. In: 2010 IEEE Symposium on Visual Languages and Human-Centric Computing, pp. 187–194. IEEE (2010)
Feurer, M., Klein, A., Eggensperger, K., Springenberg, J., Blum, M., Hutter, F.: Efficient and robust automated machine learning. In: Advances in Neural Information Processing Systems, pp. 2962–2970 (2015)
Gautrais, C., Dauxais, Y., Teso, S., Kolb, S., Verbruggen, G., De Raedt, L.: Human-machine collaboration for democratizing data science. arXiv preprint arXiv:2004.11113 (2020)
Halbert, D.C.: Programming by example. Ph.D. thesis, University of California, Berkeley (1984)
Kolb, S., Paramonov, S., Guns, T., De Raedt, L.: Learning constraints in spreadsheets and tabular data. Mach. Learn. 106(9), 1441–1468 (2017). https://doi.org/10.1007/s10994-017-5640-x
Kolb, S., Teso, S., Dries, A., De Raedt, L.: Predictive spreadsheet autocompletion with constraints. Mach. Learn. 109(2), 307–325 (2019). https://doi.org/10.1007/s10994-019-05841-y
Sarkar, A., Jamnik, M., Blackwell, A.F., Spott, M.: Interactive visual machine learning in spreadsheets. In: 2015 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC), pp. 159–163. IEEE (2015)
Thornton, C., Hutter, F., Hoos, H.H., Leyton-Brown, K.: Auto-weka: combined selection and hyperparameter optimization of classification algorithms. In: Proceedings of the 19th ACM SIGKDD, pp. 847–855. ACM (2013)
Verbruggen, G., De Raedt, L.: Automatically wrangling spreadsheets into machine learning data formats. In: Duivesteijn, W., Siebes, A., Ukkonen, A. (eds.) IDA 2018. LNCS, vol. 11191, pp. 367–379. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01768-2_30
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Gautrais, C. et al. (2021). VisualSynth: Democratizing Data Science in Spreadsheets. In: Dong, Y., Ifrim, G., Mladenić, D., Saunders, C., Van Hoecke, S. (eds) Machine Learning and Knowledge Discovery in Databases. Applied Data Science and Demo Track. ECML PKDD 2020. Lecture Notes in Computer Science(), vol 12461. Springer, Cham. https://doi.org/10.1007/978-3-030-67670-4_37
Download citation
DOI: https://doi.org/10.1007/978-3-030-67670-4_37
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-67669-8
Online ISBN: 978-3-030-67670-4
eBook Packages: Computer ScienceComputer Science (R0)