VisualSynth: Democratizing Data Science in Spreadsheets

Gautrais, Clément; Dauxais, Yann; Kolb, Samuel; Jain, Arcchit; Kumar, Mohit; Teso, Stefano; Van Wolputte, Elia; Verbruggen, Gust; De Raedt, Luc

doi:10.1007/978-3-030-67670-4_37

Clément Gautrais¹³,
Yann Dauxais¹³,
Samuel Kolb¹³,
Arcchit Jain¹³,
Mohit Kumar¹³,
Stefano Teso¹⁴,
Elia Van Wolputte¹³,
Gust Verbruggen¹³ &
…
Luc De Raedt¹³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12461))

Included in the following conference series:

Joint European Conference on Machine Learning and Knowledge Discovery in Databases

1853 Accesses

Abstract

We introduce VisualSynth, a framework that wants to democratize data science by enabling naive end-users to specify the data science tasks that match their needs. In VisualSynth, the user and the spreadsheet application interact by highlighting parts of the data using colors. The colors define a partial specification of a data science task (such as data wrangling or clustering), which is then completed and solved automatically using artificial intelligence techniques. The user can interactively refine the specification until she is satisfied with the result.

This work has received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant agreement No [694980] SYNTH: Synthesising Inductive Data Models).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Demonstration video: https://youtu.be/df6JgHl28Vw.

References

Bishop, C.M.: Pattern Recognition and Machine Learning. Springer, Heidelberg (2006)
Google Scholar
Chambers, C., Scaffidi, C.: Struggling to excel: a field study of challenges faced by spreadsheet users. In: 2010 IEEE Symposium on Visual Languages and Human-Centric Computing, pp. 187–194. IEEE (2010)
Google Scholar
Feurer, M., Klein, A., Eggensperger, K., Springenberg, J., Blum, M., Hutter, F.: Efficient and robust automated machine learning. In: Advances in Neural Information Processing Systems, pp. 2962–2970 (2015)
Google Scholar
Gautrais, C., Dauxais, Y., Teso, S., Kolb, S., Verbruggen, G., De Raedt, L.: Human-machine collaboration for democratizing data science. arXiv preprint arXiv:2004.11113 (2020)
Halbert, D.C.: Programming by example. Ph.D. thesis, University of California, Berkeley (1984)
Google Scholar
Kolb, S., Paramonov, S., Guns, T., De Raedt, L.: Learning constraints in spreadsheets and tabular data. Mach. Learn. 106(9), 1441–1468 (2017). https://doi.org/10.1007/s10994-017-5640-x
Article MathSciNet MATH Google Scholar
Kolb, S., Teso, S., Dries, A., De Raedt, L.: Predictive spreadsheet autocompletion with constraints. Mach. Learn. 109(2), 307–325 (2019). https://doi.org/10.1007/s10994-019-05841-y
Article MathSciNet MATH Google Scholar
Sarkar, A., Jamnik, M., Blackwell, A.F., Spott, M.: Interactive visual machine learning in spreadsheets. In: 2015 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC), pp. 159–163. IEEE (2015)
Google Scholar
Thornton, C., Hutter, F., Hoos, H.H., Leyton-Brown, K.: Auto-weka: combined selection and hyperparameter optimization of classification algorithms. In: Proceedings of the 19th ACM SIGKDD, pp. 847–855. ACM (2013)
Google Scholar
Verbruggen, G., De Raedt, L.: Automatically wrangling spreadsheets into machine learning data formats. In: Duivesteijn, W., Siebes, A., Ukkonen, A. (eds.) IDA 2018. LNCS, vol. 11191, pp. 367–379. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01768-2_30
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, KU Leuven, Leuven, Belgium
Clément Gautrais, Yann Dauxais, Samuel Kolb, Arcchit Jain, Mohit Kumar, Elia Van Wolputte, Gust Verbruggen & Luc De Raedt
University of Trento, Trento, Italy
Stefano Teso

Authors

Clément Gautrais
View author publications
You can also search for this author in PubMed Google Scholar
Yann Dauxais
View author publications
You can also search for this author in PubMed Google Scholar
Samuel Kolb
View author publications
You can also search for this author in PubMed Google Scholar
Arcchit Jain
View author publications
You can also search for this author in PubMed Google Scholar
Mohit Kumar
View author publications
You can also search for this author in PubMed Google Scholar
Stefano Teso
View author publications
You can also search for this author in PubMed Google Scholar
Elia Van Wolputte
View author publications
You can also search for this author in PubMed Google Scholar
Gust Verbruggen
View author publications
You can also search for this author in PubMed Google Scholar
Luc De Raedt
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Clément Gautrais .

Editor information

Editors and Affiliations

Microsoft Research, Redmond, WA, USA
Yuxiao Dong
University College Dublin, Dublin, Ireland
Georgiana Ifrim
Jožef Stefan Institute, Ljubljana, Slovenia
Dunja Mladenić
Amazon Alexa Knowledge, Cambridge, UK
Craig Saunders
Ghent University, Kotrijk, Belgium
Sofie Van Hoecke

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Gautrais, C. et al. (2021). VisualSynth: Democratizing Data Science in Spreadsheets. In: Dong, Y., Ifrim, G., Mladenić, D., Saunders, C., Van Hoecke, S. (eds) Machine Learning and Knowledge Discovery in Databases. Applied Data Science and Demo Track. ECML PKDD 2020. Lecture Notes in Computer Science(), vol 12461. Springer, Cham. https://doi.org/10.1007/978-3-030-67670-4_37

Download citation

DOI: https://doi.org/10.1007/978-3-030-67670-4_37
Published: 25 February 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-67669-8
Online ISBN: 978-3-030-67670-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

the ECML PKDD community (opens in a new tab)