Skip to main content

Enhancing the Interactive Visualisation of a Data Preparation Tool from in-Memory Fitting to Big Data Sets

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Business Information Processing ((LNBIP,volume 394))

Abstract

In order to derive reliable insights or make evidence-based decisions, the starting point is to assess and meet a minimum quality of data, either by those that publish the data (preferably) or alternatively by those that prepare data for analysis and develop specific analytics. Much of the (open) data shared by governments and different institutions, or crowdsourced, is in tabular format, and the amount and size of it is increasing rapidly. This paper presents the challenges faced and the solutions adopted while evolving the web-based graphical user interface (GUI) of a tabular data preparation tool from in-memory fitting to Big Data sets. Traditional standalone processing and rendering solutions are no longer usable in a Big Data context. We report on the approach adopted to asynchronously pre-compute the visualisations required for the tool, in addition to the applied visualisation aggregation strategies. The implementation of this approach has allowed us to overcome web-browsers’ client-side data handling limitations and to avoid information overload when using granular information charts from our existing in-memory data preparation tool with Big Data sets. The developed solution provides the user with an acceptable GUI interaction time.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Bikakis, N.: Big data visualization tools. In: Sakr, S., Zomaya, A.Y., (eds.) Encyclopedia of Big Data Technologies. pp. 336–340. Springer, Cham (2019). https://doi.org/10.1007/978-3-319-77525-8_109

  2. Battle, L., Stonebraker, M., Chang, R.: Dynamic reduction of query result sets for interactive visualizaton. In: 2013 IEEE International Conference on Big Data, pp. 1–8 (2013). https://doi.org/10.1109/BigData.2013.6691708

  3. Park, Y., Cafarella, M., Mozafari, B.: Visualization-aware sampling for very large databases. In: 2016 IEEE 32nd International Conference on Data Engineering (ICDE), pp. 755–766 (2016). https://doi.org/10.1109/ICDE.2016.7498287

  4. Jugel, U., Jerzak, Z., Hackenbroich, G., Markl, V.: VDDA: automatic visualization-driven data aggregation in relational databases. VLDB J. 25(1), 53–77 (2015). https://doi.org/10.1007/s00778-015-0396-z

    Article  Google Scholar 

  5. Lins, L., Klosowski, J.T., Scheidegger, C.: Nanocubes for real-time exploration of spatiotemporal datasets. IEEE Trans. Vis. Comput. Graph. 19, 2456–2465 (2013). https://doi.org/10.1109/TVCG.2013.179

    Article  Google Scholar 

  6. Bikakis, N., Papastefanatos, G., Skourla, M., Sellis, T.: A hierarchical aggregation framework for efficient multilevel visual exploration and analysis. Semantic Web. 8, 139–179 (2017). https://doi.org/10.3233/SW-160226

    Article  Google Scholar 

  7. Elmqvist, N., Fekete, J.-D.: Hierarchical aggregation for information visualization: overview, techniques, and design guidelines. IEEE Trans. Vis. Comput. Graph. 16, 439–454 (2010). https://doi.org/10.1109/TVCG.2009.84

    Article  Google Scholar 

  8. Stolper, C.D., Perer, A., Gotz, D.: Progressive visual analytics: user-driven visual exploration of in-progress analytics. IEEE Trans. Vis. Comput. Graph. 20, 1653–1662 (2014). https://doi.org/10.1109/TVCG.2014.2346574

    Article  Google Scholar 

  9. Im, J.-F., Villegas, F.G., McGuffin, M.J.: VisReduce: Fast and responsive incremental information visualization of large datasets. In: 2013 IEEE International Conference on Big Data, pp. 25–32 (2013). https://doi.org/10.1109/BigData.2013.6691710

  10. Zoumpatianos, K., Idreos, S., Palpanas, T.: Indexing for interactive exploration of big data series. In: Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data, pp. 1555–1566. Association for Computing Machinery, Snowbird, Utah (2014). https://doi.org/10.1145/2588555.2610498

  11. Mackinlay, J., Hanrahan, P., Stolte, C.: Show me: automatic presentation for visual analysis. IEEE Trans. Vis. Comput. Graph. 13, 1137–1144 (2007). https://doi.org/10.1109/TVCG.2007.70594

    Article  Google Scholar 

  12. Gotz, D., Wen, Z.: Behavior-driven visualization recommendation. In: Proceedings of the 14th international conference on Intelligent user interfaces. pp. 315–324. Association for Computing Machinery, Sanibel Island, Florida, USA (2009). https://doi.org/10.1145/1502650.1502695

  13. Ali, S.M., Gupta, N., Nayak, G.K., Lenka, R.K.: Big data visualization: tools and challenges. In: 2016 2nd International Conference on Contemporary Computing and Informatics (IC3I), pp. 656–660 (2016). https://doi.org/10.1109/IC3I.2016.7918044

  14. Ovum: Ovum Decision Matrix: Selecting a Self-Service Data Prep Solution, 2018–19. (2018)

    Google Scholar 

  15. Álvarez Sánchez, R., Beristain Iraola, A., Epelde Unanue, G., Carlin, P.: TAQIH, a tool for tabular data quality assessment and improvement in the context of health data. Comput. Methods Programs Biomed. SI Data Qual. Assess. 181, 104824 (2019). https://doi.org/10.1016/j.cmpb.2018.12.029

  16. The Dama UK Working Group: The Six Primary Dimensions For Data Quality assessment, https://www.dqglobal.com/wp-content/uploads/2013/11/DAMA-UK-DQ-Dimensions-White-Paper-R37.pdf. Accessed 08 Mar 2018

  17. Nielsen, J.: Usability Engineering. Morgan Kaufmann, Amsterdam (1993)

    Book  Google Scholar 

Download references

Acknowledgments

This work was supported by the European Union’s Horizon 2020 research and innovation programme under grant agreement No 727721 (MIDAS).

This work was supported by the Gipuzkoan Science, Technology and Innovation Network Programme funding of the HIDRA project.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Gorka Epelde .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Epelde, G., Álvarez, R., Beristain, A., Arrúe, M., Arangoa, I., Rankin, D. (2020). Enhancing the Interactive Visualisation of a Data Preparation Tool from in-Memory Fitting to Big Data Sets. In: Abramowicz, W., Klein, G. (eds) Business Information Systems Workshops. BIS 2020. Lecture Notes in Business Information Processing, vol 394. Springer, Cham. https://doi.org/10.1007/978-3-030-61146-0_22

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-61146-0_22

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-61145-3

  • Online ISBN: 978-3-030-61146-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics