VDDA: automatic visualization-driven data aggregation in relational databases

Jugel, Uwe; Jerzak, Zbigniew; Hackenbroich, Gregor; Markl, Volker

doi:10.1007/s00778-015-0396-z

VDDA: automatic visualization-driven data aggregation in relational databases

Special Issue Paper
Published: 06 August 2015

Volume 25, pages 53–77, (2016)
Cite this article

The VLDB Journal Aims and scope Submit manuscript

Uwe Jugel ORCID: orcid.org/0000-0003-0722-4544¹,
Zbigniew Jerzak¹,
Gregor Hackenbroich¹ &
…
Volker Markl²

2212 Accesses
27 Citations
1 Altmetric
Explore all metrics

Abstract

Contemporary RDBMS-based systems for visualization of high-volume numerical data have difficulty to cope with the hard latency requirements and high ingestion rates of interactive visualizations. Existing solutions for lowering the volume of large data sets disregard the spatial properties of visualizations, resulting in visualization errors. In this work, we introduce VDDA, a visualization-driven data aggregation that models visual aggregation at the pixel level as data aggregation at the query level. Based on the M4 aggregation for producing pixel-perfect line charts from highly reduced data subsets, we define a complete set of data reduction operators that simulate the overplotting behavior of the most frequently used chart types. Relying only on the relational algebra and the common data aggregation functions, our approach is generic and applicable to any visualization system that consumes data stored in relational databases. We demonstrate our visualization-driven data aggregation using real-world data sets from high-tech manufacturing, stock markets, and sports analytics, reducing data volumes by up to two orders of magnitude, while preserving pixel-perfect visualizations, as producible from the raw data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Interactive Visualization of Big Data

Making data visualization more efficient and effective: a survey

Article 19 November 2019

Xuedi Qin, Yuyu Luo, … Guoliang Li

rdf:SynopsViz – A Framework for Hierarchical Linked Data Visual Exploration and Analysis

Notes

We use the relational algebra notations \(\pi \) for projection, \(\sigma \) for selection, and \(_{[GroupFunction]^{*}}G_{[Aggregation]^+}\) or \(G_{[GroupKey|Aggregation]^+}\) for aggregation.
Given w equally sized groups of a continuous range, we obtain \(2\cdot w\) equally sized groups by adding intersections in the center of each group. The original intersections of the value range and thus their corresponding first and last tuples are still part of the query result. Similarly, the original min and max tuples become min or max tuples in the new subgroups.

References

Agarwal, S., Panda, A., Mozafari, B., Iyer, A.P., Madden, S., Stoica, I.: Blink and it’s done: Interactive queries on very large data. PVLDB 5(12), 1902–1905 (2012)
Google Scholar
Aigner, W., Miksch, S., Schumann, H., Tominski, C.: Visualization of Time-Oriented Data. Human-Computer Interaction Series. Springer, Berlin (2011)
Book Google Scholar
Battle, L., Stonebraker, M., Chang, R.: Dynamic reduction of query result sets for interactive visualizaton. In: IEEE Big Data, pp. 1–8. IEEE (2013)
Bresenham, J.E.: Algorithm for computer control of a digital plotter. IBM Syst. J. 4(1), 25–30 (1965)
Article Google Scholar
Burtini, G., Fazackerley, S., Lawrence, R.: Time series compression for adaptive chart generation. In: CCECE, pp. 1–6. IEEE (2013)
Chen, J.X., Wang, X.: Approximate line scan-conversion and antialiasing. Comput. Graph. Forum 18(1), 69–78 (1999)
Article Google Scholar
Chi, E.H., Riedl, J.T.: An operator interaction framework for visualization systems. In: Symposium on Information Visualization, pp. 63–70. IEEE (1998)
Cudré-Mauroux, P., Kimura, H., Lim, K.T., Rogers, J., Simakov, R., Soroush, E., Velikhov, P., Wang, D.L., Balazinska, M., Becla, J., et al.: A demonstration of SciDB: a science-oriented DBMS. PVLDB 2(2), 1534–1537 (2009)
Google Scholar
Salomon, David: Data Compression. Springer, Berlin (2007)
Google Scholar
Douglas, D.H., Peucker, T.K.: Algorithms for the reduction of the number of points required to represent a digitized line or its caricature. Cartogr. J. 10(2), 112–122 (1973)
Article Google Scholar
Duan, Q., Wang, P., Wu, M., Wang, W., Huang, S.: Approximate query on historical stream data. In: DEXA, pp. 128–135. Springer (2011)
Eick, S.G., Karr, A.F.: Visual scalability. J. Comput. Graph. Stat. 11(1), 22–43 (2002)
Article MathSciNet Google Scholar
Elmqvist, N., Fekete, J.D.: Hierarchical aggregation for information visualization: overview, techniques and design guidelines. TVCG 16(3), 439–454 (2010)
Google Scholar
Esling, P., Agon, C.: Time-series data mining. ACM Comput. Surv. 45(1), 12–34 (2012)
Article Google Scholar
Färber, F., Cha, S.K., Primsch, J., Bornhövd, C., Sigg, S., Lehner, W.: SAP HANA database-data management for modern business applications. SIGMOD Rec. 40(4), 45–51 (2012)
Article Google Scholar
Fu, T., Chung, F., Luk, R., Ng, C.: Representing financial time series based on data point importance. EAAI J. 21(2), 277–300 (2008)
Google Scholar
Fu, T.C.: A review on time series data mining. EAAI J. 24(1), 164–181 (2011)
Google Scholar
Gandhi, S., Foschini, L., Suri, S.: Space-efficient online approximation of time series data: streams, amnesia, and out-of-order. In: ICDE, pp. 924–935. IEEE (2010)
Haber, R.B., McNabb, D.A.: Visualization idioms: a conceptual model for scientific visualization systems. Vis. Sci. Comput. 74, 93 (1990)
Google Scholar
Hershberger, J., Snoeyink, J.: Speeding up the Douglas–Peucker line-simplification algorithm. University of British Columbia, Department of Computer Science (1992)
Jerzak, Z., Heinze, T., Fehr, M., Gröber, D., Hartung, R., Stojanovic, N.: The DEBS 2012 grand challenge. In: DEBS, pp. 393–398. ACM (2012)
Jugel, U., Jerzak, Z., Hackenbroich, G., Markl, V.: Faster visual analytics through pixel-perfect aggregation. PVLDB 7(13), 1705–1708 (2014)
Google Scholar
Jugel, U., Jerzak, Z., Hackenbroich, G., Markl, V.: M4: a visualization-oriented time series data aggregation. PVLDB 7(10), 797–808 (2014)
Google Scholar
Jugel, U., Markl, V.: Interactive visualization of high-velocity event streams. PVLDB (PhD Workshop) 5(13) (2012)
Keim, D.A., Panse, C., Schneidewind, J., Sips, M., Hao, M.C., Dayal, U.: Pushing the limit in visual data exploration: techniques and applications. LNCS 2821, 37–51 (2003)
Google Scholar
Keogh, E.J., Pazzani: A simple dimensionality reduction technique for fast similarity search in large time series databases. In: PAKDD, pp. 122–133. Springer (2000)
Kolesnikov, A.: Efficient Algorithms for Vectorization and Polygonal Approximation. University of Joensuu, Joensuu (2003)
Google Scholar
Lindstrom, P., Isenburg, M.: Fast and efficient compression of floating-point data. TVCG 12(5), 1245–1250 (2006)
Google Scholar
Liu, Z., Jiang, B., Heer, J.: imMens: real-time visual querying of big data. Comput. Graph. Forum 32(3pt4), 421–430 (2013)
Article Google Scholar
Ma, W., Bedner, I., Chang, G., Kuchinsky, A., Zhang, H.: A framework for adaptive content delivery in heterogeneous network environments. In: Proceedings of SPIE, Multimedia Computing and Networking, vol. 3969, pp. 86–100. SPIE (2000)
Mackinlay, J., Hanrahan, P., Stolte, C.: Show me: automatic presentation for visual analysis. TVCG 13(6), 1137–1144 (2007)
Google Scholar
Mutschler, C., Ziekow, H., Jerzak, Z.: The DEBS 2013 grand challenge. In: DEBS, pp. 289–294. ACM (2013)
Office of Electricity Delivery & Energy Reliability: Smart Grid (2014). http://energy.gov/oe/technology-development/smart-grid
Przymus, P., Boniewicz, A., Burzańska, M., Stencel, K.: Recursive query facilities in relational databases: a survey. In: DTA and BSBT, pp. 89–99. Springer (2010)
Reumann, K., Witkam, A.P.M.: Optimizing curve segmentation in computer graphics. In: Proceedings of the International Computing Symposium, pp. 467–472. North-Holland Publishing Company (1974)
Shi, W., Cheung, C.: Performance evaluation of line simplification algorithms for vector generalization. Cartogr. J. 43(1), 27–44 (2006)
Article Google Scholar
Upson, C., Faulhaber Jr, T.A., Kamins, D., Laidlaw, D., Schlegel, D., Vroom, J., Gurwitz, R., Van Dam, A.: The application visualization system: a computational environment for scientific visualization. IEEE Comput. Graph. Appl. 9(4), 30–42 (1989)
Article Google Scholar
Visvalingam, M., Whyatt, J.D.: Line generalisation by repeated elimination of points. Cartogr. J. 30(1), 46–51 (1993)
Article Google Scholar
Wang, Z., Bovik, A.C., Sheikh, H.R., Simoncelli, E.P.: Image quality assessment: from error visibility to structural similarity. IEEE Trans. Image Process. 13(4), 600–612 (2004)
Article Google Scholar
Wesley, R., Eldridge, M., Terlecki, P.T.: An analytic data engine for visualization in Tableau. In: SIGMOD, pp. 1185–1194. ACM (2011)
Wu, E., Battle, L., Madden, S.R.: The case for data visualization management systems. PVLDB 7(10), 903–906 (2014)
Google Scholar
Wu, Y., Agrawal, D., El Abbadi, A.: A comparison of DFT and DWT based similarity search in timeseries databases. In: CIKM, pp. 488–495. ACM (2000)

Download references

Author information

Authors and Affiliations

SAP SE, Walldorf/Dresden, Germany
Uwe Jugel, Zbigniew Jerzak & Gregor Hackenbroich
Technische Universität Berlin, Berlin, Germany
Volker Markl

Authors

Uwe Jugel
View author publications
You can also search for this author in PubMed Google Scholar
Zbigniew Jerzak
View author publications
You can also search for this author in PubMed Google Scholar
Gregor Hackenbroich
View author publications
You can also search for this author in PubMed Google Scholar
Volker Markl
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Uwe Jugel.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Jugel, U., Jerzak, Z., Hackenbroich, G. et al. VDDA: automatic visualization-driven data aggregation in relational databases. The VLDB Journal 25, 53–77 (2016). https://doi.org/10.1007/s00778-015-0396-z

Download citation

Received: 30 December 2014
Revised: 11 April 2015
Accepted: 12 July 2015
Published: 06 August 2015
Issue Date: February 2016
DOI: https://doi.org/10.1007/s00778-015-0396-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

VDDA: automatic visualization-driven data aggregation in relational databases

Abstract

Access this article

Similar content being viewed by others

Interactive Visualization of Big Data

Making data visualization more efficient and effective: a survey

rdf:SynopsViz – A Framework for Hierarchical Linked Data Visual Exploration and Analysis

Notes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

VDDA: automatic visualization-driven data aggregation in relational databases

Abstract

Access this article

Similar content being viewed by others

Interactive Visualization of Big Data

Making data visualization more efficient and effective: a survey

rdf:SynopsViz – A Framework for Hierarchical Linked Data Visual Exploration and Analysis

Notes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation