Abstract
Supporting interactive database exploration (IDE) is a problem that attracts lots of attention these days. Exploratory OLAP (On-Line Analytical Processing) is an important use case where tools support navigation and analysis of the most interesting data, using the best possible perspectives. While many approaches were proposed (like query recommendation, reuse, steering, personalization or unexpected data recommendation), a recurrent problem is how to assess the effectiveness of an exploratory OLAP approach. In this paper we propose a benchmark framework to do so, that relies on an extensible set of user-centric metrics that relate to the main dimensions of exploratory analysis. Namely, we describe how to model and simulate user activity, how to formalize our metrics and how to build exploratory tasks to properly evaluate an IDE system under test (SUT). To the best of our knowledge, this is the first proposal of such a benchmark. Experiments are two-fold: first we evaluate the benchmark protocol and metrics based on synthetic SUTs whose behavior is well known. Second, we concentrate on two different recent SUTs from IDE literature that are evaluated and compared with our benchmark. Finally, potential extensions to produce an industry-strength benchmark are listed in the conclusion.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
More information on the benchmark can be found on its web page: http://www.info.univ-tours.fr/~marcel/benchmark.html.
- 2.
- 3.
- 4.
References
Aligon, J., Gallinucci, E., Golfarelli, M., Marcel, P., Rizzi, S.: A collaborative filtering approach for recommending OLAP sessions. DSS 69, 20–30 (2015)
Aligon, J., Golfarelli, M., Marcel, P., Rizzi, S., Turricchia, E.: Similarity measures for OLAP sessions. KAIS 39(2), 463–489 (2014)
Aufaure, M.-A., Kuchmann-Beauger, N., Marcel, P., Rizzi, S., Vanrompay, Y.: Predicting your next OLAP query based on recent analytical sessions. In: Bellatreche, L., Mohania, M.K. (eds.) DaWaK 2013. LNCS, vol. 8057, pp. 134–145. Springer, Heidelberg (2013). doi:10.1007/978-3-642-40131-2_12
Battle, L., Stonebraker, M., Chang, R.: Dynamic reduction of query result sets for interactive visualizaton. In: International Conference on Big Data, pp. 1–8 (2013)
Cariou, V., Cubillé, J., Derquenne, C., Goutier, S., Guisnel, F., Klajnmic, H.: Embedded indicators to facilitate the exploration of a data cube. IJBIDM 4(3/4), 329–349 (2009)
Corbett, A.T., Anderson, J.R.: Knowledge tracing: modelling the acquisition of procedural knowledge. UMUAI 4(4), 253–278 (1995)
Desmarais, M.C., de Baker, R.S.J.: A review of recent advances in learner and skill modeling in intelligent learning environments. UMUAI 22(1–2), 9–38 (2012)
Djedaini, M., Furtado, P., Labroche, N., Marcel, P., Peralta, V.: Assessing the effectiveness of OLAP exploration approaches. Technical report 315, June 2016. http://www.info.univ-tours.fr/~marcel/RR-DFLMP-1-062016.pdf
Drosou, M., Pitoura, E.: YmalDB: exploring relational databases via result-driven recommendations. VLDB J. 22(6), 849–874 (2013)
Drutsa, A., Gusev, G., Serdyukov, P.: Future user engagement prediction and its application to improve the sensitivity of online experiments. In: WWW, pp. 256–266 (2015)
Eirinaki, M., Abraham, S., Polyzotis, N., Shaikh, N.: Querie: collaborative database exploration. TKDE 26(7), 1778–1790 (2014)
Giacometti, A., Marcel, P., Negre, E., Soulet, A.: Query recommendations for OLAP discovery-driven analysis. IJDWM 7(2), 1–25 (2011)
Gkesoulis, D., Vassiliadis, P., Manousis, P.: Cinecubes: aiding data workers gain insights from OLAP queries. IS 53, 60–86 (2015)
Golfarelli, M., Rizzi, S., Biondi, P.: myOLAP: an approach to express and evaluate OLAP preferences. TKDE 23(7), 1050–1064 (2011)
Gray, J., Chaudhuri, S., Bosworth, A., Layman, A., Reichart, D., Venkatrao, M., Pellow, F., Pirahesh, H.: Data cube: a relational aggregation operator generalizing group-by, cross-tab, and sub totals. Data Min. Knowl. Discov. 1(1), 29–53 (1997)
Hellerstein, J.M., Haas, P.J., Wang, H.J.: Online aggregation. In: SIGMOD, pp. 171–182 (1997)
Idreos, S., Papaemmanouil, O., Chaudhuri, S.: Overview of data exploration techniques. In: SIGMOD, pp. 277–281 (2015)
Kamat, N., Jayachandran, P., Tunga, K., Nandi, A.: Distributed and interactive cube exploration. In: ICDE, pp. 472–483 (2014)
Khan, H.A., Sharaf, M.A., Albarrak, A.: Divide: efficient diversification for interactive data exploration. In: SSDBM, pp. 15:1–15:12 (2014)
Krishnapuram, R., Joshi, A., Nasraoui, O., Yi, L.: Low-complexity fuzzy relational clustering algorithms for web mining. IEEE-FS 9, 595–607 (2001)
LeFevre, J., Sankaranarayanan, J., Hacigümüş, H., Tatemura, J., Polyzotis, N.: Towards a workload for evolutionary analytics. In: DanaC 2013, pp. 26–30 (2013)
Lehmann, J., Lalmas, M., Yom-Tov, E., Dupret, G.: Models of user engagement. In: Masthoff, J., Mobasher, B., Desmarais, M.C., Nkambou, R. (eds.) UMAP 2012. LNCS, vol. 7379, pp. 164–175. Springer, Heidelberg (2012). doi:10.1007/978-3-642-31454-4_14
Nguyen, H.V., Böhm, K., Becker, F., Goldman, B., Hinkel, G., Müller, E.: Identifying user interests within the data space - a case study with skyserver. EDBT 2015, 641–652 (2015)
O’Neil, P., O’Neil, E., Chen, X., Revilak, S.: The star schema benchmark and augmented fact table indexing. In: Nambiar, R., Poess, M. (eds.) TPCTC 2009. LNCS, vol. 5895, pp. 237–252. Springer, Heidelberg (2009). doi:10.1007/978-3-642-10424-4_17
Rabl, T., Poess, M., Jacobsen, H., O’Neil, P.E., O’Neil, E.J.: Variations of the star schema benchmark to test the effects of data skew on query performance. In: ICPE 2013, pp. 361–372 (2013)
Rizzi, S., Gallinucci, E.: CubeLoad: a parametric generator of realistic OLAP workloads. In: Jarke, M., Mylopoulos, J., Quix, C., Rolland, C., Manolopoulos, Y., Mouratidis, H., Horkoff, J. (eds.) CAiSE 2014. LNCS, vol. 8484, pp. 610–624. Springer, Heidelberg (2014). doi:10.1007/978-3-319-07881-6_41
Sapia, C.: PROMISE: predicting query behavior to enable predictive caching strategies for OLAP systems. In: Kambayashi, Y., Mohania, M., Tjoa, A.M. (eds.) DaWaK 2000. LNCS, vol. 1874, pp. 224–233. Springer, Heidelberg (2000). doi:10.1007/3-540-44466-1_22
Sarawagi, S.: Explaining differences in multidimensional aggregates. In: VLDB, pp. 42–53 (1999)
Sarawagi, S.: User-adaptive exploration of multidimensional data. In VLDB, pp. 307–316 (2000)
Sathe, G., Sarawagi, S.: Intelligent rollups in multidimensional OLAP data. In: VLDB, pp. 531–540 (2001)
Sellam, T., Kersten, M.L.: Meet Charles, big data query advisor. In: CIDR (2013)
Song, Y., Shi, X., Fu, X.: Evaluating and predicting user engagement change with degraded search relevance. In: WWW, pp. 1213–1224 (2013)
White, R.W., Roth, R.A.: Exploratory Search: Beyond the Query-Response Paradigm. Morgan & Claypool Publishers, San Rafael (2009)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Djedaini, M., Furtado, P., Labroche, N., Marcel, P., Peralta, V. (2017). Benchmarking Exploratory OLAP. In: Nambiar, R., Poess, M. (eds) Performance Evaluation and Benchmarking. Traditional - Big Data - Internet of Things. TPCTC 2016. Lecture Notes in Computer Science(), vol 10080. Springer, Cham. https://doi.org/10.1007/978-3-319-54334-5_5
Download citation
DOI: https://doi.org/10.1007/978-3-319-54334-5_5
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-54333-8
Online ISBN: 978-3-319-54334-5
eBook Packages: Computer ScienceComputer Science (R0)