Abstract
Focusing on novel database application scenarios, where datasets arise more and more in uncertain and imprecise formats, in this paper we propose a novel framework for efficiently computing and querying multidimensional OLAP data cubes over probabilistic data, which well-capture previous kinds of data. Several models and algorithms supported in our proposed framework are formally presented and described in details, based on well-understood theoretical statistical/probabilistic tools, which converge to the definition of the so-called probabilistic OLAP data cubes, the most prominent result of our research. Finally, we complete our analytical contribution by introducing an innovative Probability Distribution Function (PDF)-based approach for efficiently querying probabilistic OLAP data cubes.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Agarwal, S., Agrawal, R., Deshpande, P., Gupta, A., Naughton, J.F., Ramakrishnan, R., Sarawagi, S.: On the Computation of Multidimensional Aggregates. In: Proceedings of VLDB 1996 Int. Conf. (1996)
Agrawal, P., Benjelloun, O., Sarma, A.D., Hayworth, C., Nabar, S.U., Sugihara, T., Widom, J.: Trio: A System for Data, Uncertainty, and Lineage. In: Proceedings of VLDB 2006 Int. Conf. (2006)
Barbarà , D., Garcia-Molina, H., Porter, D.: The Management of Probabilistic Data. IEEE Transactions on Knowledge Data Engineering 4(5) (1992)
Benjelloun, O., Sarma, A.D., Halevy, A.Y., Theobald, M., Widom, J.: Databases with Uncertainty and Lineage. VLDB Journal 17(2) (2008)
Bonnet, P., Gehrke, J.E., Seshadri, P.: Towards Sensor Database Systems. In: Proceedings of ACM MDM Int. Conf. (2001)
Burdick, D., Deshpande, P.M., Jayram, T.S., Ramakrishnan, R., Vaithyanathan, S.: OLAP over Uncertain and Imprecise Data. In: Proceedings of VLDB 2005 Int. Conf. (2005)
Burdick, D., Deshpande, P.M., Jayram, T.S., Ramakrishnan, R., Vaithyanathan, S.: Efficient Allocation Algorithms for OLAP over Imprecise Data. In: Proceedings of VLDB 2006 Int. Conf. (2006)
Burdick, D., Doan, A., Ramakrishnan, R., Vaithyanathan, S.: OLAP over Imprecise Data with Domain Constraints. In: Proceedings of VLDB 2007 Int. Conf. (2007)
Chen, A.L.P., Chiu, J.-S., Tseng, F.S.-C.: Evaluating Aggregate Operations over Imprecise Data. IEEE Transactions on Knowledge Data Engineering 8(2) (1996)
Cheng, R., Kalashnikov, D., Prabhakar, S.: Evaluating Probabilistic Queries over Imprecise Data. In: Proceedings of ACM SIGMOD 2003 Int. Conf. (2003)
Cheng, R., Singh, S., Prabhakar, S., Shah, R., Vitter, J.S., Xia, Y.: Efficient Join Processing over Uncertain Data. In: Proceedings of ACM CIKM 2006 Int. Conf. (2006)
Colliat, G.: OLAP, Relational, and Multidimensional Database Systems. SIGMOD Record 25(3) (1996)
Cormode, G., Garofalakis, M.: Sketching Probabilistic Data Streams. In: Proceedings of ACM SIGMOD 2007 Int. Conf. (2007)
Cuzzocrea, A.: Improving Range-Sum Query Evaluation on Data Cubes via Polynomial Approximation. Data & Knowledge Engineering 56(2) (2006)
Cuzzocrea, A., Wang, W.: Approximate Range-Sum Query Answering on Data Cubes with Probabilistic Guarantees. Journal of Intelligent Information Systems 28(2) (2007)
Dalvi, N., Suciu, D.: Efficient Query Evaluation on Probabilistic Databases. In: Proceedings of VLDB 2004 Int. Conf. (2004)
Dalvi, N., Suciu, D.: Management of Probabilistic Data: Foundations and Challenges. In: Proceedings of ACM PODS 2007 Int. Conf. (2007)
Deligiannakis, A., Roussopoulos, N.: Extended Wavelets for Multiple Measures. In: Proceedings of ACM SIGMOD 2003 Int. Conf. (2003)
Golub, G.H., Van Loan, C.F.: Matrix Computation, 2nd edn. Johns Hopkins University Press, Baltimore (1989)
Gray, J., Chaudhuri, S., Bosworth, A., Layman, A., Reichart, D., Venkatrao, M., Pellow, F., Pirahesh, H.: Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Totals. Data Mining and Knowledge Discovery 1(1) (1997)
Han, J., Kamber, M.: Data Mining: Concepts and Techniques, second ed. Morgan Kauffmann Publishers, San Francisco (2006)
Harinarayan, V., Rajaraman, A., Ullman, J.: Implementing Data Cubes Efficiently. In: Proceedings of ACM SIGMOD 1996 Int. Conf. (1996)
Hellerstein, J.M., Haas, P.J., Wang, H.J.: Online Aggregation. In: Proceedings of ACM SIGMOD 1997 Int. Conf. (1997)
Ho, C.-T., Agrawal, R., Megiddo, N., Srikant, R.: Range Queries in OLAP Data Cubes. In: Proceedings of ACM SIGMOD 1997 Int. Conf. (1997)
Hua, M., Pei, J., Zhang, W., Lin, X.: Ranking Queries on Uncertain Data: A Probabilistic Threshold Approach. In: Proceedings of ACM SIGMOD 2008 Int. Conf. (2008)
Jayram, T.S., McGregor, A., Muthukrishnan, S., Vee, E.: Estimating Statistical Aggregates on Probabilistic Data Streams. In: Proceedings of ACM PODS 2007 Int. Conf. (2007)
Jin, R., Glimcher, L., Jermaine, C., Agrawal, G.: New Sampling-Based Estimators for OLAP Queries. In: Proceedings of IEEE ICDE 2006 Int. Conf. (2006)
Kimelfeld, B., Sagiv, Y.: Maximally Joining Probabilistic Data. In: Proceedings of ACM PODS 2007 Int. Conf. (2007)
Lian, X., Chen, L.: Probabilistic Ranked Queries in Uncertain Databases. In: Proceedings of EDBT 2008 Int. Conf. (2008)
McClean, S.I., Scotney, B.W., Shapcott, M.: Aggregation of Imprecise and Uncertain Information in Databases. IEEE Transactions on Knowledge Data Engineering 13(6) (2001)
Papoulis, A.: Probability, Random Variables, and Stochastic Processes, second ed. McGraw-Hill, New York (1984)
Ré, C., Suciu, D.: Approximate Lineage for Probabilistic Databases. PVLDB 1(1) (2008)
Ross, R., Subrahmanian, V.S., Grant, J.: Aggregate Operators in Probabilistic Databases. Journal of the ACM 52(1) (2005)
Sarma, A.D., Theobald, M., Widom, J.: Exploiting Lineage for Confidence Computation in Uncertain and Probabilistic Databases. In: Proceedings of IEEE ICDE Int. Conf. (2008)
Soliman, M.A., Ilyas, I.F., Chang, K.C.-C.: Top-K Query Processing in Uncertain Databases. In: Proceedings of IEEE ICDE 2007 Int. Conf. (2007)
Timko, I., Dyreson, C.E., Pedersen, T.B.: Pre-Aggregation with Probability Distributions. In: Proceedings of ACM DOLAP 2006 Int. Conf. (2006)
Vassiliadis, P., Sellis, T.: A Survey of Logical Models for OLAP Databases. SIGMOD Record 28(4) (1999)
Yi, K., Li, F., Srivastava, D.: Kollios. G.: Efficient Processing of Top-K Queries in Uncertain Databases. In: Proceedings of IEEE ICDE 2008 Int. Conf. (2008)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Cuzzocrea, A., Gunopulos, D. (2010). Efficiently Computing and Querying Multidimensional OLAP Data Cubes over Probabilistic Relational Data. In: Catania, B., Ivanović, M., Thalheim, B. (eds) Advances in Databases and Information Systems. ADBIS 2010. Lecture Notes in Computer Science, vol 6295. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15576-5_12
Download citation
DOI: https://doi.org/10.1007/978-3-642-15576-5_12
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-15575-8
Online ISBN: 978-3-642-15576-5
eBook Packages: Computer ScienceComputer Science (R0)