Skip to main content

Enabling OLAP in mobile environments via intelligent data cube compression techniques

Abstract

The main drawbacks of handheld devices (small storage space, small size of the display screen, discontinuance of the connection to the WLAN etc) are often incompatible with the need of querying and browsing information extracted from enormous amounts of data which are accessible through the network. In this application scenario, data compression and summarization have a leading role: data in a lossy compressed format can be transmitted more efficiently than the original ones, and can be effectively stored in handheld devices (setting the compression ratio accordingly). In this paper, we introduce a very effective compression technique for multidimensional data cubes, and the system Hand-OLAP, which exploits this technique to allow handheld devices to extract and browse compressed two-dimensional OLAP views coming from multidimensional data cubes stored on a remote OLAP server localized on the wired network. Hand-OLAP effectively and efficiently enables OLAP in mobile environments, and also enlarges the potentialities of Decision Support Systems by taking advantage from the “naturally” decentralized nature of such environments. The idea which the system is based on is: rather than querying the original multidimensional data cubes, it may be more convenient to generate a compressed OLAP view of them, store such view into the handheld device, and query it locally (off-line), thus obtaining approximate answers that are suitable for OLAP applications.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20
Fig. 21
Fig. 22
Fig. 23
Fig. 24
Fig. 25
Fig. 26
Fig. 27
Fig. 28
Fig. 29

References

  • Acharya, S., Alonso, R., Franklin, M., & Zdonik, S. (1995). Broadcast disks: Data management for asymmetric communication environments. In Proceedings of the 1995 ACM international conference on management of data (pp. 199–210). San Jose, CA, USA.

  • Acharya, S., Poosala, V., & Ramaswamy, S. (1999). Selectivity estimation in spatial databases. In Proceedings of the 1999 ACM international conference on management of data (pp. 13–24). Philadelphia, PA, USA, May 13–June 3.

  • Barbarà, D. (1999). Mobile computing and databases—a survey. IEEE Transactions on Knowledge and Data Engineering, 11(1), 108–117.

    Article  Google Scholar 

  • Bharadvaj, H., Joshi, A., & Auephanwiriyakyl, S. (1998). An active transcoding proxy to support mobile web access. In Proceedings of the 17th IEEE symposium on reliable distributed systems (pp. 118–123). West Lafayette, IN, USA.

  • Bobineau, C., Bouganim, L., Pucheral, P., & Valduriez, P. (2000). PicoDBMS: Scaling down database techniques for the smartcard. In Proceedings of the 26th international conference on very large data bases (pp. 11–20). Cairo, Egypt.

  • Bruno, N., Chaudhuri, S., Gravano, L. (2001). STHoles: A multidimensional workload-aware histogram. In Proceedings of the 2001 ACM International conference on management of data (pp. 211–222). New York, NY, USA.

  • Buccafurri, F., Furfaro, F., Lax, G., & Saccà, D. (2002a). Binary-tree histograms with tree indices. In Proceedings of the 13rd international conference on database and expert systems applications (pp. 861–870). Aix-en, France, September 2–6.

  • Buccafurri, F., Furfaro, F., Saccà, D., & Sirangelo, C. (2003). A quad-tree based multiresolution approach for two-dimensional summary data. In Proceedings of the 15th IEEE international conference on scientific and statistical database management (pp. 127–140). Cambridge, MA, USA, July 9–11.

  • Buccafurri, F., Pontieri, L., Rosaci, D., & Saccà, D. (2002b). Improving range query estimation on histograms. In Proceedings of the 18th IEEE international conference on data engineering (pp. 628–638). San Jose, CA, USA.

  • Buyukkokten, O., Garcia-Molina, H., & Paepcke, A. (2001). Seeing the whole in parts: Text summarization for web browsing on handled devices. In Proceedings of the 10th international world wide web conference (pp. 652–662.). Hongkong.

  • Colliat, G. (1996). OLAP, relational, and multidimensional database systems. SIGMOD Record, 25(3), 64–69.

    Article  Google Scholar 

  • Cuzzocrea, A. (2006). Improving range-sum query evaluation on data cubes via polynomial approximation. Data & Knowledge Engineering, 56(2), 85–121.

    Article  Google Scholar 

  • Cuzzocrea, A., & Wang, W. (2007). Approximate range-sum query answering on data cubes with probabilistic guarantees. Journal of Intelligent Information Systems, 28(2), 161–197.

    Article  Google Scholar 

  • Dehne, F., Eavis, T., Hambrusch, S., Rau-Chaplin, A. (2001). Parallelizing the data cube. Distributed and Parallel Databases, 11(2), 181–201.

    Google Scholar 

  • Dehne, F., Eavis, T., Rau-Chaplin, A. (2004). The cgmCUBE project: Optimizing parallel data cube generation for ROLAP. Distributed and Parallel Databases, 19(1), 29–62.

    Article  Google Scholar 

  • Donjerkovic, D., Ioannidis, Y., & Ramakrishnan, R. (1999). Dynamic histograms: Capturing evolving data sets. University of Wisconsin–Madison Technical Report CS-TR-99-1396. Madison: University of Wisconsin.

  • Franz, M., & Kistler, T. (1997). Slim binaries. Communications of the ACM, 40(12), 87–94.

    Article  Google Scholar 

  • Garofalakis, M., & Gibbons, P. B. (2001). Approximate query processing: Taming the terabytes! Tutorial at 27th international conference on very large data bases. Accessed at http://www.vldb.org/conf/2001/tut4.pdf.

  • Gebhardt, M., Jarke, M., & Jacobs, S. (1997). A toolkit for negotiation support interfaces to multi-dimensional data. Proceedings of the 1997 ACM international conference on management of data, 26(2), 348–356.

    Google Scholar 

  • Gibbons, P. B., Matias, Y., & Poosala, V. (2002). Fast incremental maintenance of approximate histograms. ACM Transactions on Database Systems, 27(3), 261–298.

    Article  Google Scholar 

  • Gilbert, A. C., Kotidis, Y., Muthukrishnan, S., & Strauss, M. (2001). Optimal and approximate computation of summary statistics for range aggregates. In Proceedings of the 20th ACM international symposium on principles of database systems (pp. 227–236). Santa Barbara, CA, USA.

  • Goodman, D., Borras, J., Mandayam, N., & Yates, R. (1997). Infostations: A new system model for data and messaging services. Proceedings of the IEEE Vehicular Technology Conference, 2, 969–973.

    Google Scholar 

  • Gray, J., Bosworth, A., Layman, A., Pirahesh, H. (1997). Data cube: A relational aggregation operator generalizing group-by, cross-tab, and sub-total. Journal of Data Mining and Knowledge Discovery, 1(1), 29–53.

    Article  Google Scholar 

  • Gunopulos, D., Kollios, G., Tsotras, V. J., & Domeniconi, C. (2000). Approximating multi-dimensional aggregate range queries over real attributes. In Proceedings of the 2000 ACM conference on management of data (pp. 463–474). Dallas, TX, USA.

  • Han, J., & Kamber, M. (2000). Data mining: Concepts and techniques. San Francisco: Kauffmann.

    Google Scholar 

  • Ho, C. T., Agrawal, R., Megiddo, N., & Srikant, R. (1997). Range queries in OLAP data cubes. Proceedings of the 1997 ACM international conference on management of data, 26(2), 73–88.

    Google Scholar 

  • Imielinski, T., Viswanathan, S., & Badrinath, B. R. (1997). Data on air: Organization and access. IEEE Transactions on Knowledge and Data Engineering, 9(3), 352–372.

    Article  Google Scholar 

  • Ioannidis, Y. E., & Poosala, V. (1999). Histogram-based approximation of set-valued query answers. Proceedings of the 25th International Conference on Very Large Data Bases, 18(3), 174–185.

    Google Scholar 

  • Jagadish, H. V., Jin, H., Ooi, B. C., & Tan, K. L. (2001). Global optimization of histograms. In Proceedings of the 2001 ACM international conference on management of data (pp. 223–234). Santa Barbara, CA, USA.

  • Jagadish, H. V., Koudas, N., Muthukrishnan, S., Poosala, V., Sevcik, K., & Suel, T. (1998). Optimal histograms with quality guarantees. In Proceedings of the 24th international conference on very large data bases (pp. 275–286). New York, NY, USA.

  • Joshi, A. (2000). On proxy agents, mobility and web access. ACM/Baltzer Journal on Mobile Networks and Applications, 5(4), 233–241.

    MATH  Article  Google Scholar 

  • Kooi, R. P. (1980). The optimization of queries in relational databases. PhD Thesis. Melbourne: CWR University.

  • Maniatis, A., Vassiliadis, P., Skiadopoulos, S., & Vassiliou, Y. (2003a). CPM: A cube presentation model for OLAP. In Proceedings of the 5th international conference on data warehousing and knowledge discovery (pp. 4–13). Prague, Czech Republic.

  • Maniatis, A., Vassiliadis, P., Skiadopoulos, S., & Vassiliou, Y. (2003b). Advanced visualization for OLAP. In Proceedings of the 6th ACM international workshop on data warehousing and OLAP (pp. 9–16). New Orleans, LA, USA.

  • Maniatis, A., Vassiliadis, P., Skiadopoulos, S., Vassiliou, Y., Mavrogonatos, G., & Michalarias, I. (2005). A presentation model & non-traditional visualization for OLAP. International Journal of Data Warehousing and Mining, 1(1), 1–36.

    Google Scholar 

  • Mansmann, S., & Scholl, M. H. (2006). Extending visual OLAP for handling irregular dimensional hierarchies. In Proceedings of 8th international conference on data warehousing and knowledge discovery (pp. 95–105). Krakow: Springer.

    Chapter  Google Scholar 

  • Muralikrishna, M., & De Witt, D. J. (1988). Equi-depth histograms for estimating selectivity factors for multi-dimensional queries. In Proceedings of the 1998 ACM international conference on management of data (pp. 28–36). New York: ACM.

    Google Scholar 

  • Muthukrishnan, S., Poosala, V., & Suel, T. (1999). On rectangular partitioning in two dimensions: Algorithms, complexity, and applications. In Proceedings of the 7th international conference on database theory (pp. 236–256), 10–12 Jan.

  • Oezsu, M., & Valduriez, P. (1999). Principles of distributed database systems. Upper Saddle River: Prentice Hall.

    Google Scholar 

  • Perich, F., Joshi, A., Finin, T., & Yesha, Y. (2004). On data management in pervasive computing environments. IEEE Transactions on Knowledge and Data Engineering, 16(5), 621–634.

    Article  Google Scholar 

  • Piatetsky-Shapiro, G., & Connell, C. (1984). Accurate estimation of the number of tuples satisfying a condition. Proceedings of the 1984 ACM International Conference on Management of Data, 14(2), 265–275.

    Google Scholar 

  • Pedersen, T. B., Jensen, C., & Dyreson, C. E. (2000). The treescape system: Reuse of pre-computed aggregates over irregular OLAP hierarchies. In Proceedings of the 26th international conference on very large databases (pp. 595–598). Cairo, Egypt.

  • Pedersen, T. B., Jensen, C., & Dyreson, C. E. (2001). Pre-aggregation for irregular OLAP hierarchies with the treescape system. In Proceedings of the 17th IEEE international conference on data engineering (pp. 1–3). Heidelberg, Germany.

  • Poosala, V., & Ganti, V. (1999). Fast approximate answers to aggregate queries on a data cube. In Proceedings of the 11st international conference on statistical and scientific database management (pp. 24–33). Cleveland, OH, USA.

  • Poosala, V., & Ioannidis, Y. E. (1997). Selectivity estimation without the attribute value independence assumption. In Proceedings of the 23rd international conference on very large databases (pp. 486–495). Athens, Greece.

  • Poosala, V., Ioannidis, Y. E., Haas, P. J., & Shekita, E. (1996). Improved histograms for selectivity estimation of range predicates. In Proceedings of the 1996 acm international conference on management of data (pp. 294–305). Montreal, Quebec.

  • Rodriguez-Martinez, M., & Rossopoulos, N. (2000). MOCHA: A self-extensible database middleware system for distributed data sources. In Proceedings of the 2000 acm international conference on management of data (pp. 213–224). Dallas, TX, USA.

  • Sampaio, M. C., Dias, P. M., Baptista, C. S. (2003). Incremental updates on mobile data warehousing using optimized hierarchical views and new aggregation operators. In Proceedings of 17th international conference on advanced information networking and applications (pp. 78–83). doi:10.1109/AINA.2003.1192847.

  • Sharaf, M. A., & Chrysanthis, P. K. (2002). Semantic-based delivery of OLAP summary tables in wireless environments. In Proceedings of the 2002 ACM international conference on information and knowledge management (pp. 84–92). New York: ACM.

    Google Scholar 

  • Shoshani, A. (1997). OLAP and statistical databases: Similarities and differences. In Proceedings of the 16th ACM international symposium on principles of database systems, Tucson, AZ, USA (pp. 185–196). New York: ACM.

    Google Scholar 

  • Stanoi, I., Agrawal, D., El Abbadi, A., Phatak, S. H., & Badrinath, B. R. (1999). Data warehousing alternatives for mobile environments. In Proceedings of the 1999 ACM international workshop on data engineering for wireless and mobile access (pp. 110–115). Seattle, WA, USA.

  • Tait, C., Lei, H., Acharya, S., & Chang, H. (1995). Intelligent file hoarding for mobile computers. In Proceedings of the 1st ACM international conference on mobile computing and networking pp. (119–125). New York: ACM.

    Google Scholar 

  • Vassiliadis, P., & Sellis, T. (1999). A survey of logical models for OLAP databases. ACM SIGMOD Record, 28(4), 64–69.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alfredo Cuzzocrea.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Cuzzocrea, A., Furfaro, F. & Saccà, D. Enabling OLAP in mobile environments via intelligent data cube compression techniques. J Intell Inf Syst 33, 95–143 (2009). https://doi.org/10.1007/s10844-008-0065-4

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10844-008-0065-4

Keywords

  • Approximate query answering
  • Synopsis data structures
  • OLAP
  • Data management on mobile environments
  • Pervasive and ubiquitous computing