Multidimensional Prediction Models When the Resolution Context Changes

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9285)


Multidimensional data is systematically analysed at multiple granularities by applying aggregate and disaggregate operators (e.g., by the use of OLAP tools). For instance, in a supermarket we may want to predict sales of tomatoes for next week, but we may also be interested in predicting sales for all vegetables (higher up in the product hierarchy) for next Friday (lower down in the time dimension). While the domain and data are the same, the operating context is different. We explore several approaches for multidimensional data when predictions have to be made at different levels (or contexts) of aggregation. One method relies on the same resolution, another approach aggregates predictions bottom-up, a third approach disaggregates predictions top-down and a final technique corrects predictions using the relation between levels. We show how these strategies behave when the resolution context changes, using several machine learning techniques in four application domains.


Multidimensional data Operating context aggregation Disaggregation OLAP cubes Quantification 


  1. 1.
    Agrawal, R., Gupta, A., Sarawagi, S.: Modeling multidimensional databases. In: Proceedings of the Thirteenth International Conference on Data Engineering, ICDE 1997, pp. 232–243. IEEE Computer Society (1997)Google Scholar
  2. 2.
    Bella, A., Ferri, C., Hernández-Orallo, J., Ramírez-Quintana, M.: Quantification via probability estimators. In: IEEE ICDM, pp. 737–742 (2010)Google Scholar
  3. 3.
    Bella, A., Ferri, C., Hernández-Orallo, J., Ramírez-Quintana, M.J.: Aggregative quantification for regression. DMKD 28(2), 475–518 (2014)zbMATHGoogle Scholar
  4. 4.
    Bickel, R.: Multilevel analysis for applied research: It’s just regression! Guilford Press (2012)Google Scholar
  5. 5.
    Cabibbo, L., Torlone, R.: A logical approach to multidimensional databases. In: Schek, H.-J., Saltor, F., Ramos, I., Alonso, G. (eds.) EDBT 1998. LNCS, vol. 1377, p. 183. Springer, Heidelberg (1998) CrossRefGoogle Scholar
  6. 6.
    Chaudhuri, S., Dayal, U.: An overview of data warehousing and OLAP technology. ACM Sigmod Record 26(1), 65–74 (1997)CrossRefGoogle Scholar
  7. 7.
    Chen, B.C.: Cube-Space Data Mining. ProQuest (2008)Google Scholar
  8. 8.
    Chen, B.C., Chen, L., Lin, Y., Ramakrishnan, R.: Prediction cubes. In: Proc. of the 31st Intl. Conf. on Very Large Data Bases, pp. 982–993 (2005)Google Scholar
  9. 9.
    Datahub: Car fuel consumptions and emissions 2000–2013 (2013).
  10. 10.
    Dhurandhar, A.: Using coarse information for real valued prediction. Data Mining and Knowledge Discovery 27(2), 167–192 (2013)MathSciNetCrossRefGoogle Scholar
  11. 11.
    Forman, G.: Quantifying counts and costs via classification. Data Min. Knowl. Discov. 17(2), 164–206 (2008)MathSciNetCrossRefGoogle Scholar
  12. 12.
    Goldstein, H.: Multilevel Statistical Models, vol. 922. John Wiley & Sons (2011)Google Scholar
  13. 13.
    Golfarelli, M., Maio, D., Rizzi, S.: The dimensional fact model: a conceptual model for data warehouses. Intl. J. of Coop. Information Systems 7, 215–247 (1998)CrossRefzbMATHGoogle Scholar
  14. 14.
    Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA data mining software: An update. SIGKDD Explor. 11(1), 10–18 (2009)CrossRefzbMATHGoogle Scholar
  15. 15.
    Hernández-Orallo, J.: Probabilistic reframing for cost-sensitive regression. ACM Transactions on Knowledge Discovery from Data 8(3) (2014)Google Scholar
  16. 16.
    IBM Corporation: Introduction to Aroma and SQL (2006).
  17. 17.
    Kamber, M., Jenny, J.H., Chiang, Y., Han, J., Chiang, J.Y.: Metarule-guided mining of multi-dimensional association rules using data cubes. In: KDD, pp. 207–210 (1997)Google Scholar
  18. 18.
    Lin, T., Yao, Y., Zadeh, L.: Data Mining, Rough Sets and Granular Computing. Studies in Fuzziness and Soft Computing. Physica-Verlag HD (2002)Google Scholar
  19. 19.
    Páircéir, R., McClean, S., Scotney, B.: Discovery of multi-level rules and exceptions from a distributed database. In: Proc. of the 6th ACM SIGKDD Intl. Conf. on Knowledge discovery and data mining, pp. 523–532. ACM (2000)Google Scholar
  20. 20.
    Pastor, O., Casamayor, J.C., Celma, M., Mota, L., Pastor, M.A., Levin, A.M.: Conceptual Modeling of Human Genome: Integration Challenges. In: Düsterhöft, A., Klettke, M., Schewe, K.-D. (eds.) Conceptual Modelling and Its Theoretical Foundations. LNCS, vol. 7260, pp. 231–250. Springer, Heidelberg (2012) CrossRefGoogle Scholar
  21. 21.
    Perlich, C., Provost, F.: Distribution-based aggregation for relational learning with identifier attributes. Machine Learning 62(1–2), 65–105 (2006)CrossRefGoogle Scholar
  22. 22.
    Team, R., et al.: R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria (2012)zbMATHGoogle Scholar
  23. 23.
    Ramakrishnan, R., Chen, B.C.: Exploratory mining in cube space. Data Mining and Knowledge Discovery 15(1), 29–54 (2007)MathSciNetCrossRefGoogle Scholar
  24. 24.
    Raudenbush, S.W., Bryk, A.S.: Hierarchical linear models: applications and data analysis methods, vol. 1. Sage (2002)Google Scholar
  25. 25.
    UCI Repository: UJIIndoorLoc data set (2014).
  26. 26.
    Vassiliadis, P.: Modeling multidimensional databases, cubes and cube operations. In: Proc. of the 10th SSDBM Conference, pp. 53–62 (1998)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  1. 1.DSIC, Universitat Politècnica de ValènciaValènciaSpain

Personalised recommendations