Advertisement

What Can Formal Concept Analysis Do for Data Warehouses?

  • Rokia Missaoui
  • Léonard Kwuida
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5548)

Abstract

Formal concept analysis (FCA) has been successfully used in several Computer Science fields such as databases, software engineering, and information retrieval, and in many domains like medicine, psychology, linguistics and ecology. In data warehouses, users exploit data hypercubes (i.e., multi-way tables) mainly through online analytical processing (OLAP) techniques to extract useful information from data for decision support purposes.

Many topics have attracted researchers in the area of data warehousing: data warehouse design and multidimensional modeling, efficient cube materialization (pre-computation), physical data organization, query optimization and approximation, discovery-driven data exploration as well as cube compression and mining. Recently, there has been an increasing interest to apply or adapt data mining approaches and advanced statistical analysis techniques for extracting knowledge (e.g., outliers, clusters, rules, closed n-sets) from multidimensional data. Such approaches or techniques cover (but are not limited to) FCA, cluster analysis, principal component analysis, log-linear modeling, and non-negative multi-way array factorization. Since data cubes are generally large and highly dimensional, and since cells contain consolidated (e.g., mean value), multidimensional and temporal data, such facts lead to challenging research issues in mining data cubes. In this presentation, we will give an overview of related work and show how FCA theory (with possible extensions) can be used to extract valuable and actionable knowledge from data warehouses.

Keywords

Association Rule Data Cube Formal Concept Analysis Approximate Query Galois Lattice 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Agrawal, R., Gupta, A., Sarawagi, S.: Modeling multidimensional databases. In: ICDE 1997: Proceedings of the Thirteenth International Conference on Data Engineering, Washington, DC, USA, 1997, pp. 232–243. IEEE Computer Society Press, Los Alamitos (1997)Google Scholar
  2. 2.
    Babcock, B., Chaudhuri, S., Das, G.: Dynamic sample selection for approximate query processing. In: SIGMOD 2003: Proceedings of the 2003 ACM SIGMOD international conference on Management of data, pp. 539–550. ACM Press, New York (2003)Google Scholar
  3. 3.
    Barbará, D., Wu, X.: Using loglinear models to compress datacubes. In: Lu, H., Zhou, A. (eds.) WAIM 2000. LNCS, vol. 1846, pp. 311–323. Springer, Heidelberg (2000)CrossRefGoogle Scholar
  4. 4.
    Barbara, D., Wu, X.: Loglinear-based quasi cubes. J. Intell. Inf. Syst. 16(3), 255–276 (2001)CrossRefzbMATHGoogle Scholar
  5. 5.
    Bellatreche, L., Missaoui, R., Necir, H., Drias, H.: A data mining approach for selecting bitmap join indices. Journal of Computing Science and Engineering 1(2), 177–194 (2007)CrossRefGoogle Scholar
  6. 6.
    Besson, J., Robardet, C., Boulicaut, J.-F.: Mining a new fault-tolerant pattern type as an alternative to formal concept discovery. In: Schärfe, H., Hitzler, P., Øhrstrøm, P. (eds.) ICCS 2006. LNCS, vol. 4068, pp. 144–157. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  7. 7.
    Casali, A., Nedjar, S., Cicchetti, R., Lakhal, L.: Convex cube: Towards a unified structure for multidimensional databases. In: Wagner, R., Revell, N., Pernul, G. (eds.) DEXA 2007. LNCS, vol. 4653, pp. 572–581. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  8. 8.
    Casali, A., Nedjar, S., Cicchetti, R., Lakhal, L.: Closed Cube Lattices. In: New Trends in Data Warehousing and Data Analysis. Annals of Information Systems, vol. 3, pp. 1–20. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  9. 9.
    Cerf, L., Besson, J., Robardet, C., Boulicaut, J.-F.: Data peeler: Constraint-based closed pattern mining in n-ary relations. In: SDM, pp. 37–48. SIAM, Philadelphia (2008)Google Scholar
  10. 10.
    Chakrabarti, K., Garofalakis, M.N., Rastogi, R., Shim, K.: Approximate query processing using wavelets. VLDB J. 10(2-3), 199–223 (2001)zbMATHGoogle Scholar
  11. 11.
    Chaudhuri, S., Datar, M., Narasayya, V.: Index selection for databases: A hardness study and a principled heuristic solution. IEEE Transactions on Knowledge and Data Engineering 16(11), 1313–1323 (2004)CrossRefGoogle Scholar
  12. 12.
    Chaudhuri, S., Dayal, U.: An overview of data warehousing and olap technology. SIGMOD Rec. 26(1), 65–74 (1997)CrossRefGoogle Scholar
  13. 13.
    Dong, G., Han, J., Lam, J.M.W., Pei, J., Wang, K.: Mining multi-dimensional constrained gradients in data cubes. In: VLDB 2001: Proceedings of the 27th International Conference on Very Large Data Bases, pp. 321–330. Morgan Kaufmann Publishers Inc., San Francisco (2001)Google Scholar
  14. 14.
    Gabler, S., Wolff, K.E.: Comparison of visualizations in formal concept analysis and correspondence analysis. In: Greenacre, M., Blasius, J. (eds.) Visualization of Categorical Data, pp. 85–97. Academic Press, San Diego (1998)Google Scholar
  15. 15.
    Han, J., Kamber, M.: Data mining: concepts and techniques. Morgan Kaufmann Publishers Inc., San Francisco (2000)zbMATHGoogle Scholar
  16. 16.
    Harinarayan, V., Rajaraman, A., Ullman, J.D.: Implementing data cubes efficiently. In: SIGMOD 1996: Proceedings of the 1996 ACM SIGMOD international conference on Management of data, pp. 205–216. ACM Press, New York (1996)CrossRefGoogle Scholar
  17. 17.
    Imielinski, T., Khachiyan, L., Abdulghani, A.: Cubegrades: Generalizing association rules. Data Min. Knowl. Discov. 6(3), 219–257 (2002)MathSciNetCrossRefGoogle Scholar
  18. 18.
    Jaeschke, R., Hotho, A., Schmitz, C., Ganter, B., Stumme, G.: Trias - an algorithm for mining iceberg tri-lattices. In: Proceedings of the 6th IEEE International Conference on Data Mining (ICDM 2006), Hong Kong, December 2006, pp. 907–911. IEEE Computer Society Press, Los Alamitos (2006)Google Scholar
  19. 19.
    Ji, L., Tan, K.-L., Tung, A.K.H.: Mining frequent closed cubes in 3d datasets. In: VLDB 2006: Proceedings of the 32nd international conference on Very large data bases, pp. 811–822. VLDB Endowment (2006)Google Scholar
  20. 20.
    Kamber, M., Han, J., Chiang, J.: Metarule-Guided Mining of Multi-Dimensional Association Rules Using Data Cubes. In: Proceedings of the 3rd International Conference on Knowledge Discovery and Data Mining (KDD 1997), Newport Beach, CA, USA, August 1997, pp. 207–210. The AAAI Press, Menlo Park (1997)Google Scholar
  21. 21.
    Knorr, E.M., Ng, R.T., Tucakov, V.: Distance-based outliers: algorithms and applications. The VLDB Journal 8(3-4), 237–253 (2000)CrossRefGoogle Scholar
  22. 22.
    Lakshmanan, L.V.S., Pei, J., Zhao, Y.: Quotient cube: How to summarize the semantics of a data cube. In: Proceedings of the 28th International Conference on Very Large Databases, VLDB, pp. 778–789 (2002)Google Scholar
  23. 23.
    Lehmann, F., Wille, R.: A triadic approach to formal concept analysis. In: Ellis, G., Rich, W., Levinson, R., Sowa, J.F. (eds.) ICCS 1995. LNCS, vol. 954, pp. 32–43. Springer, Heidelberg (1995)CrossRefGoogle Scholar
  24. 24.
    Li, C.-P., Tung, K.-H., Wang, S.: Incremental maintenance of quotient cube based on galois lattice. J. Comput. Sci. Technol. 19(3), 302–308 (2004)CrossRefGoogle Scholar
  25. 25.
    Lu, H., Feng, L., Han, J.: Beyond intratransaction association analysis: mining multidimensional intertransaction association rules. ACM Trans. Inf. Syst. 18(4), 423–454 (2000)CrossRefGoogle Scholar
  26. 26.
    Messaoud, R.B., Boussaid, O., Rabaséda, S.: A new olap aggregation based on the ahc technique. In: DOLAP 2004: Proceedings of the 7th ACM international workshop on Data warehousing and OLAP, pp. 65–72. ACM Press, New York (2004)Google Scholar
  27. 27.
    Missaoui, R., Goutte, C., Choupo, A.K., Boujenoui, A.: A probabilistic model for data cube compression and query approximation. In: DOLAP 2007: Proceedings of the ACM tenth international workshop on Data warehousing and OLAP, pp. 33–40. ACM Press, New York (2007)CrossRefGoogle Scholar
  28. 28.
    Palpanas, T., Koudas, N., Mendelzon, A.: Using datacube aggregates for approximate querying and deviation detection. IEEE Transactions on Knowledge and Data Engineering 17(11), 1465–1477 (2005)CrossRefGoogle Scholar
  29. 29.
    Sarawagi, S., Agrawal, R., Megiddo, N.: Discovery-driven exploration of olap data cubes. In: Schek, H.-J., Saltor, F., Ramos, I., Alonso, G. (eds.) EDBT 1998. LNCS, vol. 1377, pp. 168–182. Springer, Heidelberg (1998)CrossRefGoogle Scholar
  30. 30.
    Shanmugasundaram, J., Fayyad, U., Bradley, P.S.: Compressed data cubes for olap aggregate query approximation on continuous dimensions. In: KDD 1999: Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 223–232. ACM Press, New York (1999)CrossRefGoogle Scholar
  31. 31.
    Stumme, G.: Conceptual on-line analytical processing. In: Information organization and databases: foundations of data organization, pp. 191–203 (2000)Google Scholar
  32. 32.
    Tjioe, H.C., Taniar, D.: Mining association rules in data warehouses. International Journal of Data Warehousing and Mining 1(3), 28–62 (2005)CrossRefGoogle Scholar
  33. 33.
    Ventos, V., Soldano, H.: Alpha galois lattices: an overview. In: Ganter, B., Godin, R. (eds.) ICFCA 2005. LNCS, vol. 3403, pp. 298–313. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  34. 34.
    Voutsadakis, G.: Polyadic concept analysis. Order 19(3), 295–304 (2002)MathSciNetCrossRefzbMATHGoogle Scholar
  35. 35.
    White, D.R.: Statistical entailments and the galois lattice. Social Networks 18, 201–215 (1996)CrossRefGoogle Scholar
  36. 36.
    Wolff, K.E.: Comparison of graphical data analysis methods. In: Faulbaum, F., Bandilla, W. (eds.) SoftStat 1995. Advances in Statistical Software, vol. 5, Lucius&Lucius, Stuttgart, pp. 139–151 (1996)Google Scholar
  37. 37.
    Xin, D., Han, J., Li, X., Wah, B.W.: Star-cubing: Computing iceberg cubes by top-down and bottom-up integration. In: VLDB (2003)Google Scholar
  38. 38.
    Yu, F., Shan, W.: Compressed data cube for approximate olap query processing. J. Comput. Sci. Technol. 17(5), 625–635 (2002)MathSciNetCrossRefzbMATHGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Rokia Missaoui
    • 1
  • Léonard Kwuida
    • 1
  1. 1.Département d’informatique et d’ingénierieUniversité du Québec en OutaouaisGatineau (Québec)Canada

Personalised recommendations