Advertisement

On Two Kinds of Dataset Decomposition

  • Pavel EmelyanovEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10861)

Abstract

We consider a Cartesian decomposition of datasets, i.e. finding datasets such that their unordered Cartesian product yields the source set, and some natural generalization of this decomposition. In terms of relational databases, this means reversing the SQL CROSS JOIN and INNER JOIN operators (the last is equipped with a test verifying the equality of a tables attribute to another tables attribute). First we outline a polytime algorithm for computing the Cartesian decomposition. Then we describe a polytime algorithm for computing a generalized decomposition based on the Cartesian decomposition. Some applications and relating problems are discussed.

Keywords

Data analysis Databases Decision tables Decomposition Knowledge discovery Functional dependency Compactification Optimization of boolean functions 

References

  1. 1.
    Bioch, J.C.: The complexity of modular decomposition of boolean functions. Discrete Appl. Math. 149(1–3), 1–13 (2005)MathSciNetCrossRefGoogle Scholar
  2. 2.
    Bohanec, M., Zupan, B.: A function-decomposition method for development of hierarchical multi-attribute decision models. Decis. Support Syst. 36(3), 215–233 (2004)CrossRefGoogle Scholar
  3. 3.
    Emelyanov, P.: Cartesian decomposition of tables. Transact SQL. http://algo.nsu.ru/CartesianDecomposition.sql
  4. 4.
    Emelyanov, P.: AND–decomposition of boolean polynomials with prescribed shared variables. In: Govindarajan, S., Maheshwari, A. (eds.) CALDAM 2016. LNCS, vol. 9602, pp. 164–175. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-29221-2_14CrossRefzbMATHGoogle Scholar
  5. 5.
    Emelyanov, P., Ponomaryov, D.: Algorithmic issues of conjunctive decomposition of boolean formulas. Program. Comput. Softw. 41(3), 162–169 (2015)MathSciNetCrossRefGoogle Scholar
  6. 6.
    Emelyanov, P., Ponomaryov, D.: On tractability of disjoint AND-decomposition of boolean formulas. In: Voronkov, A., Virbitskaite, I. (eds.) PSI 2014. LNCS, vol. 8974, pp. 92–101. Springer, Heidelberg (2015).  https://doi.org/10.1007/978-3-662-46823-4_8CrossRefzbMATHGoogle Scholar
  7. 7.
    Emelyanov, P., Ponomaryov, D.: Cartesian decomposition in data analysis. In: Proceedings of the Siberian Symposium on Data Science and Engineering (SSDSE 2017), pp. 55–60 (2017)Google Scholar
  8. 8.
    Fagin, R., Vardi, M.: The theory of data dependencies: a survey. In: Mathematics of Information Processing: Proceedings of Symposia in Applied Mathematics, vol. 34, pp. 19–71. AMS, Providence (1986)Google Scholar
  9. 9.
    Fayyad, U., Piatetsky-Shapiro, G., Smyth, P.: From data mining to knowledge discovery in databases. AI Mag. 17(3), 37–54 (1996)Google Scholar
  10. 10.
    Liu, J., Li, J., Liu, C., Chen, Y.: Discover dependencies from data - a review. IEEE Trans. Knowl. Data Eng. 24(2), 251–264 (2012)CrossRefGoogle Scholar
  11. 11.
    Mankowski, M., Łuba, T., Jankowski, C.: Evaluation of decision table decomposition using dynamic programming classifiers. In: Suraj, Z., Czaja, L. (eds.) Proceedings of the 24\(^{th}\) International Workshop on Concurrency, Specification and Programming (CS&P 2015), pp. 34–43 (2015)Google Scholar
  12. 12.
    Papenbrock, T., Ehrlich, J., Marten, J., Neubert, T., Rudolph, J., Schoenberg, M., Zwiener, J., Naumann, F.: Functional dependency discovery: an experimental evaluation of seven algorithms. Proc. VLDB Endowment 8(10), 1082–1093 (2015)CrossRefGoogle Scholar
  13. 13.
    Savnik, I., Flach, P.: Discovery of multivalued dependencies from relations. Intell. Data Anal. 4(3–4), 195–211 (2000)zbMATHGoogle Scholar
  14. 14.
    Thalheim, B.: An overview on semantical constraints for database models. In: Proceedings of the 6th International Conference on Intellectual Systems and Computer Science, pp. 81–102 (1996)Google Scholar
  15. 15.
    Vanthienen, J.: Rules as data: decision tables and relational databases. Bus. Rules J. 11(1) (2010). http://www.brcommunity.com/a2010/b516.html
  16. 16.
    Yan, M., Fu, A.W.: Algorithm for discovering multivalued dependencies. In: Proceedings of the \(10^{th}\) International Conference on Information and Knowledge Management (CIKM 2001), pp. 556–558. ACM, New York (2001)Google Scholar
  17. 17.
    Zupan, B., Bohanec, M.: Experimental evaluation of three partition selection criteria for decision table decomposition. Informatica 22, 207–217 (1998)Google Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  1. 1.A.P. Ershov Institute of Informatics SystemsNovosibirskRussia
  2. 2.Novosibirsk State UniversityNovosibirskRussia

Personalised recommendations