Abstract
The problem of incomplete data in the data mining is well known. In the literature many solutions to deal with missing values in various knowledge discovery tasks were presented and discussed. In the area of association rules the problem was presented mainly in the context of relational data. However, the methods proposed for incomplete relational database can not be easily adapted to incomplete transactional data. In this paper we introduce postulates of a statistically justified approach to discovering rules from incomplete transactional data and present the new approach to this problem, satisfying the postulates.
Research has been supported by the grant No 3 T11C 002 29 received from Polish Ministry of Education and Science.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Agrawal, R., Imielinski, T., Swami, A.: Mining Associations Rules between Sets of Items in Large Databases. In: Proc. of the ACM SIGMOD Conference on Management of Data, Washington, USA, pp. 207–216. ACM Press, New York (1993)
Agrawal, R., Srikant, R.: Fast Algorithms for Mining Association Rules. In: Proc. of the 20th International Conference on Very Large Databases Conference (VLDB), Santiago, Chile, pp. 487–499. Morgan Kaufmann, San Francisco (1994)
Bayardo, R.J., Agrawal, R.: Mining the Most Interesting Rules. In: Proc. of the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), San Diego, CA, USA, pp. 145–154. ACM Press, New York (1999)
Bayardo, R.J., Agrawal, R., Gunopulos, D.: Constraint-Based Rule Mining in Large, Dense Databases. Data Mining and Knowledge Discovery 4(2/3), 217–240 (2000)
Bayardo Jr., R.J.: Efficiently Mining Long Patterns from Databases. In: Proceedings of ACM SIGMOD International Conference on Management of Data, Seattle, ACM Press, New York (1998)
Breiman, L., et al.: Classification and regression trees. Wadsworth, Belmont (1984)
Calders, T., Goethals, B.: Mining All Non-derivable Frequent Item Sets. In: Elomaa, T., Mannila, H., Toivonen, H. (eds.) PKDD 2002. LNCS (LNAI), vol. 2431, Springer, Heidelberg (2002)
Dardzińska-Głȩbocka, A.: Chase Method Based on Dynamic Knowledge Discovery for Prediction Values in Incomplete Information Systems. PhD thesis, Warsaw (2004)
Friedman, H.F., Kohavi, R., Yun, Y.: Lazy decision trees. In: Proceedings of the 13th National Conference on Artificial Intelligence, Portland, Oregon (1996)
Grzymala-Busse, J.W.: Characteristic Relations for Incomplete Data: A Generalization of the Indiscernibility Relation. In: Tsumoto, S., et al. (eds.) RSCTC 2004. LNCS (LNAI), vol. 3066, Springer, Heidelberg (2004)
Han, J., Pei, J., Yin, Y.: Mining Frequent Patterns without Candidate Generation. In: Proc. Of the 2000 ACM SIGMOD International Conference on Management of Data, Dallas, Texas, USA, 2000. SIGMOD Record 29(2), 1–12 (2000)
Kryszkiewicz, M., Rybinski, H.: Legitimate Approach to Association Rules under Incompleteness. In: Ohsuga, S., Raś, Z.W. (eds.) ISMIS 2000. LNCS (LNAI), vol. 1932, pp. 505–514. Springer, Heidelberg (2000)
Kryszkiewicz, M.: Probabilistic Approach to Association Rules in Incomplete Databases. In: Lu, H., Zhou, A. (eds.) WAIM 2000. LNCS, vol. 1846, Springer, Heidelberg (2000)
Kryszkiewicz, M.: Concise Representation of Frequent Patterns based on Disjunction-Free Generators. In: Proc. of the 2001 IEEE International Conference on Data Mining (ICDM), San Jose, California, USA, pp. 305–312. IEEE Computer Society Press, Los Alamitos (2001)
Kryszkiewicz, M.: Representative Association Rules. In: Wu, X., Kotagiri, R., Korb, K.B. (eds.) PAKDD 1998. LNCS, vol. 1394, pp. 198–209. Springer, Heidelberg (1998)
Kryszkiewicz, M.: Concise Representations of Frequent Patterns and Association Rules. Habilitation Thesis, Warsaw University of Technology (2002)
Liu, W.Z., et al.: Techniques for Dealing with Missing Values in Classification. In: Liu, X., Cohen, P.R., Berthold, M.R. (eds.) Advances in Intelligent Data Analysis. Reasoning about Data. LNCS, vol. 1280, Springer, Heidelberg (1997)
Nayak, J.R., Cook, D.J.: Approximate Association Rule Mining. In: Proceedings of the Fourteenth International Artificial Intelligence Research Society Conference, Key West, Florida (2001)
Parsons, S.: Current Approach to Handling Imperfect Information in Data and Knowledge Bases. IEEE Transaction on knowledge and data engineering 8 (1996)
Pawlak, Z.: Rough Sets. International Journal of Information and Computer Sciences 11, 341–356 (1982)
Pawlak, Z.: Rough Sets: Theoretical Aspects of Reasoning about Data, vol. 9. Kluwer Academic Publishers, Dordrecht (1991)
Pasquier, N., et al.: Discovering Frequent Closed Itemsets for Association Rules. In: Beeri, C., Bruneman, P. (eds.) ICDT 1999. LNCS, vol. 1540, pp. 398–416. Springer, Heidelberg (1998)
Protaziuk, G., Soldacki, P., Gancarz, L.: Discovering interesting rules in dense data. In: The Eleventh International Symposium on Intelligent Information Systems, Sopot (2002)
Bastide, Y., et al.: Mining Minimal Non-redundant Association Rules Using Frequent Closed Itemsets. In: Comp. Logic, pp. 972–986 (2000)
Quinlan, J.R.: C4.5 Programs for Machine Learning. Morgan Kaufmann, San Mateo (1993)
Ragel, A., Cremilleux, B.: Treatment of Missing Values for Association Rules. In: Wu, X., Kotagiri, R., Korb, K.B. (eds.) PAKDD 1998. LNCS, vol. 1394, pp. 258–270. Springer, Heidelberg (1998)
Srikant, R., Vu, Q., Agrawal, R.: Mining Association Rules with Item Constraints. In: Proc. Of the Third International Conference on Knowledge Discovery and Data Mining (KDD), Newport Beach, California, USA, pp. 67–73. AAAI Press, Menlo Park (1997)
Stefanowski, J., Tsoukias, A.: Incomplete Information Tables and Rough Classification. Int. Journal of Computational Intelligence 17(3), 545–566 (2001)
Stefanowski, J.: Algorytmy indukcji regu decyzyjnych w odkrywaniu wiedzy (Algorithms of Rule Induction for Knowledge Discovery). Habilitation Thesis, Poznan University of Technology, No. 361 (2001)
Wang, G.: Extension of Rough Set under Incomplete Information Systems. In: Proceedings of the 2002 IEEE International Conf. on Fuzzy Systems, Honolulu, IEEE Computer Society Press, Los Alamitos (2002)
Zaki, M.J.: Generating Non-Redundant Association Rules. In: Proc. of 6th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Boston, MA, pp. 34–43. ACM Press, New York (2000)
Zhang, J., Honavar, V.: Learning Decision Tree Classifiers from Attribute Value Taxonomies and Partially Specified Data. In: Proceedings of the Twentieth International Conference (ICML 2003), Washington, DC (2003)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer Berlin Heidelberg
About this chapter
Cite this chapter
Protaziuk, G., Rybinski, H. (2007). Discovering Association Rules in Incomplete Transactional Databases. In: Peters, J.F., Skowron, A., Düntsch, I., Grzymała-Busse, J., Orłowska, E., Polkowski, L. (eds) Transactions on Rough Sets VI. Lecture Notes in Computer Science, vol 4374. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-71200-8_17
Download citation
DOI: https://doi.org/10.1007/978-3-540-71200-8_17
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-71198-8
Online ISBN: 978-3-540-71200-8
eBook Packages: Computer ScienceComputer Science (R0)