Discovering Association Rules in Incomplete Transactional Databases

Protaziuk, Grzegorz; Rybinski, Henryk

doi:10.1007/978-3-540-71200-8_17

Grzegorz Protaziuk¹ &
Henryk Rybinski¹

Part of the book series: Lecture Notes in Computer Science ((TRS,volume 4374))

574 Accesses
1 Citations

Abstract

The problem of incomplete data in the data mining is well known. In the literature many solutions to deal with missing values in various knowledge discovery tasks were presented and discussed. In the area of association rules the problem was presented mainly in the context of relational data. However, the methods proposed for incomplete relational database can not be easily adapted to incomplete transactional data. In this paper we introduce postulates of a statistically justified approach to discovering rules from incomplete transactional data and present the new approach to this problem, satisfying the postulates.

Research has been supported by the grant No 3 T11C 002 29 received from Polish Ministry of Education and Science.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

eBook: USD 16.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Agrawal, R., Imielinski, T., Swami, A.: Mining Associations Rules between Sets of Items in Large Databases. In: Proc. of the ACM SIGMOD Conference on Management of Data, Washington, USA, pp. 207–216. ACM Press, New York (1993)
Google Scholar
Agrawal, R., Srikant, R.: Fast Algorithms for Mining Association Rules. In: Proc. of the 20th International Conference on Very Large Databases Conference (VLDB), Santiago, Chile, pp. 487–499. Morgan Kaufmann, San Francisco (1994)
Google Scholar
Bayardo, R.J., Agrawal, R.: Mining the Most Interesting Rules. In: Proc. of the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), San Diego, CA, USA, pp. 145–154. ACM Press, New York (1999)
Chapter Google Scholar
Bayardo, R.J., Agrawal, R., Gunopulos, D.: Constraint-Based Rule Mining in Large, Dense Databases. Data Mining and Knowledge Discovery 4(2/3), 217–240 (2000)
Article Google Scholar
Bayardo Jr., R.J.: Efficiently Mining Long Patterns from Databases. In: Proceedings of ACM SIGMOD International Conference on Management of Data, Seattle, ACM Press, New York (1998)
Google Scholar
Breiman, L., et al.: Classification and regression trees. Wadsworth, Belmont (1984)
MATH Google Scholar
Calders, T., Goethals, B.: Mining All Non-derivable Frequent Item Sets. In: Elomaa, T., Mannila, H., Toivonen, H. (eds.) PKDD 2002. LNCS (LNAI), vol. 2431, Springer, Heidelberg (2002)
Chapter Google Scholar
Dardzińska-Głȩbocka, A.: Chase Method Based on Dynamic Knowledge Discovery for Prediction Values in Incomplete Information Systems. PhD thesis, Warsaw (2004)
Google Scholar
Friedman, H.F., Kohavi, R., Yun, Y.: Lazy decision trees. In: Proceedings of the 13th National Conference on Artificial Intelligence, Portland, Oregon (1996)
Google Scholar
Grzymala-Busse, J.W.: Characteristic Relations for Incomplete Data: A Generalization of the Indiscernibility Relation. In: Tsumoto, S., et al. (eds.) RSCTC 2004. LNCS (LNAI), vol. 3066, Springer, Heidelberg (2004)
Google Scholar
Han, J., Pei, J., Yin, Y.: Mining Frequent Patterns without Candidate Generation. In: Proc. Of the 2000 ACM SIGMOD International Conference on Management of Data, Dallas, Texas, USA, 2000. SIGMOD Record 29(2), 1–12 (2000)
Article Google Scholar
Kryszkiewicz, M., Rybinski, H.: Legitimate Approach to Association Rules under Incompleteness. In: Ohsuga, S., Raś, Z.W. (eds.) ISMIS 2000. LNCS (LNAI), vol. 1932, pp. 505–514. Springer, Heidelberg (2000)
Chapter Google Scholar
Kryszkiewicz, M.: Probabilistic Approach to Association Rules in Incomplete Databases. In: Lu, H., Zhou, A. (eds.) WAIM 2000. LNCS, vol. 1846, Springer, Heidelberg (2000)
Chapter Google Scholar
Kryszkiewicz, M.: Concise Representation of Frequent Patterns based on Disjunction-Free Generators. In: Proc. of the 2001 IEEE International Conference on Data Mining (ICDM), San Jose, California, USA, pp. 305–312. IEEE Computer Society Press, Los Alamitos (2001)
Chapter Google Scholar
Kryszkiewicz, M.: Representative Association Rules. In: Wu, X., Kotagiri, R., Korb, K.B. (eds.) PAKDD 1998. LNCS, vol. 1394, pp. 198–209. Springer, Heidelberg (1998)
Google Scholar
Kryszkiewicz, M.: Concise Representations of Frequent Patterns and Association Rules. Habilitation Thesis, Warsaw University of Technology (2002)
Google Scholar
Liu, W.Z., et al.: Techniques for Dealing with Missing Values in Classification. In: Liu, X., Cohen, P.R., Berthold, M.R. (eds.) Advances in Intelligent Data Analysis. Reasoning about Data. LNCS, vol. 1280, Springer, Heidelberg (1997)
Google Scholar
Nayak, J.R., Cook, D.J.: Approximate Association Rule Mining. In: Proceedings of the Fourteenth International Artificial Intelligence Research Society Conference, Key West, Florida (2001)
Google Scholar
Parsons, S.: Current Approach to Handling Imperfect Information in Data and Knowledge Bases. IEEE Transaction on knowledge and data engineering 8 (1996)
Google Scholar
Pawlak, Z.: Rough Sets. International Journal of Information and Computer Sciences 11, 341–356 (1982)
Article MathSciNet Google Scholar
Pawlak, Z.: Rough Sets: Theoretical Aspects of Reasoning about Data, vol. 9. Kluwer Academic Publishers, Dordrecht (1991)
Google Scholar
Pasquier, N., et al.: Discovering Frequent Closed Itemsets for Association Rules. In: Beeri, C., Bruneman, P. (eds.) ICDT 1999. LNCS, vol. 1540, pp. 398–416. Springer, Heidelberg (1998)
Chapter Google Scholar
Protaziuk, G., Soldacki, P., Gancarz, L.: Discovering interesting rules in dense data. In: The Eleventh International Symposium on Intelligent Information Systems, Sopot (2002)
Google Scholar
Bastide, Y., et al.: Mining Minimal Non-redundant Association Rules Using Frequent Closed Itemsets. In: Comp. Logic, pp. 972–986 (2000)
Google Scholar
Quinlan, J.R.: C4.5 Programs for Machine Learning. Morgan Kaufmann, San Mateo (1993)
Google Scholar
Ragel, A., Cremilleux, B.: Treatment of Missing Values for Association Rules. In: Wu, X., Kotagiri, R., Korb, K.B. (eds.) PAKDD 1998. LNCS, vol. 1394, pp. 258–270. Springer, Heidelberg (1998)
Google Scholar
Srikant, R., Vu, Q., Agrawal, R.: Mining Association Rules with Item Constraints. In: Proc. Of the Third International Conference on Knowledge Discovery and Data Mining (KDD), Newport Beach, California, USA, pp. 67–73. AAAI Press, Menlo Park (1997)
Google Scholar
Stefanowski, J., Tsoukias, A.: Incomplete Information Tables and Rough Classification. Int. Journal of Computational Intelligence 17(3), 545–566 (2001)
Article Google Scholar
Stefanowski, J.: Algorytmy indukcji regu decyzyjnych w odkrywaniu wiedzy (Algorithms of Rule Induction for Knowledge Discovery). Habilitation Thesis, Poznan University of Technology, No. 361 (2001)
Google Scholar
Wang, G.: Extension of Rough Set under Incomplete Information Systems. In: Proceedings of the 2002 IEEE International Conf. on Fuzzy Systems, Honolulu, IEEE Computer Society Press, Los Alamitos (2002)
Google Scholar
Zaki, M.J.: Generating Non-Redundant Association Rules. In: Proc. of 6th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Boston, MA, pp. 34–43. ACM Press, New York (2000)
Chapter Google Scholar
Zhang, J., Honavar, V.: Learning Decision Tree Classifiers from Attribute Value Taxonomies and Partially Specified Data. In: Proceedings of the Twentieth International Conference (ICML 2003), Washington, DC (2003)
Google Scholar

Download references

Author information

Authors and Affiliations

Institute of Computer Science, Warsaw University of Technology,
Grzegorz Protaziuk & Henryk Rybinski

Authors

Grzegorz Protaziuk
View author publications
You can also search for this author in PubMed Google Scholar
Henryk Rybinski
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

James F. Peters Andrzej Skowron Ivo Düntsch Jerzy Grzymała-Busse Ewa Orłowska Lech Polkowski

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Protaziuk, G., Rybinski, H. (2007). Discovering Association Rules in Incomplete Transactional Databases. In: Peters, J.F., Skowron, A., Düntsch, I., Grzymała-Busse, J., Orłowska, E., Polkowski, L. (eds) Transactions on Rough Sets VI. Lecture Notes in Computer Science, vol 4374. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-71200-8_17

Download citation

DOI: https://doi.org/10.1007/978-3-540-71200-8_17
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-71198-8
Online ISBN: 978-3-540-71200-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics