Skip to main content
Log in

A reduction algorithm meeting users’ requirements

  • Published:
Journal of Computer Science and Technology Aims and scope Submit manuscript

Abstract

Generally a database encompasses various kinds of knowledge and is shared by many users. Different users may prefer different kinds of knowledge. So it is important for a data mining algorithm to output specific knowledge according to users’ current requirements (preference). We call this kind of data mining requirement-oriented knowledge discovery (ROKD). When the rough set theory is used in data mining, the ROKD problem is how to find a reduct and corresponding rules interesting for the user. Since reducts and rules are generated in the same way, this paper only concerns with how to find a particular reduct. The user’s requirement is described by an order of attributes, called attribute order, which implies the importance of attributes for the user. In the order, more important attributes are located before less important ones. Then the problem becomes how to find a reduct including those attributes anterior in the attribute order. An approach to dealing with such a problem is proposed. And its completeness for reduct is proved. After that, three kinds of attribute order are developed to describe various user requirements.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Han J, Kamber M. Data Mining: Concepts and Techniques, Morgan Kaufmann 2000.

  2. Catlett J. Megainduction: Machine learning on very large databases [Dissertation]. Dept. of Computer Science, University of Sydney, Australia, 1991.

    Google Scholar 

  3. Musick R, Catlett J, Russell S. Decision theoretic subsampling for induction on large databases. InProceedings of the Tenth International Conference on Machine Learning, Utgoff P E (ed.), San Francisco, CA: Morgan Kaufmann, 1992, pp. 212–219.

    Google Scholar 

  4. Chan P K, Stolfo S J. Learning arbiter and combiner trees from partitioned data for scaling machine learning. InProceedings of the First International Conference on Knowledge Discovery and Data Mining, Menlo Park, CA: AAAI Press, 1995, pp. 39–44.

    Google Scholar 

  5. Shafer J, Agrawal R, Mehta M. SPRINT: A scalable parallel classifier for data mining. InProceedings of the Twenty-Second VLDB Conference, San Francisco, CA: Morgan Kaufmann, 1996, pp. 544–555.

    Google Scholar 

  6. Mehta M, Agrawal R, Rissanen J. SLIQ: A fast scalable classifier for data mining. In5th Int. Conf. on Extending Database Technology, New York: Springer, 1996, pp. 18–32.

    Google Scholar 

  7. Provost F, Kolluri V. Scaling up inductive algorithms: An overview, InProceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD-97), 1997, pp. 239–242.

  8. Ronen F, Willi K, Amir Z. Visualization techniques to explore data mining results for document collections. InProceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD-97), AAAI Press, 1997, pp. 16–23.

  9. Utgoff P, Mitchell T. Acquisition of appropriate bias for inductive concept learning. InProceedings of the National Conferense on Artificial Intelligence AAAI-82, Pittsburgh, 1982, pp. 414–417.

  10. Utgoff P. Shift of bias for inductive concept learning. InMachine Learning: An Artificial Intelligence Approach, Michalski R S, Carbonell J G, Mitchell T M (eds.), Volume II, California: Morgan Kaufmann, 1986, pp. 107–148.

    Google Scholar 

  11. Rendell L. A general framework for induction and a study of selective induction.Machine Learning, 1986, 1(2): 177–226.

    Google Scholar 

  12. Haussler D. Quantifying inductive bias: AI learning algorithms and Valiant’s learning framework.Artificial Intelligence, 1988, 36(2): 177–221.

    Article  MATH  MathSciNet  Google Scholar 

  13. Machine Learning, Vol.20, Issue 1/2,Special Issue of ML on Bias Selection, July, 1995.

  14. Dietterich T G, Kong E B. Machine learning bias, statistical bias, and statistical variance of decision tree algorithms. Tech. Rep., Department of Computer Science, Oregon State University, Corvallis, Oregon, 1995.

    Google Scholar 

  15. Wilson D R, Tony R M. Bias and the Probability of Generalization. InProc. the Int. Conf. Intelligent Information Systems (IIS’97), 1997, pp. 108–114.

  16. Turney P D. Cost-sensitive classification: Empirical evaluation of a hybrid genetic decision tree induction algorithm.Journal for AI Research, 1995, 2: 369–409.

    Google Scholar 

  17. Turney P D. Technical note: Bias and the quantification of stability.Machine Learning, 1995, 20(1–2): 23–33.

    Google Scholar 

  18. Wang Jue, Wang Ju. Reduction algorithms based on discernibility matrix: The ordered attributes method.J. Computer Science and Technology, 2001, 16(6): 489–504.

    Article  MATH  Google Scholar 

  19. Pawlak Z. Rough sets.Int. J. Comput. Inform. Sci., 1982, 11(5): 341–356.

    Article  MATH  MathSciNet  Google Scholar 

  20. Polkowski L, Skowron A (eds.), Rough sets in knowledge discovery. Heidelberg: Physica-Verlag, 1998.

    Google Scholar 

  21. Duntsch I, Gediga G. Rough set data analysis.Encyclopedia of Computer Science and Technology, 2000, 43(Supplement, 28): 281–301.

    Google Scholar 

  22. Greco S, Matarazzo B, Slowinski R. Rough approximation of a preference relation by dominance relations.European Journal of Operational Research, 1999, 117(1): 63–83.

    Article  MATH  MathSciNet  Google Scholar 

  23. Greco S, Matarazzo B, Slowinski R. The use of rough sets and fuzzy sets in MCDM. Gal T, Stewart T, Hanne T (eds.), Chapter 14,Advances in Multiple Criteria Decision Making, Kluwer Academic Publishers, Dordrecht, Boston, 1999, pp. 14.1–14.59.

    Google Scholar 

  24. Greco S, Matarazzo B, Slowinski R. Rough sets theory for multicriteria decision analysis.European Journal of Operational Research, 2001, 129(1): 1–47.

    Article  MATH  MathSciNet  Google Scholar 

  25. Liu B, Hsu W, Chen S. Using general impressions to analyze discovered classification rules.Knowledge Discovery and Data Mining, 1997, pp. 31–36.

  26. Bazan J, Skowron A, Synak P. Discovery of decision rules from experimental data. InProc. the Third International Workshop on Rough Sets and Soft Computing, Lin T L (ed.), San Jose CA, November 10–12, 1994, pp. 526–533.

  27. Bazan J, Skowron A, Synak P. Dynamic reducts as a tool for extracting laws from decision tables. InProc. the Symp. Methodologies for Intelligent Systems, Charlotte, NC, Lecture Notes in Artificial Intelligence, Berlin: Springer-Verlag, 1994, pp. 346–355.

    Google Scholar 

  28. Wang J, Cui J, Zhao K. Investigation on AQ11, ID3 and the principle of discernibility matrix.J. Computer Science and Technology, 2001, 16(1): 1–12.

    Article  MathSciNet  Google Scholar 

  29. Wroblewski J. Finding minimal reducts using genetic algorithms. InProceedings of the International Workshop on Rough Sets Soft Computing at Second Annual Joint Conference on Information Sciences (JCIS’95), Wang P P (ed.), Wrightsville Beach, North Carolina, USA, September 28–October 1, 1995, pp. 186–189.

    Google Scholar 

  30. Wroblewski J. Genetic algorithms in decomposition and classification problems. InRough Sets in Knowledge Discovery 2: Applications, Case Studies and Software Systems, Polkowski L, Skowron A (eds.), Physica-Verlag, Heidelberg, 1998, pp. 472–492.

    Google Scholar 

  31. Skowron A, Rauszer C. The discernibility matrices and functions in information systems. Intelligent Decision Support Handbook of Applications and Advances of the Rough Sets Theory, Slowinski R (eds.), 1991, pp. 331–362.

  32. Wang X F, Wang R S, Wang J. Sustainability knowledge mining from human development database. InThird Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD99), Zhong N, Zhou L Z (eds.), 1999, pp. 279–283.

  33. Ziarko W. The discovery, analysis, and representation of data dependencies in databases. InIJCAI Workshop on Knowledge Discovery in Databases Proceedings, Piatetsky-Shapiro G, Frawley W J (eds.), AAAI/MIT Press, 1991, pp. 195–209.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zhao Kai.

Additional information

This work is supported by the National Key Project for Prime Research on Image, Speech, Natural Language Understanding and Knowledge Mining (NKBRSF, Grant No. G 1998030508).

ZHAO Kai received his B.S. degree from Beijing Institute of Technology in 1993, and Ph.D. degree from the Institute of Automation, the Chinese Academy of Sciences. His research interests are adaptation systems, genetic programming and data mining.

WANG Jue is a professor of computer science and artificial intelligence at the Institute of Automation, the Chinese Academy of Sciences. His research interests include artificial neural network, machine learning and knowledge discovery in database.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhao, K., Wang, J. A reduction algorithm meeting users’ requirements. J. Comput. Sci. & Technol. 17, 578–593 (2002). https://doi.org/10.1007/BF02948826

Download citation

  • Received:

  • Revised:

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF02948826

Keywords

Navigation