Knowledge Discovery with Words Using Cartesian Granule Features: An Analysis for Classification Problems

  • James G. Shanahan
Part of the Studies in Fuzziness and Soft Computing book series (STUDFUZZ, volume 95)

Abstract

Cartesian granule features were originally introduced to address some of the shortcomings of existing forms of knowledge representation such as decomposition error and transparency, and also to enable the paradigm modelling with words through related learning algorithms. This chapter presents a detailed analysis of the impact of granularity on Cartesian granule features models that are learned from example data in the context of classification problems. This analysis provides insights on how to effectively model problems using Cartesian granule features using various levels of granulation, granule characterizations, granule dimensionalies and granule generation techniques. Other modelling with words approaches such as the data browser [1, 2] and fuzzy probabilistic decision trees [3] are also examined and compared. In addition, this chapter provides a useful platform for understanding many other learning algorithms that may or may not explicitly manipulate fuzzy events. For example, it is shown how a naive Bayes classifier is equivalent to crisp Cartesian granule feature classifiers under certain conditions.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. [1]
    Baldwin, J. F., Martin, T. P., and Pilsworth, B. W. (1995). FRIL - Fuzzy and Evidential Reasoning in A.I. Research Studies Press(Wiley Inc.).Google Scholar
  2. [2]
    Baldwin, J. F., and Martin, T. P. (1995). “Fuzzy Modelling in an Intelligent Data Browser.” In the proceedings of FUZZ-IEEE, Yokohama, Japan, 11711176.Google Scholar
  3. [3]
    Baldwin, J. F., Lawry, J., and Martin, T. P. (1997). “Mass assignment fuzzy ID3 with applications.” In the proceedings of Fuzzy Logic: Applications and Future Directions Workshop, London, UK, 278–294.Google Scholar
  4. [4]
    Quinlan, J. R. (1986). “Induction of Decision Trees”, Machine Learning, 1 (1): 86–106.Google Scholar
  5. [5]
    Quinlan, J. R. (1993). C4. 5: Programs for Machine Learning. Morgan Kaufmann, San Mateo, CA.Google Scholar
  6. [6]
    Ruspini, E. H. (1969). “A New Approach to Clustering”, Inform. Control, 15 (1): 22–32.MATHCrossRefGoogle Scholar
  7. [7]
    Zadeh, L. A. (1994). “Soft Computing and Fuzzy Logic”, IEEE Software, 11 (6): 48–56.CrossRefGoogle Scholar
  8. [8]
    Zadeh, L. A. (1996). “Fuzzy Logic = Computing with Words”, IEEE Transactions on Fuzzy Systems, 4 (2): 103–111.MathSciNetCrossRefGoogle Scholar
  9. [9]
    Shanahan, J. G. (2000). Soft computing for knowledge discovery: Introducing Cartesian granule features. Kluwer Academic Publishers, Boston.Google Scholar
  10. [10]
    Baldwin, J. F., Martin, T. P., and Shanahan, J. G. (1998). “Aggregation in Cartesian granule feature models.” In the proceedings of IPMU, Paris, 6.Google Scholar
  11. [11]
    Shanahan, J. G. (1998). “Cartesian Granule Features: Knowledge Discovery of Additive Models for Classification and Prediction”, PhD Thesis, Dept. of Engineering Mathematics, University of Bristol, Bristol, UK.Google Scholar
  12. [12]
    Baldwin, J. F. (1993). “Evidential Support logic, FRIL and Case Based Reasoning”, Int. J. of Intelligent Systems, 8 (9): 939–961.MATHCrossRefGoogle Scholar
  13. [13]
    Baldwin, J. F., Lawry, J., and Martin, T. P. (1996). “Efficient Algorithms for Semantic Unification.” In the proceedings of IPMU, Granada, Spain, 527532.Google Scholar
  14. [14]
    Lindley, D. V. (1985). Making decisions. John Wiley, Chichester.Google Scholar
  15. [15]
    Kohavi, R., and John, G. H. (1997). “Wrappers for feature selection”, Artificial Intelligence, 97: 273–324.MATHCrossRefGoogle Scholar
  16. [16]
    Baldwin, J. F. (1995). “Machine Intelligence using Fuzzy Computing.” In the proceedings of ACRC Seminar (November),University of Bristol.Google Scholar
  17. [17]
    Miller, G. A. (1956). “The magical number seven, plus or minus two: some limits on our capacity to process information”, Psychological Review, 63: 8197.CrossRefGoogle Scholar
  18. [18]
    Bezdek, J. C. (1981). Pattern Recognition with Fuzzy Objective Function Algorithms. Plenum Press, New York.MATHCrossRefGoogle Scholar
  19. [19]
    Kohonen, T. (1984). Self-Organisation and Associative Memory. Springer-Verlag, Berlin.Google Scholar
  20. [20]
    Sugeno, M., and Yasukawa, T. (1993). “A Fuzzy Logic Based Approach to Qualitative Modelling”, IEEE Trans on Fuzzy Systems, 1(1): 7–31.Google Scholar
  21. [21]
    Zadeh, L. A. (1994). “Soft computing”, LIFE Seminar, LIFE Laboratory, Yokohama, Japan (February, 24), published in SOFT Journal, 6:1–10.Google Scholar
  22. [22]
    Baldwin, J. F., Martin, T. P., and Shanahan, J. G. (1997). “Structure identification of fuzzy Cartesian granule feature models using genetic programming.” In the proceedings of IJCAI Workshop on Fuzzy Logic in Artificial Intelligence, Nagoya, Japan, 1–11.Google Scholar
  23. [23]
    Silverman, B. W. (1986). Density estimation for statistics and data analysis. Chapman and Hall, New York.MATHGoogle Scholar
  24. [24]
    Baldwin, J. F., and Pilsworth, B. W. (1997). “Genetic Programming for Knowledge Extraction of Fuzzy Rules.” In the proceedings of Fuzzy Logic: Applications and Future Directions Workshop, London, UK, 238–251.Google Scholar
  25. [25]
    Baldwin, J. F., and Martin, T. P. (1999). “Basic concepts of a fuzzy logic data browser with applications”, Report No. ITRC 250, Dept. of Engineering Maths, University of Bristol.Google Scholar
  26. [26]
    Weiss, S. M., and Indurkhya, N. (1998). Predictive data mining: a practical guide. Morgan Kaufmann.Google Scholar
  27. [27]
    Zell, A., Mamier, G., Vogt, M., and Mache, N. (1995). SNNS (Stuggart Neural Network Simulator) Version 4.1. Institute for Parallel and Distributed High Performance Systems (NPR), Applied Computer Science, University of Stuggart, Stuggart, Germany.Google Scholar
  28. [28]
    Moller, M. F. (1993). “A scaled conjugate gradient algorithm for fast supervised learning”, Neural Networks, 6: 525–533.CrossRefGoogle Scholar
  29. [29]
    Shanahan, J. G. (2000). “A comparison between naive Bayes classifiers and product Cartesian granule feature models”,Report No. In preparation, XRCE.Google Scholar
  30. [30]
    Breiman, L. (1996). “Bagging predictors”, Machine Learning, 66: 34–53.Google Scholar
  31. [31]
    Baldwin, J. F. (1992). “Fuzzy and Probabilistic Uncertainties”, In Encyclopaedia of AI, 2nd ed., Shapiro, ed., 528–537.Google Scholar
  32. [32]
    Baldwin, J. F. (1991). “Combining evidences for evidential reasoning”, International Journal of Intelligent Systems, 6 (6): 569–616.MATHCrossRefGoogle Scholar
  33. [33]
    Sudkamp, T. (1992). “On probability-possibility transformation”, Fuzzy Sets and Systems, 51: 73–81.MathSciNetMATHCrossRefGoogle Scholar
  34. [34]
    Zadeh, L. A. (1968). “Probability Measures of Fuzzy Events”, Journal of Mathematical Analysis and Applications, 23: 421–427.MathSciNetMATHCrossRefGoogle Scholar
  35. [35]
    Dubois, D., and Prade, H. (1983). “Unfair coins and necessary measures: towards a possibilistic interpretation of histograms”, Fuzzy sets and systems, 10: 15–20.MathSciNetMATHCrossRefGoogle Scholar
  36. [36]
    Baldwin, J. F. (1991). “A Theory of Mass Assignments for Artificial Intelligence”, In IJCAI ‘81 Workshops on Fuzzy Logic and Fuzzy Control, Sydney, Australia, Lecture Notes in Artificial Intelligence, A. L. Ralescu, ed., 22–34.Google Scholar
  37. [37]
    Klir, K. (1990). “A principle of uncertainty and information invariance”, International journal of general systems, 17 (2, 3): 249–275.MATHCrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2002

Authors and Affiliations

  • James G. Shanahan
    • 1
  1. 1.Grenoble LaboratoryXerox Research Centre Europe (XRCE)MeylanFrance

Personalised recommendations