Skip to main content
Log in

Classifying Cinnamomums using rough sets classifier based on interval-discretization

  • Original Article
  • Published:
Plant Systematics and Evolution Aims and scope Submit manuscript

Abstract

Classification, which is the task of assigning objects to one of several predefined categories, is a pervasive problem that encompasses many diverse applications. Decision tree classifier, which is a simple yet widely used classification technique, employs training data to yield decision rules; moreover, it can create thresholds and then split the list of continuous attributes into descrete intervals for handling continuous attributes (Quinlan in Journal of Artificial Intelligence Research 4:77–90, 1996). Rough set theory (Pawlak in International Journal of Computer and Information Sciences 11:341–356, 1982; International Journal of Man-Machine Studies 20:469–483, 1984; Rough sets: theoretical aspects of reasoning about data. Kluwer, Dordrecht, 1991) has been applied to a wide variety of decision analysis problems for the extraction of rules from databases. This paper proposes a hybrid approach that takes advantage of combining decision tree and rough sets classifier and applies it to plant classification. The introduced approach starts with decision tree classifier (C4.5) as preprocessing technique to make interval-discretization, subsequently, and uses rough set method for extracting rules. The proposed approach aims at finding out classification rules via analyzing lamina attributes (leaf stalk, leaf width, leaf length, length/width ratio) of Cinnamomum, which are gathered and measured by plant specialists in the field of Taiwan. A comparison with the widely used algorithms (e.g., decision tree, multilayer perceptrons, naïve Bayes, and rough sets classifier) is carried out to show numerous advantages of the proposed approach. Finally, employing with test data in which species are unknown, results of classification are approved by consulting the relative plant specialists.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

  • Breiman L, Friedman JH, Olshen R, Stone CJ (1984) Classification and regression trees. Chapman and Hall, New York

    Google Scholar 

  • Catlett J (1991) On changing continuous attributes into ordered discrete attributes, European Working Session on Learning

  • Dimitras AL, Slowinski R, Susmaga R, Zopounidis C (1999) Business failure prediction using rough set. European Journal of Operation Research 114:263–280

    Article  Google Scholar 

  • Duda RO, Hart PE (1973) Pattern classification and scene analysis. (Q327.D83) Wiley, New York, p 218. ISBN 0–471-22361-1

  • Fayyad UM, Irani KB (1992) Technical note on the handling in decision tree of continuous-valued attributes generation. Machine Learning 8:87–102

    Google Scholar 

  • Grzymala-Busse JW (1997) A new version of the rule induction system LERS. Fundamenta Informaticae 31:27–39

    Google Scholar 

  • Han J, Kamber M (2006) Data mining: concepts and techniques. Morgan Kaufmann, San Francisco

    Google Scholar 

  • Judd WS, Campbell CS, Kellogg EA, Stevens PF (1999) Plant systematics: a phylogenetic approach. Sinauer Associates, Sunderland, MA

    Google Scholar 

  • Kass GV (1980) An exploratory technique for investigating large quantities of categorical data. Applied Statistics 29:119–127

    Article  Google Scholar 

  • Kerber R (1992) ChiMerge: discretization of numeric attributes, Proceedings AAAI–92, ninth international conference artificial intelligence, pp 123–128

  • Khuroo AA, Dar GH, Khan ZS, Malik AH (2007) Exploring an inherent interface between taxonomy and biodiversity: current problems and future challenges. Journal for Nature Conservation 15:256–261

    Article  Google Scholar 

  • Lawrence GHM (1955) Taxonomy of vascular plants. Prentice Hall College Div, New York

    Google Scholar 

  • Mitra S, Acharya T (2003) Data mining: multimedia. soft computing, and bioinformatics. Wiley, New Jersey

    Google Scholar 

  • Murthy SK (1998) Automatic construction of decision trees from data: a multi-disciplinary survey. Data Mining and Knowledge Discovery 2(4):345–389

    Article  Google Scholar 

  • Pawlak Z (1982) Rough sets. International Journal of Computer and Information Sciences 11:341–356

    Article  Google Scholar 

  • Pawlak Z (1984) Rough classification. International Journal of Man-Machine Studies 20:469–483

    Article  Google Scholar 

  • Pawlak Z (1991) Rough sets: theoretical aspects of reasoning about data. Kluwer, Dordrecht

    Google Scholar 

  • Pearl J (1988) Probabilistic reasoning in intelligent systems: networks of plausible inference. Morgan Kaufmann, San Francisco

    Google Scholar 

  • Predki B, Slowinski R, Stefanowski J, Susmaga R, Wilk Sz (1998) ROSE-software implementation of the rough set theory. In: Polkowski L, Skowron A (eds) “Rough Sets and Current Trends in Computing”, lecture notes in artificial intelligence, vol vol 1424. Springer-Verlag, Berlin, pp 605–608

    Google Scholar 

  • Quinlan JR (1986) Induction of decision trees. Machine Learning 1:81–106

    Google Scholar 

  • Quinlan JR (1993) C4.5: programs for machine learning. Morgan Kaufmann, San Mateo

    Google Scholar 

  • Quinlan JR (1996) Improved use of continuous attributes in c4.5. Journal of Artificial Intelligence Research 4:77–90

    Google Scholar 

  • Rumelhart DE, Hinton GE, Williams RJ (1986) Learning internal representation by error propagation. Parallel Distributed Processing 1:318–362

    Google Scholar 

  • Safavian SR, Landgrebe D (1998) A survey of decision tree classifier methodology. IEEE Trans. Systems, Man and Cybernetics 22:660–674, May/June

    Google Scholar 

  • Samuel BJ, Luchsinger AE (1979) Plant systematics. McGraw-Hill, Columbus

    Google Scholar 

  • Skrypnyk I (2002) Comparison of feature selection strategies for hearing impairments diagnostics, proceedings of the 15th IEEE symposium on computer-based medical systems (CBMS 2002)

  • Tan P, Steinbach M, Kumar V (2006) Introduction to data mining. Pearson Education, New York

    Google Scholar 

  • Yang Y, Liu H, Lu S (1999) Introduction to vascular plants in Taiwan. Council of Agriculture, Taipei, Taiwan

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yao-Hsien Chen.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Cheng, CH., Chen, YH. & Liu, JW. Classifying Cinnamomums using rough sets classifier based on interval-discretization. Plant Syst Evol 280, 89–97 (2009). https://doi.org/10.1007/s00606-009-0161-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00606-009-0161-0

Keywords

Navigation