On changing continuous attributes into ordered discrete attributes
- 161 Citations
- 428 Downloads
Abstract
The large real-world datasets now commonly tackled by machine learning algorithms are often described in terms of attributes whose values are real numbers on some continuous interval, rather than being taken from a small number of discrete values. Many algorithms are able to handle continuous attributes, but learning requires far more CPU time than for a corresponding task with discrete attributes. This paper describes how continuous attributes can be converted economically into ordered discrete attributes before being given to the learning system. Experimental results from a wide variety of domains suggest this change of representation does not often result in a significant loss of accuracy (in fact it sometimes significantly improves accuracy), but offers large reductions in learning time, typically more than a factor of 10 in domains with a large number of continuous attributes.
Keywords
Discretisation empirical concept learning induction of decision treesPreview
Unable to display preview. Download preview PDF.
References
- Amarel, S. (1968). On the representation of problems of reasoning about action, In D. Michie (Ed.), Machine Intelligence 3, Edinburgh University Press.Google Scholar
- Breiman, L., Friedman, J. H., Olshen, R. A., Stone C. J. (1984). Classification and regression trees. Belmont, CA: Wadsworth International Group.Google Scholar
- Carter, C., & Catlett, J. (1987). Assessing credit card applications using machine learning. IEEE Expert, Fall 1987, 71–79.Google Scholar
- Kononenko, I., Bratko, I., & Roskar, E. (1984). Experiments in automatic learning of medical diagnostic rules, Technical Report, Jozef Stefan Institute, Ljubljana.Google Scholar
- Michie, D. (1987). Current developments in expert systems. In J. R. Quinlan, (Ed.), Applications of Expert Systems. Maidenhead: Addison Wesley.Google Scholar
- Michalski, R., Mozetic, T., Hong, J., Lavrac, N. (1986). The multi-purpose incremental learning system AQ15 and its testing application to three medical domains Proceedings of AAAI-86, Morgan Kaufmann.Google Scholar
- Oates, J., Cellar, B., Bernstein, L., Bailey, B. P., Freedman, S. B. (1989). Real-time detection of ischemic ECG changes using quasi-orthogonal leads and artificial intelligence, Proceedings, IEEE Computers in Cardiology Conference, 1989, IEEE Computer Society.Google Scholar
- Quinlan, J. R. (1979). Discovering rules by induction from large numbers of examples: a case study. In D. Michie (Ed.), Expert systems in the micro-electronic age. Edinburgh University Press.Google Scholar
- Quinlan, J. R. (1983). Learning efficient classification procedures and their application to chess endgames (p. 469). In R. S. Michalski, J. R. Carbonell, T. M. Mitchell (Eds.), Machine learning: an Artificial Intelligence approach (pp. 463–82). Los Altos, CA: Morgan Kaufmann.Google Scholar
- Quinlan, J. R. (1986). Induction of decision trees. Machine Learning, 1,1.Google Scholar
- Quinlan, J. R., Compton, P.J., Horn, K.A. & Lazarus, L. (1988). Inductive knowledge acquisition: a case study In J. Quinlan (Ed.), Applications of expert systems, Maidenhead: Addison-WesleyGoogle Scholar
- Quinlan, J. R. (1987). Simplifying decision trees. International Journal of Man-machine Studies, 27 (pp. 221–234).Google Scholar
- Quinlan, J. R. (1987b). Decision trees as probabilistic classifiers, Proceedings of the fourth international conference on machine learning, (pp. 31–37) Morgan Kaufmann.Google Scholar
- Quinlan, J. R. (1989). Unknown attribute values in induction Proceedings of the sixth international conference on machine learning, (pp. 164–168) Morgan Kaufmann.Google Scholar
- Rendell, L. (1989). Comparing systems and analysing functions to improve constructive induction Proceedings of the sixth international conference on machine learning (pp. 461–464) Morgan Kaufmann.Google Scholar
- Sejnowski, T. J., & Rosenberg, C. R., (1987). Parallel networks that learn to pronounce English text Complex Systems 1. (pp. 426–429).Google Scholar
- Subramanian, D. (1989). Representational issues in machine learning Proceedings of the sixth international conference on machine learning (pp. 426–429) Morgan KaufmannGoogle Scholar
- Utgoff, P. & Heitman, P.S. (1988). Learning and generalizing move selection preferences Proceedings of the AAAI symposium on computer game playing pp. 36–40 (original not seen).Google Scholar
- Utgoff, P. (1989). ID5: an incremental ID3 Proceedings of the fifth international conference on machine learning (pp. 107–120) Morgan Kaufmann.Google Scholar
- Wilson, S. W. (1987). Classifier systems and the animat problem Machine Learning, 2,4.Google Scholar
- Wirth, J., & Catlett, J. (1988). Costs and benefits of windowing in ID3 Proceedings of the fifth international conference on machine learning (pp. 87–99) Morgan Kaufmann.Google Scholar
- Wong, A.K.C., Chiu, D.K.Y., (1987). Synthesizing statistical knowledge from incomplete mixed-mode data, IEEE Trans. Pattern Analysis and Machine Intelligence, November 1987, Vol PAMI-9, No. 6, pp. 796–805.Google Scholar