This chapter looks at the question of how to convert a continuous attribute to a categorical one, a process known as discretisation. This is important as many data mining algorithms, including TDIDT, require all attributes to take categorical values.
Two different types of discretisation are distinguished, known as local and global discretisation. The process of extending the TDIDT algorithm by adding local discretisation of continuous attributes is illustrated in detail, followed by a description of the ChiMerge algorithm for global discretisation. The effectiveness of the two methods is compared for the TDIDT algorithm for a number of datasets.
- Kerber, R. (1992). ChiMerge: discretization of numeric attributes. In Proceedings of the 10th national conference on artificial intelligence (pp. 123–128). Menlo Park: AAAI Press. Google Scholar