OMVD: An Optimization of MVD
Most discretization algorithms are univariate and consider only one attribute at a time. Stephen D. Bay presented a multivariate discretization(MVD) method that considers the affects of all the attributes in the procedure of data mining. But as the author mentioned, any test of differences has a limited amount of power. We present OMVD by improving MVD on the power of testing differences with a genetic algorithm. OMVD is more powerful than MVD because the former does not suffer from setting the difference threshold and from seriously depending on the basic intervals. In addition, the former simultaneously searches partitions for multiple attributes. Our experiments with some synthetic and real datasets suggest that OMVD could obtain more interesting discretizations than could MVD.
KeywordsAssociation Rule Real Dataset Synthetic Dataset Semantic Meaning Basic Interval
Unable to display preview. Download preview PDF.
- 1.Agarwal, R., Imielinski, T., Swami, A.: Mining association rules between sets of items in large databases. In: Proc. ACM SIGMOD International Conference on Management of Data, Washington, DC, pp. 207–216 (1993)Google Scholar
- 2.Quinlan, J.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Francisco (1993)Google Scholar
- 5.Kwedlo, W., Kretowski, M.: An evolutionary algorithm using multivariate discretization for decision rule induction. In: Principles of Data Mining and Knowledge Discovery, pp. 392–397 (1999)Google Scholar
- 6.Srikant, R., Agrawal, R.: Mining quantitative association rules in large relational tables. In: Jagadish, H.V., Mumick, I.S. (eds.) Proceedings of the 1996 ACM SIGMOD International Conference on Management of Data, Montreal, Quebec, Canada, pp. 1–12 (1996)Google Scholar
- 7.Miller, R.J., Yang, Y.: Association rules over interval data. In: Proceedings ACM SIGMOD International Conference on Management of Data, pp. 452–461 (1997)Google Scholar
- 8.Monti, S., Cooper, G.F.: A latent variable model for multivariate discretization. In: The 7th Int. Workshop Artificial Intelligence and Statistics, Fort Lauderdale (1999)Google Scholar
- 9.Ludl, M.C., Widmer, G.: Relative unsupervised discretization for association rule mining. In: Proceedings of the 4th European Conference on Principles and Practice of Knowledge Discovery in Databases, Springer, Berlin (2000)Google Scholar
- 12.Ruggles, S., Sobek, M., Alexander, T., et. al.: Integrated public use microdata series: Version 2.0 minneapolis: Historical census projects (1997)Google Scholar