OMVD: An Optimization of MVD

  • Zhi He
  • Shengfeng Tian
  • Houkuan Huang
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4093)


Most discretization algorithms are univariate and consider only one attribute at a time. Stephen D. Bay presented a multivariate discretization(MVD) method that considers the affects of all the attributes in the procedure of data mining. But as the author mentioned, any test of differences has a limited amount of power. We present OMVD by improving MVD on the power of testing differences with a genetic algorithm. OMVD is more powerful than MVD because the former does not suffer from setting the difference threshold and from seriously depending on the basic intervals. In addition, the former simultaneously searches partitions for multiple attributes. Our experiments with some synthetic and real datasets suggest that OMVD could obtain more interesting discretizations than could MVD.


Association Rule Real Dataset Synthetic Dataset Semantic Meaning Basic Interval 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Agarwal, R., Imielinski, T., Swami, A.: Mining association rules between sets of items in large databases. In: Proc. ACM SIGMOD International Conference on Management of Data, Washington, DC, pp. 207–216 (1993)Google Scholar
  2. 2.
    Quinlan, J.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Francisco (1993)Google Scholar
  3. 3.
    Bay, S.D.: Multivariate discretization for set mining. Knowledge and Information Systems 3, 491–512 (2001)CrossRefMATHGoogle Scholar
  4. 4.
    Bay, S.D., Pazzani, M.J.: Detecting group differences: Mining contrast sets. Data Mining and Knowledge Discovery 5, 213–246 (2001)CrossRefMATHGoogle Scholar
  5. 5.
    Kwedlo, W., Kretowski, M.: An evolutionary algorithm using multivariate discretization for decision rule induction. In: Principles of Data Mining and Knowledge Discovery, pp. 392–397 (1999)Google Scholar
  6. 6.
    Srikant, R., Agrawal, R.: Mining quantitative association rules in large relational tables. In: Jagadish, H.V., Mumick, I.S. (eds.) Proceedings of the 1996 ACM SIGMOD International Conference on Management of Data, Montreal, Quebec, Canada, pp. 1–12 (1996)Google Scholar
  7. 7.
    Miller, R.J., Yang, Y.: Association rules over interval data. In: Proceedings ACM SIGMOD International Conference on Management of Data, pp. 452–461 (1997)Google Scholar
  8. 8.
    Monti, S., Cooper, G.F.: A latent variable model for multivariate discretization. In: The 7th Int. Workshop Artificial Intelligence and Statistics, Fort Lauderdale (1999)Google Scholar
  9. 9.
    Ludl, M.C., Widmer, G.: Relative unsupervised discretization for association rule mining. In: Proceedings of the 4th European Conference on Principles and Practice of Knowledge Discovery in Databases, Springer, Berlin (2000)Google Scholar
  10. 10.
    Mehta, S., Parthasarathy, S., Yang, H.: Toward unsupervised correlation preserving discretization. IEEE Transactions on Knowledge and Data Engineering 17, 1174–1185 (2005)CrossRefGoogle Scholar
  11. 11.
    Eiben, A., Smith, J.: Introduction to Evolutionary Computing. Springer, Heidelberg (2003)CrossRefMATHGoogle Scholar
  12. 12.
    Ruggles, S., Sobek, M., Alexander, T., et. al.: Integrated public use microdata series: Version 2.0 minneapolis: Historical census projects (1997)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Zhi He
    • 1
  • Shengfeng Tian
    • 1
  • Houkuan Huang
    • 1
  1. 1.School of Computer and Information TechnologyBeijing Jiaotong UniversityBeijingChina

Personalised recommendations