Discretization

  • Salvador García
  • Julián Luengo
  • Francisco Herrera
Chapter
Part of the Intelligent Systems Reference Library book series (ISRL, volume 72)

Abstract

Discretization is an essential preprocessing technique used in many knowledge discovery and data mining tasks. Its main goal is to transform a set of continuous attributes into discrete ones, by associating categorical values to intervals and thus transforming quantitative data into qualitative data. An overview of discretization together with a complete outlook and taxonomy are supplied in Sects. 9.1 and 9.2. We conduct an experimental study in supervised classification involving the most representative discretizers, different types of classifiers, and a large number of data sets (Sect. 9.4).

References

  1. 1.
    Agrawal, R., Srikant, R.: Fast algorithms for mining association rules. In: Proceedings of the 20th Very Large Data Bases conference (VLDB), pp. 487–499 (1994)Google Scholar
  2. 2.
    Aha, D.W. (ed.): Lazy Learning. Springer, New York (2010)Google Scholar
  3. 3.
    Alcalá-Fdez, J., Fernández, A., Luengo, J., Derrac, J., García, S., Sánchez, L., Herrera, F.: KEEL data-mining software tool: data set repository, integration of algorithms and experimental analysis framework. J. Multiple-Valued Logic Soft Comput. 17(2–3), 255–287 (2011)Google Scholar
  4. 4.
    Alcalá-Fdez, J., Sánchez, L., García, S., del Jesus, M.J., Ventura, S., Garrell, J.M., Otero, J., Romero, C., Bacardit, J., Rivas, V.M., Fernández, J.C., Herrera, F.: KEEL: a software tool to assess evolutionary algorithms for data mining problems. Soft Comput. 13(3), 307–318 (2009)CrossRefGoogle Scholar
  5. 5.
    An, A., Cercone, N.: Discretization of Continuous Attributes for Learning Classification Rules. In: Proceedings of the Third Conference on Methodologies for Knowledge Discovery and Data Mining, pp. 509–514 (1999)Google Scholar
  6. 6.
    Au, W.H., Chan, K.C.C., Wong, A.K.C.: A fuzzy approach to partitioning continuous attributes for classification. IEEE Trans. Knowl. Data Eng. 18(5), 715–719 (2006)CrossRefGoogle Scholar
  7. 7.
    Augasta, M.G., Kathirvalavakumar, T.: A new discretization algorithm based on range coefficient of dispersion and skewness for neural networks classifier. Appl. Soft Comput. 12(2), 619–625 (2012)CrossRefGoogle Scholar
  8. 8.
    Bache, K., Lichman, M.: UCI machine learning repository (2013). http://archive.ics.uci.edu/ml
  9. 9.
    Bakar, A.A., Othman, Z.A., Shuib, N.L.M.: Building a new taxonomy for data discretization techniques. In: Proceedings on Conference on Data Mining and Optimization (DMO), pp. 132–140 (2009)Google Scholar
  10. 10.
    Bay, S.D.: Multivariate discretization for set mining. Knowl. Inf. Syst. 3, 491–512 (2001)MATHCrossRefGoogle Scholar
  11. 11.
    Berka, P., Bruha, I.: Empirical comparison of various discretization procedures. Int. J. Pattern Recognit. Artif. Intell. 12(7), 1017–1032 (1998)CrossRefGoogle Scholar
  12. 12.
    Berrado, A., Runger, G.C.: Supervised multivariate discretization in mixed data with random forests. In: ACS/IEEE International Conference on Computer Systems and Applications (ICCSA), pp. 211–217 (2009)Google Scholar
  13. 13.
    Berzal, F., Cubero, J.C., Marín, N., Sánchez, D.: Building multi-way decision trees with numerical attributes. Inform. Sci. 165, 73–90 (2004)MATHMathSciNetCrossRefGoogle Scholar
  14. 14.
    Bondu, A., Boulle, M., Lemaire, V.: A non-parametric semi-supervised discretization method. Knowl. Inf. Syst. 24, 35–57 (2010)CrossRefGoogle Scholar
  15. 15.
    Boulle, M.: Khiops: a statistical discretization method of continuous attributes. Mach. Learn. 55, 53–69 (2004)MATHCrossRefGoogle Scholar
  16. 16.
    Boullé, M.: MODL: a bayes optimal discretization method for continuous attributes. Mach. Learn. 65(1), 131–165 (2006)CrossRefGoogle Scholar
  17. 17.
    Breiman, L., Friedman, J., Stone, C.J., Olshen, R.A.: Classification and Regression Trees. Chapman and Hall/CRC, New York (1984)MATHGoogle Scholar
  18. 18.
    Butterworth, R., Simovici, D.A., Santos, G.S., Ohno-Machado, L.: A greedy algorithm for supervised discretization. J. Biomed. Inform. 37, 285–292 (2004)CrossRefGoogle Scholar
  19. 19.
    Catlett, J.: On changing continuous attributes into ordered discrete attributes. In European Working Session on Learning (EWSL), Lecture Notes on Computer Science, vol. 482, pp. 164–178. Springer (1991)Google Scholar
  20. 20.
    Cerquides, J., Mantaras, R.L.D.: Proposal and empirical comparison of a parallelizable distance-based discretization method. In: Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD), pp. 139–142 (1997)Google Scholar
  21. 21.
    Chan, C., Batur, C., Srinivasan, A.: Determination of quantization intervals in rule based model for dynamic systems. In: Proceedings of the Conference on Systems and Man and and Cybernetics, pp. 1719–1723 (1991)Google Scholar
  22. 22.
    Chao, S., Li, Y.: Multivariate interdependent discretization for continuous attribute. Proc. Third Int. Conf. Inf. Technol. Appl. (ICITA) 2, 167–172 (2005)Google Scholar
  23. 23.
    Chen, C.W., Li, Z.G., Qiao, S.Y., Wen, S.P.: Study on discretization in rough set based on genetic algorithm. In: Proceedings of the Second International Conference on Machine Learning and Cybernetics (ICMLC), pp. 1430–1434 (2003)Google Scholar
  24. 24.
    Ching, J.Y., Wong, A.K.C., Chan, K.C.C.: Class-dependent discretization for inductive learning from continuous and mixed-mode data. IEEE Trans. Pattern Anal. Mach. Intell. 17, 641–651 (1995)CrossRefGoogle Scholar
  25. 25.
    Chlebus, B., Nguyen, S.H.: On finding optimal discretizations for two attributes. Lect. Notes Artif. Intell. 1424, 537–544 (1998)MathSciNetGoogle Scholar
  26. 26.
    Chmielewski, M.R., Grzymala-Busse, J.W.: Global discretization of continuous attributes as preprocessing for machine learning. Int. J. Approximate Reasoning 15(4), 319–331 (1996)MATHCrossRefGoogle Scholar
  27. 27.
    Chou, P.A.: Optimal partitioning for classification and regression trees. IEEE Trans. Pattern Anal. Mach. Intell. 13, 340–354 (1991)CrossRefGoogle Scholar
  28. 28.
    Cios, K.J., Kurgan, L.A., Dick, S.: Highly scalable and robust rule learner: performance evaluation and comparison. IEEE Trans. Syst. Man Cybern. Part B 36, 32–53 (2006)CrossRefGoogle Scholar
  29. 29.
    Cios, K.J., Pedrycz, W., Swiniarski, R.W., Kurgan, L.A.: Data Mining: A Knowledge Discovery Approach. Springer, New York (2007)Google Scholar
  30. 30.
    Clarke, E.J., Barton, B.A.: Entropy and MDL discretization of continuous variables for bayesian belief networks. Int. J. Intell. Syst. 15, 61–92 (2000)CrossRefGoogle Scholar
  31. 31.
    Cohen, J.A.: Coefficient of agreement for nominal scales. Educ. Psychol. Measur. 20, 37–46 (1960)CrossRefGoogle Scholar
  32. 32.
    Cohen, W.W.: Fast Effective Rule Induction. In: Proceedings of the Twelfth International Conference on Machine Learning (ICML), pp. 115–123 (1995)Google Scholar
  33. 33.
    Dai, J.H.: A genetic algorithm for discretization of decision systems. In: Proceedings of the Third International Conference on Machine Learning and Cybernetics (ICMLC), pp. 1319–1323 (2004)Google Scholar
  34. 34.
    Dai, J.H., Li, Y.X.: Study on discretization based on rough set theory. In: Proceedings of the First International Conference on Machine Learning and Cybernetics (ICMLC), pp. 1371–1373 (2002)Google Scholar
  35. 35.
    Demšar, J.: Statistical comparisons of classifiers over multiple data sets. J. Mach. Learn. Res. 7, 1–30 (2006)MATHMathSciNetGoogle Scholar
  36. 36.
    Dougherty, J., Kohavi, R., Sahami, M.: Supervised and unsupervised discretization of continuous features. In: Proceedings of the Twelfth International Conference on Machine Learning (ICML), pp. 194–202 (1995)Google Scholar
  37. 37.
    Elomaa, T., Kujala, J., Rousu, J.: Practical approximation of optimal multivariate discretization. In: Proceedings of the 16th International Symposium on Methodologies for Intelligent Systems (ISMIS), pp. 612–621 (2006)Google Scholar
  38. 38.
    Elomaa, T., Rousu, J.: General and efficient multisplitting of numerical attributes. Mach. Learn. 36, 201–244 (1999)MATHCrossRefGoogle Scholar
  39. 39.
    Elomaa, T., Rousu, J.: Necessary and sufficient pre-processing in numerical range discretization. Knowl. Inf. Syst. 5, 162–182 (2003)CrossRefGoogle Scholar
  40. 40.
    Elomaa, T., Rousu, J.: Efficient multisplitting revisited: Optima-preserving elimination of partition candidates. Data Min. Knowl. Disc. 8, 97–126 (2004)MathSciNetCrossRefGoogle Scholar
  41. 41.
    Fayyad, U.M., Irani, K.B.: Multi-interval discretization of continuous-valued attributes for classification learning. In: Proceedings of the 13th International Joint Conference on Artificial Intelligence (IJCAI), pp. 1022–1029 (1993)Google Scholar
  42. 42.
    Ferrandiz, S., Boullé, M.: Multivariate discretization by recursive supervised bipartition of graph. In: Proceedings of the 4th Conference on Machine Learning and Data Mining (MLDM), pp. 253–264 (2005)Google Scholar
  43. 43.
    Flores, J.L., Inza, I., Larrañaga, P.: Larra: Wrapper discretization by means of estimation of distribution algorithms. Intell. Data Anal. 11(5), 525–545 (2007)Google Scholar
  44. 44.
    Flores, M.J., Gámez, J.A., Martínez, A.M., Puerta, J.M.: Handling numeric attributes when comparing bayesian network classifiers: does the discretization method matter? Appl. Intell. 34, 372–385 (2011)CrossRefGoogle Scholar
  45. 45.
    Friedman, N., Goldszmidt, M.: Discretizing continuous attributes while learning bayesian networks. In: Proceedings of the 13th International Conference on Machine Learning (ICML), pp. 157–165 (1996)Google Scholar
  46. 46.
    Gaddam, S.R., Phoha, V.V., Balagani, K.S.: K-Means+ID3: a novel method for supervised anomaly detection by cascading k-means clustering and ID3 decision tree learning methods. IEEE Trans. Knowl. Data Eng. 19, 345–354 (2007)CrossRefGoogle Scholar
  47. 47.
    Gama, J., Torgo, L., Soares, C.: Dynamic discretization of continuous attributes. In: Proceedings of the 6th Ibero-American Conference on AI: Progress in Artificial Intelligence, IBERAMIA, pp. 160–169 (1998)Google Scholar
  48. 48.
    Garcia, E.K., Feldman, S., Gupta, M.R., Srivastava, S.: Completely lazy learning. IEEE Trans. Knowl. Data Eng. 22, 1274–1285 (2010)CrossRefGoogle Scholar
  49. 49.
    García, M.N.M., Lucas, J.P., Batista, V.F.L., Martín, M.J.P.: Multivariate discretization for associative classification in a sparse data application domain. In: Proceedings of the 5th International Conference on Hybrid Artificial Intelligent Systems (HAIS), pp. 104–111 (2010)Google Scholar
  50. 50.
    García, S., Herrera, F.: An extension on “statistical comparisons of classifiers over multiple data sets” for all pairwise comparisons. J. Mach. Learn. Res. 9, 2677–2694 (2008)MATHGoogle Scholar
  51. 51.
    García, S., Luengo, J., Sáez, J.A., López, V., Herrera, F.: A survey of discretization techniques: taxonomy and empirical analysis in supervised learning. IEEE Trans. Knowl. Data Eng. 25(4), 734–750 (2013)CrossRefGoogle Scholar
  52. 52.
    Giráldez, R., Aguilar-Ruiz, J., Riquelme, J., Ferrer-Troyano, F., Rodríguez-Baena, D.: Discretization oriented to decision rules generation. Frontiers Artif. Intell. Appl. 82, 275–279 (2002)Google Scholar
  53. 53.
    González-Abril, L., Cuberos, F.J., Velasco, F., Ortega, J.A.: AMEVA: an autonomous discretization algorithm. Expert Syst. Appl. 36, 5327–5332 (2009)CrossRefGoogle Scholar
  54. 54.
    Grzymala-Busse, J.W.: A multiple scanning strategy for entropy based discretization. In: Proceedings of the 18th International Symposium on Foundations of Intelligent Systems, ISMIS, pp. 25–34 (2009)Google Scholar
  55. 55.
    Grzymala-Busse, J.W., Stefanowski, J.: Three discretization methods for rule induction. Int. J. Intell. Syst. 16(1), 29–38 (2001)MATHCrossRefGoogle Scholar
  56. 56.
    Gupta, A., Mehrotra, K.G., Mohan, C.: A clustering-based discretization for supervised learning. Stat. Probab. Lett. 80(9–10), 816–824 (2010)MATHMathSciNetCrossRefGoogle Scholar
  57. 57.
    He, H., Garcia, E.A.: Learning from imbalanced data. IEEE Trans. Knowl. Data Eng. 21(9), 1263–1284 (2009)CrossRefGoogle Scholar
  58. 58.
    Ho, K.M., Scott, P.D.: Zeta: A global method for discretization of continuous variables. In: Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD), pp. 191–194 (1997)Google Scholar
  59. 59.
    Holte, R.C.: Very simple classification rules perform well on most commonly used datasets. Mach. Learn. 11, 63–90 (1993)MATHCrossRefGoogle Scholar
  60. 60.
    Hong, S.J.: Use of contextual information for feature ranking and discretization. IEEE Trans. Knowl. Data Eng. 9, 718–730 (1997)CrossRefGoogle Scholar
  61. 61.
    Hu, H.W., Chen, Y.L., Tang, K.: A dynamic discretization approach for constructing decision trees with a continuous label. IEEE Trans. Knowl. Data Eng. 21(11), 1505–1514 (2009)CrossRefGoogle Scholar
  62. 62.
    Ishibuchi, H., Yamamoto, T., Nakashima, T.: Fuzzy data mining: Effect of fuzzy discretization. In: IEEE International Conference on Data Mining (ICDM), pp. 241–248 (2001)Google Scholar
  63. 63.
    Janssens, D., Brijs, T., Vanhoof, K., Wets, G.: Evaluating the performance of cost-based discretization versus entropy- and error-based discretization. Comput. Oper. Res. 33(11), 3107–3123 (2006)MATHCrossRefGoogle Scholar
  64. 64.
    Jiang, F., Zhao, Z., Ge, Y.: A supervised and multivariate discretization algorithm for rough sets. In: Proceedings of the 5th international conference on Rough set and knowledge technology, RSKT, pp. 596–603 (2010)Google Scholar
  65. 65.
    Jiang, S., Yu, W.: A local density approach for unsupervised feature discretization. In: Proceedings of the 5th International Conference on Advanced Data Mining and Applications, ADMA, pp. 512–519 (2009)Google Scholar
  66. 66.
    Jin, R., Breitbart, Y., Muoh, C.: Data discretization unification. Knowl. Inf. Syst. 19, 1–29 (2009)CrossRefGoogle Scholar
  67. 67.
    Kang, Y., Wang, S., Liu, X., Lai, H., Wang, H., Miao, B.: An ICA-based multivariate discretization algorithm. In: Proceedings of the First International Conference on Knowledge Science, Engineering and Management (KSEM), pp. 556–562 (2006)Google Scholar
  68. 68.
    Kerber, R.: Chimerge: Discretization of numeric attributes. In: National Conference on Artifical Intelligence American Association for Artificial Intelligence (AAAI), pp. 123–128 (1992)Google Scholar
  69. 69.
    Kononenko, I., Sikonja, M.R.: Discretization of continuous attributes using relieff. In: Proceedings of Elektrotehnika in Racunalnika Konferenca (ERK) (1995)Google Scholar
  70. 70.
    Kurgan, L.A., Cios, K.J.: CAIM discretization algorithm. IEEE Trans. Knowl. Data Eng. 16(2), 145–153 (2004)CrossRefGoogle Scholar
  71. 71.
    Kurtcephe, M., Güvenir, H.A.: A discretization method based on maximizing the area under receiver operating characteristic curve. Int. J. Pattern Recognit. Artif. Intell. 27(1), 8 (2013)Google Scholar
  72. 72.
    Lee, C.H.: A hellinger-based discretization method for numeric attributes in classification learning. Knowl. Based Syst. 20, 419–425 (2007)CrossRefGoogle Scholar
  73. 73.
    Li, R.P., Wang, Z.O.: An entropy-based discretization method for classification rules with inconsistency checking. In: Proceedings of the First International Conference on Machine Learning and Cybernetics (ICMLC), pp. 243–246 (2002)Google Scholar
  74. 74.
    Li, W.L., Yu, R.H., Wang, X.Z.: Discretization of continuous-valued attributes in decision tree generation. In: Proocedings of the Second International Conference on Machine Learning and Cybernetics (ICMLC), pp. 194–198 (2010)Google Scholar
  75. 75.
    Liu, H., Hussain, F., Tan, C.L., Dash, M.: Discretization: an enabling technique. Data Min. Knowl. Disc. 6(4), 393–423 (2002)MathSciNetCrossRefGoogle Scholar
  76. 76.
    Liu, H., Setiono, R.: Feature selection via discretization. IEEE Trans. Knowl. Data Eng. 9, 642–645 (1997)CrossRefGoogle Scholar
  77. 77.
    Liu, L., Wong, A.K.C., Wang, Y.: A global optimal algorithm for class-dependent discretization of continuous data. Intell. Data Anal. 8, 151–170 (2004)Google Scholar
  78. 78.
    Liu, X., Wang, H.: A discretization algorithm based on a heterogeneity criterion. IEEE Trans. Knowl. Data Eng. 17, 1166–1173 (2005)CrossRefGoogle Scholar
  79. 79.
    Ludl, M.C., Widmer, G.: Relative unsupervised discretization for association rule mining, In: Proceedings of the 4th European Conference on Principles of Data Mining and Knowledge Discovery, The Fourth European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD), pp. 148–158 (2000)Google Scholar
  80. 80.
    Macskassy, S.A., Hirsh, H., Banerjee, A., Dayanik, A.A.: Using text classifiers for numerical classification. In: Proceedings of the 17th International Joint Conference on Artificial Intelligence, Vol. 2 (IJCAI), pp. 885–890 (2001)Google Scholar
  81. 81.
    Mehta, S., Parthasarathy, S., Yang, H.: Toward unsupervised correlation preserving discretization. IEEE Trans. Knowl. Data Eng. 17, 1174–1185 (2005)CrossRefGoogle Scholar
  82. 82.
    Monti, S., Cooper, G.: A latent variable model for multivariate discretization. In: Proceedings of the Seventh International Workshop on AI & Statistics (Uncertainty) (1999)Google Scholar
  83. 83.
    Monti, S., Cooper, G.F.: A multivariate discretization method for learning bayesian networks from mixed data. In: Proceedings on Uncertainty in Artificial Intelligence (UAI), pp. 404–413 (1998)Google Scholar
  84. 84.
    Muhlenbach, F., Rakotomalala, R.: Multivariate supervised discretization, a neighborhood graph approach. In: Proceedings of the 2002 IEEE International Conference on Data Mining, ICDM, pp. 314–320 (2002)Google Scholar
  85. 85.
    Nemmiche-Alachaher, L.: Contextual approach to data discretization. In: Proceedings of the International Multi-Conference on Computing in the Global Information Technology (ICCGI), pp. 35–40 (2010)Google Scholar
  86. 86.
    Nguyen, S.H., Skowron, A.: Quantization of real value attributes - rough set and boolean reasoning approach. In: Proceedings of the Second Joint Annual Conference on Information Sciences (JCIS), pp. 34–37 (1995)Google Scholar
  87. 87.
    Pazzani, M.J.: An iterative improvement approach for the discretization of numeric attributes in bayesian classifiers. In: Proceedings of the First International Conference on Knowledge Discovery and Data Mining (KDD), pp. 228–233 (1995)Google Scholar
  88. 88.
    Perner, P., Trautzsch, S.: Multi-interval discretization methods for decision tree learning. In: Advances in Pattern Recognition, Joint IAPR International Workshops SSPR 98 and SPR 98, pp. 475–482 (1998)Google Scholar
  89. 89.
    Pfahringer, B.: Compression-based discretization of continuous attributes. In: Proceedings of the 12th International Conference on Machine Learning (ICML), pp. 456–463 (1995)Google Scholar
  90. 90.
    Pongaksorn, P., Rakthanmanon, T., Waiyamai, K.: DCR: Discretization using class information to reduce number of intervals. In: Proceedings of the International Conference on Quality issues, measures of interestingness and evaluation of data mining model (QIMIE), pp. 17–28 (2009)Google Scholar
  91. 91.
    Qu, W., Yan, D., Sang, Y., Liang, H., Kitsuregawa, M., Li, K.: A novel chi2 algorithm for discretization of continuous attributes. In: Proceedings of the 10th Asia-Pacific web conference on Progress in WWW research and development, APWeb, pp. 560–571 (2008)Google Scholar
  92. 92.
    Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers Inc, San Francisco (1993)Google Scholar
  93. 93.
    Rastogi, R., Shim, K.: PUBLIC: a decision tree classifier that integrates building and pruning. Data Min. Knowl. Disc. 4, 315–344 (2000)MATHCrossRefGoogle Scholar
  94. 94.
    Richeldi, M., Rossotto, M.: Class-driven statistical discretization of continuous attributes. In: Proceedings of the 8th European Conference on Machine Learning (ECML), ECML ’95, pp. 335–338 (1995)Google Scholar
  95. 95.
    Roy, A., Pal, S.K.: Fuzzy discretization of feature space for a rough set classifier. Pattern Recognit. Lett. 24, 895–902 (2003)MATHCrossRefGoogle Scholar
  96. 96.
    Ruiz, F.J., Angulo, C., Agell, N.: IDD: a supervised interval distance-based method for discretization. IEEE Trans. Knowl. Data Eng. 20(9), 1230–1238 (2008)CrossRefGoogle Scholar
  97. 97.
    Sang, Y., Jin, Y., Li, K., Qi, H.: UniDis: a universal discretization technique. J. Intell. Inf. Syst. 40(2), 327–348 (2013)CrossRefGoogle Scholar
  98. 98.
    Sang, Y., Li, K., Shen, Y.: EBDA: An effective bottom-up discretization algorithm for continuous attributes. In: Proceedings of the 10th IEEE International Conference on Computer and Information Technology (CIT), pp. 2455–2462 (2010)Google Scholar
  99. 99.
    Shehzad, K.: Edisc: a class-tailored discretization technique for rule-based classification. IEEE Trans. Knowl. Data Eng. 24(8), 1435–1447 (2012)CrossRefGoogle Scholar
  100. 100.
    Singh, G.K., Minz, S.: Discretization using clustering and rough set theory. In: Proceedings of the 17th International Conference on Computer Theory and Applications (ICCTA), pp. 330–336 (2007)Google Scholar
  101. 101.
    Su, C.T., Hsu, J.H.: An extended chi2 algorithm for discretization of real value attributes. IEEE Trans. Knowl. Data Eng. 17, 437–441 (2005)CrossRefGoogle Scholar
  102. 102.
    Subramonian, R., Venkata, R., Chen, J.: A visual interactive framework for attribute discretization. In: Proceedings of the First International Conference on Knowledge Discovery and Data Mining (KDD), pp. 82–88 (1997)Google Scholar
  103. 103.
    Sun, Y., Wong, A.K.C., Kamel, M.S.: Classification of imbalanced data: a review. Int. J. Pattern Recognit. Artif. Intell. 23(4), 687–719 (2009)CrossRefGoogle Scholar
  104. 104.
    Susmaga, R.: Analyzing discretizations of continuous attributes given a monotonic discrimination function. Intell. Data Anal. 1(1–4), 157–179 (1997)CrossRefGoogle Scholar
  105. 105.
    Tay, F.E.H., Shen, L.: A modified chi2 algorithm for discretization. IEEE Trans. Knowl. Data Eng. 14, 666–670 (2002)CrossRefGoogle Scholar
  106. 106.
    Tsai, C.J., Lee, C.I., Yang, W.P.: A discretization algorithm based on class-attribute contingency coefficient. Inf. Sci. 178, 714–731 (2008)CrossRefGoogle Scholar
  107. 107.
    Vannucci, M., Colla, V.: Meaningful discretization of continuous features for association rules mining by means of a SOM. In: Proocedings of the 12th European Symposium on Artificial Neural Networks (ESANN), pp. 489–494 (2004)Google Scholar
  108. 108.
    Ventura, D., Martinez, T.R.: BRACE: A paradigm for the discretization of continuously valued data, In: Proceedings of the Seventh Annual Florida AI Research Symposium (FLAIRS), pp. 117–121 (1994)Google Scholar
  109. 109.
    Ventura, D., Martinez, T.R.: An empirical comparison of discretization methods. In: Proceedings of the 10th International Symposium on Computer and Information Sciences (ISCIS), pp. 443–450 (1995)Google Scholar
  110. 110.
    Wang, K., Liu, B.: Concurrent discretization of multiple attributes. In: Proceedings of the Pacific Rim International Conference on Artificial Intelligence (PRICAI), pp. 250–259 (1998)Google Scholar
  111. 111.
    Wang, S., Min, F., Wang, Z., Cao, T.: OFFD: Optimal flexible frequency discretization for naive bayes classification. In: Proceedings of the 5th International Conference on Advanced Data Mining and Applications, ADMA, pp. 704–712 (2009)Google Scholar
  112. 112.
    Wei, H.: A novel multivariate discretization method for mining association rules. In: 2009 Asia-Pacific Conference on Information Processing (APCIP), pp. 378–381 (2009)Google Scholar
  113. 113.
    Wilcoxon, F.: Individual comparisons by ranking methods. Biometrics 1, 80–83 (1945)CrossRefGoogle Scholar
  114. 114.
    Wilson, D.R., Martinez, T.R.: Reduction techniques for instance-based learning algorithms. Mach. Learn. 38(3), 257–286 (2000)MATHCrossRefGoogle Scholar
  115. 115.
    Wong, A.K.C., Chiu, D.K.Y.: Synthesizing statistical knowledge from incomplete mixed-mode data. IEEE Trans. Pattern Anal. Mach. Intell. 9, 796–805 (1987)CrossRefGoogle Scholar
  116. 116.
    Wu, M., Huang, X.C., Luo, X., Yan, P.L.: Discretization algorithm based on difference-similitude set theory. In: Proceedings of the Fourth International Conference on Machine Learning and Cybernetics (ICMLC), pp. 1752–1755 (2005)Google Scholar
  117. 117.
    Wu, Q., Bell, D.A., Prasad, G., McGinnity, T.M.: A distribution-index-based discretizer for decision-making with symbolic ai approaches. IEEE Trans. Knowl. Data Eng. 19, 17–28 (2007)CrossRefGoogle Scholar
  118. 118.
    Wu, Q., Cai, J., Prasad, G., McGinnity, T.M., Bell, D.A., Guan, J.: A novel discretizer for knowledge discovery approaches based on rough sets. In: Proceedings of the First International Conference on Rough Sets and Knowledge Technology (RSKT), pp. 241–246 (2006)Google Scholar
  119. 119.
    Wu, X.: A bayesian discretizer for real-valued attributes. Comput. J. 39, 688–691 (1996)CrossRefGoogle Scholar
  120. 120.
    Wu, X., Kumar, V. (eds.): The Top Ten Algorithms in Data Mining. Data Mining and Knowledge Discovery. Chapman and Hall/CRC, Taylor and Francis, Boca Raton (2009)MATHGoogle Scholar
  121. 121.
    Yang, P., Li, J.S., Huang, Y.X.: HDD: a hypercube division-based algorithm for discretisation. Int. J. Syst. Sci. 42(4), 557–566 (2011)MATHMathSciNetCrossRefGoogle Scholar
  122. 122.
    Yang, Y., Webb, G.I.: Discretization for naive-bayes learning: managing discretization bias and variance. Mach. Learn. 74(1), 39–74 (2009)CrossRefGoogle Scholar
  123. 123.
    Yang, Y., Webb, G.I., Wu, X.: Discretization methods. In: Data Mining and Knowledge Discovery Handbook, pp. 101–116 (2010)Google Scholar
  124. 124.
    Zhang, G., Hu, L., Jin, W.: Discretization of continuous attributes in rough set theory and its application. In: Proceedings of the 2004 IEEE Conference on Cybernetics and Intelligent Systems (CIS), pp. 1020–1026 (2004)Google Scholar
  125. 125.
    Zhu, W., Wang, J., Zhang, Y., Jia, L.: A discretization algorithm based on information distance criterion and ant colony optimization algorithm for knowledge extracting on industrial database. In: Proceedings of the 2010 IEEE International Conference on Mechatronics and Automation (ICMA), pp. 1477–1482 (2010)Google Scholar
  126. 126.
    Zighed, D.A., Rabaséda, S., Rakotomalala, R.: FUSINTER: a method for discretization of continuous attributes. Int. J. Uncertainty, Fuzziness Knowl. Based Syst. 6, 307–326 (1998)MATHCrossRefGoogle Scholar
  127. 127.
    Zighed, D.A., Rakotomalala, R., Feschet, F.: Optimal multiple intervals discretization of continuous attributes for supervised learning. In: Proceedings of the First International Conference on Knowledge Discovery and Data Mining (KDD), pp. 295–298 (1997)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Salvador García
    • 1
  • Julián Luengo
    • 2
  • Francisco Herrera
    • 3
  1. 1.Department of Computer ScienceUniversity of JaénJaénSpain
  2. 2.Department of Civil EngineeringUniversity of BurgosBurgosSpain
  3. 3.Department of Computer Science and Artificial IntelligenceUniversity of GranadaGranadaSpain

Personalised recommendations