Skip to main content

Discretizing Numerical Attributes: An Analysis of Human Perceptions

  • Conference paper
  • First Online:
New Trends in Database and Information Systems (ADBIS 2022)

Abstract

To partition numerical attributes, machine learning (ML) has used a variety of discretization approaches that partition the numerical attribute into intervals. However, an effective method for discretization is still missing in various ML approaches, e.g., association rule mining. Moreover, the existing discretization techniques do not reflect best the impact of the independent numerical factor on the dependent numerical target factor. The main objective of this research is to develop a benchmark approach for partitioning numerical factors. We present an in-depth analysis of human perceptions of partitioning a numerical factor and compare it with one of our proposed measures. We also examine the perceptions of various experts in data science, statistics and engineering disciplines by using a series of graphs with numerical data. The analysis of the collected responses indicates that \(68.7\%\) of the human responses were approximately close to the values obtained by the proposed method. Based on this analysis, the proposed method may be used as one of the methods for discretizing the numerical attributes.

This work has been partially conducted in the project “ICT programme” which was supported by the European Union through the European Social Fund.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://github.com/minakshikaushik/LSQM-measure.git.

References

  1. Aupetit, M., Sedlmair, M., Abbas, M.M., Baggag, A., Bensmail, H.: Toward perception-based evaluation of clustering techniques for visual analytics. In: IEEE Visualization Conference on Proceedings of the VIS 2019, pp. 141–145 (2019)

    Google Scholar 

  2. Catlett, J.: On changing continuous attributes into ordered discrete attributes. In: Kodratoff, Y. (ed.) EWSL 1991. LNCS, vol. 482, pp. 164–178. Springer, Heidelberg (1991). https://doi.org/10.1007/BFb0017012

    Chapter  Google Scholar 

  3. Demiralp, Ç., Bernstein, M.S., Heer, J.: Learning perceptual kernels for visualization design. IEEE Trans. Vis. Comput. Graph. 20(12), 1933–1942 (2014)

    Article  Google Scholar 

  4. Dougherty, J., Kohavi, R., Sahami, M.: Supervised and unsupervised discretization of continuous features. In: Machine Learning Proceedings 1995, pp. 194–202. Elsevier (1995)

    Google Scholar 

  5. Draheim, D.: Generalized Jeffrey Conditionalization: A Frequentist Semantics of Partial Conditionalization. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-69868-7

  6. Draheim, D.: Future perspectives of association rule mining based on partial conditionalization. In: The 30th International Conference on Database and Expert Systems Applications, Proceedings of the DEXA 2019. LNCS, vol. 11706, p. xvi. Springer, Heidelberg (2019). https://doi.org/10.13140/RG.2.2.17763.48163

  7. Etemadpour, R., da Motta, R.C., de Souza Paiva, J.G., Minghim, R., de Oliveira, M.C.F., Linsen, L.: Role of human perception in cluster-based visual analysis of multidimensional data projections. In: International Conference on Information Visualization Theory and Applications, Proceedings of IVAPP, pp. 276–283 (2014)

    Google Scholar 

  8. Fayyad, U., Irani, K.B.: Multi-interval discretization of continuous valued attributes for classification learning, 1993. In: The 13th International Joint Conference on Artificial Intelligence, Proceedings of IJCAI 1993 (1993)

    Google Scholar 

  9. Garcia, S., Luengo, J., Sáez, J.A., Lopez, V., Herrera, F.: A survey of discretization techniques: taxonomy and empirical analysis in supervised learning. IEEE Trans. Knowl. Data Eng. 25(4), 734–750 (2012)

    Article  Google Scholar 

  10. Kaushik, M., Sharma, R., Peious, S.A., Draheim, D.: Impact-driven discretization of numerical factors: case of two- and three-partitioning. In: Srirama, S.N., Lin, J.C.-W., Bhatnagar, R., Agarwal, S., Reddy, P.K. (eds.) BDA 2021. LNCS, vol. 13147, pp. 244–260. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-93620-4_18

    Chapter  Google Scholar 

  11. Kaushik, M., Sharma, R., Peious, S.A., Shahin, M., Ben Yahia, S., Draheim, D.: On the potential of numerical association rule mining. In: Dang, T.K., Küng, J., Takizawa, M., Chung, T.M. (eds.) FDSE 2020. CCIS, vol. 1306, pp. 3–20. Springer, Singapore (2020). https://doi.org/10.1007/978-981-33-4370-2_1

    Chapter  Google Scholar 

  12. Kaushik, M., Sharma, R., Peious, S.A., Shahin, M., Yahia, S.B., Draheim, D.: A systematic assessment of numerical association rule mining methods. SN Comput. Sci. 2(5), 1–13 (2021)

    Article  Google Scholar 

  13. Kotsiantis, S., Kanellopoulos, D.: Discretization techniques: a recent survey. GESTS Int. Trans. Comput. Sci. Eng. 32(1), 47–58 (2006)

    Google Scholar 

  14. Liu, H., Hussain, F., Tan, C.L., Dash, M.: Discretization: an enabling technique. Data Min. Knowl. Disc. 6(4), 393–423 (2002)

    Article  MathSciNet  Google Scholar 

  15. Liu, H., Setiono, R.: Feature selection via discretization. IEEE Trans. Knowl. Data Eng. 9(4), 642–645 (1997)

    Article  Google Scholar 

  16. Arakkal Peious, S., Sharma, R., Kaushik, M., Shah, S.A., Yahia, S.B.: Grand reports: a tool for generalizing association rule mining to numeric target values. In: Song, M., Song, I.-Y., Kotsis, G., Tjoa, A.M., Khalil, I. (eds.) DaWaK 2020. LNCS, vol. 12393, pp. 28–37. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-59065-9_3

    Chapter  Google Scholar 

  17. Shahin, M., et al.: Big data analytics in association rule mining: a systematic literature review. In: International Conference on Big Data Engineering and Technology, Proceedings of the BDET 2021, pp. 40–49. ACM (2021)

    Google Scholar 

  18. Sharma, R., et al.: A novel framework for unification of association rule mining, online analytical processing and statistical reasoning. IEEE Access 10, 12792–12813 (2022)

    Google Scholar 

  19. Sharma, R., Kaushik, M., Peious, S.A., Yahia, S.B., Draheim, D.: Expected vs. unexpected: selecting right measures of interestingness. In: Song, M., Song, I.-Y., Kotsis, G., Tjoa, A.M., Khalil, I. (eds.) DaWaK 2020. LNCS, vol. 12393, pp. 38–47. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-59065-9_4

    Chapter  Google Scholar 

  20. Srikant, R., Agrawal, R.: Mining quantitative association rules in large relational tables. In: International Conference on Management of Data, Proceedings of the ACM SIGMOD 1996, pp. 1–12 (1996)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Minakshi Kaushik .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Kaushik, M., Sharma, R., Vidyarthi, A., Draheim, D. (2022). Discretizing Numerical Attributes: An Analysis of Human Perceptions. In: Chiusano, S., et al. New Trends in Database and Information Systems. ADBIS 2022. Communications in Computer and Information Science, vol 1652. Springer, Cham. https://doi.org/10.1007/978-3-031-15743-1_18

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-15743-1_18

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-15742-4

  • Online ISBN: 978-3-031-15743-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics