Discretizing Numerical Attributes: An Analysis of Human Perceptions

Kaushik, Minakshi; Sharma, Rahul; Vidyarthi, Ankit; Draheim, Dirk

doi:10.1007/978-3-031-15743-1_18

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1652))

Included in the following conference series:

European Conference on Advances in Databases and Information Systems

984 Accesses

Abstract

To partition numerical attributes, machine learning (ML) has used a variety of discretization approaches that partition the numerical attribute into intervals. However, an effective method for discretization is still missing in various ML approaches, e.g., association rule mining. Moreover, the existing discretization techniques do not reflect best the impact of the independent numerical factor on the dependent numerical target factor. The main objective of this research is to develop a benchmark approach for partitioning numerical factors. We present an in-depth analysis of human perceptions of partitioning a numerical factor and compare it with one of our proposed measures. We also examine the perceptions of various experts in data science, statistics and engineering disciplines by using a series of graphs with numerical data. The analysis of the collected responses indicates that \(68.7\%\) of the human responses were approximately close to the values obtained by the proposed method. Based on this analysis, the proposed method may be used as one of the methods for discretizing the numerical attributes.

This work has been partially conducted in the project “ICT programme” which was supported by the European Union through the European Social Fund.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
https://github.com/minakshikaushik/LSQM-measure.git.

References

Aupetit, M., Sedlmair, M., Abbas, M.M., Baggag, A., Bensmail, H.: Toward perception-based evaluation of clustering techniques for visual analytics. In: IEEE Visualization Conference on Proceedings of the VIS 2019, pp. 141–145 (2019)
Google Scholar
Catlett, J.: On changing continuous attributes into ordered discrete attributes. In: Kodratoff, Y. (ed.) EWSL 1991. LNCS, vol. 482, pp. 164–178. Springer, Heidelberg (1991). https://doi.org/10.1007/BFb0017012
Chapter Google Scholar
Demiralp, Ç., Bernstein, M.S., Heer, J.: Learning perceptual kernels for visualization design. IEEE Trans. Vis. Comput. Graph. 20(12), 1933–1942 (2014)
Article Google Scholar
Dougherty, J., Kohavi, R., Sahami, M.: Supervised and unsupervised discretization of continuous features. In: Machine Learning Proceedings 1995, pp. 194–202. Elsevier (1995)
Google Scholar
Draheim, D.: Generalized Jeffrey Conditionalization: A Frequentist Semantics of Partial Conditionalization. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-69868-7
Draheim, D.: Future perspectives of association rule mining based on partial conditionalization. In: The 30th International Conference on Database and Expert Systems Applications, Proceedings of the DEXA 2019. LNCS, vol. 11706, p. xvi. Springer, Heidelberg (2019). https://doi.org/10.13140/RG.2.2.17763.48163
Etemadpour, R., da Motta, R.C., de Souza Paiva, J.G., Minghim, R., de Oliveira, M.C.F., Linsen, L.: Role of human perception in cluster-based visual analysis of multidimensional data projections. In: International Conference on Information Visualization Theory and Applications, Proceedings of IVAPP, pp. 276–283 (2014)
Google Scholar
Fayyad, U., Irani, K.B.: Multi-interval discretization of continuous valued attributes for classification learning, 1993. In: The 13th International Joint Conference on Artificial Intelligence, Proceedings of IJCAI 1993 (1993)
Google Scholar
Garcia, S., Luengo, J., Sáez, J.A., Lopez, V., Herrera, F.: A survey of discretization techniques: taxonomy and empirical analysis in supervised learning. IEEE Trans. Knowl. Data Eng. 25(4), 734–750 (2012)
Article Google Scholar
Kaushik, M., Sharma, R., Peious, S.A., Draheim, D.: Impact-driven discretization of numerical factors: case of two- and three-partitioning. In: Srirama, S.N., Lin, J.C.-W., Bhatnagar, R., Agarwal, S., Reddy, P.K. (eds.) BDA 2021. LNCS, vol. 13147, pp. 244–260. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-93620-4_18
Chapter Google Scholar
Kaushik, M., Sharma, R., Peious, S.A., Shahin, M., Ben Yahia, S., Draheim, D.: On the potential of numerical association rule mining. In: Dang, T.K., Küng, J., Takizawa, M., Chung, T.M. (eds.) FDSE 2020. CCIS, vol. 1306, pp. 3–20. Springer, Singapore (2020). https://doi.org/10.1007/978-981-33-4370-2_1
Chapter Google Scholar
Kaushik, M., Sharma, R., Peious, S.A., Shahin, M., Yahia, S.B., Draheim, D.: A systematic assessment of numerical association rule mining methods. SN Comput. Sci. 2(5), 1–13 (2021)
Article Google Scholar
Kotsiantis, S., Kanellopoulos, D.: Discretization techniques: a recent survey. GESTS Int. Trans. Comput. Sci. Eng. 32(1), 47–58 (2006)
Google Scholar
Liu, H., Hussain, F., Tan, C.L., Dash, M.: Discretization: an enabling technique. Data Min. Knowl. Disc. 6(4), 393–423 (2002)
Article MathSciNet Google Scholar
Liu, H., Setiono, R.: Feature selection via discretization. IEEE Trans. Knowl. Data Eng. 9(4), 642–645 (1997)
Article Google Scholar
Arakkal Peious, S., Sharma, R., Kaushik, M., Shah, S.A., Yahia, S.B.: Grand reports: a tool for generalizing association rule mining to numeric target values. In: Song, M., Song, I.-Y., Kotsis, G., Tjoa, A.M., Khalil, I. (eds.) DaWaK 2020. LNCS, vol. 12393, pp. 28–37. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-59065-9_3
Chapter Google Scholar
Shahin, M., et al.: Big data analytics in association rule mining: a systematic literature review. In: International Conference on Big Data Engineering and Technology, Proceedings of the BDET 2021, pp. 40–49. ACM (2021)
Google Scholar
Sharma, R., et al.: A novel framework for unification of association rule mining, online analytical processing and statistical reasoning. IEEE Access 10, 12792–12813 (2022)
Google Scholar
Sharma, R., Kaushik, M., Peious, S.A., Yahia, S.B., Draheim, D.: Expected vs. unexpected: selecting right measures of interestingness. In: Song, M., Song, I.-Y., Kotsis, G., Tjoa, A.M., Khalil, I. (eds.) DaWaK 2020. LNCS, vol. 12393, pp. 38–47. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-59065-9_4
Chapter Google Scholar
Srikant, R., Agrawal, R.: Mining quantitative association rules in large relational tables. In: International Conference on Management of Data, Proceedings of the ACM SIGMOD 1996, pp. 1–12 (1996)
Google Scholar

Download references

Author information

Authors and Affiliations

Information Systems Group, Tallinn University of Technology, Akadeemia tee 15a, 12618, Tallinn, Estonia
Minakshi Kaushik, Rahul Sharma & Dirk Draheim
Jaypee Institute of Information Technology, Noida, India
Ankit Vidyarthi

Authors

Minakshi Kaushik
View author publications
You can also search for this author in PubMed Google Scholar
Rahul Sharma
View author publications
You can also search for this author in PubMed Google Scholar
Ankit Vidyarthi
View author publications
You can also search for this author in PubMed Google Scholar
Dirk Draheim
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Minakshi Kaushik .

Editor information

Editors and Affiliations

Politecnico di Torino, Turin, Italy
Silvia Chiusano
Politecnico di Torino, Turin, Italy
Tania Cerquitelli
Poznań University of Technology, Poznań, Poland
Robert Wrembel
Norwegian University of Science and Technology, Trondheim, Norway
Kjetil Nørvåg
University of Genoa, Genoa, Italy
Barbara Catania
CNRS, Villeurbanne Cedex, France
Genoveva Vargas-Solar
University of Calabria, Rende, Italy
Ester Zumpano

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kaushik, M., Sharma, R., Vidyarthi, A., Draheim, D. (2022). Discretizing Numerical Attributes: An Analysis of Human Perceptions. In: Chiusano, S., et al. New Trends in Database and Information Systems. ADBIS 2022. Communications in Computer and Information Science, vol 1652. Springer, Cham. https://doi.org/10.1007/978-3-031-15743-1_18

Download citation

DOI: https://doi.org/10.1007/978-3-031-15743-1_18
Published: 29 August 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-15742-4
Online ISBN: 978-3-031-15743-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Discretizing Numerical Attributes: An Analysis of Human Perceptions