Optimization Problem of k-NN Classifier for Missing Values Case

  • Urszula BentkowskaEmail author
Part of the Studies in Fuzziness and Soft Computing book series (STUDFUZZ, volume 378)


In this chapter we present a method of dealing with missing values in data sets. This method uses interval-valued fuzzy calculus and we show that it outperforms other methods that were previously known. The obtained results may be useful in diverse computer support systems but especially in the computer support systems devoted to support the medical diagnosis.


  1. 1.
    Michie, D., Spiegelhalter, D.J., Taylor, D.J.: Machine Learning, Neural and Statistical Classification. Ellis Horwood Limited, England (1994)Google Scholar
  2. 2.
    Pawlak, Z., Skowron, A.: Rudiments of rough sets. Inf. Sci. 177, 3–27 (2007)MathSciNetCrossRefGoogle Scholar
  3. 3.
    Bazan, J.G.: Hierarchical classifiers for complex spatio-temporal concepts. Transactions on Rough Sets IX, pp. 474–750. Springer, Berlin (2008)Google Scholar
  4. 4.
    Bazan, J.G., Buregwa-Czuma, S., Jankowski, A.: A domain knowledge as a tool for improving classifiers. Fundam. Inform. 127(1–4), 495–511 (2013)Google Scholar
  5. 5.
    Bazan, J.G., Bazan-Socha, S., Buregwa-Czuma, S., Dydo, L., Rzasa, W., Skowron, A.: A classifier based on a decision tree with verifying cuts. Fundam. Inform. 143(1–2), 1–18 (2016)MathSciNetGoogle Scholar
  6. 6.
    Buregwa-Czuma, S., Bazan, J.G., Bazan-Socha, S., Rzasa, W., Dydo, L., Skowron, A.: Resolving the conflicts between cuts in a decision tree with verifying cuts (The best application paper award). In: Proceedings of IJCRS 2017, Olsztyn, 3–7 July. Lecture Notes in Computer Science (LNCS), vol. 10314, pp. 403–422. Springer (2017)Google Scholar
  7. 7.
    Bailey, T., Jain, A.: A note on distance-weighted k-nearest neighbor rules. IEEE Trans. Syst. Man, Cybern. 8, 311–313 (1978)Google Scholar
  8. 8.
    Dudani, S.A.: The distance-weighted k-nearest-neighbor rule. IEEE Trans. Syst. Man, Cybern., SMC 6, 325–327 (1976)CrossRefGoogle Scholar
  9. 9.
    Frank, E., Hall, M.A., Witten, I.H.: The WEKA workbench. Online Appendix for Data Mining Practical Machine Learning Tools and Techniques, 4th edn. Morgan Kaufmann, Burlington (2016)Google Scholar
  10. 10.
    Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA data mining software: an update. SIGKDD Explor. 11(1), 10–18 (2009)CrossRefGoogle Scholar
  11. 11.
    Dubois, D., Prade, H.: Gradualness, uncertainty and bipolarity: making sense of fuzzy sets. Fuzzy Sets Syst. 192, 3–24 (2012)MathSciNetCrossRefGoogle Scholar
  12. 12.
    Zadeh, L.A.: The concept of a linguistic variable and its application to approximate reasoning-I. Inf. Sci. 8, 199–249 (1975)MathSciNetCrossRefGoogle Scholar
  13. 13.
    Bentkowska, U.: New types of aggregation functions for interval-valued fuzzy setting and preservation of pos-B and nec-B-transitivity in decision making problems. Inf. Sci. 424, 385–399 (2018)MathSciNetCrossRefGoogle Scholar
  14. 14.
    Dubois, D., Prade, H.: Possibility Theory. Plenum Press, New York (1988)CrossRefGoogle Scholar
  15. 15.
    Fawcett, T.: An introduction to ROC analysis. Pattern Recognit. Lett. 27(8), 861–874 (2006)MathSciNetCrossRefGoogle Scholar
  16. 16.
    Swets, J.A.: Measuring the accuracy of diagnostic systems. Science 240, 1285–1293 (1988)MathSciNetCrossRefGoogle Scholar
  17. 17.
    UC Irvine Machine Learning Repository:
  18. 18.
    De Waal, T., Pannekoek, J., Scholtus, S.: Handbook of Statistical Data Editing and Imputation, vol. 563. Wiley, Hoboken (2011)CrossRefGoogle Scholar
  19. 19.
    Grzymala-Busse, J.W.: Three approaches to missing attribute values: a rough set perspective. Stud. Comput. Intell. (SCI) 118, 139–152 (2008)zbMATHGoogle Scholar
  20. 20.
    Dyczkowski, K.: Intelligent Medical Decision Support System Based on Imperfect Information. The Case of Ovarian Tumor Diagnosis. Studies in Computational Intelligence. Springer, Berlin (2018)CrossRefGoogle Scholar
  21. 21.
    Wójtowicz, A., Żywica, P., Stachowiak, A., Dyczkowski, K.: Solving the problem of incomplete data in medical diagnosis via interval modeling. Appl. Soft Comput. 47, 424–437 (2016)CrossRefGoogle Scholar
  22. 22.
    Żywica, P., Dyczkowski, K., Wójtowicz, A., Stachowiak, A., Szubert, S., Moszyński, R.: Development of a fuzzy-driven system for ovarian tumor diagnosis. Biocybern. Biomed. Eng. 36(4), 632–643 (2016)CrossRefGoogle Scholar
  23. 23.
    Żywica, P., Wójtowicz, A., Stachowiak, A., Dyczkowski, K.: Improving medical decisions under incomplete data using intervalvalued fuzzy aggregation. In: Proceedings of the IFSA-EUSFLAT 2015, pp. 577–584. Atlantis Press (2015)Google Scholar
  24. 24.
    Wójtowicz, A., Żywica, P., Szarzyński, K., Moszyński, R., Szubert, S., Dyczkowski, K., Stachowiak, A., Szpurek, D., Wygralak, M.: Dealing with uncertinity in ovarian tumor diagnosis. Modern Approaches in Fuzzy Sets, Intuitionistic Fuzzy Sets, Generalized Nets and Related Topics. Vol. II: Applications, pp. 151–158. SRI PAS, Warszawa (2014)Google Scholar
  25. 25.
    Szubert, S., Wójtowicz, A., Moszyński, R., Żywica, P., Dyczkowski, K., Stachowiak, A., Sajdak, S., Szpurek, D., Alcázar, J.L.: External validation of the IOTA ADNEX model performed by two independent gynecologic centers. Gynecol. Oncol. 142(3), 490–495 (2016)CrossRefGoogle Scholar
  26. 26.
    Stachowiak, A., Dyczkowski, K., Wójtowicz, A., Żywica, P., Wygralak, M.: A bipolar view on medical diagnosis in ovaexpert system. In: Andreasen, T., Christiansen, H., Kacprzyk, J., et al. (eds.) Flexible Query Answering Systems 2015, Proceedings of FQAS 2015, Cracow, Poland, October 26–28, 2015. Advances in Intelligent Systems and Computing, vol. 400, pp. 483–492. Springer International Publishing, Cham, Switzerland (2016)Google Scholar
  27. 27.
    Moszyński, R., Żywica, P., Wójtowicz, A., Szubert, S., Sajdak, S., Stachowiak, A., Dyczkowski, K., Wygralak, M., Szpurek, D.: Menopausal status strongly influences the utility of predictive models in differential diagnosis of ovarian tumors: an external validation of selected diagnostic tools. Ginekol. Pol. 85(12), 892–899 (2014)CrossRefGoogle Scholar
  28. 28.
    Dyczkowski, K., Wójtowicz, A., Żywica, P., Stachowiak, A., Moszyński, R., Szubert, S.: An intelligent system for computer-aided ovarian tumor diagnosis. Intelligent Systems 2014, pp. 335–344. Springer International Publishing, Cham (2015)Google Scholar
  29. 29.
    Fix, E., Hodges, J.L.: discriminatory analysis, aonparametric discrimination: consistency properties. Technical Report 4, USAF School of Aviation Medicine, Randolph Field, Texas (1951)Google Scholar
  30. 30.
    Cover, T.M., Hart, P.E.: Nearest neighbor pattern classification. IEEE Trans. Inform. Theory 13(1), 21–27 (1967)CrossRefGoogle Scholar
  31. 31.
    Bermejo, S., Cabestany, J.: Adaptive soft k-nearest-neighbour classifiers. Pattern Recognit. 33, 1999–2005 (2000)CrossRefGoogle Scholar
  32. 32.
    Jozwik, A.: A learning scheme for a fuzzy k-nn rule. Pattern Recognit. Lett. 1, 287–289 (1983)CrossRefGoogle Scholar
  33. 33.
    Keller, J.M., Gray, M.R., Givens, J.A.: A fuzzy k-nn neighbor algorithm. IEEE Trans. Syst. Man Cybern. SMC 15(4), 580–585 (1985)CrossRefGoogle Scholar
  34. 34.
    Moore, R.E.: Interval Analysis, vol. 4. Prentice-Hall, Englewood Cliffs (1966)Google Scholar
  35. 35.
    Bentkowska, U., Bazan, J.G., Rza̧sa, W., Zarȩba, L.: Application of interval-valued aggregation to optimization problem of \(k\)-\(NN\) classifiers for missing values case. Inf. Sci. (under review)Google Scholar
  36. 36.
    Mansouri, K., Ringsted, T., Ballabio, D., Todeschini, R., Consonni, V.: Quantitative structure - activity relationship models for ready biodegradability of chemicals. J. Chem. Inf. Model. 53, 867–878 (2013)CrossRefGoogle Scholar
  37. 37.
    Wolberg, W.H., Mangasarian, O.L.: Multisurface method of pattern separation for medical diagnosis applied to breast cytology. In: Proceedings of the National Academy of Sciences, vol. 87, pp. 9193–9196. U.S.A., Dec 1990Google Scholar
  38. 38.
    Zhang, K., Fan, W.: Forecasting skewed biased stochastic ozone days: analyses, solutions and beyond. Knowl. Inf. Syst. 14(3), 299–326 (2008)CrossRefGoogle Scholar
  39. 39.
    Antal, B., Hajdu, A.: An ensemble-based system for automatic screening of diabetic retinopathy. Knowl. Based Syst. 60, 20–27 (2014)CrossRefGoogle Scholar
  40. 40.
    Japkowicz, N., Shah, M.: Evaluating Learning Algorithms: A Classification Perspective. Cambridge University Press, New York (2011)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.Faculty of Mathematics and Natural SciencesUniversity of RzeszówRzeszówPoland

Personalised recommendations