Skip to main content
Log in

A review and comparison of strategies for handling missing values in separate-and-conquer rule learning

  • Published:
Journal of Intelligent Information Systems Aims and scope Submit manuscript

Abstract

In this paper, we review possible strategies for handling missing values in separate-and-conquer rule learning algorithms, and compare them experimentally on a large number of datasets. In particular through a careful study with data with controlled levels of missing values we get additional insights on the strategies’ different biases w.r.t. attributes with missing values. Somewhat surprisingly, a strategy that implements a strong bias against the use of attributes with missing values, exhibits the best average performance on 24 datasets from the UCI repository.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Notes

  1. The amputed attributes were checking_status, duration, credit_history for credit-g, bxqsq, rimmx, wkna8 for krkp, and intensity-mean, rawred-mean, hue-mean for segment.

References

  • Breiman, L., Friedman, J. H., Olshen, R., & Stone, C. (1984). Classification and regression trees. Pacific Grove: Wadsworth & Brooks.

    MATH  Google Scholar 

  • Bruha, I., & Franek, F. (1996). Comparison of various routines for unknown attribute value processing: The covering paradigm. International Journal of Pattern Recognition and Artificial Intelligence, 10(8), 939–955.

    Article  Google Scholar 

  • Burdick, D., Deshpande, P. M., Jayram, T. S., Ramakrishnan, R., & Vaithyanathan, S. (2007). OLAP over uncertain and imprecise data. The International Journal on Very Large Data Bases, 16(1), 123–144.

    Article  Google Scholar 

  • Clark, P., & Boswell R. (1991). Rule induction with CN2: Some recent improvements. In Proceedings of the 5th European working session on learning (EWSL-91) (pp. 151–163). Porto: Springer.

    Google Scholar 

  • Clark, P., & Niblett, T. (1989). The CN2 induction algorithm. Machine Learning, 3(4), 261–283.

    Google Scholar 

  • Cohen, W. W. (1995). Fast effective rule induction. In A. Prieditis, & S. Russell (Eds.), Proceedings of the 12th international conference on machine learning (ML-95) (pp. 115–123). Lake Tahoe: Morgan Kaufmann.

    Google Scholar 

  • Dardzinska, A., & Ras, Z. W. (2006). Extracting rules from incomplete decision systems: System ERID. In T. Y. Lin, S. Ohsuga, C.-J. Liau, & X. Hu (Eds.), Foundations and novel approaches in data mining. Studies in computational intelligence (Vol. 6, pp. 143–153). Berlin: Springer.

    Google Scholar 

  • Demšar, J. (2006). Statistical comparisons of classifiers over multiple data sets. Journal of Machine Learning Research, 7, 1–30.

    Google Scholar 

  • Fujikawa, Y., & Ho, T.-B. (2002). Proceedings of the 6th Pacific-Asia conference on advances in knowledge discovery and data mining (pakdd 2002). In M.-S. Cheng, P. S. Yu, & Bing Liu (Eds.), PAKDD. Lecture notes in computer science (Vol. 2336, pp. 549–554). Taipei: Springer.

    Google Scholar 

  • Fürnkranz, J. (1999). Separate-and-conquer rule learning. Artificial Intelligence Review, 13(1), 3–54.

    Article  MATH  Google Scholar 

  • Gamberger, D., Lavrač, N., & Fürnkranz, J. (2008). Handling unknown and imprecise attribute values in propositional rule learning: A feature-based approach. In T.-B. Ho, & Z.-H. Zhou (Eds.), Proceedings of the 10th Pacific rim international conference on artificial intelligence (PRICAI-08) (pp. 636–645). Hanoi: Springer.

    Google Scholar 

  • Ghahramani, Z., & Jordan, M. I. (1994). Advances in neural information processing systems 6 (nips-93). In J. D. Cowan, G. Tesauro, & J. Alspector (Eds.), NIPS (pp. 120–127). Denver: Morgan Kaufmann.

    Google Scholar 

  • Grzymala-Busse, J. W. (2005a). LERS—a data mining system. In O. Maimon, & L. Rokach (Eds.), The data mining and knowledge discovery handbook (pp. 1347–1351). Berlin: Springer.

    Chapter  Google Scholar 

  • Grzymala-Busse, J. W. (2005b). Characteristic relations for incomplete data: A generalization of the indiscernibility relation. In J. F. Peters, & A. Skowron (Eds.), Transactions on rough sets IV (pp. 58–68). Berlin: Springer.

    Chapter  Google Scholar 

  • Grzymala-Busse, J. W. (1991). On the unknown attribute values in learning from examples. In Z. W. Ras, & M. Zemankova (Eds.), Proceedings of the 6th international symposium on methodologies for intelligent systems (ISMIS-91) (pp. 368–377). Charlotte, N.C.

  • Grzymala-Busse, J. W., & Grzymala-Busse, W. J. (2005). Handling missing attribute values. In O. Maimon, & L. Rokach (Eds.), Data mining and knowledge discovery handbook (pp. 37–57). Berlin: Springer.

    Chapter  Google Scholar 

  • Grzymala-Busse, J. W., & Hu, M. (2000). A comparison of several approaches to missing attribute values in data mining. In Rough sets and current trends in computing (pp. 378–385).

  • Grzymala-Busse, J. W., & Wang, A. Y. (1997). Modified algorithms LEM1 and LEM2 for rule induction from data with missing attribute values. In Proceedings of the fifth international workshop on rough sets and soft computing (RSSC 1997) (pp. 69–72).

  • Grzymala-Busse, J. W., Grzymala-Busse, W. J., & Goodwin, L. K. (1999). A closest fit approach to missing attribute values in preterm birth data. In N. Zhong, A. Skowron, & S. Ohsuga (Eds.), Proceedings of the 7th international workshop on new directions in rough sets, data mining, and granular-soft computing. Lecture notes in computer science (Vol. 1711, pp. 405–413). Yamaguchi: Springer.

    Chapter  Google Scholar 

  • Hettich, S., Blake, C. L., & Merz, C. J. (1998). UCI repository of machine learning databases. Irvine: Department of Information and Computer Science, University of California at Irvine. http://www.ics.uci.edu/~mlearn/MLRepository.html.

    Google Scholar 

  • Iman, R. L., & Davenport, J. M. (1980). Approximations in the critical region of the Friedman statistic. Communications in Statistics—Theory and Methods, 9(6), 571–595.

    Article  Google Scholar 

  • Janssen, F., & Fürnkranz, J. (2008). An empirical investigation of the trade-off between consistency and coverage in rule learning heuristics. In J.-F. Boulicaut, M. Berthold, & T. Horváth (Eds.), Proceedings of the 11th international conference on discovery science (DS-08) (pp. 40–51). Budapest: Springer.

    Google Scholar 

  • Janssen, F., & Fürnkranz, J. (2010). On the quest for optimal rule learning heuristics. Machine Learning 78(3), 343–379.

    Article  Google Scholar 

  • Kryszkiewicz, M. (1999a). Association rules in incomplete databases. In N. Zhong, & L. Zhou (Eds.), Proceedings of the 3rd Pacific-Asia conference on methodologies for knowledge discovery and data mining (PAKDD-99) (pp. 84–93). Beijing, China.

  • Kryszkiewicz, M. (1999b). Rules in incomplete information systems. Information Sciences, 113(3–4), 271–292.

    Article  MATH  MathSciNet  Google Scholar 

  • Lakshminarayan, K., Harp, S. A., & Samad, T. (1999). Imputation of missing data in industrial databases. Applied Intelligence, 11(3), 259–275.

    Article  Google Scholar 

  • Latkowski, R. (2003). On decomposition for incomplete data. Fundamenta Informaticae, 54(1), 1–16.

    MATH  MathSciNet  Google Scholar 

  • Latkowski, R., & Mikołajczyk, M. (2004). Data decomposition and decision rule joining for classification of data with missing values. In J. F. Peters, A. Skowron, D. Duboi, J. W. Grzymala-Busse, M. Inuiguchi, & L. Polkowski (Eds.), Transactions on rough sets II (pp. 299–320). Berlin: Springer.

    Chapter  Google Scholar 

  • Lavrač N., Fürnkranz,v, & Gamberger, D. (2010). Explicit feature construction and manipulation for covering rule learning algorithms. In J. Koronacki, Z. Ras, S. T. Wierzchon, & J. Kacprzyk (Eds.), Advances in machine learning II—Dedicated to the memory of Professor Ryszard S. Michalski (pp. 121–146). Berlin: Springer.

  • Li, D., Deogun, J. S., Spaulding, W., & Shuart, B. (2005). Dealing with missing data: Algorithms based on fuzzy set and rough set theories. In J. F. Peters, & A. Skowron (Eds.), Transactions on rough sets IV (pp. 37–57). Berlin: Springer.

    Chapter  Google Scholar 

  • Li, T., Ruan, D., & Song, J. (2007). Dynamic maintenance and decision rules with rough set under characteristic relation. In Proceedings of the international conference on wireless communications, networking and mobile computing (pp. 3713–3716).

  • Little, R. J. A., & Rubin, D. B. (1987). Statistical analysis with missing data. New York: Wiley.

    MATH  Google Scholar 

  • Nakata, M., & Sakai, H. (2005). Rough Sets Handling Missing Values Probabilistically Interpreted. In D. Slezak, G. Wang, M. S. Szczuka, I. Düntsch, & Y. Yao (Eds.), Proceedings of the 10th international conference on rough sets, fuzzy sets, data mining, and granular computing (RSFDGrC-05), part I (pp. 325–334).

  • Pawlak, Z. (1991). Rough sets: Theoretical aspects of reasoning about data. Dordrecht: Kluwer Academic (ISBN 0-7923-1472-7)

    MATH  Google Scholar 

  • Ross Quinlan, J. (1989). Unknown attribute values in induction. In Proceedings of the 6th international workshop on machine learning (ML-89) (pp. 164–168).

  • Rubin, D. B. (1996). Multiple imputation after 18+ years. Journal of the American Statistical Association, 91, 473–489.

    Article  MATH  Google Scholar 

  • Rubin, D. B. (1987). Multiple imputation for nonresponse in surveys. New York: Wiley.

    Book  Google Scholar 

  • Saar-Tsechansky, M., & Provost, F. (2007). Handling missing values when applying classification models. Journal of Machine Learning Research, 8, 1625–1657.

    Google Scholar 

  • Schafer, J. L. (1997). Analysis of incomplete multivariate data. Boca Raton: Chapman & Hall/CRC.

    Book  MATH  Google Scholar 

  • Stefanowski, J., & Tsoukiàs, A. (2001). Incomplete information tables and rough classification. Computational Intelligence, 17(3), 545–566.

    Article  Google Scholar 

  • Twala, B., Cartwright,v, & Shepperd, M. J. (2005). Comparison of various methods for handling incomplete data in software engineering databases. In Proceedings of the international symposium on empirical software engineering (ISESE-05) (pp. 105–114).

  • Wang, G. (2002). Extension of rough set under incomplete information systems. In Proceedings of the IEEE international conference on fuzzy systems (FUZZ-IEEE-02) (pp. 1098–1103).

  • Witten, I. H., & Frank, E. (2005). Data mining—practical machine learning tools and techniques with Java implementations (2nd ed.). Lake Tahoe: Morgan Kaufmann.

    Google Scholar 

  • Wohlrab, L. (2009). Comparison of different methods for handling missing attribute values in the SeCo rule learner. Independent Study Project, Knowledge Engineering Group, TU Darmstadt (in German).

  • Wong, A. K. C., & Chiu, D. K. Y. (1987). Synthesizing statistical knowledge from incomplete mixed-mode data. IEEE Transactions on Pattern Analysis and Machine Intelligence, 9(6), 796–805.

    Article  Google Scholar 

  • Wu, X., & Barbará, D. (2002a). Learning missing values from summary constraints. SIGKDD Explorations, 4(1), 21–30.

    Article  Google Scholar 

  • Wu, X., & Barbará, D. (2002). Modeling and Imputation of Large Incomplete Multidimensional Datasets. In Proceedings of the 4th international conference on data warehousing and knowledge discovery (DaWaK-02) (pp. 286–295). Berlin: Springer.

    Google Scholar 

  • Zou, Y., An, A., & Huang, X. (2005). Evaluation and automatic selection of methods for handling missing data. In X. Hu, Q. Liu, A. Skowron, T. Y. Lin, R. R. Yager, & B. Zhang (Eds.), Proceedings of the IEEE international conference on granular computing (pp. 728–733). Washington: IEEE.

    Google Scholar 

Download references

Acknowledgements

We would like to thank Nada Lavrač and Dragan Gamberger for interesting discussions on their pessimistic value strategy. This research was supported by the German Science Foundation (DFG) under grant FU 580/2.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Johannes Fürnkranz.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wohlrab, L., Fürnkranz, J. A review and comparison of strategies for handling missing values in separate-and-conquer rule learning. J Intell Inf Syst 36, 73–98 (2011). https://doi.org/10.1007/s10844-010-0121-8

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10844-010-0121-8

Keywords

Navigation