Techniques for dealing with missing values in classification

Liu, W. Z.; White, A. P.; Thompson, S. G.; Bramer, M. A.

doi:10.1007/BFb0052868

W. Z. Liu¹,
A. P. White²,
S. G. Thompson¹ &
…
M. A. Bramer¹

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1280))

Included in the following conference series:

International Symposium on Intelligent Data Analysis

883 Accesses
25 Citations

Abstract

A brief overview of the history of the development of decision tree induction algorithms is followed by a review of techniques for dealing with missing attribute values in the operation of these methods. The technique of dynamic path generation is described in the context of tree-based classification methods. The waste of data which can result from casewise deletion of missing values in statistical algorithms is discussed and alternatives proposed.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Breiman, L., Friedman, J.H., Olshen, R.A. & Stone, C.J. (1984). Classification and regression trees. Belmont: Wadsworth.
MATH Google Scholar
Clark, L.A. & Pregibon, D. (1992). Tree-based models. In Statistical Models in S, edited by J.M. Chambers & T.J. Hastie, pp. 377–419. California: Wadsworth & Brooks/Cole.
Google Scholar
Friedman, H.F., Kohavi, R. & Yun, Y. (1996). Lazy decision trees. in Proceedings of the 13th National Conference on Artificial Intelligence, pp. 717–724, AAAI Press/MIT Press.
Google Scholar
Hunt, E.B. (1962). Concept learning: an information processing problem. New York: Wiley.
Book Google Scholar
Hunt, E.B., Marin, J. & Stone, P.J. (1966). Experiments in induction. New York: Academic Press.
Google Scholar
Kass, G.V. (1980). An exploratory technique for investigating large quantities of categorical data. Applied Statistics, 29, 119–127.
Article Google Scholar
Kononenko, I., Bratko, I. & Roskar, E. (1984). Experiments in automatic learning of medical diagnostic rules. Technical Report. Jozef Stefan Institute, Ljubjana, Yugoslavia.
Google Scholar
Liu, W.Z. & White, A.P. (1991). A review of inductive learning. In Research and Development in Expert Systems VIII, edited by I.M. Graham and R.W. Milne, pp. 112–126. Cambridge: Cambridge University Press.
Google Scholar
Liu, W.Z. & White, A.P. (1994). The importance of attribute selection measures in decision tree induction. Machine Learning, 15, 25–41.
Google Scholar
Liu, W.Z. White, A.P. & Hallissey, M.T. (1994). Early screening for gastric cancer using machine learning techniques. In Machine Learning: ECML-94, edited by F. Bergadano and L. De Raedt, pp. 391–394. Springer-Verlag, Berlin.
Chapter Google Scholar
Liu, W.Z., White, A.P., Hallissey, M.T. & Fielding, J.W.L. (1996). Machine learning techniques in early screening for gastric and oesophageal cancer. Artificial Intelligence in Medicine, 8, 327–341.
Article Google Scholar
Mingers, J. (1989). An empirical comparison of pruning methods for decision tree induction. Machine Learning, 4, 227–243.
Article Google Scholar
Quinlan, J.R. (1979). Discovering rules by induction from large collections of examples. In Expert Systems in the Micro-Electronic Age, edited by D. Michie, pp. 168–201. Edinburgh: Edinburgh University Press.
Google Scholar
Quinlan, J.R. (1986). Induction of decision trees. Machine Learning, 1, 81–106.
Google Scholar
White, A.P. (1987). Probabilistic induction by dynamic path generation in virtual trees. In Research and Development in Expert Systems III, edited by M.A. Bramer, pp. 35–46. Cambridge: Cambridge University Press.
Google Scholar
White, A.P. & Liu, W.Z. (1994). Bias in information-based measures in decision tree induction. Machine Learning, 15, 321–329.
MATH Google Scholar
White, A.P., Liu, W.Z., Hallissey, M.T. & Fielding, J.W.L. (1996). A comparison of two classification techniques in screening for gastro-oesophageal cancer. Applications and Innovations in Expert Systems IV, edited by A. Macintosh and C. Cooper, pp. 83–97. Cambridge: Cambridge University Press.
Google Scholar
White, A.P. & Liu, W.Z. (1997). Statistical properties of tree-based approaches to classification. In Machine Learning and Statistics: the Interface, edited by R. Nakhaeizadeh and C. Taylor, pp. 23–44. ISBN 0-471-14890-3, John Wiley & Sons, Inc.
Google Scholar

Download references

Author information

Authors and Affiliations

Artificial Intelligence Research Group, Department of Information Science, University of Portsmouth, Locksway Road, PO4 8JF, Milton, Hampshire, UK
W. Z. Liu, S. G. Thompson & M. A. Bramer
School of Mathematics and Statistics, University of Birmingham, B15 2TT, Edgbaston, Birmingham, UK
A. P. White

Authors

W. Z. Liu
View author publications
You can also search for this author in PubMed Google Scholar
A. P. White
View author publications
You can also search for this author in PubMed Google Scholar
S. G. Thompson
View author publications
You can also search for this author in PubMed Google Scholar
M. A. Bramer
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Xiaohui Liu Paul Cohen Michael Berthold

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Liu, W.Z., White, A.P., Thompson, S.G., Bramer, M.A. (1997). Techniques for dealing with missing values in classification. In: Liu, X., Cohen, P., Berthold, M. (eds) Advances in Intelligent Data Analysis Reasoning about Data. IDA 1997. Lecture Notes in Computer Science, vol 1280. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0052868

Download citation

DOI: https://doi.org/10.1007/BFb0052868
Published: 19 May 2006
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-63346-4
Online ISBN: 978-3-540-69520-2
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics