Ordered Estimation of Missing Values

Lobo, Oscar Ortega; Numao, Masayuki

doi:10.1007/3-540-48912-6_67

Oscar Ortega Lobo³ &
Masayuki Numao³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 1574))

Included in the following conference series:

Pacific-Asia Conference on Knowledge Discovery and Data Mining

1028 Accesses
11 Citations

Abstract

When attempting to discover by learning concepts embedded in data, it is not uncommon to find that information is missing from the data. Such missing information can diminish the confidence on the concepts learned from the data. This paper describes a new approach to fill missing values in examples provided to a learning algorithm. A decision tree is constructed to determine the missing values of each attribute by using the information contained in other attributes. Also, an ordering for the construction of the decision trees for the attributes is formulated. Experimental results on three datasets show that completing the data by using decision trees leads to final concepts with less error under different rates of random missing values. The approach should be suitable for domains with strong relations among the attributes, and for which improving accuracy is desirable even if computational cost increases.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

C. Blake, E. Keogh, and C.J. Merz. UCI repository of machine learning databases. In [ http://www.ics.uci.edu/~mlearn/MLRepository.html ]. University of California, Department of Information and Computer Science, Irvine, CA, 1998.
Google Scholar
Leo Breiman, Jerome H. Friedman, Richard A. Olshen, and Charles J. Stone. Classification and Regression Trees. Chapman & Hall, 1993.
Google Scholar
B. Cestnik, I. Kononenko, and I. Bratko. Assistant-86: A knowledge-elicitation tool for sophisticated users. In Ivan Bratko and Nada Lavrac, editors, Progress in Machine Learning. Sigma Press, Wilmslow, UK, 1987.
Google Scholar
I. Kononenko and E. Roscar. Experiments in automatic learning of medical diagnostic rules. Technical report, Jozef Stefan Institute, Ljubjana, Yugoslavia, 1984.
Google Scholar
W.Z. Liu, A.P. White, S.G. Thompson, and M.A. Bramer. Techniques for dealing with missing values in classification. In Proc of Advances in Intelligent Data Analysis (IDA’97), volume 1280 of Lecture notes in computer science, pages 527–536. Springer, 1997.
Chapter Google Scholar
J. R. Quinlan. Unknown attribute values in induction. In Proceedings of the sixth international Machine Learning workshop, pages 164–168. Morgan Kaufmann, 1989.
Google Scholar
J. Ross Quinlan. Unknown attribute values. In C4.5 Programs for Machine Learning, pages 27–32. Morgan Kaufmann, 1993.
Google Scholar
J.R. Quinlan. Induction of decision trees. Machine Learning, 1:81–106, 1986.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, Tokyo Institute of Technology, 2-12-1 Ookayama, Meguro-ku, Tokyo, 152-8552, Japan
Oscar Ortega Lobo & Masayuki Numao

Authors

Oscar Ortega Lobo
View author publications
You can also search for this author in PubMed Google Scholar
Masayuki Numao
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science and Systems Engineering, Yamaguchi University, Tokiwa-Dai, 2557, Ube, 755, Japan
Ning Zhong
Department of Computer Science and Technology, Tsinghua University, Beijing, China
Lizhu Zhou

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Lobo, O.O., Numao, M. (1999). Ordered Estimation of Missing Values. In: Zhong, N., Zhou, L. (eds) Methodologies for Knowledge Discovery and Data Mining. PAKDD 1999. Lecture Notes in Computer Science(), vol 1574. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-48912-6_67

Download citation

DOI: https://doi.org/10.1007/3-540-48912-6_67
Published: 24 September 2002
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-65866-5
Online ISBN: 978-3-540-48912-2
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics