Imputing manufacturing material in data mining

Yeh, Ruey-Ling; Liu, Ching; Shia, Ben-Chang; Cheng, Yu-Ting; Huwang, Ya-Fang

doi:10.1007/s10845-007-0067-z

Imputing manufacturing material in data mining

Published: 21 November 2007

Volume 19, pages 109–118, (2008)
Cite this article

Journal of Intelligent Manufacturing Aims and scope Submit manuscript

Ruey-Ling Yeh¹,
Ching Liu¹,
Ben-Chang Shia²,
Yu-Ting Cheng³ &
…
Ya-Fang Huwang³

126 Accesses
4 Citations
Explore all metrics

Abstract

Data plays a vital role as a source of information to organizations, especially in times of information and technology. One encounters a not-so-perfect database from which data is missing, and the results obtained from such a database may provide biased or misleading solutions. Therefore, imputing missing data to a database has been regarded as one of the major steps in data mining. The present research used different methods of data mining to construct imputative models in accordance with different types of missing data. When the missing data is continuous, regression models and Neural Networks are used to build imputative models. For the categorical missing data, the logistic regression model, neural network, C5.0 and CART are employed to construct imputative models. The results showed that the regression model was found to provide the best estimate of continuous missing data; but for categorical missing data, the C5.0 model proved the best method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Alan, A. (1996). An introduction to categorical data analysis. Wiley Interscience.
Craven, M. P. (1997). A faster learning neural network classifier using selective backpropagation. In Proceedings of the fourth IEEE international conference on electronics, circuits and systems. Cairo, Egypt, 1, 254–258.
Ford B.L. (1983) An overview of hot-deck procedures. In: Madow W.G., Olkin I., Rubin D.B. (eds). Incomplete data in sample surveys, Volume 2 Theory and Bibliographies. Academic Press, New York, NY. pp. 185–207
Google Scholar
Friedman J.H. (1997). A recursive partitioning decision rule for nonparametric classifiers. IEEE Transactions on Computers 26, 404–408
Article Google Scholar
Huberty C.J. (1989) Problems with stepwise methods—better alter-natives. In: Thompson B. (eds). Advances in social science methodology. Vol. 1. JAI Press Inc., Greenwich, pp. 43–70
Google Scholar
John,O.R., Sastry,G.P., & David, A. D. (1998). Applied regression analysis—a research tool, 2nd ed. Springer.
Joop J.H. (1999). A review of current software for handing missing data. Kwantitatieve Methoden 62, 123–138
Google Scholar
Judi S. (2002). Dealing with missing data. Research Letters in the Information and Mathematical Sciences 3, 153–160
Google Scholar
Kalton, G., & Kasprzyk, D. (1982). Imputing for missing survey responses. Proceedings of the Section on Survey Research Methods, American Statistical Association, 22–23.
Kalton G., Kasprzyk D. (1986). The treatment of missing survey data. Survey Methodology 12(1): 1–16
Google Scholar
Lessler J.T., Kalsbeek W.D. (1992). Nonsampling error in surveys. John Wiley & Sons, Inc, New York
Google Scholar
Li J.R., Khoo L.P., Tor S.B. (2006). RMINE: A rough set based data mining prototype for the reasoning of incomplete data in condition-based fault diagnosis. Journal of Intelligent Manufacturing 17, 163–176
Article Google Scholar
Little R.J.A., Rubin D.B. (1987). Statistical analysis with missing data. John Wiley & Sons, New York
Google Scholar
Little R.J.A., Rubin D.B. (2002). Statistical analysis with missing data, 2nd ed. John Wiley & Sons, New York
Google Scholar
Margaret, H. D. (2002). Data mining—introductory and advanced topics. Prentice Hall.
Robert E.F. (1996). Alternative paradigms for the analysis of imputed survey data. Journal of the American Statistical Association, 91(434): 490–498
Article Google Scholar
Rubin D.B. (1987). Multiple imputation for nonresponse in surveys. John Wiley & Sons, New York
Google Scholar
Werbos, P. (1974). Beyond regression: New tools for prediction and analysis in the behavioral sciences. Harvard University.

Download references

Author information

Authors and Affiliations

Division of Biometrics, Graduate Institute of Agronomy, National Taiwan University, Taipei, Taiwan
Ruey-Ling Yeh & Ching Liu
Department of Statistics and Information Science, Fu Jen Catholic University, Taipei, Taiwan
Ben-Chang Shia
Department of Statistics Science, National Chengchi University, Taipei, Taiwan
Yu-Ting Cheng & Ya-Fang Huwang

Authors

Ruey-Ling Yeh
View author publications
You can also search for this author in PubMed Google Scholar
Ching Liu
View author publications
You can also search for this author in PubMed Google Scholar
Ben-Chang Shia
View author publications
You can also search for this author in PubMed Google Scholar
Yu-Ting Cheng
View author publications
You can also search for this author in PubMed Google Scholar
Ya-Fang Huwang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ruey-Ling Yeh.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yeh, RL., Liu, C., Shia, BC. et al. Imputing manufacturing material in data mining. J Intell Manuf 19, 109–118 (2008). https://doi.org/10.1007/s10845-007-0067-z

Download citation

Received: 01 May 2006
Accepted: 01 July 2007
Published: 21 November 2007
Issue Date: February 2008
DOI: https://doi.org/10.1007/s10845-007-0067-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Imputing manufacturing material in data mining

Abstract

Access this article

Similar content being viewed by others

Missing Data Imputation for Machine Learning

A survey on missing data in machine learning

Empirical comparison of supervised learning techniques for missing value imputation

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Imputing manufacturing material in data mining

Abstract

Access this article

Similar content being viewed by others

Missing Data Imputation for Machine Learning

A survey on missing data in machine learning

Empirical comparison of supervised learning techniques for missing value imputation

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation