Skip to main content
Log in

Imputing manufacturing material in data mining

  • Published:
Journal of Intelligent Manufacturing Aims and scope Submit manuscript

Abstract

Data plays a vital role as a source of information to organizations, especially in times of information and technology. One encounters a not-so-perfect database from which data is missing, and the results obtained from such a database may provide biased or misleading solutions. Therefore, imputing missing data to a database has been regarded as one of the major steps in data mining. The present research used different methods of data mining to construct imputative models in accordance with different types of missing data. When the missing data is continuous, regression models and Neural Networks are used to build imputative models. For the categorical missing data, the logistic regression model, neural network, C5.0 and CART are employed to construct imputative models. The results showed that the regression model was found to provide the best estimate of continuous missing data; but for categorical missing data, the C5.0 model proved the best method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Alan, A. (1996). An introduction to categorical data analysis. Wiley Interscience.

  • Craven, M. P. (1997). A faster learning neural network classifier using selective backpropagation. In Proceedings of the fourth IEEE international conference on electronics, circuits and systems. Cairo, Egypt, 1, 254–258.

  • Ford B.L. (1983) An overview of hot-deck procedures. In: Madow W.G., Olkin I., Rubin D.B. (eds). Incomplete data in sample surveys, Volume 2 Theory and Bibliographies. Academic Press, New York, NY. pp. 185–207

    Google Scholar 

  • Friedman J.H. (1997). A recursive partitioning decision rule for nonparametric classifiers. IEEE Transactions on Computers 26, 404–408

    Article  Google Scholar 

  • Huberty C.J. (1989) Problems with stepwise methods—better alter-natives. In: Thompson B. (eds). Advances in social science methodology. Vol. 1. JAI Press Inc., Greenwich, pp. 43–70

    Google Scholar 

  • John,O.R., Sastry,G.P., & David, A. D. (1998). Applied regression analysis—a research tool, 2nd ed. Springer.

  • Joop J.H. (1999). A review of current software for handing missing data. Kwantitatieve Methoden 62, 123–138

    Google Scholar 

  • Judi S. (2002). Dealing with missing data. Research Letters in the Information and Mathematical Sciences 3, 153–160

    Google Scholar 

  • Kalton, G., & Kasprzyk, D. (1982). Imputing for missing survey responses. Proceedings of the Section on Survey Research Methods, American Statistical Association, 22–23.

  • Kalton G., Kasprzyk D. (1986). The treatment of missing survey data. Survey Methodology 12(1): 1–16

    Google Scholar 

  • Lessler J.T., Kalsbeek W.D. (1992). Nonsampling error in surveys. John Wiley & Sons, Inc, New York

    Google Scholar 

  • Li J.R., Khoo L.P., Tor S.B. (2006). RMINE: A rough set based data mining prototype for the reasoning of incomplete data in condition-based fault diagnosis. Journal of Intelligent Manufacturing 17, 163–176

    Article  Google Scholar 

  • Little R.J.A., Rubin D.B. (1987). Statistical analysis with missing data. John Wiley & Sons, New York

    Google Scholar 

  • Little R.J.A., Rubin D.B. (2002). Statistical analysis with missing data, 2nd ed. John Wiley & Sons, New York

    Google Scholar 

  • Margaret, H. D. (2002). Data mining—introductory and advanced topics. Prentice Hall.

  • Robert E.F. (1996). Alternative paradigms for the analysis of imputed survey data. Journal of the American Statistical Association, 91(434): 490–498

    Article  Google Scholar 

  • Rubin D.B. (1987). Multiple imputation for nonresponse in surveys. John Wiley & Sons, New York

    Google Scholar 

  • Werbos, P. (1974). Beyond regression: New tools for prediction and analysis in the behavioral sciences. Harvard University.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ruey-Ling Yeh.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yeh, RL., Liu, C., Shia, BC. et al. Imputing manufacturing material in data mining. J Intell Manuf 19, 109–118 (2008). https://doi.org/10.1007/s10845-007-0067-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10845-007-0067-z

Keywords

Navigation