Skip to main content

Incomplete Time Series: Imputation through Genetic Algorithms

  • Chapter
Time Series Analysis, Modeling and Applications

Abstract

Uncertainty in time series can appear in many ways, and its analysis can be performed based on different theories. An important problem appears when time series is incomplete since the analyst should impute those observations before any other analysis.

This chapter focuses on designing an imputation method for multiple missing observations in time series through the use of a genetic algorithm (GA), which is designed for replacing these missed observations in the original series. The flexibility of a GA is used for finding an adequate solution to a multicriteria objective, defined as the error between some key properties of the original series and the imputed one. A comparative study between a classical estimation method and our proposal is presented through an example.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Abdella, M., Marwala, T.: Treatment of missing data using neural networks and genetic algorithms. In: IEEE (ed.) Proceedings of International Joint Conference on Neural Networks, pp. 598–603. IEEE (2005)

    Google Scholar 

  2. Abdella, M., Marwala, T.: The use of genetic algorithms and neural networks to approximate missing data in database. In: IEEE (ed.) IEEE 3rd International Conference on Computational Cybernetics, ICCC 2005, vol. 3, pp. 207–212. IEEE (April 2005)

    Google Scholar 

  3. Anderson, T.W.: The Statistical Analysis of Time Series. John Wiley and Sons (1994)

    Google Scholar 

  4. Arnold, M.: Reasoning about non-linear AR models using expectation maximization. Journal of Forecasting 22(6), 479–490 (2003)

    Article  Google Scholar 

  5. Aytug, H., Bhattacharrya, S., Koehler, G.J.: A markov chain analysis of genetic algorithms with power of 2 cardinality alphabets. ORSA Journal on Computing 96(6), 195–201 (1997)

    MATH  Google Scholar 

  6. Aytug, H., Koehler, G.J.: Stopping criteria for finite length genetic algorithms. ORSA Journal on Computing 8(2), 183–191 (1996)

    Article  MATH  Google Scholar 

  7. Aytug, H., Koehler, G.J.: New stopping criterion for genetic algorithms. European Journal of Operational Research 126(1), 662–674 (2000)

    Article  MathSciNet  MATH  Google Scholar 

  8. Bäck, T.: Evolutionary Algorithms in Theory and Practice: Evolution Strategies, Evolutionary Programming, Genetic Algorithms. Oxford University Press (1996)

    Google Scholar 

  9. Bagchi, T.: Multiobjective Scheduling by Genetic Algorithms. Kluwer Academic Publishers (1999)

    Google Scholar 

  10. Brockwell, P., Davis, R.: Time Series: Theory and Methods. Springer (1998)

    Google Scholar 

  11. Brockwell, P., Davis, R.: Introduction to Time Series and Forecasting. Springer (2000)

    Google Scholar 

  12. Broersen, P., de Waele, S., Bos, R.: Application of autoregressive spectral analysis to missing data problems. IEEE Transactions on Instrumentation and Measurement 53(4), 981–986 (2004)

    Article  Google Scholar 

  13. Burke, E.K., Gustafson, S., Kendall, G.: Diversity in genetic programming: An analysis of measures and correlation with fitness. IEEE Transactions on Evolutionary Computation 8(1), 47–62 (2004)

    Article  Google Scholar 

  14. Celeux, G., Diebolt, J.: The SEM algorithm: a probabilistic teacher algorithm derived from the EM algorithm for the mixture problem. Computational Statistics Quarterly 2(1), 73–82 (1993)

    MathSciNet  Google Scholar 

  15. Chambers, R.L., Skinner, C.J.: Analysis of Survey Data. John Wiley and Sons (2003)

    Google Scholar 

  16. Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum-likelihood from incomplete data via the EM algorithm. Journal of Royal Statistical Society 39(1), 1–38 (1977)

    MathSciNet  MATH  Google Scholar 

  17. Devroye, L.: Non-Uniform Random Variate Generation. Springer, New York (1986)

    MATH  Google Scholar 

  18. Eklund, N.: Using genetic algorithms to estimate confidence intervals for missing spatial data. IEEE Transactions on Systems, Man and Cybernetics, Part C: Applications and Reviews 36(4), 519–523 (2006)

    Article  Google Scholar 

  19. Figueroa García, J.C., Kalenatic, D., Lopez Bello, C.A.: Missing Data Imputation in Time Series by Evolutionary Algorithms. In: Huang, D.-S., Wunsch II, D.C., Levine, D.S., Jo, K.-H. (eds.) ICIC 2008. LNCS (LNAI), vol. 5227, pp. 275–283. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  20. Figueroa García, J.C., Kalenatic, D., López, C.A.: An evolutionary approach for imputing missing data in time series. Journal on Systems, Circuits and Computers 19(1), 107–121 (2010)

    Article  Google Scholar 

  21. Fonseca, C.M., Fleming, P.J.: Genetic algorithms for multiobjective optimization: Formulation, discussion and generalization. Evolutionary Computation 3(1), 1–16 (2004)

    Article  Google Scholar 

  22. Gaetan, C., Yao, J.F.: A multiple-imputation metropolis version of the EM algorithm. Biometrika 90(3), 643–654 (2003)

    Article  MathSciNet  Google Scholar 

  23. Goldberg, D.E.: Genetic Algorithms in Search, Optimization, and Machine Learning. Adisson-Wesley (1989)

    Google Scholar 

  24. González, S., Rueda, M., Arcos, A.: An improved estimator to analyse missing data. Statistical Papers 49(4), 791–796 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  25. Grimmet, G., Stirzaker, D.: Probability and Random Processes. Oxford University Press (2001)

    Google Scholar 

  26. Hair, J.F., Black, W.C., Babin, B.J., Anderson, R.E.: Multivariate Data Analysis, 7th edn. Prentice-Hall (2009)

    Google Scholar 

  27. Hamilton, J.D.: Time Series Analysis. Princeton University (1994)

    Google Scholar 

  28. Harville, D.A.: Matrix Algebra from a Statician’s Perspective. Springer-Verlag Inc. (1997)

    Google Scholar 

  29. Huber, P.: Robust Statistics. John Wiley and Sons, New York (1981)

    Book  MATH  Google Scholar 

  30. Ibrahim, J., Molenberghs, G.: Missing data methods in longitudinal studies: a review. TEST 18(1), 1–43 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  31. JiaWei, L., Yang, T., Wang, Y.: Missing value estimation for microarray data based on fuzzy c-means clustering. In: IEEE (ed.) Proceedings of High-Performance Computing in Asia-Pacific Region, 2005 Conference, pp. 616–623. IEEE (2005)

    Google Scholar 

  32. Kalra, R., Deo, M.: Genetic programming for retrieving missing information in wave records along the west coast of india. Applied Ocean Research 29(3), 99–111 (2007)

    Article  Google Scholar 

  33. Law, A., Kelton, D.: Simulation System and Analysis. McGraw Hill International (2000)

    Google Scholar 

  34. Levine, L.A., Casella, G.: Implementations of the monte-carlo EM algorithm. Journal of Computational Graphic Statistics 10(1), 422–439 (2000)

    MathSciNet  Google Scholar 

  35. Londhe, S.: Soft computing approach for real-time estimation of missing wave heights. Ocean Engineering 35(11), 1080–1089 (2008)

    Article  Google Scholar 

  36. Mood, A.M., Graybill, F.A., Boes, D.C.: Introduction to the Theory of Statistics. Mc Graw Hill Book Company (1974)

    Google Scholar 

  37. Nelwamondo, F.V., Golding, D., Marwala, T.: A dynamic programming approach to missing data estimation using neural networks. Information Sciences (in press 2012)

    Google Scholar 

  38. Nielsen, S.F.: The stochastic EM algorithm: Estimation and asymptotic results. Bernoulli 6(1), 457–489 (2000)

    Article  MathSciNet  MATH  Google Scholar 

  39. Parveen, S., Green, P.: Speech enhancement with missing data techniques using recurrent neural networks. In: IEEE (ed.) Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2004), vol. 1, pp. 733–738. IEEE (2004)

    Google Scholar 

  40. Pendharkar, P.C., Koehler, G.J.: A general steady state distribution based stopping criteria for finite length genetic algorithms. European Journal of Operational Research 176(3), 1436–1451 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  41. Qin, Y., Zhang, S., Zhu, X., Zhang, J., Zhang, C.: Semi-parametric optimization for missing data imputation. Applied Intelligence 27(1), 79–88 (2007)

    Article  MATH  Google Scholar 

  42. Ross, S.M.: Stochastic Processes. John Wiley and Sons (1996)

    Google Scholar 

  43. Safe, M., Carballido, J., Ponzoni, I., Brignole, N.: On Stopping Criteria for Genetic Algorithms. In: Bazzan, A.L.C., Labidi, S. (eds.) SBIA 2004. LNCS (LNAI), vol. 3171, pp. 405–413. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  44. Siripitayananon, P., Hui-Chuan, C., Kang-Ren, J.: Estimating missing data of wind speeds using neural network. In: IEEE (ed.) Proceedings of the 2002 IEEE Southeast Conference, vol. 1, pp. 343–348. IEEE (2002)

    Google Scholar 

  45. Ssali, G., Marwala, T.: Computational intelligence and decision trees for missing data estimation. In: IEEE (ed.) IJCNN 2008 (IEEE World Congress on Computational Intelligence), pp. 201–207. IEEE (2008)

    Google Scholar 

  46. Tsiatis, A.A.: Semiparametric Theory and Missing Data. Springer Series in Statistics (2006)

    Google Scholar 

  47. Wilks, A.: Mathematical Statistics. John Wiley and Sons, New York (1962)

    MATH  Google Scholar 

  48. Zhong, M., Lingras, P., Sharma, S.: Estimation of missing traffic counts using factor, genetic, neural, and regression techniques. Transportation Research Part C: Emerging Technologies 12(2), 139–166 (2004)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Juan Carlos Figueroa-García .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Figueroa-García, J.C., Kalenatic, D., López, C.A. (2013). Incomplete Time Series: Imputation through Genetic Algorithms. In: Pedrycz, W., Chen, SM. (eds) Time Series Analysis, Modeling and Applications. Intelligent Systems Reference Library, vol 47. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-33439-9_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-33439-9_2

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-33438-2

  • Online ISBN: 978-3-642-33439-9

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics