Neural Computing and Applications

, Volume 18, Issue 1, pp 25–35 | Cite as

Applying REC analysis to ensembles of particle filters

IJCNN 2007

Abstract

Particle filters (PF) are sequential Monte Carlo methods based in the representation of probability densities with mass points. Although currently most researches involving time series forecasting use the traditional methods, particle filters can be applied to any state-space model and generalize the traditional Kalman filter methods, providing better results. Furthermore, it is well-known that for classification and regression tasks ensembles achieve better performances than the algorithms that compose them. Therefore, it is expected that ensembles of time series predictors can provide even better results than particle filters. The regression error characteristic (REC) analysis is a powerful technique for visualization and comparison of regression models. The objective of this work is to advocate the use of REC curves in order to compare traditional Kalman filter methods with particle filters and analyze their use in ensembles, which can achieve a better performance.

Keywords

REC analysis Ensemble Particle filter Kalman filter 

References

  1. 1.
    de Pina AC, Zaverucha G (2007) Applying REC analysis to ensembles of particle filters. In: Proceedings of the 20th international joint conference on neural networks, Orlando, FL, pp 2352–2357Google Scholar
  2. 2.
    Provost F, Fawcett T, Kohavi R (1998) The case against accuracy estimation for comparing classifiers. In: Proceedings of the 15th international conference on machine learning, Madison, WI, pp 445–453Google Scholar
  3. 3.
    Provost F, Fawcett T (1997) Analysis and visualization of classifier performance: comparison under imprecise class and cost distributions. In: Proceedings of the 3rd international conference on knowledge discovery and data mining, Newport Beach, CA, pp 43–48Google Scholar
  4. 4.
    Bi J, Bennett KP (2003) Regression error characteristic curves. In: Proceedings of the 20th international conference on machine learning, Washington, DC, pp 43–50Google Scholar
  5. 5.
    Doucet A, de Freitas N, Gordon N (2001) Sequential Monte-Carlo methods in practice. Springer, New YorkMATHGoogle Scholar
  6. 6.
    Kalman RE (1960) A new approach to linear filtering and prediction problems. J Basic Eng Trans ASME Ser D 82:35–45Google Scholar
  7. 7.
    Jazwinsky A (1970) Stochastic processes and filtering theory. Academic Press, New YorkGoogle Scholar
  8. 8.
    van der Merwe R, Wan E (2003) Sigma-point Kalman filters for probabilistic inference in dynamic state-space models. In: Proceedings of the workshop on advances in machine learning, Montreal, CanadaGoogle Scholar
  9. 9.
    Dietterich TG (1998) Machine learning research: four current directions. AI Mag 18:97–136Google Scholar
  10. 10.
    Suen YL, Melville P, Mooney RJ (2005) Combining bias and variance reduction techniques for regression trees. In: Proceedings of the 16th European conference on machine learning, Porto, Portugal, pp 741–749Google Scholar
  11. 11.
    Breiman L (1996) Bagging predictors. Mach Learn 24(2):123–140MATHMathSciNetGoogle Scholar
  12. 12.
    Parmanto B, Munro PW, Doyle HR (1996) Improving committee diagnosis with resampling techniques. Adv Neural Inf Process Syst 8:882–888Google Scholar
  13. 13.
    Schapire RE (1990) The strength of weak learnability. Mach Learn 5:197–227Google Scholar
  14. 14.
    Tumer K, Ghosh J (1996) Error correlation and error reduction in ensemble classifiers. Connect Sci 8(3-4):385–404CrossRefGoogle Scholar
  15. 15.
    Dietterich TG, Bariki G (1995) Solving multiclass learning problems via error-correcting output codes. J Artif Intell Res 2:263–286MATHGoogle Scholar
  16. 16.
    Brown G, Wyatt JL, Tiño P (2005) Managing diversity in regression ensembles. J Mach Learn Res 6:1621–1650MathSciNetGoogle Scholar
  17. 17.
    Ali KM, Pazzani MJ (1996) Error reduction through learning multiple descriptions. Mach Learn 24(3):173–202Google Scholar
  18. 18.
    Quinlan JR (1990) Learning logical definitions from relations. Mach Learn 5:239–266Google Scholar
  19. 19.
    Lavrac N, Dzeroski S (1994) Inductive logic programming: techniques and applications. Ellis Horwood, New YorkMATHGoogle Scholar
  20. 20.
    Friedman J (1999) Greedy function approximation: a gradient boosting machine. Technical report, Stanford University Statistics Department, StanfordGoogle Scholar
  21. 21.
    Friedman J, Hastie T, Tibshirani R (2000) Additive logistic regression: a statistical view of boosting. Technical report, Stanford University Statistics Department, StanfordGoogle Scholar
  22. 22.
    Caruana R, Niculescu-Mizil A (2004) An empirical evaluation of supervised learning for ROC area. In: Proceedings of the first workshop on ROC analysis in AI, Valencia, Spain, pp 1–8Google Scholar
  23. 23.
    Kelvin T, Leung D, Parker S (2003) Empirical comparisons of various voting methods in bagging. In: Proceedings of the 9th ACM SIGKDD international conference on knowledge discovery and data mining, Washington, DC, pp 595–600Google Scholar
  24. 24.
    Perrone MP, Cooper LN (1993) When networks disagree: ensemble methods for hybrid neural networks. In: Mammone RJ (ed) Neural networks for speech and image processing. Chapman & Hall, London, pp 126–142Google Scholar
  25. 25.
    Wolpert D (1992) Stacked generalization. Neural Netw 5:241–260. doi:10.1016/S0893-6080(05)80023-1 CrossRefGoogle Scholar
  26. 26.
    de Pina AC, Zaverucha G (2006) Applying REC analysis to ensembles of sigma-point Kalman filters. In: Proceedings of the 16th international conference on artificial neural networks, Athens, Greece, pp 151–160Google Scholar
  27. 27.
    Brockwell PJ, Davis RA (2002) Introduction to time series and forecasting. Springer, New YorkMATHGoogle Scholar
  28. 28.
    Kantz H, Schreiber T (1999) Nonlinear time series analysis. Cambridge University Press, CambridgeGoogle Scholar
  29. 29.
    Harrison PJ, Stevens C (1976) Bayesian forecasting (with discussion). J R Stat Soc Ser A 38:205–247MATHMathSciNetGoogle Scholar
  30. 30.
    Baum LE, Egon JA (1967) An inequality with applications to statistical estimation for probabilistic functions of a Markov process and to a model for ecology. Bull Am Meteorol Soc 73:360–363. doi:10.1090/S0002-9904-1967-11751-8 MATHCrossRefGoogle Scholar
  31. 31.
    West M, Harrison J (1997) Bayesian forecasting and dynamic models. Springer, New YorkMATHGoogle Scholar
  32. 32.
    Julier S, Uhlmann J, Durrant-Whyte H (1995) A new approach for filtering nonlinear systems. In: Proceedings of the American control conference, Seattle, WA, pp 1628–1632Google Scholar
  33. 33.
    Ito K, Xiong K (2000) Gaussian filters for nonlinear filtering problems. IEEE Trans Automat Contr 45:910–927. doi:10.1109/9.855552 MATHCrossRefMathSciNetGoogle Scholar
  34. 34.
    van der Merwe R, Wan E (2001) Efficient derivative-free Kalman filters for online learning. In: Proceedings of the 9th European symposium on artificial neural networks, Bruges, Belgium, pp 205–210Google Scholar
  35. 35.
    Julier S, Uhlmann J (2000) The scaled unscented transformation. In: Proceedings of the IEEE American control conference, Anchorage, AK, pp 4555–4559Google Scholar
  36. 36.
    Gordon N, Salmond D, Smith A (1993) Novel approach to nonlinear and non-Gaussian Bayesian state estimation. Proc Inst Electr Eng 104:107–113Google Scholar
  37. 37.
    Musso C, Oudjane N, LeGland F (2001) Improving regularized particle filters. In: Doucet A et al (eds) Sequential Monte Carlo methods in practice. Springer, New York, pp 247–271Google Scholar
  38. 38.
    Kotecha JH, Djuric PM (2003) Gaussian sum particle filtering. IEEE Trans Signal Process 51(10):2602–2612. doi:10.1109/TSP.2003.816754 CrossRefMathSciNetGoogle Scholar
  39. 39.
    van der Merwe R, de Freitas N, Doucet A, Wan E (2000) The unscented particle filter. Technical report CUED/F-INFENG/TR 380, Cambridge University Engineering Department, Cambridge, EnglandGoogle Scholar
  40. 40.
    van der Merwe R, Wan E (2003) Gaussian mixture sigma-point particle filters for sequential probabilistic inference in dynamic state-space models. In: Proceedings of IEEE international conference on acoustics, speech and signal processing, Hong Kong, China, pp 701–704Google Scholar
  41. 41.
    Keogh E, Folias T (2002) The UCR time series data mining archive. Computer Science and Engineering Department, University of California, Riverside, CA. Available at http://www.cs.ucr.edu/~eamonn/TSDMA/index.html
  42. 42.
    Teixeira M, Zaverucha G (2003) Fuzzy Bayes and fuzzy Markov predictors. J Intell Fuzzy Syst 13:155–165Google Scholar
  43. 43.
    Quinlan JR (1992) Learning with continuous classes. In: Proceedings of the 5th Australian joint conference on artificial intelligence, Hobart, Tasmania, pp 343–348Google Scholar
  44. 44.
    Dzeroski S, Zenko B (2004) Is combining classifiers with stacking better than selecting the best one? Mach Learn 54:255–273. doi:10.1023/B:MACH.0000015881.36452.6e MATHCrossRefGoogle Scholar
  45. 45.
    Kohavi R, John GH (1997) Wrappers for feature subset selection. Artif Intell 97:273–324. doi:10.1016/S0004-3702(97)00043-X MATHCrossRefGoogle Scholar
  46. 46.
    Witten IH, Frank E (2005) Data mining: practical machine learning tools and techniques, 2nd edn. Morgan Kaufmann, San FranciscoMATHGoogle Scholar

Copyright information

© Springer-Verlag London Limited 2008

Authors and Affiliations

  1. 1.Department of Systems Engineering and Computer Science, COPPEFederal University of Rio de Janeiro (UFRJ)Rio de JaneiroBrazil

Personalised recommendations