Skip to main content

Evaluating Forecasting Methods

  • Chapter

Part of the book series: International Series in Operations Research & Management Science ((ISOR,volume 30))

Abstract

Ideally, forecasting methods should be evaluated in the situations for which they will be used. Underlying the evaluation procedure is the need to test methods against reasonable alternatives. Evaluation consists of four steps: testing assumptions, testing data and methods, replicating outputs, and assessing outputs. Most principles for testing forecasting methods are based on commonly accepted methodological procedures, such as to prespecify criteria or to obtain a large sample of forecast errors. However, forecasters often violate such principles, even in academic studies. Some principles might be surprising, such as do not use R-square, do not use Mean Square Error, and do not use the within-sample fit of the model to select the most accurate time-series model. A checklist of 32 principles is provided to help in systematically evaluating forecasting methods.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   429.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   549.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   549.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Adya, M. (2000), “Corrections to rule-based forecasting: Results of a replication,” International Journal of Forecasting, 16, 125–127.

    Article  Google Scholar 

  • Ames, E. S. Reiter (1961), “Distributions of correlation coefficients in economic time series, ” Journal of the American Statistical Association, 56, 637–656.

    Article  Google Scholar 

  • Anscombe, F. J. (1973), “Graphs in statistical analysis,” American Statistician, 27, 17–21.

    Google Scholar 

  • Armstrong, J. S. (1970), “How to avoid exploratory research,” Journal of Advertising Research, 10 (August), 27–30. Full text at hops.wharton.upenn.edu/forecast

    Google Scholar 

  • Armstrong, J. S. (1979), “Advocacy and objectivity in science,” Management Science, 25, 423–428.

    Article  Google Scholar 

  • Armstrong, J. S. (1980), “Unintelligible management research and academic prestige,” Interfaces, 10 (March—April), 80–86. Full text at hops.wharton.upenn.edu/forecast

    Google Scholar 

  • Armstrong, J. S. (1983), “Cheating in management science,” Interfaces, 13 (August), 20–29.

    Article  Google Scholar 

  • Armstrong, J. S. (1984), “Forecasting by extrapolation: Conclusions from 25 years of research,” Interfaces, 13 (Nov./Dec.), 52–61. Full text at hops.wharton.upenn.edu/forecast

    Google Scholar 

  • Armstrong, J. S. (1985), Long-Range Forecasting. New York: John Wiley. Full text at hops.wharton.upenn.edu/forecast

    Google Scholar 

  • Armstrong, J. S. (1988), “Research needs in forecasting,” International Journal of Forecasting, 4, 449–465. Full text at hops.wharton.upenn.edu/forecast

    Google Scholar 

  • Armstrong, J. S. (1996), “Management folklore and management science: On portfolio planning, escalation bias, and such,” Interfaces, 26, No. 4, 28–42. Full text at hops.wharton.upenn.edu/forecast

    Google Scholar 

  • Armstrong, J. S. (1997), “Peer review for journals: Evidence on quality control, fairness, and innovation,” Science and Engineering Ethics, 3, 63–84. Full text at hops.wharton.upenn.edu/forecast. See “peer review.”

    Google Scholar 

  • Armstrong, J. S. (2001a), “Role-playing: A method to forecast decisions,” in J. S. Armstrong (ed.), Principles of Forecasting. Norwell, MA: Kluwer Academic Publishers.

    Google Scholar 

  • Armstrong, J. S. (2001b), “Selecting forecasting methods,” in J. S. Armstrong (ed.), Principles of Forecasting. Norwell, MA.: Kluwer Academic Publishers.

    Google Scholar 

  • Armstrong, J. S., M. Adya F. Collopy (2001), “Rule-based forecasting: Using judgment in time-series extrapolation,” in J. S. Armstrong (ed.), Principles of Forecasting. Norwell, MA: Kluwer Academic Publishers.

    Google Scholar 

  • Armstrong, J. S., R. Brodie A. Parsons (2001), “Hypotheses in marketing science: Literature review and publication audit,” Marketing Letters, 12, 171–187.

    Article  Google Scholar 

  • Armstrong, J. S. F. Collopy (1992), “Error measures for generalizing about forecasting methods: Empirical comparisons,” International Journal of Forecasting, 8, 69–80. Full text at hops.wharton.upenn.edu/forecast. Followed by commentary by Ahlburg

    Google Scholar 

  • Chatfield, Taylor, Thompson, Winkler and Murphy, Collopy and Armstrong, and Fildes, pp. 99–111.

    Google Scholar 

  • Armstrong, J. S. F. Collopy (1993), “Causal forces: Structuring knowledge for time series extrapolation,” Journal of Forecasting, 12, 103–115. Full text at hops.wharton.upenn.edu/forecast

    Google Scholar 

  • Armstrong, J. S. F. Collopy (1994), “How serious are methodological issues in surveys? A reexamination of the Clarence Thomas polls.” Full text at hops.wharton.upenn.edu/forecast

    Google Scholar 

  • Armstrong, J. S. F. Collopy (2001), “Identification of asymmetric prediction intervals through causal forces” Journal of Forecasting (forthcoming).

    Google Scholar 

  • Armstrong, J. S. R. Fildes (1995), “On the selection of error measures for comparisons among forecasting methods,” Journal of Forecasting, 14, 67–71. Full text at hops.wharton.upenn.edu/forecast

    Google Scholar 

  • Armstrong, J. S. A. Shapiro (1974), “Analyzing quantitative models,” Journal of Marketing, 38, 61–66. Full text at hops.wharton.upenn.edu/forecast

    Google Scholar 

  • Barrett, G. V., J. S. Phillips R. A. Alexander (1981), “Concurrent and predictive validity designs: A critical reanalysis,” Journal of Applied Psychology, 66, 1–6.

    Article  Google Scholar 

  • Batson C. D. (1975), “Rational processing or rationalization? The effect of disconfirming information on a stated religious belief,” Journal of Personality and Social Psychology, 32, 176–184.

    Article  Google Scholar 

  • Bretschneider, S. I., W. L. Gorr, G. Grizzle E. Klay (1989), “Political and organizational influences on the accuracy of forecasting state government revenues,” International Journal of Forecasting, 5, 307–319.

    Article  Google Scholar 

  • Brouthers, L. E. (1986), “Parties, ideology, and elections: The politics of federal revenues and expenditures forecasting,” International Journal of Public Administration, 8, 289–314.

    Article  Google Scholar 

  • Carbone, R. J. S. Armstrong (1982), “Evaluation of extrapolative forecasting methods: Results of a survey of academicians and practitioners,” Journal of Forecasting, 1, 215–217. Full text at hops.wharton.upenn.edu/forecast

    Google Scholar 

  • Card, D. A.B. Krueger (1994), “Minimum wages and a case study of the fast-food industry in New Jersey and Pennsylvania,” American Economic Review, 84, 772–793.

    Google Scholar 

  • Chamberlin, C. (1965), “The method of multiple working hypotheses,” Science, 148, 754–759.

    Article  Google Scholar 

  • Chatfield, C. (1988), “Apples, oranges and mean square error,” Journal of Forecasting, 4, 515–518.

    Article  Google Scholar 

  • Cohen, J. (1988), Statistical Power Analysis for the Behavioral Sciences. Hillsdale, NJ: Lawrence Erlbaum.

    Google Scholar 

  • Cohen, J. (1994), “The earth is round (p.05),” American Psychologist, 49, 997–1003.

    Article  Google Scholar 

  • Collopy, F., M. Adya J. S. Armstrong (1994), “Principles for examining predictive validity: The case of information systems spending forecasts,” Information Systems Research, 5, 170–179.

    Article  Google Scholar 

  • Collopy, F. J. S. Armstrong (1992), “Rule-based forecasting: Development and validation of an expert systems approach to combining time series extrapolations,” Management Science, 38, 1394–1414.

    Article  Google Scholar 

  • Dalessio, A. T. (1994), “Predicting insurance agent turnover using a video-based situational judgment test,” Journal of Business and Psychology, 9, 23–37.

    Article  Google Scholar 

  • Dunnett, C. W. (1955), “A multiple comparison procedure for comparing several treatments with a control,” Journal of the American Statistical Association, 50, 1096–1121. Available at hops.wharton.upenn.edu/forecast.

    Google Scholar 

  • Dunnett, C. W. (1964), “New tables for multiple comparisons with a control,” Biometrics, 20, 482–491. Available at hops.wharton.upenn.edu/forecast.

    Google Scholar 

  • Elliott, J. W. J. R. Baier (1979), “Econometric models and current interest rates: How well do they predict future rates?” Journal of Finance, 34, 975–986.

    Article  Google Scholar 

  • Erickson, E.P. (1988), “Estimating the concentration of wealth in America,” Public Opinion Quarterly, 2, 243–253.

    Article  Google Scholar 

  • Ferber, R. (1956), “Are correlations any guide to predictive value?” Applied Statistics, 5, 113–122.

    Article  Google Scholar 

  • Fildes, R. R. Hastings (1994), “The organization and improvement of market forecasting,” Journal of the Operational Research Society, 45, 1–16.

    Google Scholar 

  • Fildes, R., M. Hibon, S. Makridakis N. Meade (1998), “Generalizing about univariate forecasting methods: Further empirical evidence” (with commentary), International Journal of Forecasting, 14, 339–366.

    Google Scholar 

  • Fildes, R. S. Makridakis (1988), “Forecasting and loss functions,” International Journal of Forecasting, 4, 545–550.

    Google Scholar 

  • Fischhoff, B. (2001), “Learning from experience: Coping with hindsight bias and ambiguity,” in J. S. Armstrong (ed.), Principles of Forecasting. Norwell, MA: Kluwer Academic Publishers.

    Google Scholar 

  • Flores, B. C. Whybark (1986), “A comparison of focus forecasting with averaging and exponential smoothing,” Production and Inventory Management, 27, (3), 961–103.

    Google Scholar 

  • Friedman, M. (1953), “The methodology of positive economics,” Essays in Positive Economics. Chicago: University of Chicago Press.

    Google Scholar 

  • Friedman, M. A. J. Schwartz (1991), “Alternative approaches to analyzing economic data.” American Economic Review, 81, Appendix, pp. 48–49.

    Google Scholar 

  • Gardner, E. S. Jr. (1984), “The strange case of lagging forecasts,” Interfaces, 14 (May–June), 47–50.

    Article  Google Scholar 

  • Gardner, E. S. Jr. (1985), “Further notes on lagging forecasts,” Interfaces, 15 (Sept–Oct.), 63.

    Article  Google Scholar 

  • Gardner, E. S. Jr. E. A. Anderson (1997), “Focus forecasting reconsidered,” International Journal of Forecasting, 13, 501–508.

    Article  Google Scholar 

  • Gurbaxani, V. H. Mendelson (1990), “An integrative model of information systems spending growth,” Information Systems Research, 1, 254–259.

    Article  Google Scholar 

  • Gurbaxani, V. H. Mendelson (1994), “Modeling vs. forecasting: The case of information systems spending,” Information Systems Research, 5, 180–190.

    Article  Google Scholar 

  • Henderson, D.R. (1996), “Rush to judgment,” Managerial and Decision Economics, 17, 339–344.

    Article  Google Scholar 

  • Hubbard, R. J. S. Armstrong (1994), “Replications and extensions in marketing: Rarely published but quite contrary,” International Journal of Research in Marketing, 11, 233–248. Full text at hops.wharton.upenn.edu/forecast

    Google Scholar 

  • Hubbard, R. P. A. Ryan (2001), “The historical growth of statistical significance testing in psychology—and its future prospects,” Educational and Psychological Measurement, 60, 661–681. Commentary follows on pp. 682–696.

    Google Scholar 

  • Hubbard, R. D. E. Vetter (1996), “An empirical comparison of published replication research in accounting, economics, finance, management, and marketing,” Journal of Business Research, 35, 153–164.

    Article  Google Scholar 

  • Lau, R. D. (1994), “An analysis of the accuracy of `trial heat’ polls during the 1992 presidential election,” Public Opinion Quarterly, 58, 2–20.

    Article  Google Scholar 

  • Machlup, F. (1955), “The problem of verification in economics,” Southern Economic Journal, 22, 1–21.

    Article  Google Scholar 

  • Makridakis, S. (1993), “Accuracy measures: Theoretical and practical concerns,” International Journal of Forecasting, 9, 527–529.

    Article  Google Scholar 

  • Makridakis, S., A. Andersen, R. Carbone, R. Fildes, M. Hibon, R. Lewandowski, J. Newton, E. Parzen R. Winkler (1982), “The accuracy of extrapolation (time series) methods: Results of a forecasting competition,” Journal of Forecasting, 1, 111–153.

    Article  Google Scholar 

  • Makridakis, S., C. Chatfield, M. Hibon, M. Lawrence, T. Mills, K. Ord L. F. Simmons (1993), “The M2-Competition: A real-time judgmentally based forecasting study,” International Journal of Forecasting, 9, 5–22. Commentary follows on pages 23–29.

    Google Scholar 

  • Makridakis, S. M. Hibon (1979), “Accuracy of forecasting: An empirical investigation” (with discussion), Journal of the Royal Statistical Society: Series A, 142, 97–145.

    Google Scholar 

  • Makridakis, S. M. Hibon (2000), “The M3-Competition: Results, conclusions and implications,” International Journal of Forecasting, 16, 451–476.

    Article  Google Scholar 

  • Mayer, T. (1975), “Selecting economic hypotheses by goodness of fit,” The Economic Journal, 85, 877–883.

    Article  Google Scholar 

  • McCloskey, D. N. S. T. Ziliak (1996), “The standard error of regressions,” Journal of Economic Literature, 34, 97–114.

    Google Scholar 

  • McLeavy, D.W., T. S. Lee E. E. Adam, Jr. (1981), “An empirical evaluation of individual item forecasting models” Decision Sciences, 12, 708–714.

    Article  Google Scholar 

  • Mentzer, J. T. K. B. Kahn (1995), “Forecasting technique familiarity, satisfaction, usage, and application,” Journal of Forecasting, 14, 465–476.

    Article  Google Scholar 

  • Nagel, E. (1963), “Assumptions in economic theory,” American Economic Review, 53, 211–219.

    Google Scholar 

  • Ohlin, L. E. O. D. Duncan (1949), “The efficiency of prediction in criminology,” American Journal of Sociology, 54, 441–452.

    Article  Google Scholar 

  • Pant, P. N. W. H. Starbuck (1990), “Innocents in the forest: Forecasting and research methods,” Journal of Management, 16, 433–460.

    Article  Google Scholar 

  • Schnaars, S. (1984), “Situational factors affecting forecast accuracy,” Journal ofMarketing Research, 21, 290–297.

    Article  Google Scholar 

  • Schupack, M. R. (1962), “The predictive accuracy of empirical demand analysis,” Economic Journal, 72, 550–575.

    Article  Google Scholar 

  • Sexton, T. A. (1987), “Forecasting property taxes: A comparison and evaluation of methods,” National Tax Journal, 15, 47–59

    Google Scholar 

  • Shamir, J. (1986), “Pre-election polls in Israel: Structural constraints on accuracy,” Public Opinion Quarterly, 50, 62–75.

    Article  Google Scholar 

  • Slovic, P. D. J. McPhillamy (1974), “Dimensional commensurability and cue utilization in comparative judgment,” Organizational Behavior and Human Performance, 11, 172–194.

    Article  Google Scholar 

  • Smith, B. T. (1978), Focus Forecasting: Computer Techniques for Inventory Control. Boston: CBI Publishing.

    Google Scholar 

  • Smith, M. C. (1976), “A comparison of the value of trainability assessments and other tests for predicting the practical performance of dental students,” International Review of Applied Psychology, 25, 125–130.

    Article  Google Scholar 

  • Stephan, W. G. (1978), “School desegregation: An evaluation of predictions made in Brown v. Board of Education,” Psychological Bulletin, 85, 217–238.

    Article  Google Scholar 

  • Theil, H. (1966), Applied Economic Forecasting. Chicago: Rand McNally.

    Google Scholar 

  • Wade, N. (1976), “IQ and heredity: Suspicion of fraud beclouds classic experiment,” Science, 194, 916–919.

    Article  Google Scholar 

  • Webster, E. C. (1964), Decision Making in the Employment Interview. Montreal: Eagle.

    Google Scholar 

  • Weimann, G. (1990), “The obsession to forecast: Pre-election polls in the Israeli press,” Public Opinion Quarterly, 54, 396–408.

    Article  Google Scholar 

  • Winston, C. (1993), “Economic deregulation: Days of reckoning for microeconomists,” Journal of Economic Literature, 31, 1263–1289.

    Google Scholar 

  • Yokum, T. J. S. Armstrong (1995), “Beyond accuracy: Comparison of criteria used to select forecasting methods,” International Journal of Forecasting, 11, 591–597. Full text at hops.wharton.upenn/edu/forecast.

    Google Scholar 

  • Zellner, A. (1986), “A tale of forecasting 1001 series: The Bayesian knight strikes again,” International Journal of Forecasting, 2, 491–494.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2001 Springer Science+Business Media New York

About this chapter

Cite this chapter

Armstrong, J.S. (2001). Evaluating Forecasting Methods. In: Armstrong, J.S. (eds) Principles of Forecasting. International Series in Operations Research & Management Science, vol 30. Springer, Boston, MA. https://doi.org/10.1007/978-0-306-47630-3_20

Download citation

  • DOI: https://doi.org/10.1007/978-0-306-47630-3_20

  • Publisher Name: Springer, Boston, MA

  • Print ISBN: 978-0-7923-7401-5

  • Online ISBN: 978-0-306-47630-3

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics