Time-Varying Dictionary and the Predictive Power of FED Minutes

Lima, Luiz Renato; Godeiro, Lucas Lúcio; Mohsin, Mohammed

doi:10.1007/s10614-020-10039-9

Time-Varying Dictionary and the Predictive Power of FED Minutes

Published: 28 August 2020

Volume 57, pages 149–181, (2021)
Cite this article

Computational Economics Aims and scope Submit manuscript

Luiz Renato Lima ORCID: orcid.org/0000-0002-2336-3440^1,2,
Lucas Lúcio Godeiro³ &
Mohammed Mohsin¹

750 Accesses
4 Citations
7 Altmetric
1 Mention
Explore all metrics

Abstract

This paper proposes a method to extract the most predictive information from FED minutes that is specifically adapted to the problem of forecasting. Instead of considering a dictionary (set of words) with a fixed content, we construct a dictionary whose content is allowed to change over time. Specifically, we utilize machine learning to identify the most predictive words (the most predictive content) of a given minute and use them to derive new predictors. We show that the new predictors improve real time forecasts of output growth by a statistically significant margin, suggesting that the combination of supervised machine learning and text regression can be interpreted as a powerful device for out-of-sample macroeconomic forecasting.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Artificial intelligence in Finance: a comprehensive review through bibliometric and content analysis

Article Open access 20 January 2024

What Is Inflation?

Autoencoders and their applications in machine learning: a survey

Article Open access 03 February 2024

Notes

Wright (2012) and Altavilla and Giannone (2017) study the effects of news about the monetary policy on the yield curve, but they do not rely on text mining. They find that news affects market agents’ expectations about corporate and Treasury bond yields.
This restriction led Thorsrud (2018) to hold the training sample constant over time.
Another caveat with re-estimating the LDA recursively is the lack of identifiability, that is, topic estimates cannot be combined across samples for an analysis that relies on the content of specific topics (Thorsrud 2018, p. 22).
Their large dictionary uses the union of dictionaries found in Nyman et al. (2018), Loughran and McDonald (2013), Nielsen (2011), Hu and Liu (2004), Hu et al. (2017), Correa et al. (2017), Tetlock (2007). This gives 9660 unique terms of which 8030 appear in their corpus.
Using real time rather than revised GDP data implies that we are considering solely the information that was available at the time the forecast was being made. Thus, we are reproducing the forecasting problem in real time.
Words are positively (negatively) correlated when the number of times they appear in a document are positively (negatively) correlated across time.
Too frequent words are always used in documents regardless the occurrence of important economic events and, for this reason, do not contain relevant predictive information. Rare words are mostly associated to “typos” which are not correlated with important economic events either.
A corpus at quarter s will include all FED minutes from that quarter.
Normalization implies that if a term does not appear in the minutes during quarter s, then it will receive a value \(\left( -\mu _{j}\right) /\sigma _{j}\). Notice, however, that our preprocessing of raw texts use the term frequency-inverse document frequency (\(tf-idf)\) to remove terms that are rare. This avoids the occurrence of observations that are almost always equal to \(\left( -\mu _{j}\right) /\sigma _{j}\).
A looking-ahead bias occurs when the entire sample is used to compute a model parameter that is subsequently used to make out-of-sample predictions on a future that the parameter estimates has extracted information from.
We end at \(T-h\) because we need to use observation T to evaluate forecasts made at \(T-h\)
The link can be found here GLMNET.
See footnote (6).
Notice that \(X_{k,s}^{*}\) is the kth element of the vector \(X_{s}^{*}\).
Bai and Ng (2005) show that the least squares estimates from factor-augmented forecasting regressions are \(\sqrt{T}\) consistent and asymptotically normal, and that pre-estimation of the factors does not affect the consistency of the second-stage parameter estimates or their standard errors.
The hawkish words are {hawkish, tighten, hike, raise, increase, boost} and the dovish are {dovish, ease, cut, lower, decrease, loose}.
Remenber that \(D_{i,t}\) for i=1,4,5 and 6 are just common factors.
Throughout this paper we assume that the target variable \(y_{t}\) is a covariance-stationary process.
The Blue Chip Indicators is a poll of around top 50 forecast economists from banks, manufacturing industries, brokerage firms, and insurance companies. The poll has been conducted since 1976 and comprises several macro series, including GDP growth.
We always use the end-of-quarter BC forecast, which is typically released 10 days after the end of the quarter.
Recall that models \(D_{i,t}\) \(i=1,4,5\) use the same procedure to compute the final predictors, they only differ on the machine learning method used to select the most predictive words.
The test by Diebold and Mariano (2002) is designed to compare non-nested models. If the forecasting models are nested, then the DM test may be undersized under the null and may have low power under the alternative hypothesis.
Recall that we include 2 auto-regressive lags so the dependent variable starts at 1976Q3.
The minutes from 1993 to 2017 can be found in this link: https://www.federalreserve.gov/monetarypolicy/fomccalendars.htm, and the minutes from 1936-1992 can be found in this link: https://www.federalreserve.gov/monetarypolicy/fomc_historical_year.htm.
The R package used to import these FOMC minutes is the “tm” package, which provides a function by which one can import pdf files to RA. A quick tutorial can be found in the web page “https://data.library.virginia.edu/reading-pdf-files-into-r-for-text-mining/”.
This classification method works because all time series \(X_t\) are measured at the same standard normal scale.
The words appearing in the collocations are also counted as sentiment charged.
We extracted this information from Hansen et al. (2017).
We only consider the Greenbook forecasts presented in the last meeting of each quarter.
This explains why the results reported for \(D_{1,t}\) in Table 11 are different from the ones previously reported in Tables 7 and 9.
Stekler and Symington (2016) also pointed out that the forecasts of economists showed similar errors which could be explained by the fact it was very hard to predict a recession when the real-time data showed that the growth rate of GDP was accelerating as well as the Committee may have believed that the stimuli that had already been provided to the economy were sufficient to avert a downturn.

References

Ahn, S. C., & Horenstein, A. R. (2013). Eigenvalue ratio test for the number of factors. Econometrica, 81(3), 1203–1227.
Article Google Scholar
Altavilla, C., & Giannone, D. (2017). The effectiveness of non-standard monetary policy measures: Evidence from survey data. Journal of Applied Econometrics, 32(5), 952–964.
Article Google Scholar
Andreou, E., Ghysels, E., & Kourtellos, A. (2013). Should macroeconomic forecasters use daily financial data and how? Journal of Business & Economic Statistics, 31(2), 240–251.
Article Google Scholar
Apel, M., & Grimaldi, M. (2012). The information content of central bank minutes. Sveriges Riksbank Working Paper Series, 92.
Armesto, M. T., Hernández-Murillo, R., Owyang, M. T., & Piger, J. (2009). Measuring the information content of the beige book: A mixed data sampling approach. Journal of Money, Credit and Banking, 41(1), 35–55.
Article Google Scholar
Bai, J., & Ng, S. (2005). Tests for skewness, kurtosis, and normality for time series data. Journal of Business & Economic Statistics, 23(1), 49–60.
Article Google Scholar
Bai, J., & Ng, S. (2008). Forecasting economic time series using targeted predictors. Journal of Econometrics, 146(2), 304–317.
Article Google Scholar
Blei, D. M., & Lafferty, J. D. (2009). Topic models. Text mining: Classification, Clustering, and Applications, 10(71), 34.
Google Scholar
Boukus, E., & Rosenberg, J. V. (2006). The information content of FOMC minutes. Available at SSRN 922312.
Carriero, A., Clark, T. E., & Marcellino, M. (2015). Realtime nowcasting with a Bayesian mixed frequency model with stochastic volatility. Journal of the Royal Statistical Society: Series A (Statistics in Society), 178(4), 837–862.
Article Google Scholar
Cecchetti, S. G., et al. (2003) What the FOMC says and does when the stock market booms. In Asset prices and monetary policy, proceedings of the research conference of the Reserve Bank of Australia (pp. 77–96).
Chakraborty, C., & Joseph, A. (2017). Machine learning at central banks. Bank of England working paper.
Chauvet, M., & Potter, S. (2013). Forecasting output. In G. Elliott & A. Timmermann (Eds.), Handbook of economic forecasting (Vol. 2, pp. 141–194). Amsterdam: Elsevier.
Google Scholar
Clark, T. E., & West, K. D. (2007). Approximately normal tests for equal predictive accuracy in nested models. Journal of Econometrics, 138(1), 291–311.
Article Google Scholar
Correa, R., Garud, K., Londono, J. M., Mislang, N., et al. (2017). Constructing a dictionary for financial stability. Board of Governors of the Federal Reserve System (US), 6, 9.
Google Scholar
Diebold, F. X., & Mariano, R. S. (2002). Comparing predictive accuracy. Journal of Business & Economic Statistics, 20(1), 134–144.
Article Google Scholar
Diebold, F. X., & Shin, M. (2018). Machine learning for regularized survey forecast combination: Partially-egalitarian lasso and its derivatives. Technical Report. National Bureau of Economic Research
Dossani, A. (2018). Essays on Inference from Option Markets. PhD thesis, UC San Diego
Efron, B., Hastie, T., Johnstone, I., Tibshirani, R., et al. (2004). Least angle regression. The Annals of Statistics, 32(2), 407–499.
Article Google Scholar
Elliott, G., & Timmermann, A. (2013). Handbook of economic forecasting. Amsterdam: Elsevier.
Google Scholar
Engelberg, J. E., & Parsons, C. A. (2011). The causal impact of media in financial markets. The Journal of Finance, 66(1), 67–97.
Article Google Scholar
Ericsson, N. R. (2016). Eliciting gdp forecasts from the FOMC’s minutes around the financial crisis. International Journal of Forecasting, 32(2), 571–583.
Article Google Scholar
Garcia, D. (2013). Sentiment during recessions. The Journal of Finance, 68(3), 1267–1300.
Article Google Scholar
Gentzkow, M., Kelly, B. T., & Taddy, M. (2017). Text as data. Working Paper 23276, National Bureau of Economic Research. http://www.nber.org/papers/w23276.
Gonçalves, S., McCracken, M. W., & Perron, B. (2017). Tests of equal accuracy for nested models with estimated factors. Journal of Econometrics, 198(2), 231–252.
Article Google Scholar
Hansen, S., McMahon, M., & Prat, A. (2017). Transparency and deliberation within the FOMC: A computational linguistics approach. The Quarterly Journal of Economics, 133(2), 801–870.
Article Google Scholar
Hu, G., Bhargava, P., Fuhrmann, S., Ellinger, S., & Spasojevic, N. (2017). Analyzing users’ sentiment towards popular consumer industries and brands on twitter. In 2017 IEEE International Conference on Data Mining Workshops (ICDMW) (pp. 381–388). IEEE.
Hu, M., & Liu, B. (2004). Mining and summarizing customer reviews. In Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining, (pp. 168–177). ACM, 2004.
Kalamara, E., Turrell, A., Redl, C., Kapetanios, G., & Kapadia, S. (2020). Making text count: economic forecasting using newspaper text. In Proceedings of ASSA 2020 Annual Meeting (pp. 1–51).
Li, J. (2015). Sparse and stable portfolio selection with parameter uncertainty. Journal of Business & Economic Statistics, 33(3), 381–392.
Article Google Scholar
Li, J., Tsiakas, I., & Wang, W. (2015). Predicting exchange rates out of sample: Can economic fundamentals beat the random walk? Journal of Financial Econometrics, 13(2), 293–341.
Article Google Scholar
Lima, L. R., & Godeiro, L. (2019). TextForecast: Regression Analysis and Forecasting Using Textual Data from a Time-Varying Dictionary. https://github.com/lucasgodeiro/TextForecast. R package version 0.1.2. Retrieved 10 Dec 2019.
Lima, L. R., Meng, F., & Godeiro, L. (2018). Quantile forecasting with mixed-frequency data. International Journal of Forecasting, Forthcoming.
Loughran, T., & McDonald, B. (2011). When is a liability not a liability? textual analysis, dictionaries, and 10-ks. The Journal of Finance, 66(1), 35–65.
Article Google Scholar
Loughran, T., & McDonald, B. (2013). Ipo first-day returns, offer price revisions, volatility, and form s-1 language. Journal of Financial Economics, 109(2), 307–326.
Article Google Scholar
Lucca, D. O., & Trebbi, F. (2009). Measuring central bank communication: an automated approach with application to FOMC statements. Technical Report. National Bureau of Economic Research
Marcellino, M., Porqueddu, M., & Venditti, F. (2016). Short-term gdp forecasting with a mixed-frequency dynamic factor model with stochastic volatility. Journal of Business & Economic Statistics, 34(1), 118–127.
Article Google Scholar
Nelson, W. (1972). Theory and applications of hazard plotting for censored failure data. Technometrics, 14(4), 945–966.
Article Google Scholar
Newey, W. K., & West, K. D. (1986). A simple, positive semi-definite, heteroskedasticity and autocorrelationconsistent covariance matrix. National Bureau of Economic Research
Nielsen, F. Å. (2011). A new anew: Evaluation of a word list for sentiment analysis in microblogs. arXiv preprintarXiv:1103.2903.
Nyman, R., Kapadia, S., Tuckett, D., Gregory, D., Ormerod, P., & Smith, R. (2018) News and narratives in financial systems: Exploiting big data for systemic risk assessment. Bank of England Working Paper
Sheng, X. S. (2015). Evaluating the economic forecasts of FOMC members. International Journal of Forecasting, 31(1), 165–175.
Article Google Scholar
Stekler, H., & Symington, H. (2016). Evaluating qualitative forecasts: The fomc minutes, 2006–2010. International Journal of Forecasting, 32(2), 559–570.
Article Google Scholar
Stock, J. H., & Watson, M. W. (2002). Forecasting using principal components from a large number of predictors. Journal of the American Statistical Association, 97(460), 1167–1179.
Article Google Scholar
Tetlock, P. C. (2007). Giving content to investor sentiment: The role of media in the stock market. The Journal of Finance, 62(3), 1139–1168.
Article Google Scholar
Thorsrud, L. A. (2018). Words are the new numbers: A newsy coincident index of the business cycle. Journal of Business & Economic Statistics, (just–accepted), 1–35.
Google Scholar
Toutanova, K., Klein, D., Manning, C. D., & Singer, Y. (2003). Feature-rich part-of-speech tagging with a cyclic dependency network. In Proceedings of the 2003 conference of the North American chapter of the association for computational linguistics on human language technology(Vol. 1, pp. 173–180). Association for Computational Linguistics.
Wright, J. H. (2012). What does monetary policy do to long-term interest rates at the zero lower bound? The Economic Journal, 122(564), F447–F466.
Article Google Scholar
Zou, H., & Hastie, T. (2005). Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 67(2), 301–320.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Economics, The University of Tennessee, Knoxville, USA
Luiz Renato Lima & Mohammed Mohsin
Federal University of Paraiba, Joao Pessoa, PB, Brazil
Luiz Renato Lima
Federal University of the Semi-Arid Region (UFERSA), Mossoro, RN, Brazil
Lucas Lúcio Godeiro

Authors

Luiz Renato Lima
View author publications
You can also search for this author in PubMed Google Scholar
Lucas Lúcio Godeiro
View author publications
You can also search for this author in PubMed Google Scholar
Mohammed Mohsin
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Luiz Renato Lima.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lima, L.R., Godeiro, L.L. & Mohsin, M. Time-Varying Dictionary and the Predictive Power of FED Minutes. Comput Econ 57, 149–181 (2021). https://doi.org/10.1007/s10614-020-10039-9

Download citation

Accepted: 10 August 2020
Published: 28 August 2020
Issue Date: January 2021
DOI: https://doi.org/10.1007/s10614-020-10039-9

Keywords

JEL Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Time-Varying Dictionary and the Predictive Power of FED Minutes

Abstract

Access this article

Similar content being viewed by others

Artificial intelligence in Finance: a comprehensive review through bibliometric and content analysis

What Is Inflation?

Autoencoders and their applications in machine learning: a survey

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

JEL Classification

Navigation

Time-Varying Dictionary and the Predictive Power of FED Minutes

Abstract

Access this article

Similar content being viewed by others

Artificial intelligence in Finance: a comprehensive review through bibliometric and content analysis

What Is Inflation?

Autoencoders and their applications in machine learning: a survey

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

JEL Classification

Search

Navigation