Predicting Abnormal Bank Stock Returns Using Textual Analysis of Annual Reports – a Neural Network Approach

Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 629)

Abstract

This paper aims to extract both sentiment and bag-of-words information from the annual reports of U.S. banks. The sentiment analysis is based on two commonly used finance-specific dictionaries, while the bag-of-words are selected according to their tf-idf. We combine these features with financial indicators to predict abnormal bank stock returns using a neural network with dropout regularization and rectified linear units. We show that this method outperforms other machine learning algorithms (Naïve Bayes, Support Vector Machine, C4.5 decision tree, and k-nearest neighbour classifier) in predicting positive/negative abnormal stock returns. Thus, this neural network seems to be well suited for text classification tasks working with sparse high-dimensional data. We also show that the quality of the prediction significantly increased when using the combination of financial indicators and bigrams and trigrams, respectively.

Keywords

Stock return Prediction Text mining Sentiment Neural network 

Notes

Acknowledgments

This work was supported by the scientific research project of the Czech Sciences Foundation Grant No: GA16-19590S “Topic and sentiment analysis of multiple textual sources for corporate financial decision-making”.

References

  1. 1.
    Henry, E.: Are investors influenced by how earnings press releases are written? J. Bus. Commun. 45(4), 363–407 (2008)CrossRefGoogle Scholar
  2. 2.
    Tetlock, P.C., Saar-Tsechansky, M., Macskassy, S.: More than words: quantifying language to measure firms’ fundamentals. J. Financ. 63(3), 1437–1467 (2008)CrossRefGoogle Scholar
  3. 3.
    Doran, J.S., Peterson, D.R., Price, S.M.: Earnings conference call content and stock price: the case of REITs. J. Real Estate Financ. Econ. 45(2), 402–434 (2012)CrossRefGoogle Scholar
  4. 4.
    Antweiler, W., Frank, M.Z.: Is all that talk just noise? the information content of internet stock message boards. J. Financ. 59(3), 1259–1294 (2004)CrossRefGoogle Scholar
  5. 5.
    Tetlock, P.C.: Giving content to investor sentiment: the role of media in the stock market. J. Financ. 62, 1139–1168 (2007)CrossRefGoogle Scholar
  6. 6.
    Loughran, T., McDonald, B.: When is a liability not a liability? textual analysis, dictionaries, and 10-Ks. J. Financ. 66(1), 35–65 (2011)CrossRefGoogle Scholar
  7. 7.
    Hájek, P., Olej, V.: Evaluating sentiment in annual reports for financial distress prediction using neural networks and support vector machines. In: Iliadis, L., Papadopoulos, H., Jayne, C. (eds.) EANN 2013, Part II. CCIS, vol. 384, pp. 1–10. Springer, Heidelberg (2013)CrossRefGoogle Scholar
  8. 8.
    Hajek, P., Olej, V., Myskova, R.: Forecasting corporate financial performance using sentiment in annual reports for stakeholders’ decision-making. Technol. Econ. Dev. Econ. 20(4), 721–738 (2014)CrossRefGoogle Scholar
  9. 9.
    Hájek, P., Olej, V.: Intuitionistic fuzzy neural network: the case of credit scoring using text information. In: Iliadis, L., et al. (eds.) EANN 2015. CCIS, vol. 517, pp. 337–346. Springer, Heidelberg (2015). doi: 10.1007/978-3-319-23983-5_31 CrossRefGoogle Scholar
  10. 10.
    Li, F.: Do Stock Market Investors Understand the Risk Sentiment of Corporate Annual Reports? (2006). SSRN 898181Google Scholar
  11. 11.
    Li, F.: Annual report readability, current earnings, and earnings persistence. J. Account. Econ. 45(2), 221–247 (2008)CrossRefGoogle Scholar
  12. 12.
    Feldman, R., Govindaraj, S., Livnat, J., Segal, B.: Management’s tone change, post earnings announcement drift and accruals. Rev. Account. Stud. 15(4), 915–953 (2010)CrossRefGoogle Scholar
  13. 13.
    Balakrishnan, R., Qiu, X.Y., Srinivasan, P.: On the predictive ability of narrative disclosures in annual reports. Eur. J. Oper. Res. 202(3), 789–801 (2010)CrossRefMATHGoogle Scholar
  14. 14.
    Price, S.M., Doran, J.S., Peterson, D.R., Bliss, B.A.: Earnings conference calls and stock returns: the incremental informativeness of textual tone. J. Bank. Financ. 36(4), 992–1011 (2012)CrossRefGoogle Scholar
  15. 15.
    Loughran, T., McDonald, B.: The use of word lists in textual analysis. J. Behav. Financ. 16(1), 1–11 (2015)CrossRefGoogle Scholar
  16. 16.
    Hinton, G.E., Srivastava, N., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.R.: Improving neural networks by preventing co-adaptation of feature detectors (2012). arXiv preprint arXiv:1207.0580
  17. 17.
    Baharudin, B., Lee, L.H., Khan, K.: A review of machine learning algorithms for text-documents classification. J. Adv. Inf. Technol. 1(1), 4–20 (2010)Google Scholar
  18. 18.
    Kearney, C., Liu, S.: Textual sentiment in finance: a survey of methods and models. Int. Rev. Finan. Anal. 23(33), 171–185 (2014)CrossRefGoogle Scholar
  19. 19.
    Li, F.: The information content of forward-looking statements in corporate filings - a naïve Bayesian machine learning approach. J. Account. Res. 48(5), 1049–1102 (2010)CrossRefGoogle Scholar
  20. 20.
    Demers, E.A., Vega, C.: Soft Information in Earnings Announcements: News or Noise? Working paper. In: INSEAD (2010)Google Scholar
  21. 21.
    Davis, A.K., Piger, J.M., Sedor, L.M.: Beyond the numbers: measuring the information content of earnings press release language. Contemp. Account. Res. 29(3), 845–868 (2012)CrossRefGoogle Scholar
  22. 22.
    Schumaker, R.P., Chen, H.: Textual analysis of stock market prediction using breaking financial news: the AZFin text system. ACM Trans. Inf. Syst. (TOIS) 27(2), 12 (2009)CrossRefGoogle Scholar
  23. 23.
    Jiang, S., Pang, G., Wu, M., Kuang, L.: An improved K-nearest-neighbor algorithm for text categorization. Expert Syst. Appl. 39(1), 1503–1509 (2012)CrossRefGoogle Scholar
  24. 24.
    Schumaker, R.P., Zhang, Y., Huang, C.N., Chen, H.: Evaluating sentiment in financial news articles. Decis. Support Syst. 53(3), 458–464 (2012)CrossRefGoogle Scholar
  25. 25.
    Li, Q., Wang, T., Gong, Q., Chen, Y., Lin, Z., Song, S.K.: Media-aware quantitative trading based on public web information. Decis. Support Syst. 61, 93–105 (2014)CrossRefGoogle Scholar
  26. 26.
    Yu, Y., Duan, W., Cao, Q.: The impact of social and conventional media on firm equity value: a sentiment analysis approach. Decis. Support Syst. 55(4), 919–926 (2013)CrossRefGoogle Scholar
  27. 27.
    Kothari, S.P., Li, X., Short, J.E.: The effect of disclosures by management, analysts, and business press on cost of capital, return volatility, and analyst forecasts: a study using content analysis. Account. Rev. 84(5), 1639–1670 (2009)CrossRefGoogle Scholar
  28. 28.
    Fama, E.F., French, K.R.: Common risk factors in the returns on stocks and bonds. J. Finan. Econ. 33(1), 3–56 (1993)CrossRefMATHGoogle Scholar
  29. 29.
    Nassirtoussi, A.K., Aghabozorgi, S., Wah, T.Y., Ngo, D.C.L.: Text mining for market prediction: a systematic review. Expert Syst. Appl. 41(16), 7653–7670 (2014)CrossRefGoogle Scholar
  30. 30.
    Nam, J., Kim, J., Mencía, E.L., Gurevych, I., Fürnkranz, J.: Large-scale multi-label text classification - revisiting neural networks. In: Calders, T., Esposito, F., Hullermeier, E., Meo, R. (eds.) Machine Learning and Knowledge Discovery in Databases, pp. 437–452. Springer, Heidelberg (2014)Google Scholar
  31. 31.
    Maas, A.L., Hannun, A.Y., Ng, A.Y.: Rectifier nonlinearities improve neural network acoustic models. In: Proceedings of the 30th International Conference on Machine Learning (ICML), vol. 30, pp. 1–6. Atlanta, Georgia (2013)Google Scholar
  32. 32.
    Jaitly, N., Hinton, G.: Learning a better representation of speech soundwaves using restricted boltzmann machines. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5884–5887 (2011)Google Scholar
  33. 33.
    Chawla, N.V., Japkowicz, N., Kotcz, A.: Editorial: special issue on learning from imbalanced data sets. ACM Sigkdd Explor. Newsl. 6(1), 1–6 (2004)CrossRefGoogle Scholar
  34. 34.
    Yin, L., Ge, Y., Xiao, K., Wang, X., Quan, X.: Feature selection for high-dimensional imbalanced data. Neurocomputing 105, 3–11 (2013)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  1. 1.Institute of System Engineering and Informatics, Faculty of Economics and AdministrationUniversity of PardubicePardubiceCzech Republic

Personalised recommendations