Predicting Abnormal Bank Stock Returns Using Textual Analysis of Annual Reports – a Neural Network Approach

Hájek, Petr; Boháčová, Jana

doi:10.1007/978-3-319-44188-7_5

Petr Hájek¹² &
Jana Boháčová¹²

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 629))

Included in the following conference series:

International Conference on Engineering Applications of Neural Networks

2522 Accesses
4 Citations

Abstract

This paper aims to extract both sentiment and bag-of-words information from the annual reports of U.S. banks. The sentiment analysis is based on two commonly used finance-specific dictionaries, while the bag-of-words are selected according to their tf-idf. We combine these features with financial indicators to predict abnormal bank stock returns using a neural network with dropout regularization and rectified linear units. We show that this method outperforms other machine learning algorithms (Naïve Bayes, Support Vector Machine, C4.5 decision tree, and k-nearest neighbour classifier) in predicting positive/negative abnormal stock returns. Thus, this neural network seems to be well suited for text classification tasks working with sparse high-dimensional data. We also show that the quality of the prediction significantly increased when using the combination of financial indicators and bigrams and trigrams, respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Henry, E.: Are investors influenced by how earnings press releases are written? J. Bus. Commun. 45(4), 363–407 (2008)
Article Google Scholar
Tetlock, P.C., Saar-Tsechansky, M., Macskassy, S.: More than words: quantifying language to measure firms’ fundamentals. J. Financ. 63(3), 1437–1467 (2008)
Article Google Scholar
Doran, J.S., Peterson, D.R., Price, S.M.: Earnings conference call content and stock price: the case of REITs. J. Real Estate Financ. Econ. 45(2), 402–434 (2012)
Article Google Scholar
Antweiler, W., Frank, M.Z.: Is all that talk just noise? the information content of internet stock message boards. J. Financ. 59(3), 1259–1294 (2004)
Article Google Scholar
Tetlock, P.C.: Giving content to investor sentiment: the role of media in the stock market. J. Financ. 62, 1139–1168 (2007)
Article Google Scholar
Loughran, T., McDonald, B.: When is a liability not a liability? textual analysis, dictionaries, and 10-Ks. J. Financ. 66(1), 35–65 (2011)
Article Google Scholar
Hájek, P., Olej, V.: Evaluating sentiment in annual reports for financial distress prediction using neural networks and support vector machines. In: Iliadis, L., Papadopoulos, H., Jayne, C. (eds.) EANN 2013, Part II. CCIS, vol. 384, pp. 1–10. Springer, Heidelberg (2013)
Chapter Google Scholar
Hajek, P., Olej, V., Myskova, R.: Forecasting corporate financial performance using sentiment in annual reports for stakeholders’ decision-making. Technol. Econ. Dev. Econ. 20(4), 721–738 (2014)
Article Google Scholar
Hájek, P., Olej, V.: Intuitionistic fuzzy neural network: the case of credit scoring using text information. In: Iliadis, L., et al. (eds.) EANN 2015. CCIS, vol. 517, pp. 337–346. Springer, Heidelberg (2015). doi:10.1007/978-3-319-23983-5_31
Chapter Google Scholar
Li, F.: Do Stock Market Investors Understand the Risk Sentiment of Corporate Annual Reports? (2006). SSRN 898181
Google Scholar
Li, F.: Annual report readability, current earnings, and earnings persistence. J. Account. Econ. 45(2), 221–247 (2008)
Article Google Scholar
Feldman, R., Govindaraj, S., Livnat, J., Segal, B.: Management’s tone change, post earnings announcement drift and accruals. Rev. Account. Stud. 15(4), 915–953 (2010)
Article Google Scholar
Balakrishnan, R., Qiu, X.Y., Srinivasan, P.: On the predictive ability of narrative disclosures in annual reports. Eur. J. Oper. Res. 202(3), 789–801 (2010)
Article MATH Google Scholar
Price, S.M., Doran, J.S., Peterson, D.R., Bliss, B.A.: Earnings conference calls and stock returns: the incremental informativeness of textual tone. J. Bank. Financ. 36(4), 992–1011 (2012)
Article Google Scholar
Loughran, T., McDonald, B.: The use of word lists in textual analysis. J. Behav. Financ. 16(1), 1–11 (2015)
Article Google Scholar
Hinton, G.E., Srivastava, N., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.R.: Improving neural networks by preventing co-adaptation of feature detectors (2012). arXiv preprint arXiv:1207.0580
Baharudin, B., Lee, L.H., Khan, K.: A review of machine learning algorithms for text-documents classification. J. Adv. Inf. Technol. 1(1), 4–20 (2010)
Google Scholar
Kearney, C., Liu, S.: Textual sentiment in finance: a survey of methods and models. Int. Rev. Finan. Anal. 23(33), 171–185 (2014)
Article Google Scholar
Li, F.: The information content of forward-looking statements in corporate filings - a naïve Bayesian machine learning approach. J. Account. Res. 48(5), 1049–1102 (2010)
Article Google Scholar
Demers, E.A., Vega, C.: Soft Information in Earnings Announcements: News or Noise? Working paper. In: INSEAD (2010)
Google Scholar
Davis, A.K., Piger, J.M., Sedor, L.M.: Beyond the numbers: measuring the information content of earnings press release language. Contemp. Account. Res. 29(3), 845–868 (2012)
Article Google Scholar
Schumaker, R.P., Chen, H.: Textual analysis of stock market prediction using breaking financial news: the AZFin text system. ACM Trans. Inf. Syst. (TOIS) 27(2), 12 (2009)
Article Google Scholar
Jiang, S., Pang, G., Wu, M., Kuang, L.: An improved K-nearest-neighbor algorithm for text categorization. Expert Syst. Appl. 39(1), 1503–1509 (2012)
Article Google Scholar
Schumaker, R.P., Zhang, Y., Huang, C.N., Chen, H.: Evaluating sentiment in financial news articles. Decis. Support Syst. 53(3), 458–464 (2012)
Article Google Scholar
Li, Q., Wang, T., Gong, Q., Chen, Y., Lin, Z., Song, S.K.: Media-aware quantitative trading based on public web information. Decis. Support Syst. 61, 93–105 (2014)
Article Google Scholar
Yu, Y., Duan, W., Cao, Q.: The impact of social and conventional media on firm equity value: a sentiment analysis approach. Decis. Support Syst. 55(4), 919–926 (2013)
Article Google Scholar
Kothari, S.P., Li, X., Short, J.E.: The effect of disclosures by management, analysts, and business press on cost of capital, return volatility, and analyst forecasts: a study using content analysis. Account. Rev. 84(5), 1639–1670 (2009)
Article Google Scholar
Fama, E.F., French, K.R.: Common risk factors in the returns on stocks and bonds. J. Finan. Econ. 33(1), 3–56 (1993)
Article MATH Google Scholar
Nassirtoussi, A.K., Aghabozorgi, S., Wah, T.Y., Ngo, D.C.L.: Text mining for market prediction: a systematic review. Expert Syst. Appl. 41(16), 7653–7670 (2014)
Article Google Scholar
Nam, J., Kim, J., Mencía, E.L., Gurevych, I., Fürnkranz, J.: Large-scale multi-label text classification - revisiting neural networks. In: Calders, T., Esposito, F., Hullermeier, E., Meo, R. (eds.) Machine Learning and Knowledge Discovery in Databases, pp. 437–452. Springer, Heidelberg (2014)
Google Scholar
Maas, A.L., Hannun, A.Y., Ng, A.Y.: Rectifier nonlinearities improve neural network acoustic models. In: Proceedings of the 30th International Conference on Machine Learning (ICML), vol. 30, pp. 1–6. Atlanta, Georgia (2013)
Google Scholar
Jaitly, N., Hinton, G.: Learning a better representation of speech soundwaves using restricted boltzmann machines. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5884–5887 (2011)
Google Scholar
Chawla, N.V., Japkowicz, N., Kotcz, A.: Editorial: special issue on learning from imbalanced data sets. ACM Sigkdd Explor. Newsl. 6(1), 1–6 (2004)
Article Google Scholar
Yin, L., Ge, Y., Xiao, K., Wang, X., Quan, X.: Feature selection for high-dimensional imbalanced data. Neurocomputing 105, 3–11 (2013)
Article Google Scholar

Download references

Acknowledgments

This work was supported by the scientific research project of the Czech Sciences Foundation Grant No: GA16-19590S “Topic and sentiment analysis of multiple textual sources for corporate financial decision-making”.

Author information

Authors and Affiliations

Institute of System Engineering and Informatics, Faculty of Economics and Administration, University of Pardubice, Pardubice, Czech Republic
Petr Hájek & Jana Boháčová

Authors

Petr Hájek
View author publications
You can also search for this author in PubMed Google Scholar
Jana Boháčová
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Petr Hájek .

Editor information

Editors and Affiliations

Robert Gordon University, Aberdeen, United Kingdom
Chrisina Jayne
Lab of Forest Informatics (FiLAB), Democritus University of Thrace Lab of Forest Informatics (FiLAB), Orestiada, Greece
Lazaros Iliadis

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hájek, P., Boháčová, J. (2016). Predicting Abnormal Bank Stock Returns Using Textual Analysis of Annual Reports – a Neural Network Approach. In: Jayne, C., Iliadis, L. (eds) Engineering Applications of Neural Networks. EANN 2016. Communications in Computer and Information Science, vol 629. Springer, Cham. https://doi.org/10.1007/978-3-319-44188-7_5

Download citation

DOI: https://doi.org/10.1007/978-3-319-44188-7_5
Published: 19 August 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-44187-0
Online ISBN: 978-3-319-44188-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics