Abstract
This paper aims to extract both sentiment and bag-of-words information from the annual reports of U.S. banks. The sentiment analysis is based on two commonly used finance-specific dictionaries, while the bag-of-words are selected according to their tf-idf. We combine these features with financial indicators to predict abnormal bank stock returns using a neural network with dropout regularization and rectified linear units. We show that this method outperforms other machine learning algorithms (Naïve Bayes, Support Vector Machine, C4.5 decision tree, and k-nearest neighbour classifier) in predicting positive/negative abnormal stock returns. Thus, this neural network seems to be well suited for text classification tasks working with sparse high-dimensional data. We also show that the quality of the prediction significantly increased when using the combination of financial indicators and bigrams and trigrams, respectively.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Henry, E.: Are investors influenced by how earnings press releases are written? J. Bus. Commun. 45(4), 363–407 (2008)
Tetlock, P.C., Saar-Tsechansky, M., Macskassy, S.: More than words: quantifying language to measure firms’ fundamentals. J. Financ. 63(3), 1437–1467 (2008)
Doran, J.S., Peterson, D.R., Price, S.M.: Earnings conference call content and stock price: the case of REITs. J. Real Estate Financ. Econ. 45(2), 402–434 (2012)
Antweiler, W., Frank, M.Z.: Is all that talk just noise? the information content of internet stock message boards. J. Financ. 59(3), 1259–1294 (2004)
Tetlock, P.C.: Giving content to investor sentiment: the role of media in the stock market. J. Financ. 62, 1139–1168 (2007)
Loughran, T., McDonald, B.: When is a liability not a liability? textual analysis, dictionaries, and 10-Ks. J. Financ. 66(1), 35–65 (2011)
Hájek, P., Olej, V.: Evaluating sentiment in annual reports for financial distress prediction using neural networks and support vector machines. In: Iliadis, L., Papadopoulos, H., Jayne, C. (eds.) EANN 2013, Part II. CCIS, vol. 384, pp. 1–10. Springer, Heidelberg (2013)
Hajek, P., Olej, V., Myskova, R.: Forecasting corporate financial performance using sentiment in annual reports for stakeholders’ decision-making. Technol. Econ. Dev. Econ. 20(4), 721–738 (2014)
Hájek, P., Olej, V.: Intuitionistic fuzzy neural network: the case of credit scoring using text information. In: Iliadis, L., et al. (eds.) EANN 2015. CCIS, vol. 517, pp. 337–346. Springer, Heidelberg (2015). doi:10.1007/978-3-319-23983-5_31
Li, F.: Do Stock Market Investors Understand the Risk Sentiment of Corporate Annual Reports? (2006). SSRN 898181
Li, F.: Annual report readability, current earnings, and earnings persistence. J. Account. Econ. 45(2), 221–247 (2008)
Feldman, R., Govindaraj, S., Livnat, J., Segal, B.: Management’s tone change, post earnings announcement drift and accruals. Rev. Account. Stud. 15(4), 915–953 (2010)
Balakrishnan, R., Qiu, X.Y., Srinivasan, P.: On the predictive ability of narrative disclosures in annual reports. Eur. J. Oper. Res. 202(3), 789–801 (2010)
Price, S.M., Doran, J.S., Peterson, D.R., Bliss, B.A.: Earnings conference calls and stock returns: the incremental informativeness of textual tone. J. Bank. Financ. 36(4), 992–1011 (2012)
Loughran, T., McDonald, B.: The use of word lists in textual analysis. J. Behav. Financ. 16(1), 1–11 (2015)
Hinton, G.E., Srivastava, N., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.R.: Improving neural networks by preventing co-adaptation of feature detectors (2012). arXiv preprint arXiv:1207.0580
Baharudin, B., Lee, L.H., Khan, K.: A review of machine learning algorithms for text-documents classification. J. Adv. Inf. Technol. 1(1), 4–20 (2010)
Kearney, C., Liu, S.: Textual sentiment in finance: a survey of methods and models. Int. Rev. Finan. Anal. 23(33), 171–185 (2014)
Li, F.: The information content of forward-looking statements in corporate filings - a naïve Bayesian machine learning approach. J. Account. Res. 48(5), 1049–1102 (2010)
Demers, E.A., Vega, C.: Soft Information in Earnings Announcements: News or Noise? Working paper. In: INSEAD (2010)
Davis, A.K., Piger, J.M., Sedor, L.M.: Beyond the numbers: measuring the information content of earnings press release language. Contemp. Account. Res. 29(3), 845–868 (2012)
Schumaker, R.P., Chen, H.: Textual analysis of stock market prediction using breaking financial news: the AZFin text system. ACM Trans. Inf. Syst. (TOIS) 27(2), 12 (2009)
Jiang, S., Pang, G., Wu, M., Kuang, L.: An improved K-nearest-neighbor algorithm for text categorization. Expert Syst. Appl. 39(1), 1503–1509 (2012)
Schumaker, R.P., Zhang, Y., Huang, C.N., Chen, H.: Evaluating sentiment in financial news articles. Decis. Support Syst. 53(3), 458–464 (2012)
Li, Q., Wang, T., Gong, Q., Chen, Y., Lin, Z., Song, S.K.: Media-aware quantitative trading based on public web information. Decis. Support Syst. 61, 93–105 (2014)
Yu, Y., Duan, W., Cao, Q.: The impact of social and conventional media on firm equity value: a sentiment analysis approach. Decis. Support Syst. 55(4), 919–926 (2013)
Kothari, S.P., Li, X., Short, J.E.: The effect of disclosures by management, analysts, and business press on cost of capital, return volatility, and analyst forecasts: a study using content analysis. Account. Rev. 84(5), 1639–1670 (2009)
Fama, E.F., French, K.R.: Common risk factors in the returns on stocks and bonds. J. Finan. Econ. 33(1), 3–56 (1993)
Nassirtoussi, A.K., Aghabozorgi, S., Wah, T.Y., Ngo, D.C.L.: Text mining for market prediction: a systematic review. Expert Syst. Appl. 41(16), 7653–7670 (2014)
Nam, J., Kim, J., Mencía, E.L., Gurevych, I., Fürnkranz, J.: Large-scale multi-label text classification - revisiting neural networks. In: Calders, T., Esposito, F., Hullermeier, E., Meo, R. (eds.) Machine Learning and Knowledge Discovery in Databases, pp. 437–452. Springer, Heidelberg (2014)
Maas, A.L., Hannun, A.Y., Ng, A.Y.: Rectifier nonlinearities improve neural network acoustic models. In: Proceedings of the 30th International Conference on Machine Learning (ICML), vol. 30, pp. 1–6. Atlanta, Georgia (2013)
Jaitly, N., Hinton, G.: Learning a better representation of speech soundwaves using restricted boltzmann machines. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5884–5887 (2011)
Chawla, N.V., Japkowicz, N., Kotcz, A.: Editorial: special issue on learning from imbalanced data sets. ACM Sigkdd Explor. Newsl. 6(1), 1–6 (2004)
Yin, L., Ge, Y., Xiao, K., Wang, X., Quan, X.: Feature selection for high-dimensional imbalanced data. Neurocomputing 105, 3–11 (2013)
Acknowledgments
This work was supported by the scientific research project of the Czech Sciences Foundation Grant No: GA16-19590S “Topic and sentiment analysis of multiple textual sources for corporate financial decision-making”.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Hájek, P., Boháčová, J. (2016). Predicting Abnormal Bank Stock Returns Using Textual Analysis of Annual Reports – a Neural Network Approach. In: Jayne, C., Iliadis, L. (eds) Engineering Applications of Neural Networks. EANN 2016. Communications in Computer and Information Science, vol 629. Springer, Cham. https://doi.org/10.1007/978-3-319-44188-7_5
Download citation
DOI: https://doi.org/10.1007/978-3-319-44188-7_5
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-44187-0
Online ISBN: 978-3-319-44188-7
eBook Packages: Computer ScienceComputer Science (R0)