EANN 2016: Engineering Applications of Neural Networks pp 67-78 | Cite as
Predicting Abnormal Bank Stock Returns Using Textual Analysis of Annual Reports – a Neural Network Approach
Abstract
This paper aims to extract both sentiment and bag-of-words information from the annual reports of U.S. banks. The sentiment analysis is based on two commonly used finance-specific dictionaries, while the bag-of-words are selected according to their tf-idf. We combine these features with financial indicators to predict abnormal bank stock returns using a neural network with dropout regularization and rectified linear units. We show that this method outperforms other machine learning algorithms (Naïve Bayes, Support Vector Machine, C4.5 decision tree, and k-nearest neighbour classifier) in predicting positive/negative abnormal stock returns. Thus, this neural network seems to be well suited for text classification tasks working with sparse high-dimensional data. We also show that the quality of the prediction significantly increased when using the combination of financial indicators and bigrams and trigrams, respectively.
Keywords
Stock return Prediction Text mining Sentiment Neural networkNotes
Acknowledgments
This work was supported by the scientific research project of the Czech Sciences Foundation Grant No: GA16-19590S “Topic and sentiment analysis of multiple textual sources for corporate financial decision-making”.
References
- 1.Henry, E.: Are investors influenced by how earnings press releases are written? J. Bus. Commun. 45(4), 363–407 (2008)CrossRefGoogle Scholar
- 2.Tetlock, P.C., Saar-Tsechansky, M., Macskassy, S.: More than words: quantifying language to measure firms’ fundamentals. J. Financ. 63(3), 1437–1467 (2008)CrossRefGoogle Scholar
- 3.Doran, J.S., Peterson, D.R., Price, S.M.: Earnings conference call content and stock price: the case of REITs. J. Real Estate Financ. Econ. 45(2), 402–434 (2012)CrossRefGoogle Scholar
- 4.Antweiler, W., Frank, M.Z.: Is all that talk just noise? the information content of internet stock message boards. J. Financ. 59(3), 1259–1294 (2004)CrossRefGoogle Scholar
- 5.Tetlock, P.C.: Giving content to investor sentiment: the role of media in the stock market. J. Financ. 62, 1139–1168 (2007)CrossRefGoogle Scholar
- 6.Loughran, T., McDonald, B.: When is a liability not a liability? textual analysis, dictionaries, and 10-Ks. J. Financ. 66(1), 35–65 (2011)CrossRefGoogle Scholar
- 7.Hájek, P., Olej, V.: Evaluating sentiment in annual reports for financial distress prediction using neural networks and support vector machines. In: Iliadis, L., Papadopoulos, H., Jayne, C. (eds.) EANN 2013, Part II. CCIS, vol. 384, pp. 1–10. Springer, Heidelberg (2013)CrossRefGoogle Scholar
- 8.Hajek, P., Olej, V., Myskova, R.: Forecasting corporate financial performance using sentiment in annual reports for stakeholders’ decision-making. Technol. Econ. Dev. Econ. 20(4), 721–738 (2014)CrossRefGoogle Scholar
- 9.Hájek, P., Olej, V.: Intuitionistic fuzzy neural network: the case of credit scoring using text information. In: Iliadis, L., et al. (eds.) EANN 2015. CCIS, vol. 517, pp. 337–346. Springer, Heidelberg (2015). doi: 10.1007/978-3-319-23983-5_31 CrossRefGoogle Scholar
- 10.Li, F.: Do Stock Market Investors Understand the Risk Sentiment of Corporate Annual Reports? (2006). SSRN 898181Google Scholar
- 11.Li, F.: Annual report readability, current earnings, and earnings persistence. J. Account. Econ. 45(2), 221–247 (2008)CrossRefGoogle Scholar
- 12.Feldman, R., Govindaraj, S., Livnat, J., Segal, B.: Management’s tone change, post earnings announcement drift and accruals. Rev. Account. Stud. 15(4), 915–953 (2010)CrossRefGoogle Scholar
- 13.Balakrishnan, R., Qiu, X.Y., Srinivasan, P.: On the predictive ability of narrative disclosures in annual reports. Eur. J. Oper. Res. 202(3), 789–801 (2010)CrossRefMATHGoogle Scholar
- 14.Price, S.M., Doran, J.S., Peterson, D.R., Bliss, B.A.: Earnings conference calls and stock returns: the incremental informativeness of textual tone. J. Bank. Financ. 36(4), 992–1011 (2012)CrossRefGoogle Scholar
- 15.Loughran, T., McDonald, B.: The use of word lists in textual analysis. J. Behav. Financ. 16(1), 1–11 (2015)CrossRefGoogle Scholar
- 16.Hinton, G.E., Srivastava, N., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.R.: Improving neural networks by preventing co-adaptation of feature detectors (2012). arXiv preprint arXiv:1207.0580
- 17.Baharudin, B., Lee, L.H., Khan, K.: A review of machine learning algorithms for text-documents classification. J. Adv. Inf. Technol. 1(1), 4–20 (2010)Google Scholar
- 18.Kearney, C., Liu, S.: Textual sentiment in finance: a survey of methods and models. Int. Rev. Finan. Anal. 23(33), 171–185 (2014)CrossRefGoogle Scholar
- 19.Li, F.: The information content of forward-looking statements in corporate filings - a naïve Bayesian machine learning approach. J. Account. Res. 48(5), 1049–1102 (2010)CrossRefGoogle Scholar
- 20.Demers, E.A., Vega, C.: Soft Information in Earnings Announcements: News or Noise? Working paper. In: INSEAD (2010)Google Scholar
- 21.Davis, A.K., Piger, J.M., Sedor, L.M.: Beyond the numbers: measuring the information content of earnings press release language. Contemp. Account. Res. 29(3), 845–868 (2012)CrossRefGoogle Scholar
- 22.Schumaker, R.P., Chen, H.: Textual analysis of stock market prediction using breaking financial news: the AZFin text system. ACM Trans. Inf. Syst. (TOIS) 27(2), 12 (2009)CrossRefGoogle Scholar
- 23.Jiang, S., Pang, G., Wu, M., Kuang, L.: An improved K-nearest-neighbor algorithm for text categorization. Expert Syst. Appl. 39(1), 1503–1509 (2012)CrossRefGoogle Scholar
- 24.Schumaker, R.P., Zhang, Y., Huang, C.N., Chen, H.: Evaluating sentiment in financial news articles. Decis. Support Syst. 53(3), 458–464 (2012)CrossRefGoogle Scholar
- 25.Li, Q., Wang, T., Gong, Q., Chen, Y., Lin, Z., Song, S.K.: Media-aware quantitative trading based on public web information. Decis. Support Syst. 61, 93–105 (2014)CrossRefGoogle Scholar
- 26.Yu, Y., Duan, W., Cao, Q.: The impact of social and conventional media on firm equity value: a sentiment analysis approach. Decis. Support Syst. 55(4), 919–926 (2013)CrossRefGoogle Scholar
- 27.Kothari, S.P., Li, X., Short, J.E.: The effect of disclosures by management, analysts, and business press on cost of capital, return volatility, and analyst forecasts: a study using content analysis. Account. Rev. 84(5), 1639–1670 (2009)CrossRefGoogle Scholar
- 28.Fama, E.F., French, K.R.: Common risk factors in the returns on stocks and bonds. J. Finan. Econ. 33(1), 3–56 (1993)CrossRefMATHGoogle Scholar
- 29.Nassirtoussi, A.K., Aghabozorgi, S., Wah, T.Y., Ngo, D.C.L.: Text mining for market prediction: a systematic review. Expert Syst. Appl. 41(16), 7653–7670 (2014)CrossRefGoogle Scholar
- 30.Nam, J., Kim, J., Mencía, E.L., Gurevych, I., Fürnkranz, J.: Large-scale multi-label text classification - revisiting neural networks. In: Calders, T., Esposito, F., Hullermeier, E., Meo, R. (eds.) Machine Learning and Knowledge Discovery in Databases, pp. 437–452. Springer, Heidelberg (2014)Google Scholar
- 31.Maas, A.L., Hannun, A.Y., Ng, A.Y.: Rectifier nonlinearities improve neural network acoustic models. In: Proceedings of the 30th International Conference on Machine Learning (ICML), vol. 30, pp. 1–6. Atlanta, Georgia (2013)Google Scholar
- 32.Jaitly, N., Hinton, G.: Learning a better representation of speech soundwaves using restricted boltzmann machines. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5884–5887 (2011)Google Scholar
- 33.Chawla, N.V., Japkowicz, N., Kotcz, A.: Editorial: special issue on learning from imbalanced data sets. ACM Sigkdd Explor. Newsl. 6(1), 1–6 (2004)CrossRefGoogle Scholar
- 34.Yin, L., Ge, Y., Xiao, K., Wang, X., Quan, X.: Feature selection for high-dimensional imbalanced data. Neurocomputing 105, 3–11 (2013)CrossRefGoogle Scholar