Abstract
Mining financial text documents and understanding the sentiments of individual investors, institutions and markets is an important and challenging problem in the literature. Current approaches to mine sentiments from financial texts largely rely on domain-specific dictionaries. However, dictionary-based methods often fail to accurately predict the polarity of financial texts. This paper aims to improve the state-of-the-art and introduces a novel sentiment analysis approach that employs the concept of financial and non-financial performance indicators. It presents an association rule mining-based hierarchical sentiment classifier model to predict the polarity of financial texts as positive, neutral or negative. The performance of the proposed model is evaluated on a benchmark financial dataset. The model is also compared against other state-of-the-art dictionary and machine learning-based approaches and the results are found to be quite promising. The novel use of performance indicators for financial sentiment analysis offers interesting and useful insights.
Similar content being viewed by others
References
Agrawal R, Imieliński T, Swami A (1993) Mining association rules between sets of items in large databases. In: Proceedings of the 1993 ACM SIGMOD International conference on management of data, pp 207–216, Washington, DC, May 26–28
Antweiler W, Frank MZ (2004) Is all that talk just noise? The information content of internet stock message boards. J Finance 59(3):1259–1294
Berzal F, Cubero J-C, Sánchez D, Serrano JM (2004) Art: a hybrid classification model. Mach Learn 54(1):67–92
Bird S (2006) Nltk: the natural language toolkit. In: Proceedings of the COLING/ACL on interactive presentation sessions. Association for Computational Linguistics, pp 69–72
Blitzer J, Blitzer J, Dredze M, Dredze M, Pereira F, Pereira F (2007) Biographies, bollywood, boom-boxes and blenders: domain adaptation for sentiment classification. Annu Meet Assoc Comput Linguist 45(1):440
Cambria E, Olsher D, Rajagopal D ( 2014). Senticnet 3: a common and common-sense knowledge base for cognition-driven sentiment analysis. In: Proceedings of the twenty-eighth AAAI conference on artificial intelligence, pp 1515–1521
Dang Y, Zhang Y, Chen H (2010) A lexicon-enhanced method for sentiment classification: an experiment on online product reviews. IEEE Intell Syst 25(4):46–53
Esuli A, Sebastiani F (2006) Sentiwordnet: a publicly available lexical resource for opinion mining. In: Proceedings of LREC, vol 6, pp 417–422
Fawcett T (2006) An introduction to ROC analysis. Pattern Recogn Lett 27(8):861–874
Ferguson N, Philip D, Lam H, Guo J (2014) Media content and stock returns: the predictive power of press. Multinatl Finance J 19(1):1–31
Huang AH, Zang AY, Zheng R (2014) Evidence on the information content of text in analyst reports. Account Rev 89(6):2151–2180
Ittner CD, Larcker DF (1998) Are nonfinancial measures leading indicators of financial performance? An analysis of customer satisfaction. J Account Res 36:1–35
Kaplan RS, Norton DP (1996) Linking the balanced scorecard to strategy. Calif Manag Rev 39(1):53–79
Kearney C, Liu S (2014) Textual sentiment in finance: a survey of methods and models. Int Rev Financ Anal 33:171–185
Li F (2010) The information content of forward- looking statements in corporate filings—a naive bayesian machine learning approach. J Account Res 48(5):1049–1102
Li W, Han J, Pei J (2001) CMAR: accurate and efficient classification based on multiple class-association rules. In: Proceedings 2001 IEEE international conference on data mining, pp 369–376
Li Q, Wang T, Li P, Liu L, Gong Q, Chen Y (2014a) The effect of news and public mood on stock movements. Inf Sci 278:826–840
Li X, Xie H, Chen L, Wang J, Deng X (2014b) News impact on stock price return via sentiment analysis. Knowl Based Syst 69:14–23
Li Q, Chen Y, Jiang LL, Li P, Chen H (2016) A tensor-based information framework for predicting the stock market. ACM Trans Inf Syst (TOIS) 34(2):11
Liu M, Qu J (2012) Mining high utility itemsets without candidate generation. In: Proceedings of the 21st ACM international conference on information and knowledge management. ACM, pp 55–64
Liu B, Hsu W, Ma Y, Ma B (1998) Integrating classification and association rule mining. Knowl Discov Data Min 80–86
Loughran T, Mcdonald B (2011) When is a liability not a liability? Textual analysis, dictionaries, and 10-Ks. J Finance 66(1):35–65
Loughran T, McDonald B (2015) The use of word lists in textual analysis. J Behav Finance 16(1):1–11
Loughran T, McDonald B (2016) Textual analysis in accounting and finance: a survey. J Account Res 54:1187–1230
Malo P, Sinha A, Korhonen P, Wallenius J, Takala P (2014) Good debt or bad debt: detecting semantic orientations in economic texts. J Assoc Inf Sci Technol 65(4):782–796
Man Y, Yuanxin O, Hao S (2014) Investigating association rules for sentiment classification of web reviews. J Intell Fuzzy Syst 27(4):2055–2065
Meretakis D, Wüthrich B (1999) Extending naive bayes classifiers using long itemsets. In: Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 165–174
Moilanen K, Pulman S, Zhang Y (2010) Packed feelings and ordered sentiments: sentiment parsing with quasi-compositional polarity sequencing and compression. In: Proceedings of the 1st workshop on computational approaches to subjectivity and sentiment analysis (WASSA 2010) at the 19th European conference on artificial intelligence (ECAI 2010), pp. 36–43
Mo S, Y K, Liu A, Yang SY (2016) News sentiment to market impact and its feedback effect. Environ Syst Decisi 1–9
O’Hare N, Davy M, Bermingham A, Ferguson P, Sheridan PP, Gurrin C, Smeaton AF, OHare N. (2009) Topic-dependent sentiment analysis of financial blogs. In: International CIKM workshop on topic-sentiment analysis for mass opinion measurement, pp 9–16
Pang B, Lee L, Vaithyanathan S (2002) Thumbs up? Sentiment classification using machine learning techniques. In: Proceedings of the conference on empirical methods in natural language processing, pp 79–86
Schumaker RP, Chen H (2009) Textual analysis of stock market prediction using breaking financial news: the AZFin text system. ACM Trans Inf Syst (TOIS) 27(2):12
Stone PJ, Bales RF, Namenwirth JZ, Ogilvie DM (1962) The general inquirer: a computer system for content analysis and retrieval based on the sentence as unit of information. Comput Behav Sci 7(4):484–498
Tetlock PC (2016) Giving content to investor sentiment: the role of media in the stock market. J Finance 62(3):1139–1168
Tetlock PC, Saar-Tsechansky M, MacSkassy S (2008) More than words: quantifying language to measure firms’ fundamentals. J Finance 63(3):1437–1467
Thelwall M, Buckley K, Paltoglou G (2012) Sentiment strength detection for the social web. J Am Soc Inf Sci Technol 63(1):163–173
Turney PD (2002) Thumbs up or thumbs down? Semantic orientation applied to unsupervised classification of reviews. In: Proceedings of the 40th annual meeting of the association for computational linguistics, pp 417–424
Van De Kauter M, Breesch D, Hoste V (2015) Fine-grained analysis of explicit and implicit sentiment in financial news articles. Expert Syst Appl 42(11):4999–5010
Wiebe J, Wilson T, Cardie C (2005) Annotating expressions of opinions and emotions in language. Lang Resour Eval 39(2–3):165–210
Yang CC, Tang X, Wong Y, Wei C-P (2010) Understanding online consumer review opinions with sentiment analysis using machine learning. Pac Asia J Assoc Inf Syst 2(3):73–89
Author information
Authors and Affiliations
Corresponding author
Appendix
Appendix
1.1 A. Parsing sentences using the NLTK toolkit
Steps:
-
1.
Parse the sentence with a regular expression parser using the grammar specified above
-
2.
If a match for ’NPJJ’ tree pattern is found:
-
(a)
Traverse the subtree to get the combination of NP (potential performance indicator) and JJ/RB/VB (potential directionality word). Look up the dictionary for the matching indicator and directionality word.
-
(b)
If the combination of NP and JJ/RB/VB are not found, check for presence of individual words (either performance indicator or directionality word) in the dictionary
-
(c)
Tag the sentence based on the identified matches.
-
(a)
1.2 B. Parsing numeric values to determine directionality
Preconditions:
-
1.
If a sentence has not been tagged with combination of performance indicators and directionality using the parse rules specified in Section A above.
-
2.
The sentence contains terms like compared to, versus, down from, up from #common sentences where one is likely to observe multiple numeric values without the use of directional words.
Example sentence: Operating profit margin was 8.3%, compared to 11.8%
a year earlier.
Expected tag output: LagInd::DOWN
Steps:
-
1.
Parse the sentence with a regular expression parser using the grammar specified above
-
2.
If a match for ’NPJJ’ tree pattern is found:
-
(a)
Traverse the subtree to get the combination of NP, CD, CD to extract the performance indicator and numeric values. The numeric values are analyzed to determine the directionality (UP/DOWN)
-
(b)
Tag the sentence based on the identified matches.
-
(a)
Rights and permissions
About this article
Cite this article
Krishnamoorthy, S. Sentiment analysis of financial news articles using performance indicators. Knowl Inf Syst 56, 373–394 (2018). https://doi.org/10.1007/s10115-017-1134-1
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-017-1134-1