Knowledge-Driven Approaches for Financial News Analytics



Computational finance is one of the fastest-growing application areas for natural language processing technologies. Already today, algorithmic trading funds are successfully using robo readers and sentiment analysis techniques to support adaptive algorithms that are capable of making automated decisions with little or no human intervention. However, these technologies are still in a nascent state and the competition to improve approaches within the industry is fierce. In this chapter, we discuss financial news analytics and learning strategies that help machines combine domain knowledge with other linguistic information that is extracted from text sources. We provide an overview of existing linguistic resources and methodological approaches that can be readily utilized to develop knowledge-driven solutions for financial news analysis.


  1. Ahmed K, Baig MH, Torresani L (2016). Network of experts for large-scale image categorization. In: European conference on computer vision. Springer, Cham, pp 516–532Google Scholar
  2. Ahn D (2006) The stages of event extraction. In: Proceedings of the workshop on annotating and reasoning about time and events. Association for Computer Linguistics, pp 1–8Google Scholar
  3. Aggarwal CC, Zhai CX (eds) (2012) Mining text data. Springer Science and Business MediaGoogle Scholar
  4. Allan J, Carbonell JG, Doddington G, Yamron J, Yang Y (1998) Topic detection and tracking pilot study final report. In: Proceedings of the DARPA broadcast news transcription and understanding workshopGoogle Scholar
  5. Apache License (2015) Accessed 31 July 2015
  6. Arendarenko E, Kakkonen T (2012) Ontology-based information and event extraction for business intelligence. In: International conference on artificial intelligence: methodology, systems, and applications. Springer, Berlin, pp 89–102Google Scholar
  7. Baccianella S, Esuli A, Sebastiani F (2010) Sentiwordnet 3.0: an enhanced lexical resource for sentiment analysis and opinion mining. In: Proceedings of the seventh international conference on language resources and evaluation (LREC’10), Valletta, Malta. European Language Resources Association (ELRA)Google Scholar
  8. Baker M, Stein J (2004) Market liquidity as a sentiment indicator. J Financ 7:271–299Google Scholar
  9. Ball R, Brown P (1968) An empirical evaluation of accounting income numbers. J Account Res, 159–178Google Scholar
  10. Barber BM, Odean T (2008) All that glitters: the effect of attention and news on the buying behavior of individual and institutional investors. Rev Financ Stud 21:758–818CrossRefGoogle Scholar
  11. Ben Ami Z, Feldman R (2017) Event-based trading: building superior trading strategies with state-of-the-art information extraction toolsGoogle Scholar
  12. Bird S, Loper E, Klein E (2009) Natural language processing with python. O’Reilly Media IncGoogle Scholar
  13. Boudoukh J, Feldman R, Kogan S, Richardson M (2013) Which news moves stock prices? A textual analysis (No. w18725). National Bureau of Economic ResearchGoogle Scholar
  14. Boudoukh J, Feldman R, Kogan S, Richardson M (2018) Information, trading, and volatility: evidence from firm-specific news. Rev Financ Stud 32(3):992–1033CrossRefGoogle Scholar
  15. Buehlmaier MM (2013) The role of the media in takeovers: theory and evidenceGoogle Scholar
  16. Cambria E, Havasi C, Hussain A (2012) SenticNet 2: a semantic and affective resource for opinion mining and sentiment analysis. AAAI FLAIRS, pp 202–207Google Scholar
  17. Cardini F (2014) Analysing English metaphors of the economic crisis. Lingue Linguaggi 11:59–76Google Scholar
  18. Carretta A, Farina V, Graziano EA, Reale M (2011) Does investor attention influence stock market activity? The case of spin-off deals, June 30, 2011Google Scholar
  19. Chen H, De P, Hu YJ, Hwang BH (2013) Customers as advisors: the role of social media in financial markets. In: 3rd annual behavioural finance conference. Queen’s University, Kingston, CanadaGoogle Scholar
  20. Cheng W (2013) Metaphors in financial analyst reports. In: CELC symposium 2013Google Scholar
  21. Choi Y, Wiebe J (2014) +/\(-\)EffectWordNet: sense-level lexicon acquisition for opinion inference. In: Proceedings of the EMNLP 2014Google Scholar
  22. Creative Commons (2015). Accessed 25 July 2015
  23. Daraselia N, Yuryev A, Egorov S, Novichkova S, Nikitin A, Mazo I (2014) Extracting human protein interactions from MEDLINE using a full-sentence parser. Bioinformatics 20(5):604–611CrossRefGoogle Scholar
  24. Das SR, Chen MY (2007) Yahoo! for Amazon: sentiment extraction from small talk on the web. Manag Sci 53(9):1375–1388CrossRefGoogle Scholar
  25. Day MY, Lee CC (2016) Deep learning for financial sentiment analysis on finance news providers. In: 2016 IEEE/ACM international conference on advances in social networks analysis and mining (ASONAM). IEEE, pp 1127–1134Google Scholar
  26. DBpedia (2015). Accessed 25 July 2015
  27. Deng L, Choi Y, Wiebe J (2013) Benefactive/Malefactive event and writer attitude annotation. In: Annual meeting of the association for computational linguistics (ACL-2013, short paper)Google Scholar
  28. Ding X, Zhang Y, Liu T, Duan J (2015) Deep learning for event-driven stock prediction. In: Twenty-fourth international joint conference on artificial intelligenceGoogle Scholar
  29. Du M, Pivovarova L, Yangarber R (2016) PULS: natural language processing for business intelligence. In: Proceedings of the 2016 workshop on human language technology, pp 1–8Google Scholar
  30. Dzielinski M, Rieger MO, Talpsepp T (2011) Volatility asymmetry, news, and private investors. In: Mitra G, Mitra L (eds) The handbook of news analytics in finance. WileyGoogle Scholar
  31. Engelberg J (2008) Costly information processing: evidence from information announcements. In: AFA 2009 San Francisco meetings paperGoogle Scholar
  32. Engelberg JE, Reed AV, Ringgenberg MC (2012) How are shorts informed?: short sellers, news, and information processing. J Financ Econ 105(2):260–278CrossRefGoogle Scholar
  33. Fama EF (1965) The behavior of stock-market prices. J Bus 38(1):34–105CrossRefGoogle Scholar
  34. Fama EF, Fisher L, Jensen MC, Roll R (1969) The adjustment of stock prices to new information. Int Econ Rev 10(1):1–21CrossRefGoogle Scholar
  35. Fan J, Lv J, Qi L (2011) Sparse high-dimensional models in economics. Annu Rev Econ 3(1):291–317CrossRefGoogle Scholar
  36. Feldman R, Rosenfeld B, Bar-Haim R, Fresko M (2011) The stock sonar-sentiment analysis of stocks based on a hybrid approach. In: Twenty-third IAAI conferenceGoogle Scholar
  37. Ferguson NJ, Philip D, Lam HY, Guo JM (2014) Media content and stock returns: the predictive power of press. In: Midwest finance association 2013 annual meeting paperGoogle Scholar
  38. Ferris SP, Hao Q, Liao MY (2013) The effect of issuer conservatism on IPO pricing and performance. Rev Financ 17(3):993–1027CrossRefGoogle Scholar
  39. Finkel J, Grenager T, Manning C (2005) Incorporating non-local information into information extraction systems by Gibbs sampling. In: Proceedings of the 43rd annual meeting of the association for computational linguistics (ACL 2005), pp 363–370Google Scholar
  40. Gao J, Li Z, Nevatia R (2017) Knowledge concentration: learning 100 K object classifiers in a single CNN. arXiv:1711.07607
  41. Garcia D (2013) Sentiment during recessions. J Financ 68(3):1267–1300CrossRefGoogle Scholar
  42. General Inquirer (2015). Accessed 01 Aug 2015
  43. GNU General Public License (2015). Accessed 31 July 2015
  44. Groen JJ, Kapetanios G, Price S (2013) Multivariate methods for monitoring structural change. J Appl Econ 28(2):250–274CrossRefGoogle Scholar
  45. Groth SS, Muntermann J (2011) An intraday market risk management approach based on textual analysis. Decis Support Syst 50(4):680–691CrossRefGoogle Scholar
  46. Gruber TR (1993) A translation approach to portable ontologies. Knowl Acquis 5(2):199–220CrossRefGoogle Scholar
  47. Hafez P, Guerrero-Colón J (2019) White paper, Ravenpack techGoogle Scholar
  48. Hagenau M, Liebmann M, Neumann D (2013) Automated news reading: stock price prediction based on financial news using context-capturing features. Decis Support Syst 55:685–697CrossRefGoogle Scholar
  49. Harris Z (1954) Distributional structure. Word 10(2/3):146–62CrossRefGoogle Scholar
  50. Hart R (2000) DICTION 5.0. Accessed 01 Aug 2015
  51. Hendler J, Feigenbaum E (2001) Knowledge is power: the semantic web vision. In: Web intelligence: research and development. Springer, BerlinGoogle Scholar
  52. Hirshleifer D (2001) Investor psychology and asset pricing. J Financ 56:1533–1598CrossRefGoogle Scholar
  53. Hoffman C (2015) Digital Financial Reporting. Accessed 25 July 2015
  54. Hogenboom A, Hogenboom F, Frasincar F, Schouten K, Van Der Meer O (2013) Semantics-based information extraction for detecting economic events. Multimed Tools Appl 64(1):27–52CrossRefGoogle Scholar
  55. Huang AH, Zang AY, Zheng R (2014) Evidence on the information content of text in analyst reports. Account Rev 89(6):2151–2180CrossRefGoogle Scholar
  56. Huang C-J, Liao J-J, Yang D-X, Chang T-Y, Luo Y-C (2010) Realization of a news dissemination agent based on weighted association rules and text mining techniques. Expert Syst Appl 37:6409–6413CrossRefGoogle Scholar
  57. Jacobs G, Lefever E, Hoste V (2018) Economic event detection in company-specific news text. In: The 56th annual meeting of the association for computational linguistics. Association for Computational Linguistics, pp 1–10Google Scholar
  58. Jegadeesh N, Wu D (2013) Word power: a new approach for content analysis. J Financ Econ 110(3):712–729CrossRefGoogle Scholar
  59. Kakkonen T, Mufti T (2011) Developing and applying a company, product and business event ontology for text mining. In: Proceedings of the 11th international conference on knowledge management and knowledge technologies. ACM, p 24Google Scholar
  60. Kaniel R, Saar G, Titman S (2005) Individual investor sentiment institution. Duke UniversityGoogle Scholar
  61. Kearney C, Liu S (2014) Textual sentiment in finance: a survey of methods and models. Int Rev Financ Anal 33:171–185CrossRefGoogle Scholar
  62. Klein D, Manning C (2003a) Accurate unlexicalized parsing. In: Proceedings of the 41st meeting of the association for computational linguistics, pp 423–430Google Scholar
  63. Klein D, Manning C (2003b) Fast exact inference with a factored model for natural language parsing. In: Cambridge MA (ed) Advances in neural information processing systems 15 (NIPS 2002). MIT Press, pp 3–10Google Scholar
  64. Koudijs P (2016) The boats that did not sail: asset price volatility in a natural experiment. J Financ 71(3):1185–1226CrossRefGoogle Scholar
  65. Kovacs E (2007) On the use of metaphors in the language of business, finance and economics. Accessed 21 July 2015
  66. Kumar A, Lee C (2006) Retail investor sentiment and return comovement. J Financ 61:2451–2486CrossRefGoogle Scholar
  67. Lafferty J, McCallum A, Pereira F (2001) Conditional random fields: probabilistic models for segmenting and labeling sequence data. Departmental Papers (CIS), Department of Computer & Information Science, University of PennsylvaniaGoogle Scholar
  68. Lakoff G (1987) Women, fire and dangerous things: what categories reveal about the mind?. University of Chicago Press, ChicagoCrossRefGoogle Scholar
  69. Lakoff G, Johnson M (1980) Metaphors we live by. University of Chicago Press, ChicagoGoogle Scholar
  70. Li F (2010) The information content of forward-looking statements in corporate filings-a Naïve Bayesian machine learning approach. J Account Res 48(5):1049–1102CrossRefGoogle Scholar
  71. Liu B, McConnell JJ (2013) The role of the media in corporate governance: do the media influence managers’ capital allocation decisions? J Financ Econ 110(1):1–17Google Scholar
  72. Lopez A, Llopis M (2010) Metaphorical pattern analysis in financial texts: framing the crisis in positive or negative metaphorical terms. J Pragmat 42:3300–3313CrossRefGoogle Scholar
  73. Loughran T, McDonald B (2011) When is a liability not a liability? Textual analysis, dictionaries and 10-Ks. J Financ 66(1):35–66CrossRefGoogle Scholar
  74. Lösch U, Nikitina N (2009) The newsEvents ontology: an ontology for describing business events. In: Proceedings of the 2009 international conference on ontology patterns, vol 516, pp 187–193Google Scholar
  75. Lugmayr A, Gossen G (2012) Evaluation of methods and techniques for language based sentiment analysis for DAX 30 stock exchange-a first concept of a “LUGO” sentiment indicator. In: Lugmayr A, Risse T, Stockleben B, Kaario J, Pogorelc B, Serral Asensio E (eds) SAME 2012-5th international workshop on semantic ambient media experienceGoogle Scholar
  76. Malo P, Siitari P (2010) A context-aware approach to user profiling with interactive preference learning. Aalto University working paper W-482Google Scholar
  77. Malo P, Siitari P, Sinha A (2013) Automated query learning with wikipedia and genetic programming. Artif Intell 194:86–110CrossRefGoogle Scholar
  78. Malo P, Siitari P, Ahlgren O, Wallenius J, Korhonen P (2010) Semantic content filtering with wikipedia and ontologies. In: Proceedings of IEEE international conference on data mining workshops, pp 518–526Google Scholar
  79. Malo P, Sinha A, Korhonen P, Wallenius J, Takala P (2014) Good debt or bad debt: detecting semantic orientations in economic texts. J Assoc Inf Sci Technol 65(4):782–796CrossRefGoogle Scholar
  80. Malo P, Sinha A, Takala P, Ahlgren O, Lappalainen I (2013) Learning the roles of directional expressions and domain concepts in financial news analysis. In: Proceedings of IEEE international conference of data mining workshops (ICDM SENTIRE). IEEE PressGoogle Scholar
  81. Malo P, Viitasaari L, Gorskikh O, Ilmonen P (2018) Non-parametric structural change detection in multivariate systems. arXiv:1805.08512
  82. Mandelker G (1974) Risk and return: the case of merging firms. J Financ Econ 1(4):303–335CrossRefGoogle Scholar
  83. Manning C, Raghavan P, Schütze H (2008) Introduction to information retrieval. Cambridge University PressGoogle Scholar
  84. Manning C, Surdeanu M, Bauer J, Finkel J, Bethard SJ, McClosky D (2014) The stanford CoreNLP natural language processing toolkit. In: Proceedings of 52nd annual meeting of the association for computational linguistics: system demonstrations, pp 55–60Google Scholar
  85. McEnery T, Wilson A (2001) Corpus linguistics, 2nd edn. Edinburgh University Press, EdinburghGoogle Scholar
  86. Medhat W, Hassan A, Korashy H (2014) Sentiment analysis algorithms and applications: a survey, pp 1093–1113Google Scholar
  87. Miller GA, Beckwith R, Fellbaum CD, Gross D, Miller K (1990) WordNet: an online lexical database. Int J Lexicograph 3(4):235–244CrossRefGoogle Scholar
  88. Mittal A, Goel A (2012) Stock prediction using twitter sentiment analysis. Standford University, CS229Google Scholar
  89. Mitamura T, Liu Z, Hovy EH (2015) Overview of TAC KBP 2015 event nugget track. In: Text analysis conferenceGoogle Scholar
  90. Mitra L, Mitra G (2011) The handbook of news analysis in finance. WileyGoogle Scholar
  91. Münnix MC, Shimada T, Schäfer R, Leyvraz F, Seligman TH, Guhr T, Stanley HE (2012) Identifying states of a financial market. Sci Rep 2:644Google Scholar
  92. Nassirtoussi AK, Aghabozorgi S, Wah TY, Ngo DCL (2015) Text mining of news-headlines for FOREX market prediction: a Multi-layer Dimension Reduction Algorithm with semantics and sentiment. Expert Syst Appl 42(1):306–324CrossRefGoogle Scholar
  93. Natural Language Toolkit (2015). Accessed 25 July 2015
  94. OpenCyc (2015). Accessed 25 July 2015
  95. Ozik G, Sadka R (2012) Media and investment managementGoogle Scholar
  96. Pagolu VS, Reddy KN, Panda G, Majhi B (2016). Sentiment analysis of twitter data for predicting stock market movements. In: 2016 international conference on signal processing, communication, power and embedded system (SCOPES). IEEE, pp 1345–1350Google Scholar
  97. Peramunetilleke D, Wong RK (2002) Currency exchange rate forecasting from news headlines. Aust Comput Sci Commun 24:131–139Google Scholar
  98. Price SM, Doran JS, Peterson DR, Bliss BA (2012) Earnings conference calls and stock returns: the incremental informativeness of textual tone. J Bank Financ 36(4):992–1011CrossRefGoogle Scholar
  99. Rönnqvist S, Sarlin P (2017) Bank distress in the news: describing events through deep learning. Neurocomputing 264:57–70CrossRefGoogle Scholar
  100. Schmeling M (2007) Institutional and individual sentiment: smart money and noise trader risk? Int J Forecast 23:127–145CrossRefGoogle Scholar
  101. Schumaker RP, Chen H (2009) Textual analysis of stock market prediction using breaking financial news: the AZFin text system. ACM Trans Inf Syst (TOIS) 27(2):12CrossRefGoogle Scholar
  102. Schumaker RP, Zhang Y, Huang C-N, Chen H (2012) Evaluating sentiment in financial news articles. Decision support systemsGoogle Scholar
  103. Seo M, Kembhavi A, Farhadi A, Hajishirzi H (2016) Bidirectional attention flow for machine comprehension. arXiv:1611.01603
  104. Shiller RJ (2003) From efficient markets theory to behavioral finance. J Econ Perspect 17:83–104CrossRefGoogle Scholar
  105. Sinha NR (2010) Underreaction to news in the US stock marketGoogle Scholar
  106. Soni A, van Eck NJ, Kaymak U (2007) Prediction of stock price movements based on concept map information. In: IEEE symposium on computational intelligence in multicriteria decision making, pp 205–211Google Scholar
  107. Stock JH, Watson M (2009) Forecasting in dynamic factor models subject to structural instability. The methodology and practice of econometrics. A festschrift in Honour of David F Hendry, p 173, 205Google Scholar
  108. Sutton C, McCallum A (2011) An introduction to conditional random fields. Found Trends R Mach Learn 4(4):267–373CrossRefGoogle Scholar
  109. Takala P, Malo P, Sinha A, Ahlgren O (2014) Gold-standard for topic-specific sentiment analysis of economic texts. In: Proceedings of The 9th edition of the language resources and evaluation conference (LREC), pp 2152–2157Google Scholar
  110. Taye MM (2011) Web-based ontology languages and its based description logics. Res Bull Jordan ACM II(II). ISSN: 2078-7952Google Scholar
  111. Tetlock PC, Saar-Tsechansky M, Macskassy S (2008) More than words: quantifying language to measure firms’ fundamentals. J Financ 63:1437–1467CrossRefGoogle Scholar
  112. The Stanford Natural Language Processing Group (2015). Aaccessed 25 July 2015
  113. Turing AM (1950) Computing machinery and intelligence. Mind 59:433–460CrossRefGoogle Scholar
  114. Vanderlinden E (2015) Finance ontology. Accessed 27 July 2015
  115. Von Beschwitz B, Keim DB, Massa M (2015). First to ‘read’ the news: news analytics and high frequency trading. In: Paris December 2015 finance meeting EUROFIDAI-AFFIGoogle Scholar
  116. Vu TT, Chang S, Ha QT, Collier N (2012) An experiment in integrating sentiment features for tech stock prediction in twitter. In: Proceedings of the workshop on information extraction and entity analytics on social media data, Mumbai, India. The COLING 2012 Organizing Committee, pp 23–38Google Scholar
  117. Wiebe J, Wilson T, Cardie C (2005) Annotating expressions of opinions and emotions in language. Lang Resour Eval 39:165–210CrossRefGoogle Scholar
  118. Wilson T, Wiebe J, Hoffmann P (2005) Recognizing contextual polarity in phrase-level sentiment analysis. In: Proceeding of the HLT-EMNLP-2005Google Scholar
  119. Wilson T (2008) Fine-grained subjectivity analysis. School University of PittsburghGoogle Scholar
  120. Wilson T, Wiebe J, Hoffman P (2009) Recognizing contextual polarity: an exploration of features for phrase-level sentiment analysis, computational linguistics, vol 35, no 3, pp 399–433Google Scholar
  121. Wuthrich B, Cho V, Leung S, Permunetilleke D, Sankaran K, Zhang J (1998) Daily stock market forecast from textual web data. In: IEEE international conference on systems, man, and cybernetics, vol 3, pp 2720–2725Google Scholar
  122. Xie B, Passonneau R, Wu L, Creamer GG (2013). Semantic frames to predict stock price movement. In: Proceedings of the 51st annual meeting of the association for computational linguistics, pp 873–883Google Scholar
  123. Yang Y, Carbonell JG, Brown RD, Pierce T, Archibald BT, Liu X (1999) Learning approaches for detecting and tracking news events. IEEE Intell Syst Their Appl 14(4):32–43CrossRefGoogle Scholar
  124. Yang Z, Yang D, Dyer C, He X, Smola A, Hovy E (2016) Hierarchical attention networks for document classification. In: Proceedings of the 2016 conference of the North American chapter of the association for computational linguistics: human language technologies, pp 1480–1489Google Scholar
  125. Yermack D (2014) Tailspotting: Identifying and profiting from CEO vacation trips. J Financ Econ 113(2):252–269CrossRefGoogle Scholar
  126. Yu Y, Duan W, Cao Q (2013) The impact of social and conventional media on firm equity value: a sentiment analysis approach. Decision support systemsGoogle Scholar
  127. Yuan G, Ho C, Lin C (2012) Recent advances of large-scale linear classification. Proc IEEE 100(9):2584–2603CrossRefGoogle Scholar
  128. Zhai Y, Hsu A, Halgamuge SK (2007) Combining news and technical indicators in daily stock price trends prediction. In Proceedings of the 4th international symposium on neural networks: advances in neural networks, Part III. Springer, Nanjing, China, pp 1087–1096Google Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2019

Authors and Affiliations

  1. 1.Department of Information and Service EconomyAalto University School of BusinessHelsinkiFinland
  2. 2.Production and Quantitative Methods, Indian Institute of Management AhmedabadAhmedabadIndia

Personalised recommendations