Buy, Sell, or Hold? Information Extraction from Stock Analyst Reports

  • Yeong Su Lee
  • Michaela Geierhos
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6967)


This paper presents a novel linguistic information extraction approach exploiting analysts’ stock ratings for statistical decision making. Over a period of one year, we gathered German stock analyst reports in order to determine market trends. Our goal is to provide business statistics over time to illustrate market trends for a user-selected company. We therefore recognize named entities within the very short stock analyst reports such as organization names (e.g. BASF, BMW, Ericsson), analyst houses (e.g. Gartner, Citigroup, Goldman Sachs), ratings (e.g. buy, sell, hold, underperform, recommended list) and price estimations by using lexicalized finite-state graphs, so-called local grammars. Then, company names and their acronyms respectively have to be cross-checked against data the analysts provide. Finally, all extracted values are compared and presented into charts with different views depending on the evaluation criteria (e.g. by time line). Thanks to this approach it will be easier and even more comfortable in the future to pay attention to analysts’ buy/sell signals without reading all their reports.


Information Extraction Business Intelligence Name Entity Recognition Entity Recognition Relation Extraction 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Barber, B.M., Lehaby, R., McNichols, M., Trueman, B.: Buys, holds, and sells: The distribution of investment banks’ stock ratings and the implications for the profitability of analysts’ recommendations. Journal of Accounting and Economics 41, 87–117 (2006)CrossRefGoogle Scholar
  2. 2.
    Lin, L., Liotta, A., Hippisley, A.: A method for automating the extraction of specialized information from the web. CIS 1, 489–494 (2005)Google Scholar
  3. 3.
    Gross, M.: The Construction of Local Grammars. In: Roche, E., Schabés, Y. (eds.) Finite-State Language Processing. Language, Speech, and Communication, pp. 329–354. MIT Press, Cambridge (1997)Google Scholar
  4. 4.
    Surdeanu, M., Harabagiu, S., Williams, J., Aarseth, P.: Using Predicate-Argument Structures for Informaiton Extraction. In: Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics, pp. 8–15 (2003)Google Scholar
  5. 5.
    Baumgartner, R., Frölich, O., Gottlob, G., Harz, P.: Web Data Extraction for Business Intelligence: the Lixto Approach. In: Proc. of BTW 2005 (2005)Google Scholar
  6. 6.
    Maynard, D., Saggion, H., Yankova, M., Bontcheva, K., Peters, W.: Natural Language Technology for Information Integration in Business Intelligence. In: 10th International Conference on Business Information Systems, Poland (2007)Google Scholar
  7. 7.
    Saggion, H., Funk, A., Maynard, D., Bontcheva, K.: Ontology-Based Information Extraction for Business Intelligence. In: Aberer, K., Choi, K.-S., Noy, N., Allemang, D., Lee, K.-I., Nixon, L.J.B., Golbeck, J., Mika, P., Maynard, D., Mizoguchi, R., Schreiber, G., Cudré-Mauroux, P. (eds.) ASWC 2007 and ISWC 2007. LNCS, vol. 4825, pp. 843–856. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  8. 8.
    Paradis, F., Nie, J.Y., Tajarobi, A.: Discovery of Business Opportunities on the Internet with Informaiton Extraction. In: IJCAI 2005, Edinburgh, pp. 47–54 (2005)Google Scholar
  9. 9.
    Silva, J., Kozareva, Z., Noncheva, V., Lopes, G.: Extracting Named Entities. A Statistical Approach. In: TALN 2004, Fez, Marroco, ATALA, pp. 347–351 (2004)Google Scholar
  10. 10.
    Downey, D., Broadhead, M., Etzioni, O.: Locating Complex Named Entities in web Text. In: Proc. of the Twentieth International Joint Conference on Artificial Intelligence (IJCAI 2007), Hyderabad, India (2007)Google Scholar
  11. 11.
    McDonald, D.: Internal and external evidence in the identification and semantic categorization of proper names. In: Boguraev, B., Pustejovsky, J. (eds.) Corpus Processing for Lexical Acquisition, pp. 21–39. MIT Press, Cambridge (1996)Google Scholar
  12. 12.
    Mikheev, A., Moens, M., Grover, C.: Named entity recognition without gazetteers. In: Proceedings of the Ninth Conference of the European Chapter of the Association for Computational Linguistics, pp. 1–8 (1999)Google Scholar
  13. 13.
    Etzioni, O., Cafarella, M., Downey, D., Popescu, A.M., Shaked, T., Soderland, S., Weld, D.S., Yates, A.: Unsupervised Named-Entity Extraction from the web: An Experimental Study. Artificial Intelligence 165(1), 134–191 (2005)CrossRefGoogle Scholar
  14. 14.
    Banko, M., Cafarella, M.J., Soderland, S., Broadhead, M., Etzioni, O.: Open Information Extraction from the Web. In: Proceedings of IJCAI (2007)Google Scholar
  15. 15.
    Giuliano, C., Lavelli, A., Romano, L.: Exploiting shallow linguistic information for relation extraction from biomedical literature. In: Proc. EACL 2006 (2006)Google Scholar
  16. 16.
    Bsiri, S., Geierhos, M., Ringlstetter, C.: Structuring Job Search via Local Grammars. Advances in Natural Language Processing and Applications. Research in Computing Science (RCS) 33, 201–212 (2008)Google Scholar
  17. 17.
    Geierhos, M., Blanc, O.: BiographIE – Biographical Information Extraction from Business News. In: De Gioia, M. (ed.) Actes du 27e Colloque international sur le lexique et la grammaire, L’Aquila, September 10-13 (2008); Seconde partie. Lingue d’Europa e del Mediterraneo: Grammatica comparata. Aracne, Rome, Italy (2010)Google Scholar
  18. 18.
    Traboulsi, H.N.: Named Entity Recognition: A Local Grammar-based Approach. PhD thesis, University of Surrey (2006)Google Scholar
  19. 19.
    Woods, W.A.: Transition network grammars for natural language analysis. Commun. ACM 13(10), 591–606 (1970)CrossRefzbMATHGoogle Scholar
  20. 20.
    Paumier, S.: Unitex User Manual 2.1. (2010),
  21. 21.
    Chi, C.H., Ding, C.: Word Segmentation and Recognition for Web Document Framework. In: Proceedings of Conference on Information and Knowledge Management, CIKM (1999)Google Scholar
  22. 22.
    Lee, Y.S.: Website-Klassifikation und Informationsextraktion aus Informationsseiten einer Firmenwebsite. PhD thesis, University of Munich, Germany (2008)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Yeong Su Lee
    • 1
  • Michaela Geierhos
    • 1
  1. 1.CISUniversity of MunichGermany

Personalised recommendations