Skip to main content

An Experimental Evaluation of Sentiment Analysis on Financial News Using Prior Polarity Words

  • Conference paper
  • First Online:
Advances in Artificial Intelligence -- IBERAMIA 2014 (IBERAMIA 2014)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8864))

Included in the following conference series:

  • 1675 Accesses

Abstract

Throughout the past decade, extensive research on text classification has produced fast and accurate algorithms. Most of these algorithms are based on bag-of-words representations, which generate high dimensional data. However, just a few supervised learning methods, such as SVMs, can efficiently handle high dimensional data. To overcome this limitation we propose the use of prior polarity words (PPW) in order to create a compact and representative feature set for financial news classification. Using this approach it is possible to reduce feature sets from thousands to less than tens of features without compromising the accuracy of the text classifier. We measured accuracy, precision, recall, F-measure and ROC AUC of text classifiers using PPW. Classifier using PPW was able to topping all results when compared with a wide range of feature selection methods. By adopting PPW, Support Vector Machines and Naive Bayes performed consistently better than using the full feature set. PPW also turned Naive Bayes comparable to SVMs, as indicated by the improved performance scores in all measures tested.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bollen, J., Mao, H., Zeng, X.: Twitter mood predicts the stock market. Journal of Computational Science 2(1), 1–8 (2011), http://www.sciencedirect.com/science/article/pii/S187775031100007X

  2. Chan, S.W., Franklin, J.: A text-based decision support system for financial sequence prediction. Decision Support Systems 52(1), 189–198 (2011), http://www.sciencedirect.com/science/article/pii/S0167923611001230

  3. Chang, C.C., Lin, C.J.: Libsvm: A library for support vector machines. ACM Trans. Intell. Syst. Technol. 2(3), 27:1–27:27 (2011), http://doi.acm.org/10.1145/1961189.1961199

  4. Fawcett, T.: An introduction to roc analysis. Pattern Recogn. Lett. 27(8), 861–874 (2006), http://dx.doi.org/10.1016/j.patrec.2005.10.010

  5. Forman, G.: An extensive empirical study of feature selection metrics for text classification. J. Mach. Learn. Res. 3, 1289–1305 (2003), http://dl.acm.org/citation.cfm?id=944919.944974

  6. Gabrilovich, E., Markovitch, S.: Text categorization with many redundant features: Using aggressive feature selection to make svms competitive with c4.5. In: Proceedings of the Twenty-first International Conference on Machine Learning, ICML 2004, p. 41. ACM, New York (2004), http://doi.acm.org/10.1145/1015330.1015388

  7. Groth, S.S., Muntermann, J.: An intraday market risk management approach based on textual analysis. Decision Support Systems 50(4), 680–691 (2011), http://www.sciencedirect.com/science/article/pii/S0167923610001430, (enterprise Risk and Security Management: Data, Text and Web Mining)

  8. Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. J. Mach. Learn. Res. 3, 1157–1182 (2003), http://dl.acm.org/citation.cfm?id=944919.944968

  9. Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The weka data mining software: An update. SIGKDD Explor. Newsl. 11(1), 10–18 (2009), http://doi.acm.org/10.1145/1656274.1656278

  10. Joachims, T.: Text categorization with suport vector machines: Learning with many relevant features. In: Proceedings of the 10th European Conference on Machine Learning, ECML 1998, Chemnitz, Germany, April 21-23, pp. 137–142 (1998)

    Google Scholar 

  11. Keerthi, S.S., Lin, C.J.: Asymptotic behaviors of support vector machines with gaussian kernel. Neural Comput. 15(7), 1667–1689 (2003), http://dx.doi.org/10.1162/089976603321891855

  12. Mizumoto, K., Yanagimoto, H., Yoshioka, M.: Sentiment analysis of stock market news with semi-supervised learning. In: 2012 IEEE/ACIS 11th International Conference on Computer and Information Science (ICIS), pp. 325–328. IEEE (2012)

    Google Scholar 

  13. Robertson, S.: Understanding inverse document frequency: on theoretical arguments for idf. Journal of Documentation 60(5), 503–520 (2004)

    Article  Google Scholar 

  14. Schumaker, R.P., Chen, H.: Textual analysis of stock market prediction using breaking financial news: The azfin text system. ACM Trans. Inf. Syst. 27(2), 12:1–12:19 (2009), http://doi.acm.org/10.1145/1462198.1462204

  15. Schumaker, R.P., Zhang, Y., Huang, C.N., Chen, H.: Evaluating sentiment in financial news articles. Decis. Support Syst. 53(3), 458–464 (2012), http://dx.doi.org/10.1016/j.dss.2012.03.001

  16. Sebastiani, F.: Machine learning in automated text categorization. ACM Comput. Surv. 34(1), 1–47 (2002), http://doi.acm.org/10.1145/505282.505283

  17. Wilson, T., Hoffmann, P., Somasundaran, S., Kessler, J., Wiebe, J., Choi, Y., Cardie, C., Riloff, E., Patwardhan, S.: Opinionfinder: A system for subjectivity analysis. In: Proceedings of HLT/EMNLP on Interactive Demonstrations, HLT-Demo 2005, pp. 34–35. Association for Computational Linguistics, Stroudsburg (2005), http://dx.doi.org/10.3115/1225733.1225751

  18. Wilson, T., Wiebe, J., Hoffmann, P.: Recognizing contextual polarity in phrase-level sentiment analysis. In: Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing, HLT 2005, pp. 347–354. Association for Computational Linguistics, Stroudsburg (2005), http://dx.doi.org/10.3115/1220575.1220619

  19. Wilson, T., Wiebe, J., Hoffmann, P.: Recognizing contextual polarity: An exploration of features for phrase-level sentiment analysis. Comput. Linguist. 35(3), 399–433 (2009), http://dx.doi.org/10.1162/coli.08-012-R1-06-90

  20. Zhang, W., Skiena, S.: Trading strategies to exploit blog and news sentiment. In: ICWSM (2010)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Edson Matsubara .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Campos, E., Matsubara, E. (2014). An Experimental Evaluation of Sentiment Analysis on Financial News Using Prior Polarity Words. In: Bazzan, A., Pichara, K. (eds) Advances in Artificial Intelligence -- IBERAMIA 2014. IBERAMIA 2014. Lecture Notes in Computer Science(), vol 8864. Springer, Cham. https://doi.org/10.1007/978-3-319-12027-0_18

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-12027-0_18

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-12026-3

  • Online ISBN: 978-3-319-12027-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics