Skip to main content

New Approach to Feature Generation by Complex-Valued Econometrics and Sentiment Analysis for Stock-Market Prediction

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 95))

Abstract

The theory of complex-valued econometrics makes it possible to generate qualitatively new features that can be used in machine learning algorithms. Our study reveals the task of determining the long-term dependence of future companies’ stock prices from a time-generated feature, i.e., a calculated tonality coefficient gained by methods of semantic analysis of texts from social networks. Data was gathered from the Twitter platform with the use of Big Data ETL-scenarios. The resulting data sets were used to train machine learning algorithms designed to work with Big Data technologies. A semantic coefficient was calculated on the basis of aggregated estimates for each day, with the further application of the methods of complex-valued econometrics. To demonstrate the new approach of feature generation, a complex-valued linear regression model based on the semantic coefficients and stock markets data was constructed. The outcome obtained by the new approach was compared with existing solutions in terms of accuracy. Finally, we demonstrate a possible route for impacting improvements of the existing algorithms for trading strategies using the complex-valued regression.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Chan, E.: Algorithmic Trading: Winning Strategies and Their Rationale, 656 p. Wiley, Hoboken (2013). ISBN 978-1118460146

    Google Scholar 

  2. Harris, L.: Trading and Exchanges: Market Microstructure for Practitioners, 1st edn., 304 p. Oxford University Press, Oxford (2002). ISBN 978-0195144703

    Google Scholar 

  3. Company Information about Active Broker-Dealers. U.S. Securities and Exchange Commission. https://www.sec.gov/help/foiadocsbdfoiahtm.html

  4. MacGregor, A.: As automated trading takes over markets, rational human investors matter even more. http://www.abmac.com/industry-insight/as-automated-trading-takes-over-markets-rational-human-investors-matter-even-more

  5. Merello, S., Ratto, A.P., Oneto, L., Cambria, E.: Predicting future market trends: which is the optimal window? In: Oneto, L., Navarin, N., Sperduti, A., Anguita, D. (eds.) Recent Advances in Big Data and Deep Learning. INNSBDDL 2019. Proceedings of the International Neural Networks Society, vol. 1. Springer, Cham (2020)

    Google Scholar 

  6. Yang, R., He, J., Xu, M., Ni, H., Jones, P., Samatova, N.: An intelligent and hybrid weighted fuzzy time series model based on empirical mode decomposition for financial markets forecasting. In: Perner, P. (ed.) Advances in Data Mining. Applications and Theoretical Aspects, ICDM 2018. Lecture Notes in Computer Science, vol. 10933. Springer, Cham (2018)

    Google Scholar 

  7. Galimberti, J.K., Suhadolnik, N., Da Silva, S.: Comput. Econ. 50, 393 (2017). https://doi.org/10.1007/s10614-016-9591-2

    Article  Google Scholar 

  8. Twitter’s Q3 earnings by the numbers. Fast Company. https://www.fastcompany.com/90256723/twitters-q3-earnings-by-the-numbers

  9. Makice, K.: Twitter API: Up and Running. Learn How to Build Applications with the Twitter API, 416 p. O’Reilly, Sebastopol (2009). ISBN 978-0596154615

    Google Scholar 

  10. Brexit and the UK’s Public Finances. Institute for Fiscal Studies (IFS Report 116), May 2016. https://www.ifs.org.uk/uploads/publications/comms/r116.pdf

  11. Hydrator by samdark. https://github.com/samdark/hydrator

  12. Bonobo. Data-processing for humans. https://www.bonobo-project.org

  13. Tagliaferri, L.: DigitalOcean eBook: How to Code in Python. DigitalOcean, New York City. ISBN 978-0-9997730-1-7

    Google Scholar 

  14. Karau, H., Konwinski, A., Wendell, P., Zaharia, M.: Learning Spark: Lightning-Fast Big Data Analytics, 304 p. O’Reilly, Sebastopol (2015). ISBN 978-5-97060-323-9

    Google Scholar 

  15. Frampton, M.: Mastering Apache Spark, 318 p. Packt Publishing Ltd., Birmingham (2015). ISBN 978-1783987146

    Google Scholar 

  16. Karau, H.: High-Performance Spark: Best Practices for Scaling and Optimizing Apache Spark, 358 p. O’Reilly, Sebastopol (2017). ISBN 978-1491943205

    Google Scholar 

  17. Go, A., Bhayani, R., Huang, L.: Twitter Sentiment Classification using Distant Supervision. https://cs.stanford.edu/people/alecmgo/papers/TwitterDistantSupervision09.pdf

  18. Lyman Ott, R., Longnecker, M.T.: An Introduction to Statistical Methods and Data Analysis, 1296 p. Cengage Learning (2015). ISBN 978-1305269477

    Google Scholar 

  19. Hackeling, G.: Mastering Machine Learning with Scikit-Learn Paperback, 238 p. Packt Publishing Ltd., Birmingham (2014). ISBN 978-1783988365

    Google Scholar 

  20. Gulati, S., Kumar, S.: Apache Spark 2.x for Java Developers: Explore Big Data at Scale Using Java APIs, 350 p. Packt Publishing Ltd., Birmingham (2017). ISBN 978-1787126497

    Google Scholar 

  21. Brussels explosions: What we know about airport and metro attacks. BBC News. https://www.bbc.com/news/world-europe-35869985

  22. Sergey, S.: Complex-Valued Modeling in Economics and Finance, 318 p. Springer, New York (2012)

    Google Scholar 

  23. FTSE UK Index Series. https://www.ftse.com/products/indices/uk

  24. numpy.corrcoef – NumPy v1.16 Manual. https://docs.scipy.org/doc/numpy/reference/generated/numpy.corrcoef.html

  25. sklearn.linear_model.LinearRegression – Scikit-Learn 0.20.3 Documentation. https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LinearRegression.html

  26. sklearn.metrics.mean_squared_error – Scikit-Learn 0.20.3 Documentation. https://scikit-learn.org/stable/modules/generated/sklearn.metrics.mean_squared_error.html

Download references

Acknowledgements

The study was supported by the Russian Foundation for Basic Research, Grant No. 19-010-00610\19 “Theory, Methods and Techniques for Forecasting Economic Development by Autoregressive Models of Complex Variables.”

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dmitry Baryev .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Baryev, D., Konovalov, I., Voinov, N. (2020). New Approach to Feature Generation by Complex-Valued Econometrics and Sentiment Analysis for Stock-Market Prediction. In: Arseniev, D., Overmeyer, L., Kälviäinen, H., Katalinić, B. (eds) Cyber-Physical Systems and Control. CPS&C 2019. Lecture Notes in Networks and Systems, vol 95. Springer, Cham. https://doi.org/10.1007/978-3-030-34983-7_56

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-34983-7_56

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-34982-0

  • Online ISBN: 978-3-030-34983-7

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics