Skip to main content

Modeling Logistic Regression and Neural Network for Stock Selection with BSE 500 – A Comparative Study

  • Conference paper
  • First Online:
Advances in Data Science and Artificial Intelligence (ICDSAI 2022)

Part of the book series: Springer Proceedings in Mathematics & Statistics ((PROMS,volume 403))

  • 316 Accesses

Abstract

In this study, we developed logistic regression and neural network models for stock selection of Indian companies, listed on Bombay Stock Exchange, with BSE 500. The models predicted twenty five stocks of companies listed on BSE 500 that were more likely to yield higher returns in subsequent year. We included them in equally-weighted portfolios to win the market. We evaluated the models on portfolio basis using market as the benchmark. We compared them using profitability based performance measures. We optimized them by finding optimal training period and finally compared the optimally trained models. The results showed that the proposed models enhance stock selection for the long-term investment. The logistic regression model yielded overall higher returns and higher hit rates at portfolio level and at stock level. However, the neural network model had lower variances in returns. Thus, there are some indications that neural network may likely to yield steadier returns and resilient models for the noisy financial data. We used the moving window system to achieve higher returns with optimal training samples. The retraining method in this system brings challenges in comparing the models. However, our study demonstrates the steps to improve the stock selection models using comparative studies.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 149.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 199.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 199.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Ahn, B. S., Cho, S. S., & Kim, C. Y. (2000). The integrated methodology of rough set theory and artificial neural network for business failure prediction. Expert Systems with Applications, 18(2), 65–74.

    Article  Google Scholar 

  2. Becker, Y., Fei, P., & Lester, A. (2007). Stock selection: An innovative application of genetic programming methodology. Genetic Programming Theory and Practice IV, (617), 315–334.

    Article  Google Scholar 

  3. Beynon, M. J., Clatworthy, M. a., & Jones, M. J. (2004). The prediction of profitability using accounting narratives: a variable-precision rough set approach. Intelligent Systems in Accounting, Finance & Management, 12(4), 227–242.

    Article  Google Scholar 

  4. Cao, Y., Chen, X., Wu, D. D., & Mo, M. (2011). Early warning of enterprise decline in a life cycle using neural networks and rough set theory. Expert Systems with Applications, 38(6), 6424–6429.

    Article  Google Scholar 

  5. Cao, Y., Wan, G., & Wang, F. (2011). Predicting Financial Distress of Chinese Listed Companies Using Rough Set Theory and Support Vector Machine. Asia-Pacific Journal of Operational Research, 28(01), 95.

    Article  MathSciNet  Google Scholar 

  6. Cawley, G. C., & Talbot, N. L. C. (2010). On over-fitting in model selection and subsequent selection bias in performance evaluation. Journal of Machine Learning Research, 11, 2079–2107.

    MathSciNet  MATH  Google Scholar 

  7. Chen, S.-S., Huang, C.-F., & Hong, T.-P. (2013). A multi-objective genetic model for stock selection. Kaigi.Org, 2–6.

    Google Scholar 

  8. Demsar, J., Curk, T., Erjavec, A., Gorup, C., Hocevar, T., Milutinovic, M., … Zupan, B. (2013). Orange: Data Mining Toolbox in Python. Journal of Machine Learning Research, 14, 2349–2353.

    MATH  Google Scholar 

  9. Ding, Y., Song, X., & Zen, Y. (2008). Forecasting financial condition of Chinese listed companies based on support vector machine. Expert Systems with Applications, 34(4), 3081–3089.

    Article  Google Scholar 

  10. Fan, A., & Palaniswami, M. (2001). Stock selection using support vector machines. IJCNN’01. International Joint Conference on Neural Networks. Proceedings, 3, 1793–1798.

    Google Scholar 

  11. Feng, X., & Kong-lin, K. (2008). Five-Category Evaluation of Commercial Bank’s Loan by the Integration of Rough Sets and Neural Network. Systems Engineering - Theory & Practice, 28(1), 40–45.

    Article  Google Scholar 

  12. Geng, R., Bose, I., & Chen, X. (2015). Prediction of financial distress: An empirical study of listed Chinese companies using data mining. European Journal of Operational Research, 241(1), 236–247.

    Article  Google Scholar 

  13. Han, J., Kamber, M., & Pei, J. (2011). Data Mining: Concepts and Techniques. Morgan Kaufmann (3rd Editio).

    Google Scholar 

  14. Härdle, W., Lee, Y., Schäfer, D., & Yeh, Y. (2009). Variable selection and oversampling in the use of smooth support vector machines for predicting the default risk of companies. Journal of Forecasting, 28, 512–534.

    Article  MathSciNet  Google Scholar 

  15. Hargreaves, C., & Hao, Y. (2013). Prediction of stock performance using analytical techniques. Journal of Emerging Technologies in Web Intelligence, 5(2), 136–142.

    Article  Google Scholar 

  16. Hassan, G., & Clack, C. (2009). Robustness of multiple objective GP stock-picking in unstable financial markets. In GECCO’09 Proceedings of the 11th Annual conference on Genetic and evolutionary computation (pp. 1513–1520).

    Google Scholar 

  17. Huang, C.-F., Chang, B. R., Cheng, D.-W., & Chang, C.-H. (2012). Feature selection and parameter optimization of a fuzzy-based stock selection model using genetic algorithms. International Journal of Fuzzy Systems, 14(1), 65–75.

    MathSciNet  Google Scholar 

  18. Huang, C.-F., Hsieh, T., Chang, B. R., & Chang, C. (2011). A comparative study of stock scoring using regression and genetic-based linear models. 2011 IEEE International Conference on Granular Computing, 268–273.

    Google Scholar 

  19. Huang, S.-C., Tang, Y.-C., Lee, C.-W., & Chang, M.-J. (2012). Kernel local Fisher discriminant analysis based manifold-regularized SVM model for financial distress predictions. Expert Systems with Applications, 39(3), 3855–3861.

    Article  Google Scholar 

  20. Krishna Kumar, M. S., Subramanian, S., & Rao, U. S. (2010). Enhancing stock selection in Indian stock market using value investment criteria: An application of artificial neural networks. The IUP Journal of Accounting Research and Audit Practices, 9(4), 54–67.

    Google Scholar 

  21. Lai, K., Yu, L., Wang, S., & Zhou, C. (2006). A double-stage genetic optimization algorithm for portfolio selection. 13th International Conference on Neural Information Processing, 928–937.

    Google Scholar 

  22. Min, J. H., & Lee, Y. C. (2005). Bankruptcy prediction using support vector machine with optimal choice of kernel function parameters. Expert Systems with Applications, 28(4), 603–614.

    Article  Google Scholar 

  23. Min, S. H., Lee, J., & Han, I. (2006). Hybrid genetic algorithms and support vector machines for bankruptcy prediction. Expert Systems with Applications, 31(3), 652–660.

    Article  Google Scholar 

  24. Mironiuc, M., & Robu, M.-A. (2013). Obtaining a Practical Model for Estimating Stock Performance on an Emerging Market Using Logistic Regression Analysis. Procedia – Social and Behavioral Sciences, 81, 422–427.

    Article  Google Scholar 

  25. Olson, D., & Mossman, C. (2003). Neural network forecasts of Canadian stock returns using accounting ratios. International Journal of Forecasting, 19(3), 453–465.

    Article  Google Scholar 

  26. Pao, H. (2008). A comparison of neural network and multiple regression analysis in modeling capital structure. Expert Systems with Applications, 35(3), 720–727.

    Article  Google Scholar 

  27. Powers, D. M. W. (2007). Evaluation: From Precision, Recall and F-Factor to ROC, Informedness, Markedness & Correlation. Journal of Machine Learning Technologies, 2, 37–63.

    Google Scholar 

  28. Quah, T.-S. (2008). DJIA stock selection assisted by neural network. Expert Systems with Applications, 35(1–2), 50–58.

    Article  Google Scholar 

  29. Quah, T.-S., & Srinivasan, B. (1999). Improving returns on stock investment through neural network selection. Expert Systems with Applications, 17(4), 295–301.

    Article  Google Scholar 

  30. Šarlija, N., Bilandžić, A., & Stanić, M. (2017). Logistic regression modelling: procedures and pitfalls in developing and interpreting prediction models. Croatian Operational Research Review, 8(2), 631–652.

    Article  Google Scholar 

  31. Shmueli, G., Patel, N., & Bruce, P. (2010). Data Mining for Business Intelligence in XLMiner. Wiley.

    Google Scholar 

  32. Telmoudi, F., Ghourabi, M., & Limam, M. (2011). RST–GCBR-CLUSTERING-BASED RGA–SVM model for corporate failure prediction. Intelligent Systems in Accounting, Finance & Management, 18(June 2011), 105–120.

    Google Scholar 

  33. Thenmozhi, M. (2006). Forcasting Stock Index Returns Using Neural Networks. Delhi Business Review, 7(2), 59–69.

    Google Scholar 

  34. Turban, E., Sharda, R., & Delen, D. (2011). Decision Support and Business Intelligence Systems (9th Editio). Prentice Hall.

    Google Scholar 

  35. Vanstone, B., Finnie, G., & Tan, C. (2004). Enhancing security selection in the Australian stockmarket using fundamental analysis and neural networks. Bond University EPublications@bond.

    Google Scholar 

  36. Witten, I., Frank, E., & Hall, M. (2011). Data Mining:: Practical Machine Learning Tools and Techniques (Third). The Morgan Kaufmann Series in Data Management Systems.

    Google Scholar 

  37. Yeh, C.-C., Chi, D.-J., & Hsu, M.-F. (2010). A hybrid approach of DEA, rough set and support vector machines for business failure prediction. Expert Systems with Applications, 37(2), 1535–1541.

    Article  Google Scholar 

  38. Yildiz, B. (1999). Fundamental analysis with neuro-fuzzy technology: An experiment in Istanbul stock exchange. Sbd.Ogu.Edu.Tr, 8(2), 25–42.

    Google Scholar 

  39. Yildiz, B., & Yezegel, A. (2010). Fundamental analysis with artificial neural network. The International Journal of Business and Finance Research, 4(1), 149–159.

    Google Scholar 

  40. Zekić-Sušac, M., Šarlija, N., Has, A., & Bilandžić, A. (2016). Predicting company growth using logistic regression and neural networks. Croatian Operational Research Review, 7(2), 229–248.

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Appendix

Appendix

Data Widgets in Orange

We used the following widgets in our workflows (see Figs. 6 and 14) for the data and file handling tasks:

  • File widget to read data from external files into workflows.

  • Select Columns widget to select input attributes, class attribute, and meta attributes from the available columns.

  • Concatenate widget to vertically merge instances from multiple files for ML-III and ML-IV.

  • Select Rows widget to exclude samples belong to the target year after the concatenation of multiple training sets.

  • Data Table widget to view the composed training and test sets in spreadsheet form after selecting columns and rows.

  • Save Data widget to export the end results viewed in Data Table from Orange workflows to external files.

Fig. 14
figure 14

Workflow for constructing training sets of ML-III

Fig. 15
figure 15

Reading and composing training set

Fig. 16
figure 16

Reading and composing test set

Fig. 17
figure 17

Partial view of a training set

Fig. 18
figure 18

Partial view of a test set

In our case, the external files were in available Excel worksheets or in tab separated text files. Figure 14 shows the workflows for constructing the training sets in Orange. Figure 15, 16, 17, 18 illustrate the widgets reading, composing, and presenting the training, and test sets of ML-I models for the target year 2010–2011.

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Simon, S., Date, H. (2023). Modeling Logistic Regression and Neural Network for Stock Selection with BSE 500 – A Comparative Study. In: Misra, R., et al. Advances in Data Science and Artificial Intelligence. ICDSAI 2022. Springer Proceedings in Mathematics & Statistics, vol 403. Springer, Cham. https://doi.org/10.1007/978-3-031-16178-0_20

Download citation

Publish with us

Policies and ethics