Abstract
In this study, we developed logistic regression and neural network models for stock selection of Indian companies, listed on Bombay Stock Exchange, with BSE 500. The models predicted twenty five stocks of companies listed on BSE 500 that were more likely to yield higher returns in subsequent year. We included them in equally-weighted portfolios to win the market. We evaluated the models on portfolio basis using market as the benchmark. We compared them using profitability based performance measures. We optimized them by finding optimal training period and finally compared the optimally trained models. The results showed that the proposed models enhance stock selection for the long-term investment. The logistic regression model yielded overall higher returns and higher hit rates at portfolio level and at stock level. However, the neural network model had lower variances in returns. Thus, there are some indications that neural network may likely to yield steadier returns and resilient models for the noisy financial data. We used the moving window system to achieve higher returns with optimal training samples. The retraining method in this system brings challenges in comparing the models. However, our study demonstrates the steps to improve the stock selection models using comparative studies.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Ahn, B. S., Cho, S. S., & Kim, C. Y. (2000). The integrated methodology of rough set theory and artificial neural network for business failure prediction. Expert Systems with Applications, 18(2), 65–74.
Becker, Y., Fei, P., & Lester, A. (2007). Stock selection: An innovative application of genetic programming methodology. Genetic Programming Theory and Practice IV, (617), 315–334.
Beynon, M. J., Clatworthy, M. a., & Jones, M. J. (2004). The prediction of profitability using accounting narratives: a variable-precision rough set approach. Intelligent Systems in Accounting, Finance & Management, 12(4), 227–242.
Cao, Y., Chen, X., Wu, D. D., & Mo, M. (2011). Early warning of enterprise decline in a life cycle using neural networks and rough set theory. Expert Systems with Applications, 38(6), 6424–6429.
Cao, Y., Wan, G., & Wang, F. (2011). Predicting Financial Distress of Chinese Listed Companies Using Rough Set Theory and Support Vector Machine. Asia-Pacific Journal of Operational Research, 28(01), 95.
Cawley, G. C., & Talbot, N. L. C. (2010). On over-fitting in model selection and subsequent selection bias in performance evaluation. Journal of Machine Learning Research, 11, 2079–2107.
Chen, S.-S., Huang, C.-F., & Hong, T.-P. (2013). A multi-objective genetic model for stock selection. Kaigi.Org, 2–6.
Demsar, J., Curk, T., Erjavec, A., Gorup, C., Hocevar, T., Milutinovic, M., … Zupan, B. (2013). Orange: Data Mining Toolbox in Python. Journal of Machine Learning Research, 14, 2349–2353.
Ding, Y., Song, X., & Zen, Y. (2008). Forecasting financial condition of Chinese listed companies based on support vector machine. Expert Systems with Applications, 34(4), 3081–3089.
Fan, A., & Palaniswami, M. (2001). Stock selection using support vector machines. IJCNN’01. International Joint Conference on Neural Networks. Proceedings, 3, 1793–1798.
Feng, X., & Kong-lin, K. (2008). Five-Category Evaluation of Commercial Bank’s Loan by the Integration of Rough Sets and Neural Network. Systems Engineering - Theory & Practice, 28(1), 40–45.
Geng, R., Bose, I., & Chen, X. (2015). Prediction of financial distress: An empirical study of listed Chinese companies using data mining. European Journal of Operational Research, 241(1), 236–247.
Han, J., Kamber, M., & Pei, J. (2011). Data Mining: Concepts and Techniques. Morgan Kaufmann (3rd Editio).
Härdle, W., Lee, Y., Schäfer, D., & Yeh, Y. (2009). Variable selection and oversampling in the use of smooth support vector machines for predicting the default risk of companies. Journal of Forecasting, 28, 512–534.
Hargreaves, C., & Hao, Y. (2013). Prediction of stock performance using analytical techniques. Journal of Emerging Technologies in Web Intelligence, 5(2), 136–142.
Hassan, G., & Clack, C. (2009). Robustness of multiple objective GP stock-picking in unstable financial markets. In GECCO’09 Proceedings of the 11th Annual conference on Genetic and evolutionary computation (pp. 1513–1520).
Huang, C.-F., Chang, B. R., Cheng, D.-W., & Chang, C.-H. (2012). Feature selection and parameter optimization of a fuzzy-based stock selection model using genetic algorithms. International Journal of Fuzzy Systems, 14(1), 65–75.
Huang, C.-F., Hsieh, T., Chang, B. R., & Chang, C. (2011). A comparative study of stock scoring using regression and genetic-based linear models. 2011 IEEE International Conference on Granular Computing, 268–273.
Huang, S.-C., Tang, Y.-C., Lee, C.-W., & Chang, M.-J. (2012). Kernel local Fisher discriminant analysis based manifold-regularized SVM model for financial distress predictions. Expert Systems with Applications, 39(3), 3855–3861.
Krishna Kumar, M. S., Subramanian, S., & Rao, U. S. (2010). Enhancing stock selection in Indian stock market using value investment criteria: An application of artificial neural networks. The IUP Journal of Accounting Research and Audit Practices, 9(4), 54–67.
Lai, K., Yu, L., Wang, S., & Zhou, C. (2006). A double-stage genetic optimization algorithm for portfolio selection. 13th International Conference on Neural Information Processing, 928–937.
Min, J. H., & Lee, Y. C. (2005). Bankruptcy prediction using support vector machine with optimal choice of kernel function parameters. Expert Systems with Applications, 28(4), 603–614.
Min, S. H., Lee, J., & Han, I. (2006). Hybrid genetic algorithms and support vector machines for bankruptcy prediction. Expert Systems with Applications, 31(3), 652–660.
Mironiuc, M., & Robu, M.-A. (2013). Obtaining a Practical Model for Estimating Stock Performance on an Emerging Market Using Logistic Regression Analysis. Procedia – Social and Behavioral Sciences, 81, 422–427.
Olson, D., & Mossman, C. (2003). Neural network forecasts of Canadian stock returns using accounting ratios. International Journal of Forecasting, 19(3), 453–465.
Pao, H. (2008). A comparison of neural network and multiple regression analysis in modeling capital structure. Expert Systems with Applications, 35(3), 720–727.
Powers, D. M. W. (2007). Evaluation: From Precision, Recall and F-Factor to ROC, Informedness, Markedness & Correlation. Journal of Machine Learning Technologies, 2, 37–63.
Quah, T.-S. (2008). DJIA stock selection assisted by neural network. Expert Systems with Applications, 35(1–2), 50–58.
Quah, T.-S., & Srinivasan, B. (1999). Improving returns on stock investment through neural network selection. Expert Systems with Applications, 17(4), 295–301.
Šarlija, N., Bilandžić, A., & Stanić, M. (2017). Logistic regression modelling: procedures and pitfalls in developing and interpreting prediction models. Croatian Operational Research Review, 8(2), 631–652.
Shmueli, G., Patel, N., & Bruce, P. (2010). Data Mining for Business Intelligence in XLMiner. Wiley.
Telmoudi, F., Ghourabi, M., & Limam, M. (2011). RST–GCBR-CLUSTERING-BASED RGA–SVM model for corporate failure prediction. Intelligent Systems in Accounting, Finance & Management, 18(June 2011), 105–120.
Thenmozhi, M. (2006). Forcasting Stock Index Returns Using Neural Networks. Delhi Business Review, 7(2), 59–69.
Turban, E., Sharda, R., & Delen, D. (2011). Decision Support and Business Intelligence Systems (9th Editio). Prentice Hall.
Vanstone, B., Finnie, G., & Tan, C. (2004). Enhancing security selection in the Australian stockmarket using fundamental analysis and neural networks. Bond University EPublications@bond.
Witten, I., Frank, E., & Hall, M. (2011). Data Mining:: Practical Machine Learning Tools and Techniques (Third). The Morgan Kaufmann Series in Data Management Systems.
Yeh, C.-C., Chi, D.-J., & Hsu, M.-F. (2010). A hybrid approach of DEA, rough set and support vector machines for business failure prediction. Expert Systems with Applications, 37(2), 1535–1541.
Yildiz, B. (1999). Fundamental analysis with neuro-fuzzy technology: An experiment in Istanbul stock exchange. Sbd.Ogu.Edu.Tr, 8(2), 25–42.
Yildiz, B., & Yezegel, A. (2010). Fundamental analysis with artificial neural network. The International Journal of Business and Finance Research, 4(1), 149–159.
Zekić-Sušac, M., Šarlija, N., Has, A., & Bilandžić, A. (2016). Predicting company growth using logistic regression and neural networks. Croatian Operational Research Review, 7(2), 229–248.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Appendix
Appendix
Data Widgets in Orange
We used the following widgets in our workflows (see Figs. 6 and 14) for the data and file handling tasks:
-
File widget to read data from external files into workflows.
-
Select Columns widget to select input attributes, class attribute, and meta attributes from the available columns.
-
Concatenate widget to vertically merge instances from multiple files for ML-III and ML-IV.
-
Select Rows widget to exclude samples belong to the target year after the concatenation of multiple training sets.
-
Data Table widget to view the composed training and test sets in spreadsheet form after selecting columns and rows.
-
Save Data widget to export the end results viewed in Data Table from Orange workflows to external files.
In our case, the external files were in available Excel worksheets or in tab separated text files. Figure 14 shows the workflows for constructing the training sets in Orange. Figure 15, 16, 17, 18 illustrate the widgets reading, composing, and presenting the training, and test sets of ML-I models for the target year 2010–2011.
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Simon, S., Date, H. (2023). Modeling Logistic Regression and Neural Network for Stock Selection with BSE 500 – A Comparative Study. In: Misra, R., et al. Advances in Data Science and Artificial Intelligence. ICDSAI 2022. Springer Proceedings in Mathematics & Statistics, vol 403. Springer, Cham. https://doi.org/10.1007/978-3-031-16178-0_20
Download citation
DOI: https://doi.org/10.1007/978-3-031-16178-0_20
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-16177-3
Online ISBN: 978-3-031-16178-0
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)