Modeling Logistic Regression and Neural Network for Stock Selection with BSE 500 – A Comparative Study

Simon, Selvan; Date, Hema

doi:10.1007/978-3-031-16178-0_20

Selvan Simon⁸ &
Hema Date⁸

Part of the book series: Springer Proceedings in Mathematics & Statistics ((PROMS,volume 403))

Included in the following conference series:

XVIII International Conference on Data Science and Intelligent Analysis of Information

316 Accesses

Abstract

In this study, we developed logistic regression and neural network models for stock selection of Indian companies, listed on Bombay Stock Exchange, with BSE 500. The models predicted twenty five stocks of companies listed on BSE 500 that were more likely to yield higher returns in subsequent year. We included them in equally-weighted portfolios to win the market. We evaluated the models on portfolio basis using market as the benchmark. We compared them using profitability based performance measures. We optimized them by finding optimal training period and finally compared the optimally trained models. The results showed that the proposed models enhance stock selection for the long-term investment. The logistic regression model yielded overall higher returns and higher hit rates at portfolio level and at stock level. However, the neural network model had lower variances in returns. Thus, there are some indications that neural network may likely to yield steadier returns and resilient models for the noisy financial data. We used the moving window system to achieve higher returns with optimal training samples. The retraining method in this system brings challenges in comparing the models. However, our study demonstrates the steps to improve the stock selection models using comparative studies.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 149.00; Price excludes VAT (USA)

Softcover Book: USD 199.99; Price excludes VAT (USA)

Hardcover Book: USD 199.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

A statistical learning approach for stock selection in the Chinese stock market

Article Open access 29 April 2019

Machine learning applied to stock index performance enhancement

Article Open access 25 January 2021

Portfolio management using Additive Ratio Assessment based stock selection and deep learning for prediction

Article 07 October 2023

References

Ahn, B. S., Cho, S. S., & Kim, C. Y. (2000). The integrated methodology of rough set theory and artificial neural network for business failure prediction. Expert Systems with Applications, 18(2), 65–74.
Article Google Scholar
Becker, Y., Fei, P., & Lester, A. (2007). Stock selection: An innovative application of genetic programming methodology. Genetic Programming Theory and Practice IV, (617), 315–334.
Article Google Scholar
Beynon, M. J., Clatworthy, M. a., & Jones, M. J. (2004). The prediction of profitability using accounting narratives: a variable-precision rough set approach. Intelligent Systems in Accounting, Finance & Management, 12(4), 227–242.
Article Google Scholar
Cao, Y., Chen, X., Wu, D. D., & Mo, M. (2011). Early warning of enterprise decline in a life cycle using neural networks and rough set theory. Expert Systems with Applications, 38(6), 6424–6429.
Article Google Scholar
Cao, Y., Wan, G., & Wang, F. (2011). Predicting Financial Distress of Chinese Listed Companies Using Rough Set Theory and Support Vector Machine. Asia-Pacific Journal of Operational Research, 28(01), 95.
Article MathSciNet Google Scholar
Cawley, G. C., & Talbot, N. L. C. (2010). On over-fitting in model selection and subsequent selection bias in performance evaluation. Journal of Machine Learning Research, 11, 2079–2107.
MathSciNet MATH Google Scholar
Chen, S.-S., Huang, C.-F., & Hong, T.-P. (2013). A multi-objective genetic model for stock selection. Kaigi.Org, 2–6.
Google Scholar
Demsar, J., Curk, T., Erjavec, A., Gorup, C., Hocevar, T., Milutinovic, M., … Zupan, B. (2013). Orange: Data Mining Toolbox in Python. Journal of Machine Learning Research, 14, 2349–2353.
MATH Google Scholar
Ding, Y., Song, X., & Zen, Y. (2008). Forecasting financial condition of Chinese listed companies based on support vector machine. Expert Systems with Applications, 34(4), 3081–3089.
Article Google Scholar
Fan, A., & Palaniswami, M. (2001). Stock selection using support vector machines. IJCNN’01. International Joint Conference on Neural Networks. Proceedings, 3, 1793–1798.
Google Scholar
Feng, X., & Kong-lin, K. (2008). Five-Category Evaluation of Commercial Bank’s Loan by the Integration of Rough Sets and Neural Network. Systems Engineering - Theory & Practice, 28(1), 40–45.
Article Google Scholar
Geng, R., Bose, I., & Chen, X. (2015). Prediction of financial distress: An empirical study of listed Chinese companies using data mining. European Journal of Operational Research, 241(1), 236–247.
Article Google Scholar
Han, J., Kamber, M., & Pei, J. (2011). Data Mining: Concepts and Techniques. Morgan Kaufmann (3rd Editio).
Google Scholar
Härdle, W., Lee, Y., Schäfer, D., & Yeh, Y. (2009). Variable selection and oversampling in the use of smooth support vector machines for predicting the default risk of companies. Journal of Forecasting, 28, 512–534.
Article MathSciNet Google Scholar
Hargreaves, C., & Hao, Y. (2013). Prediction of stock performance using analytical techniques. Journal of Emerging Technologies in Web Intelligence, 5(2), 136–142.
Article Google Scholar
Hassan, G., & Clack, C. (2009). Robustness of multiple objective GP stock-picking in unstable financial markets. In GECCO’09 Proceedings of the 11th Annual conference on Genetic and evolutionary computation (pp. 1513–1520).
Google Scholar
Huang, C.-F., Chang, B. R., Cheng, D.-W., & Chang, C.-H. (2012). Feature selection and parameter optimization of a fuzzy-based stock selection model using genetic algorithms. International Journal of Fuzzy Systems, 14(1), 65–75.
MathSciNet Google Scholar
Huang, C.-F., Hsieh, T., Chang, B. R., & Chang, C. (2011). A comparative study of stock scoring using regression and genetic-based linear models. 2011 IEEE International Conference on Granular Computing, 268–273.
Google Scholar
Huang, S.-C., Tang, Y.-C., Lee, C.-W., & Chang, M.-J. (2012). Kernel local Fisher discriminant analysis based manifold-regularized SVM model for financial distress predictions. Expert Systems with Applications, 39(3), 3855–3861.
Article Google Scholar
Krishna Kumar, M. S., Subramanian, S., & Rao, U. S. (2010). Enhancing stock selection in Indian stock market using value investment criteria: An application of artificial neural networks. The IUP Journal of Accounting Research and Audit Practices, 9(4), 54–67.
Google Scholar
Lai, K., Yu, L., Wang, S., & Zhou, C. (2006). A double-stage genetic optimization algorithm for portfolio selection. 13th International Conference on Neural Information Processing, 928–937.
Google Scholar
Min, J. H., & Lee, Y. C. (2005). Bankruptcy prediction using support vector machine with optimal choice of kernel function parameters. Expert Systems with Applications, 28(4), 603–614.
Article Google Scholar
Min, S. H., Lee, J., & Han, I. (2006). Hybrid genetic algorithms and support vector machines for bankruptcy prediction. Expert Systems with Applications, 31(3), 652–660.
Article Google Scholar
Mironiuc, M., & Robu, M.-A. (2013). Obtaining a Practical Model for Estimating Stock Performance on an Emerging Market Using Logistic Regression Analysis. Procedia – Social and Behavioral Sciences, 81, 422–427.
Article Google Scholar
Olson, D., & Mossman, C. (2003). Neural network forecasts of Canadian stock returns using accounting ratios. International Journal of Forecasting, 19(3), 453–465.
Article Google Scholar
Pao, H. (2008). A comparison of neural network and multiple regression analysis in modeling capital structure. Expert Systems with Applications, 35(3), 720–727.
Article Google Scholar
Powers, D. M. W. (2007). Evaluation: From Precision, Recall and F-Factor to ROC, Informedness, Markedness & Correlation. Journal of Machine Learning Technologies, 2, 37–63.
Google Scholar
Quah, T.-S. (2008). DJIA stock selection assisted by neural network. Expert Systems with Applications, 35(1–2), 50–58.
Article Google Scholar
Quah, T.-S., & Srinivasan, B. (1999). Improving returns on stock investment through neural network selection. Expert Systems with Applications, 17(4), 295–301.
Article Google Scholar
Šarlija, N., Bilandžić, A., & Stanić, M. (2017). Logistic regression modelling: procedures and pitfalls in developing and interpreting prediction models. Croatian Operational Research Review, 8(2), 631–652.
Article Google Scholar
Shmueli, G., Patel, N., & Bruce, P. (2010). Data Mining for Business Intelligence in XLMiner. Wiley.
Google Scholar
Telmoudi, F., Ghourabi, M., & Limam, M. (2011). RST–GCBR-CLUSTERING-BASED RGA–SVM model for corporate failure prediction. Intelligent Systems in Accounting, Finance & Management, 18(June 2011), 105–120.
Google Scholar
Thenmozhi, M. (2006). Forcasting Stock Index Returns Using Neural Networks. Delhi Business Review, 7(2), 59–69.
Google Scholar
Turban, E., Sharda, R., & Delen, D. (2011). Decision Support and Business Intelligence Systems (9th Editio). Prentice Hall.
Google Scholar
Vanstone, B., Finnie, G., & Tan, C. (2004). Enhancing security selection in the Australian stockmarket using fundamental analysis and neural networks. Bond University EPublications@bond.
Google Scholar
Witten, I., Frank, E., & Hall, M. (2011). Data Mining:: Practical Machine Learning Tools and Techniques (Third). The Morgan Kaufmann Series in Data Management Systems.
Google Scholar
Yeh, C.-C., Chi, D.-J., & Hsu, M.-F. (2010). A hybrid approach of DEA, rough set and support vector machines for business failure prediction. Expert Systems with Applications, 37(2), 1535–1541.
Article Google Scholar
Yildiz, B. (1999). Fundamental analysis with neuro-fuzzy technology: An experiment in Istanbul stock exchange. Sbd.Ogu.Edu.Tr, 8(2), 25–42.
Google Scholar
Yildiz, B., & Yezegel, A. (2010). Fundamental analysis with artificial neural network. The International Journal of Business and Finance Research, 4(1), 149–159.
Google Scholar
Zekić-Sušac, M., Šarlija, N., Has, A., & Bilandžić, A. (2016). Predicting company growth using logistic regression and neural networks. Croatian Operational Research Review, 7(2), 229–248.
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

National Institute of Industrial Engineering, Mumbai, India
Selvan Simon & Hema Date

Authors

Selvan Simon
View author publications
You can also search for this author in PubMed Google Scholar
Hema Date
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science & Engineering, Indian Institute of Technology Patna, Patna, Bihar, India
Rajiv Misra
Department of Computer Science, Central University of Rajasthan, Ajmer, Rajasthan, India
Nishtha Kesswani
Department of EE Engineering, University of London, London, UK
Muttukrishnan Rajarajan
Department of ECE, National University of Singapore, Singapore, Singapore
Bharadwaj Veeravalli
EMLYON Business School, Écully, France
Imene Brigui
Department of Computer Science, Florida Polytechnic University, Lakeland, FL, USA
Ashok Patel
Director, Indian Institute of Technology Patna, Bihar, India
T. N. Singh

Appendix

Data Widgets in Orange

We used the following widgets in our workflows (see Figs. 6 and 14) for the data and file handling tasks:

File widget to read data from external files into workflows.
Select Columns widget to select input attributes, class attribute, and meta attributes from the available columns.
Concatenate widget to vertically merge instances from multiple files for ML-III and ML-IV.
Select Rows widget to exclude samples belong to the target year after the concatenation of multiple training sets.
Data Table widget to view the composed training and test sets in spreadsheet form after selecting columns and rows.
Save Data widget to export the end results viewed in Data Table from Orange workflows to external files.

In our case, the external files were in available Excel worksheets or in tab separated text files. Figure 14 shows the workflows for constructing the training sets in Orange. Figure 15, 16, 17, 18 illustrate the widgets reading, composing, and presenting the training, and test sets of ML-I models for the target year 2010–2011.

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Simon, S., Date, H. (2023). Modeling Logistic Regression and Neural Network for Stock Selection with BSE 500 – A Comparative Study. In: Misra, R., et al. Advances in Data Science and Artificial Intelligence. ICDSAI 2022. Springer Proceedings in Mathematics & Statistics, vol 403. Springer, Cham. https://doi.org/10.1007/978-3-031-16178-0_20

Download citation

DOI: https://doi.org/10.1007/978-3-031-16178-0_20
Published: 14 May 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-16177-3
Online ISBN: 978-3-031-16178-0
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics