Skip to main content

A Machine Learning Approach to Analyze and Reduce Features to a Significant Number for Employee’s Turn Over Prediction Model

Part of the Advances in Intelligent Systems and Computing book series (AISC,volume 857)

Abstract

Turnover of employee considers as one of the major issue that every company faces. Especially, if the employee has advance skills at his/her working field, then the company faces great loss during that period. To find out the most dominant reasons of employee attrition, we approach by determining features and using machine learning algorithms where features have been processed and reduced beforehand. We have proposed a new model where particular attributes of employee turnover have been selected and adjusted accordingly. In first phase of our reduction method, Sequential Backward Selection Algorithm (SBS) has been used to reduce the features from a higher number to a relatively smaller significant number. After that Chi2 and Random Forest importance algorithm have been used together for the second phase of reduction to determine the common important features by both of the algorithms which can be considered as the foremost features that lead to employee turnover. Our two steps feature selection technique confirms that there are mainly three features that are responsible for employee’s departure. Later, these selected minimal features have been tested with state of the art algorithms of machine learning, such as Decision Tree, Random Forest, Support Vector Machine, Multi-layer Perceptron (MLP), K-Nearest Neighbor (kNN) and Gaussian Naïve Bayes. Lastly, the test result has been visualized by 3D representation to learn the features that are precisely involved for the employee’s turnover.

Keywords

  • Component
  • Machine learning
  • SBS
  • Chi2
  • Predictive model
  • SVM
  • Decision tree
  • Random forest
  • MLP
  • Naïve bayes
  • kNN

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-3-030-01177-2_11
  • Chapter length: 18 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
eBook
USD   219.00
Price excludes VAT (USA)
  • ISBN: 978-3-030-01177-2
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   279.99
Price excludes VAT (USA)
Fig. 1.
Fig. 2.
Fig. 3.
Fig. 4.
Fig. 5.
Fig. 6.
Fig. 7.
Fig. 8.
Fig. 9.
Fig. 10.
Fig. 11.

References

  1. Sikaroudi, E., Mohammad, A., Ghousi, R., Sikaroudi, A.: A data mining approach to employee turnover prediction (case study: Arak automotive parts manufacturing). J. Ind. Syst. Eng. 8(4), 106–121 (2015)

    Google Scholar 

  2. Gao, Y.: Using decision tree to analyze the turnover of employees (dissertation) (2017). http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-325113

  3. Ajit, P.: Prediction of employee turnover in organizations using machine learning algorithms. Algorithms 4(5), C5 (2016)

    Google Scholar 

  4. Pedregosa, F., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12(Oct), 2825–2830 (2011)

    MathSciNet  MATH  Google Scholar 

  5. Heiat, A.: Predicting employee attrition through data mining (2016). http://wdsinet.org/Annual_Meetings/2016_Proceedings/papers/Paper228.pdf. Accessed 1 Oct 2017

  6. Kuldeep, L.: Human Resources Analytics (2016, Fall). https://www.kaggle.com/ludobenistant/hr-analytics/data. Accessed 01 Oct 2017

  7. Pandas.factorize. (n.d.). https://pandas.pydata.org/pandas-docs/stable/generated/pandas.factorize.html. Accessed 01 Oct 2017

  8. Sklearn.preprocessing.MinMaxScaler. (n.d.).from http://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.MinMaxScaler.html#sklearn.preprocessing.MinMaxScaler. Accessed 01 Oct 2017

  9. Sklearn.preprocessing.StandardScaler. (n.d.). http://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.StandardScaler.html. Accessed 01 Oct 2017

  10. Sklearn.preprocessing.RobustScaler. (n.d.). http://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.RobustScaler.html. Accessed 01 Oct 2017

  11. Sklearn.model_selection.train_test_split. (n.d.). http://scikit-learn.org/stable/modules/generated/sklearn.model_selection.train_test_split.html. Accessed 29 Nov 2017

  12. Raschka, S.M.:. Python Machine Learning -. S.l.: Packt Publishing Limited (2017)

    Google Scholar 

  13. Random forest feature importance. (n.d.). http://blog.datadive.net/selecting-good-features-part-iii-random-forests/. Accessed 01 Oct 2017

  14. Sklearn.feature_selection.chi2. (n.d.). http://scikit-learn.org/stable/modules/generated/sklearn.feature_selection.chi2.html. Accessed 01 Oct 2017

  15. Liu, H., Setiono, R.: Chi2: Feature selection and discretization of numeric attributes. In: Proceedings of the Seventh International Conference on Tools with Artificial Intelligence, pp. 388–391. IEEE, November 1995

    Google Scholar 

  16. Sklearn.tree.DecisionTreeClassifier. (n.d.). http://scikit-learn.org/stable/modules/generated/sklearn.tree.DecisionTreeClassifier.html. Accessed 01 Oct 2017

  17. Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)

    CrossRef  Google Scholar 

  18. RBF SVM parameters. (n.d.). http://scikit-learn.org/stable/auto_examples/svm/plot_rbf_parameters.html#sphx-glr-auto-examples-svm-plot-rbf-parameters-py. Accessed 01 Oct 2017

  19. Weston, J.: Support vector machine (and statistical learning theory) tutorial. NEC Labs Am. 4 (1998)

    Google Scholar 

  20. Vapnik, V. (2013). The nature of statistical learning theory. Springer Science & Business Media

    Google Scholar 

  21. How the Naive Bayes Classifier works in Machine Learning, 19 February 2017. http://dataaspirant.com/2017/02/06/naive-bayes-classifier-machine-learning/. Accessed 01 Oct 2017

  22. Srivastava, T., Blog, G., Rizvi, M. S., Jain, K., Jain, S.:Introduction to KNN, K-Nearest Neighbors: Simplified, 16 April 2015. https://www.analyticsvidhya.com/blog/2014/10/introduction-k-neighbours-algorithm-clustering/. Accessed 01 Oct 2017

  23. Teknomo, K. (n.d.). How K-Nearest Neighbor (KNN) Algorithm works? http://people.revoledu.com/kardi/tutorial/KNN/HowTo_KNN.html. Accessed 01 Oct 2017

  24. A Detailed Introduction to K-Nearest Neighbor (KNN) Algorithm. 17 May 2010. https://saravananthirumuruganathan.wordpress.com/2010/05/17/a-detailed-introduction-to-k-nearest-neighbor-knn-algorithm/. Accessed 01 Oct 2017

  25. Receiver operating characteristic, 21 November 2017. https://en.wikipedia.org/wiki/Receiver_operating_characteristic. Accessed 29 Nov 2017

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mirza Mohtashim Alam .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Verify currency and authenticity via CrossMark

Cite this paper

Alam, M.M., Mohiuddin, K., Islam, M.K., Hassan, M., Hoque, M.AU., Allayear, S.M. (2019). A Machine Learning Approach to Analyze and Reduce Features to a Significant Number for Employee’s Turn Over Prediction Model. In: Arai, K., Kapoor, S., Bhatia, R. (eds) Intelligent Computing. SAI 2018. Advances in Intelligent Systems and Computing, vol 857. Springer, Cham. https://doi.org/10.1007/978-3-030-01177-2_11

Download citation