Skip to main content

GSEL: A Genetic Stacking-Based Ensemble Learning Approach for Incident Classification

  • Conference paper
  • First Online:
Proceedings of ICETIT 2019

Part of the book series: Lecture Notes in Electrical Engineering ((LNEE,volume 605))

Abstract

A large amount of incident narratives is collected across all industries around the world. These narratives are usually written manually, which might include a significant amount of errors. Due to the size and complexity of such narrative data, advanced text mining and natural language processing (NLP) techniques are essentially required to extract useful information. Automatic document classification is one of such important tasks in NLP. Therefore, to reduce the dimensionality problem in data, the study proposes a genetic stacking-based ensemble learning (GSEL) method using Skip-gram model and Doc2vec framework. The classifiers, namely logistic regression, random forest, k-nearest neighbor, multi-layer perceptron (MLP), and support vector machine are used and their outputs are ensembled to produce better accuracy in prediction. A real-coded genetic algorithm (GA) is used to tune the parameters of ensemble method. Results reveal that the proposed approach is capable of handling a huge amount of text data in analysis and predict with enhanced accuracy as compared to other state-of-the-art algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 259.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 329.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 329.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Takala, J.: Global estimates of fatal occupational accidents. Epidemiol.-Baltimore 10(5), 640–646 (1999)

    Article  Google Scholar 

  2. Leigh, J.P., Marcin, J.P., Miller, T.R.: An estimate of the us government’s undercount of nonfatal occupational injuries. J. Occup. Environ. Med. 46(1), 10–18 (2004)

    Article  Google Scholar 

  3. Singh, K., Raj, N., Sahu, S., Behera, R., Sarkar, S., Maiti, J.: Modelling safety of gantry crane operations using petri nets. In. J. Injury Control Saf. Promot. 24, 32–43 (2015)

    Article  Google Scholar 

  4. Gautam, S., Maiti, J., Syamsundar, A., Sarkar, S.: Segmented point process models for work system safety analysis. Saf. Sci. 95, 15–27 (2017)

    Article  Google Scholar 

  5. Sarkar, S., Verma, A., Maiti, J.: Prediction of occupational incidents using proactive and reactive data: a data mining approach. In: Industrial Safety Management - 21st Century Perspective of Asia (Springer), pp. 65–79. Springer, Singapore (2018)

    Google Scholar 

  6. Verma, A., Chatterjee, S., Sarkar, S., Maiti, J.: Data-driven mapping between proactive and reactive measures of occupational safety performance. In: Industrial Safety Management - 21st Century Perspective of Asia (Springer), pp. 53–63. Springer, Singapore (2018)

    Google Scholar 

  7. Sarkar, S., Baidya, S., Maiti, J.: Application of rough set theory in accident analysis at work: a case study. In: ICRCICN 2017, pp. 245–250. IEEE (2017)

    Google Scholar 

  8. Sarkar, S., Raj, R., Vinay, S., Malti, J., Pratihar, D.K.: An optimization-based decision tree approach for predicting slip-trip-fall accidents at work. Saf. Sci. 118, 57–69 (2019)

    Article  Google Scholar 

  9. Sarkar, S., Ejaz, N., Maiti, J.: Application of hybrid clustering technique for pattern extraction of accident at work: a case study of a steel industry. In: 2018 4th International Conference on Recent Advances in Information Technology (RAIT), pp. 1–6. IEEE (2018)

    Google Scholar 

  10. Sarkar, S., Vinay, S., Maiti, J.: Text mining based safety risk assessment and prediction of occupational accidents in a steel plant. In: 2016 International Conference on Computational Techniques in Information and Communication Technologies (ICCTICT), pp. 439–444. IEEE (2016)

    Google Scholar 

  11. Sparck Jones, K.: A statistical interpretation of term specificity and its application in retrieval. J. Documentation 28(1), 11–21 (1972)

    Article  Google Scholar 

  12. Dadgar, S.M.H., Araghi, M.S., Farahani, M.M.: A novel text mining approach based on TF-IDFand support vector machine for news classification. In: 2016 IEEE International Conference on Engineering and Technology (ICETECH), pp. 112–116. IEEE (2016)

    Google Scholar 

  13. Sarkar, S., Pateshwari, V., Maiti, J.: Predictive model for incident occurrences in steel plant in India. In: ICCCNT 2017, pp. 1–5. IEEE (2017)

    Google Scholar 

  14. Dumais, S.T.: Latent semantic analysis. Ann. Rev. Inf. Sci. Technol. 38(1), 188–230 (2004)

    Article  Google Scholar 

  15. Sarwar, B., Karypis, G., Konstan, J., Riedl, J.: Application of dimensionality reduction in recommender system-a case study. Technical report, University Minneapolis Department of Computer Science, Minnesota (2000)

    Google Scholar 

  16. Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)

    Google Scholar 

  17. Heuer, H.: Text comparison using word vector representations and dimensionality reduction. arXiv preprint arXiv:1607.00534 (2016)

  18. Street, W.N., Kim, Y.: A streaming ensemble algorithm (sea) for large-scale classification. In: Proceedings of the Seventh ACM SIGKDD International Conference On Knowledge Discovery and Data Mining, pp. 377–382. ACM (2001)

    Google Scholar 

  19. Yu, J., Jang, J., Yoo, J., Park, J.H., Kim, S.: Bagged auto-associative kernel regression-based fault detection and identification approach for steam boilers in thermal power plants. J. Electr. Eng. Technol. 12(4), 1406–1416 (2017)

    Google Scholar 

  20. Li, X., Wang, L., Sung, E.: Adaboost with SVM-based component classifiers. Eng. Appl. Artif. Intell. 21(5), 785–795 (2008)

    Article  Google Scholar 

  21. Bieshaar, M., Zernetsch, S., Hubert, A., Sick, B., Doll, K.: Cooperative starting movement detection of cyclists using convolutional neural networks and a boosted stacking ensemble. IEEE Trans. Intell. Veh. 3(4), 534–544 (2018)

    Article  Google Scholar 

  22. Pathan, M., Patsias, S., Tagarielli, V.: A real-coded genetic algorithm for optimizing the damping response of composite laminates. Comput. Struct. 198, 51–60 (2018)

    Article  Google Scholar 

  23. Rong, X.: Word2vec parameter learning explained. arXiv preprint arXiv:1411.2738 (2014)

  24. Hosmer Jr., D.W., Lemeshow, S., Sturdivant, R.X.: Applied Logistic Regression, vol. 398. Wiley, Hoboken (2013)

    Book  Google Scholar 

  25. Belgiu, M., Drăguţ, L.: Random forest in remote sensing: a review of applications and future directions. ISPRS J. Photogrammetry Remote Sens. 114, 24–31 (2016)

    Article  Google Scholar 

  26. Sarkar, S., Patel, A., Madaan, S., Maiti, J.: Prediction of occupational accidents using decision tree approach. In: INDICON 2017, pp. 1–6. IEEE (2017)

    Google Scholar 

  27. Marchesi, M., Orlandi, G., Piazza, F., Uncini, A.: Fast neural networks without multipliers. IEEE Trans. Neural Netw. 4(1), 53–62 (1993)

    Article  Google Scholar 

  28. Chen, M.S., Manry, M.T.: Conventional modeling of the multilayer perceptron using polynomial basis functions. IEEE Trans. Neural Netw. 4(1), 164–166 (1993)

    Article  Google Scholar 

  29. Patrick, E.A., Fischer III, F.P.: A generalized k-nearest neighbor rule. Inf. Control 16(2), 128–152 (1970)

    Article  MathSciNet  Google Scholar 

  30. Weston, J.: Support vector machine. Tutorial http://www.cs.columbia.edu/~kathy/cs4701/documents/jason_svm_tutorial.pdf. Accessed 10 May 2014

  31. Sarkar, S., Vinay, S., Pateshwari, V., Maiti, J.: Study of optimized SVM for incident prediction of a steel plant in India. In: INDICON 2017, pp. 1–6. IEEE (2017)

    Google Scholar 

  32. Wright, A.H.: Genetic algorithms for real parameter optimization. In: Foundations of Genetic Algorithms, vol.1, pp. 205–218. Elsevier (1991)

    Google Scholar 

  33. Sarkar, S., Vinay, S., Raj, R., Maiti, J., Mitra, P.: Application of optimized machine learning techniques for prediction of occupational accidents. Comput. Oper. Res. 106, 210–224 (2018)

    Article  MathSciNet  Google Scholar 

  34. Sarkar, S., Lohani, A., Maiti, J.: Genetic algorithm-based association rule mining approach towards rule generation of occupational accidents. In: Communications in Computer and Information Science (Springer), vol. 776, pp. 517–530. Springer, Singapore (2017)

    Google Scholar 

  35. Sarkar, S., Lakha, V., Ansari, I., Maiti, J.: Supplier selection in uncertain environment: a fuzzy MCDM approach. In: Proceedings of the First International Conference on Intelligent Computing and Communication, pp. 257–266. Springer (2017)

    Google Scholar 

  36. Sarkar, S., Chain, M., Nayak, S., Maiti, J.: Decision support system for prediction of occupational accident: a case study from a steel plant. In: Emerging Technologies in Data Mining and Information Security, vol. 813, pp. 787–796. Springer, Singapore (2019)

    Google Scholar 

  37. Sarkar, S., Kumar, A., Mohanpuria, S.K., Maiti, J.: Application of bayesian network model in explaining occupational accidents in a steel industry. In: 2017 Third International Conference on Research in Computational Intelligence and Communication Networks (ICRCICN), pp. 337–392. IEEE (2017)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sobhan Sarkar .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Sarkar, S., Pramanik, A., Khatedi, N., Balu, A.S.M., Maiti, J. (2020). GSEL: A Genetic Stacking-Based Ensemble Learning Approach for Incident Classification. In: Singh, P., Panigrahi, B., Suryadevara, N., Sharma, S., Singh, A. (eds) Proceedings of ICETIT 2019. Lecture Notes in Electrical Engineering, vol 605. Springer, Cham. https://doi.org/10.1007/978-3-030-30577-2_64

Download citation

Publish with us

Policies and ethics