GSEL: A Genetic Stacking-Based Ensemble Learning Approach for Incident Classification

Sarkar, Sobhan; Pramanik, Anima; Khatedi, Nikhil; Balu, A. S. M.; Maiti, J.

doi:10.1007/978-3-030-30577-2_64

Sobhan Sarkar³⁹,
Anima Pramanik³⁹,
Nikhil Khatedi⁴⁰,
A. S. M. Balu⁴⁰ &
…
J. Maiti³⁹

Part of the book series: Lecture Notes in Electrical Engineering ((LNEE,volume 605))

1597 Accesses
5 Citations

Abstract

A large amount of incident narratives is collected across all industries around the world. These narratives are usually written manually, which might include a significant amount of errors. Due to the size and complexity of such narrative data, advanced text mining and natural language processing (NLP) techniques are essentially required to extract useful information. Automatic document classification is one of such important tasks in NLP. Therefore, to reduce the dimensionality problem in data, the study proposes a genetic stacking-based ensemble learning (GSEL) method using Skip-gram model and Doc2vec framework. The classifiers, namely logistic regression, random forest, k-nearest neighbor, multi-layer perceptron (MLP), and support vector machine are used and their outputs are ensembled to produce better accuracy in prediction. A real-coded genetic algorithm (GA) is used to tune the parameters of ensemble method. Results reveal that the proposed approach is capable of handling a huge amount of text data in analysis and predict with enhanced accuracy as compared to other state-of-the-art algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 259.00; Price excludes VAT (USA)

Softcover Book: USD 329.99; Price excludes VAT (USA)

Hardcover Book: USD 329.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Takala, J.: Global estimates of fatal occupational accidents. Epidemiol.-Baltimore 10(5), 640–646 (1999)
Article Google Scholar
Leigh, J.P., Marcin, J.P., Miller, T.R.: An estimate of the us government’s undercount of nonfatal occupational injuries. J. Occup. Environ. Med. 46(1), 10–18 (2004)
Article Google Scholar
Singh, K., Raj, N., Sahu, S., Behera, R., Sarkar, S., Maiti, J.: Modelling safety of gantry crane operations using petri nets. In. J. Injury Control Saf. Promot. 24, 32–43 (2015)
Article Google Scholar
Gautam, S., Maiti, J., Syamsundar, A., Sarkar, S.: Segmented point process models for work system safety analysis. Saf. Sci. 95, 15–27 (2017)
Article Google Scholar
Sarkar, S., Verma, A., Maiti, J.: Prediction of occupational incidents using proactive and reactive data: a data mining approach. In: Industrial Safety Management - 21st Century Perspective of Asia (Springer), pp. 65–79. Springer, Singapore (2018)
Google Scholar
Verma, A., Chatterjee, S., Sarkar, S., Maiti, J.: Data-driven mapping between proactive and reactive measures of occupational safety performance. In: Industrial Safety Management - 21st Century Perspective of Asia (Springer), pp. 53–63. Springer, Singapore (2018)
Google Scholar
Sarkar, S., Baidya, S., Maiti, J.: Application of rough set theory in accident analysis at work: a case study. In: ICRCICN 2017, pp. 245–250. IEEE (2017)
Google Scholar
Sarkar, S., Raj, R., Vinay, S., Malti, J., Pratihar, D.K.: An optimization-based decision tree approach for predicting slip-trip-fall accidents at work. Saf. Sci. 118, 57–69 (2019)
Article Google Scholar
Sarkar, S., Ejaz, N., Maiti, J.: Application of hybrid clustering technique for pattern extraction of accident at work: a case study of a steel industry. In: 2018 4th International Conference on Recent Advances in Information Technology (RAIT), pp. 1–6. IEEE (2018)
Google Scholar
Sarkar, S., Vinay, S., Maiti, J.: Text mining based safety risk assessment and prediction of occupational accidents in a steel plant. In: 2016 International Conference on Computational Techniques in Information and Communication Technologies (ICCTICT), pp. 439–444. IEEE (2016)
Google Scholar
Sparck Jones, K.: A statistical interpretation of term specificity and its application in retrieval. J. Documentation 28(1), 11–21 (1972)
Article Google Scholar
Dadgar, S.M.H., Araghi, M.S., Farahani, M.M.: A novel text mining approach based on TF-IDFand support vector machine for news classification. In: 2016 IEEE International Conference on Engineering and Technology (ICETECH), pp. 112–116. IEEE (2016)
Google Scholar
Sarkar, S., Pateshwari, V., Maiti, J.: Predictive model for incident occurrences in steel plant in India. In: ICCCNT 2017, pp. 1–5. IEEE (2017)
Google Scholar
Dumais, S.T.: Latent semantic analysis. Ann. Rev. Inf. Sci. Technol. 38(1), 188–230 (2004)
Article Google Scholar
Sarwar, B., Karypis, G., Konstan, J., Riedl, J.: Application of dimensionality reduction in recommender system-a case study. Technical report, University Minneapolis Department of Computer Science, Minnesota (2000)
Google Scholar
Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)
Google Scholar
Heuer, H.: Text comparison using word vector representations and dimensionality reduction. arXiv preprint arXiv:1607.00534 (2016)
Street, W.N., Kim, Y.: A streaming ensemble algorithm (sea) for large-scale classification. In: Proceedings of the Seventh ACM SIGKDD International Conference On Knowledge Discovery and Data Mining, pp. 377–382. ACM (2001)
Google Scholar
Yu, J., Jang, J., Yoo, J., Park, J.H., Kim, S.: Bagged auto-associative kernel regression-based fault detection and identification approach for steam boilers in thermal power plants. J. Electr. Eng. Technol. 12(4), 1406–1416 (2017)
Google Scholar
Li, X., Wang, L., Sung, E.: Adaboost with SVM-based component classifiers. Eng. Appl. Artif. Intell. 21(5), 785–795 (2008)
Article Google Scholar
Bieshaar, M., Zernetsch, S., Hubert, A., Sick, B., Doll, K.: Cooperative starting movement detection of cyclists using convolutional neural networks and a boosted stacking ensemble. IEEE Trans. Intell. Veh. 3(4), 534–544 (2018)
Article Google Scholar
Pathan, M., Patsias, S., Tagarielli, V.: A real-coded genetic algorithm for optimizing the damping response of composite laminates. Comput. Struct. 198, 51–60 (2018)
Article Google Scholar
Rong, X.: Word2vec parameter learning explained. arXiv preprint arXiv:1411.2738 (2014)
Hosmer Jr., D.W., Lemeshow, S., Sturdivant, R.X.: Applied Logistic Regression, vol. 398. Wiley, Hoboken (2013)
Book Google Scholar
Belgiu, M., Drăguţ, L.: Random forest in remote sensing: a review of applications and future directions. ISPRS J. Photogrammetry Remote Sens. 114, 24–31 (2016)
Article Google Scholar
Sarkar, S., Patel, A., Madaan, S., Maiti, J.: Prediction of occupational accidents using decision tree approach. In: INDICON 2017, pp. 1–6. IEEE (2017)
Google Scholar
Marchesi, M., Orlandi, G., Piazza, F., Uncini, A.: Fast neural networks without multipliers. IEEE Trans. Neural Netw. 4(1), 53–62 (1993)
Article Google Scholar
Chen, M.S., Manry, M.T.: Conventional modeling of the multilayer perceptron using polynomial basis functions. IEEE Trans. Neural Netw. 4(1), 164–166 (1993)
Article Google Scholar
Patrick, E.A., Fischer III, F.P.: A generalized k-nearest neighbor rule. Inf. Control 16(2), 128–152 (1970)
Article MathSciNet Google Scholar
Weston, J.: Support vector machine. Tutorial http://www.cs.columbia.edu/~kathy/cs4701/documents/jason_svm_tutorial.pdf. Accessed 10 May 2014
Sarkar, S., Vinay, S., Pateshwari, V., Maiti, J.: Study of optimized SVM for incident prediction of a steel plant in India. In: INDICON 2017, pp. 1–6. IEEE (2017)
Google Scholar
Wright, A.H.: Genetic algorithms for real parameter optimization. In: Foundations of Genetic Algorithms, vol.1, pp. 205–218. Elsevier (1991)
Google Scholar
Sarkar, S., Vinay, S., Raj, R., Maiti, J., Mitra, P.: Application of optimized machine learning techniques for prediction of occupational accidents. Comput. Oper. Res. 106, 210–224 (2018)
Article MathSciNet Google Scholar
Sarkar, S., Lohani, A., Maiti, J.: Genetic algorithm-based association rule mining approach towards rule generation of occupational accidents. In: Communications in Computer and Information Science (Springer), vol. 776, pp. 517–530. Springer, Singapore (2017)
Google Scholar
Sarkar, S., Lakha, V., Ansari, I., Maiti, J.: Supplier selection in uncertain environment: a fuzzy MCDM approach. In: Proceedings of the First International Conference on Intelligent Computing and Communication, pp. 257–266. Springer (2017)
Google Scholar
Sarkar, S., Chain, M., Nayak, S., Maiti, J.: Decision support system for prediction of occupational accident: a case study from a steel plant. In: Emerging Technologies in Data Mining and Information Security, vol. 813, pp. 787–796. Springer, Singapore (2019)
Google Scholar
Sarkar, S., Kumar, A., Mohanpuria, S.K., Maiti, J.: Application of bayesian network model in explaining occupational accidents in a steel industry. In: 2017 Third International Conference on Research in Computational Intelligence and Communication Networks (ICRCICN), pp. 337–392. IEEE (2017)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Industrial and Systems Engineering, IIT Kharagpur, Kharagpur, India
Sobhan Sarkar, Anima Pramanik & J. Maiti
Department of Mechanical Engineering, IIT Kharagpur, Kharagpur, India
Nikhil Khatedi & A. S. M. Balu

Authors

Sobhan Sarkar
View author publications
You can also search for this author in PubMed Google Scholar
Anima Pramanik
View author publications
You can also search for this author in PubMed Google Scholar
Nikhil Khatedi
View author publications
You can also search for this author in PubMed Google Scholar
A. S. M. Balu
View author publications
You can also search for this author in PubMed Google Scholar
J. Maiti
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sobhan Sarkar .

Editor information

Editors and Affiliations

Department of Computer Science and Engineering, Jaypee University of Information Technology, Kandaghat, India
Pradeep Kumar Singh
Department of Electrical Engineering, IIT Delhi, Delhi, Delhi, India
Bijaya Ketan Panigrahi
School of Computer and Information Sciences, University of Hyderabad, Hyderabad, Telangana, India
Nagender Kumar Suryadevara
Institute of Information Technology and Management, Delhi, Delhi, India
Sudhir Kumar Sharma
USICT, GGSIPU, Delhi, Delhi, India
Amit Prakash Singh

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Sarkar, S., Pramanik, A., Khatedi, N., Balu, A.S.M., Maiti, J. (2020). GSEL: A Genetic Stacking-Based Ensemble Learning Approach for Incident Classification. In: Singh, P., Panigrahi, B., Suryadevara, N., Sharma, S., Singh, A. (eds) Proceedings of ICETIT 2019. Lecture Notes in Electrical Engineering, vol 605. Springer, Cham. https://doi.org/10.1007/978-3-030-30577-2_64

Download citation

DOI: https://doi.org/10.1007/978-3-030-30577-2_64
Published: 24 September 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-30576-5
Online ISBN: 978-3-030-30577-2
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics