Springer Nature is making SARS-CoV-2 and COVID-19 research free. View research | View latest news | Sign up for updates

A Study on Prediction Model of Equipment Failure Through Analysis of Big Data Based on RHadoop


With the development of the internet of things, which is widely applied not only to everyday objects but also to industrial areas, the production of big data is accelerating. To provide intelligent services without human intervention in the internet of things environment, intelligent communication between objects becomes the key, and since the failure of the mechanical equipment attached to the sensor causes malfunction of the object and product failure, big data analysis to predict equipment failure is becoming more important. The purpose of this study is to propose a model for predicting mechanical equipment failure from various sense data collected in the manufacturing process. This study constructed a RHadoop-based big data platform to distribute a large amount of datasets for research, and performed logistic regression modeling to predict the main variables causing the failure from various collected variables. As a result of the study, the main variables in the manufacturing process that cause equipment failure were derived from the collected sense data, and the fitness and performance evaluation for the prediction model were made using the ROC curve. As a result of the performance evaluation of the prediction model, the ROC curve showed a fairly high prediction accuracy with AUC close to 1. The results of this study are expected to be applicable to the prediction of malfunctions, product failure, or abnormal communication between objects due to miscellaneous product faults in our daily lives in the internet of things environment.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8


  1. 1.

    Yun, J. R. (2016). 4th industrial revolution and soft power. TTA Journal, 167, 4–7.

  2. 2.

    Lammers, D. 2016. Fabs in the internet of things era.

  3. 3.

    Industrial internet insights report for 2015.

  4. 4.

    Kim, M. S., & Choi, J. H. (2016). Understanding of the fourth industrial revolution and industrial IoT. Industrial internet. Korea Information Society Development Institute, 28(12), 20–26.

  5. 5.

    Ji, E. H. (2015). Industrial IoT, emerging as the core of manufacturing innovation. Monthly Software Oriented Society, 6, 36–40.

  6. 6.

    Yu, Y., Jia, Z., Tao, W., Xue, B., & Lee, C. H. (2017). An efficient trust evaluation scheme for node behavior detection in the internet of things. Wireless Personal Communications, 93(2), 571–587.

  7. 7.

    Makoto, S. (2013). The impact of big data. Korea: HANBIT Media.

  8. 8.

    Codd, E. F. (1970). A relational model of data for large shared data banks. Communications of the ACM, 13(6).

  9. 9.

    Kim, Y. H., Kim, W. Y., & Kim, U. M. (2009). An efficient method for mining frequent patterns based on weighted support over data streams. Journal of the Korea Academia-Industrial Cooperation Society, 10(8), 1998–2004.

  10. 10.

    Kholghi, M., & Keyvanpour, M. (2011). An analytical framework for data stream mining techniques based on challenges and requirements. International Journal of Engineering Science and Technology, 3(3), 2507–2513.

  11. 11.

    Yang, J. K., Cheon, K. M., & Byun, Y. W. (2105). Manufacturing process improvement of smart phone camera body using data mining and RSM mixture model. In Proceeding of the Korea Academia-Industrial cooperation Society, (pp. 73–75).

  12. 12.

    Kang, E. Y. (2009). A mining-based healthcare multi-agent system in ubiquitous environments. Journal of the Korea Academia-Industrial cooperation Society, 10(9), 2354–2360.

  13. 13.

    Apache Hadoop. (2016).

  14. 14.

    LG CNS blog. (2013). Introduction of LG CNS’s big data platform using the Hadoop framework’s special features.

  15. 15.

    Log analysis system using Hadoop and MongoDB. (2016).

  16. 16.

    Park, S. T., Kim, Y. R., Jeong, S. P., Hong, C. I., & Kang, T. G. (2016). A case study on effective technique of distributed data storage for big data processing in the wireless internet environment. Wireless Personal Communications, 86(1), 239–253.

  17. 17.

    Cho, Y. T., Lee, W. J., Lee, I. G., On, B. W., & Choi, J. I. (2015). Analyzing smart grid energy data using Hadoop based big data system. The Transactions of the Korean Institute of Electrical Engineers, 64P(2), 85–91.

  18. 18.

    GE. (2016).

  19. 19.

    Prajapati, V. (2013). Big data analytics with R and Hadoop. Birmingham, UK: Packt Publishing.

  20. 20.

    Cleveland, W. S., & Guha, S. (2010). Computing environment for the statistical analysis of large and complex data, Doctoral Dissertation, Purdue University West Lafayette.

  21. 21.

    Oancea, B., & Dragoescu, R. M. (2014). Integrating R and Hadoop for big data analysis. Revista Română de Statistică nr., 2, 83–94.

  22. 22.

    Zheng, Z., Wang, P., Liu, J., & Sun, S. (2015). Real-time big data processing framework: Challenges and solutions. Applied Mathematics and Information Sciences, 9(6), 3169–3190.

  23. 23.

    Kim, S. H., & Na, W. S. (2016). Safe data transmission architecture based on cloud for internet of things. Wireless Personal Communications, 86(1), 287–300.

  24. 24.

    Kang, Y. H. (2013). Performance analysis of MapReduce application on private cloud by using openstack. Journal of KITT, 11(12), 177–183.

  25. 25.

    Yu, K. J. (2016). Analysis in adoption of cloud computing in Korea Regarding policy of developed countries. KPMG International, Issue Monitor, May 2016.

  26. 26.

    Suciu, G., Vulpe, A., Martian, A., Halunga, S., & Vizireanu, D. N. (2016). Big data processing for renewable energy telemetry using a decentralized cloud M2M system. Wireless Personal Communications, 87(3), 1113–1128.

  27. 27.

    Mitchell, T. M. (1997). Machine learning. Ithaca, NY: McGraw-Hill Science.

  28. 28.

    Kim, E. J. (2016). Introduction to artificial intelligence, machine learning, and deep learning. Korea: Books Wiki.

  29. 29.

    Cho, S. J., & Kang, S. H. (2016). Industrial applications of machine learning (artificial intelligence). Korean Institute Industrial Engineers ie Magazine, 23(2), 34–38.

  30. 30.

    Park, C. Y., Kim, Y. D., Kim, J. S., Song, J. W., & Choi, H. S. (2013). Data mining using R. Korea: KyoWooSa.

  31. 31.

    LG CNS blog. (2015). What do I need to implement the internet of things?

  32. 32.

    Wang, Y., Chen, I. R., & Wang, D. C. (2015). A survey of mobile cloud computing applications: Perspectives and challenges. Wireless Personal Communications, 80(4), 1607–1623.

Download references

Author information

Correspondence to Jin-Hee Ku.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Ku, J. A Study on Prediction Model of Equipment Failure Through Analysis of Big Data Based on RHadoop. Wireless Pers Commun 98, 3163–3176 (2018).

Download citation


  • Big data
  • Internet of things
  • Machine learning
  • Prediction model
  • RHadoop