Skip to main content

A Voting-Based Robust Model for Disk Failure Prediction

  • Conference paper
  • First Online:
Large-Scale Disk Failure Prediction (AI Ops 2020)

Abstract

The hard drive failure prediction is a vital part of operating and maintainance issues. With the fast growth of the data-driven artificial intelligence algorithms, more and more recent researches focus on its application on the current topic. Its effectiveness and powerfulness can be observed through a large number of data experiments. Nevertheless, the prediction accuracy is still a challenging task for dealing with extreme imbalance samples, particularly in big data cases. Rather than merely applying one well-defined LGB model, this study develops a novel ensemble learning strategy, i.e. a voting-based model, for improving the prediction accuracy and the reliance. The experiment results show a progress in scores by employing this voting-based model in comparison to the single LGB model. Additionally, a new type of feature, namely the day distance to important dates, was proven to be efficient for improving overall accuracy.

Supported by Alibaba and Shanghai East Low Carbon Technology.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. https://github.com/alibaba-edu/dcbrain/tree/master/diskdata

  2. Han, S., et al.: Robust data preprocessing for machine-learning-based disk failure prediction in cloud production environments

    Google Scholar 

  3. Aussel, N., Jaulin, S., Gandon, G., Petetin, Y., Fazli, E., et al.: Predictive models of hard drive failures based on operational data. In: ICMLA 2017: 16th IEEE International Conference On Machine Learning And Applications, Cancun, Mexico, December 2017, pp. 619–625. https://doi.org/10.1109/ICMLA.2017.00-92. ffhal-01703140

  4. Yang, W., et al.: Hard drive failure prediction using big data. In: 2015 IEEE 34th Symposium on Reliable Distributed Systems Workshops (2015)

    Google Scholar 

  5. https://www.backblaze.com/blog/what-smart-stats-indicate-hard-drive-failures/

  6. https://en.wikipedia.org/wiki/S.M.A.R.T

  7. https://mp.weixin.qq.com/s/LEsJvrB4V3YyOAZP-PGLFA

  8. LightGBM Repository. https://github.com/microsoft/LightGBM

  9. Burges, C.J.C.: From RankNet to LambdaRank to LambdaMART: an overview. Learning 11(23–581), 81 (2010)

    Google Scholar 

Download references

Acknowledgements

This study and experiment sources are strongly support by the Shanghai East Low Carbon Technology Industry Co., Ltd., and Beijing Megvii Co., Ltd. Thanks to Alibaba, PAKDD for hosting and supporting this competition.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Manjie Li .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Li, M., Li, J., Yuan, J. (2020). A Voting-Based Robust Model for Disk Failure Prediction. In: He, C., Feng, M., Lee, P., Wang, P., Han, S., Liu, Y. (eds) Large-Scale Disk Failure Prediction. AI Ops 2020. Communications in Computer and Information Science, vol 1261. Springer, Singapore. https://doi.org/10.1007/978-981-15-7749-9_3

Download citation

  • DOI: https://doi.org/10.1007/978-981-15-7749-9_3

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-15-7748-2

  • Online ISBN: 978-981-15-7749-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics