Abstract
The hard drive failure prediction is a vital part of operating and maintainance issues. With the fast growth of the data-driven artificial intelligence algorithms, more and more recent researches focus on its application on the current topic. Its effectiveness and powerfulness can be observed through a large number of data experiments. Nevertheless, the prediction accuracy is still a challenging task for dealing with extreme imbalance samples, particularly in big data cases. Rather than merely applying one well-defined LGB model, this study develops a novel ensemble learning strategy, i.e. a voting-based model, for improving the prediction accuracy and the reliance. The experiment results show a progress in scores by employing this voting-based model in comparison to the single LGB model. Additionally, a new type of feature, namely the day distance to important dates, was proven to be efficient for improving overall accuracy.
Supported by Alibaba and Shanghai East Low Carbon Technology.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Han, S., et al.: Robust data preprocessing for machine-learning-based disk failure prediction in cloud production environments
Aussel, N., Jaulin, S., Gandon, G., Petetin, Y., Fazli, E., et al.: Predictive models of hard drive failures based on operational data. In: ICMLA 2017: 16th IEEE International Conference On Machine Learning And Applications, Cancun, Mexico, December 2017, pp. 619–625. https://doi.org/10.1109/ICMLA.2017.00-92. ffhal-01703140
Yang, W., et al.: Hard drive failure prediction using big data. In: 2015 IEEE 34th Symposium on Reliable Distributed Systems Workshops (2015)
https://www.backblaze.com/blog/what-smart-stats-indicate-hard-drive-failures/
LightGBM Repository. https://github.com/microsoft/LightGBM
Burges, C.J.C.: From RankNet to LambdaRank to LambdaMART: an overview. Learning 11(23–581), 81 (2010)
Acknowledgements
This study and experiment sources are strongly support by the Shanghai East Low Carbon Technology Industry Co., Ltd., and Beijing Megvii Co., Ltd. Thanks to Alibaba, PAKDD for hosting and supporting this competition.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Li, M., Li, J., Yuan, J. (2020). A Voting-Based Robust Model for Disk Failure Prediction. In: He, C., Feng, M., Lee, P., Wang, P., Han, S., Liu, Y. (eds) Large-Scale Disk Failure Prediction. AI Ops 2020. Communications in Computer and Information Science, vol 1261. Springer, Singapore. https://doi.org/10.1007/978-981-15-7749-9_3
Download citation
DOI: https://doi.org/10.1007/978-981-15-7749-9_3
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-7748-2
Online ISBN: 978-981-15-7749-9
eBook Packages: Computer ScienceComputer Science (R0)