Abstract
This paper describes our submission to the PAKDD 2020 Alibaba AIOps Competition: Large-scale Disk Failure Prediction. Our approach is based on LightGBM classifier with focal loss objective function. The method ranks third with a F1-score of 0.4047 in the final competition season, while the winning F1-score is 0.4903.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Xin, Q., Miller, E.L., et al.: Reliability mechanisms for very large storage systems. In: Proceedings of 20th IEEE/11th NASA Goddard Conference on IEEE Mass Storage Systems and Technologies, 2003, San Diego, CA, USA, pp. 146–156. IEEE Press (2003)
Wang, Y., Ma, E.W., et al.: A two-step parametric method for failure prediction in hard disk drives. IEEE Trans. Ind. Inform. 10(1), 419–430 (2014)
Hamerly, G., Elkan, C.: Bayesian approaches to failure prediction for disk drives. In: 18th International Conference on Machine Learning, San Francisco, CA, USA, pp. 202–209. ACM (2001)
Murray, J.F., Hughes, G.F., et al.: Machine learning methods for predicting failures in hard drives: a multiple-instance application. J. Mach. Learn. Res. 16, 783–816 (2005)
Yang, W., Hu, D., et al.: Hard drive failure prediction using big data. In: 2015 IEEE 34th Symposium on Reliable Distributed Systems Workshops, Montreal, QC, Canada, pp. 13–18. IEEE (2015)
Kaur, K., Kaur, K.: Failure prediction, lead time estimation and health degree assessment for hard disk drives using voting based decision trees. CMC Comput. Mater. Continua 60(3), 913–946 (2019)
The competition datasets. https://github.com/alibaba-edu/dcbrain/tree/master/diskdata
Xu, Y., Sui, K., et al.: Improving service availability of cloud systems by predicting disk error. In: Proceedings of the 2018 USENIX Annual Technical Conference, Boston, MA, USA, pp. 481–494. USENIX Association (2018)
Chawla, N.V., Bowyer, K.W., et al.: SMOTE: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16(1), 321–357 (2002)
He, H., Bai, Y., et al.: ADASYN: adaptive synthetic sampling approach for imbalanced learning. In: International Joint Conference on Neural Network, Hong Kong, China, pp. 1322–1328. IEEE Press (2008)
Salvaris, M., Dean, D., et al.: Generative adversarial networks. In: Deep Learning with Azure, pp. 187–208. Apress, Berkeley (2018)
Lin, T., Goyal, P., et al.: Focal loss for dense object detection. In: International Conference on Computer Vision, Venice, Italy, pp. 2999–3007. IEEE Press (2017)
Ke, G., Meng, Q., Finley, et al.: LightGBM: a highly efficient gradient boosting decision tree. In: 31st Conference on Neural Information Processing Systems, Long Beach, CA, USA, pp. 3149–3157, Curran Associates (2017)
The LightGBM Python packages. https://lightgbm.readthedocs.io/en/latest/Python-Intro.html
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Zhou, B. (2020). PAKDD 2020 Alibaba AIOps Competition - Large-Scale Disk Failure Prediction: Third Place Team. In: He, C., Feng, M., Lee, P., Wang, P., Han, S., Liu, Y. (eds) Large-Scale Disk Failure Prediction. AI Ops 2020. Communications in Computer and Information Science, vol 1261. Springer, Singapore. https://doi.org/10.1007/978-981-15-7749-9_2
Download citation
DOI: https://doi.org/10.1007/978-981-15-7749-9_2
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-7748-2
Online ISBN: 978-981-15-7749-9
eBook Packages: Computer ScienceComputer Science (R0)