Abstract
In this paper, we will describe our solution to the PAKDD Cup 2020 Alibaba intelligent operation and maintenance algorithm competition. The biggest challenge of this competition is how to model this problem. In order to maximize the use of data and make model train faster, we turn this problem into a regression problem. By combining GBDT [5] related algorithms like XGBoost [1], LightGBM [2], CatBoost [3, 4] and deep feature engineering and utilizing greedy methods for postprocessing the models’ predictions, our method ranks first in the final standings with F1-Score 49.0683. The corresponding precision and recall are 62.2047 and 40.5128 respectively.
The contribution of three authors to this work is the same.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Ke, G., et al.: LightGBM: a highly efficient gradient boosting decision tree. In: Advances in Neural Information Processing Systems, pp. 3146–3154 (2017)
Chen, T., Guestrin, C.: XGBoost: a scalable tree boosting system. In: Proceedings of the 22Nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 785–794. ACM (2016)
Dorogush, A.V., Gulin, A., Gusev, G., Kazeev, N., Prokhorenkova, L.O., Vorobev, A.: Fighting biases with dynamic boosting (2017). arXiv:1706.09516
Dorogush, A.V., Ershov, V., Gulin, A.: CatBoost: gradient boosting with categorical features support. In: Workshop on ML Systems at NIPS 2017 (2017)
Friedman, J.H.: Greedy function approximation: a gradient boosting machine. Ann. Stat. 29, 1189–1232 (2001)
Disk fault prediction data set. https://github.com/alibaba-edu/dcbrain/tree/master/diskdata
Acknowledgement
Thanks to Tianchi, Alibaba and PAKDD for hosting, creating and supporting this competition.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Zhang, J., Sun, Z., Lu, J. (2020). First Place Solution of PAKDD Cup 2020. In: He, C., Feng, M., Lee, P., Wang, P., Han, S., Liu, Y. (eds) Large-Scale Disk Failure Prediction. AI Ops 2020. Communications in Computer and Information Science, vol 1261. Springer, Singapore. https://doi.org/10.1007/978-981-15-7749-9_4
Download citation
DOI: https://doi.org/10.1007/978-981-15-7749-9_4
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-7748-2
Online ISBN: 978-981-15-7749-9
eBook Packages: Computer ScienceComputer Science (R0)