Abstract
Many Data-driven decisions in manufacturing need accurate and reliable predictions. Due to high complexity and variability of working conditions, a prediction model may deteriorate over time after deployed. Traditional performance evaluation indexes mainly assess the prediction model from a static perspective, which is difficult to meet the actual needs of model selection and proactive maintenance, resulting in unstable online prediction performance. For regression-based prediction models, this paper designs online prediction performance evaluation indexes (OPPEI) to evaluate the prediction model in terms of its accuracy, degradation speed, and stability. For proactive maintenance, this paper proposes a model maintenance evaluation method based on Principal Component Analysis (PCA). We use PCA to transform various performance indexes and extract the first principal component as a model maintenance evaluation index, which could reduce the over-sensitive or insensitive phenomenon of single indicator. The effectiveness of online prediction performance evaluation indexes and PCA-based proactive maintenance evaluation method are verified by simulation and several real-world load forecasting experiments.
Similar content being viewed by others
Data availability
The datasets generated and analysed during the current study are not publicly available due the fact that they constitute an excerpt of research in progress but are available from the corresponding author on reasonable request.
References
Abdi, H., & Williams, L. J. (2010). Principal component analysis. Wiley Interdisciplinary Reviews: Computational Statistics, 2(4), 433–459. https://doi.org/10.1002/wics.101
Agrahari, S., & Singh, A. K. (2021). Concept drift detection in data stream mining: A literature review. Journal of King Saud University-Computer and Information Sciences, 34(10), 9523–9540. https://doi.org/10.1016/j.jksuci.2021.11.006
Bahri, M., Bifet, A., Gama, J., Gomes, H. M., & Maniu, S. (2021). Data stream analysis: Foundations, major tasks and tools. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 11(3), 1405. https://doi.org/10.1002/widm.1405
Barbiero, P., Squillero, G., & Tonda, A. (2020) Modeling generalization in machine learning: A methodological and computational study. https://doi.org/10.48550/arXiv.2006.15680
Cao, L. (2022). Beyond iid: non-iid thinking, informatics, and learning. IEEE Intelligent Systems, 37(4), 5–17. https://doi.org/10.1109/MIS.2022.3194618
Cavalcante, R.C., Minku, L.L., & Oliveira, A.L. (2016). Fedd: Feature extraction for explicit concept drift detection in time series. In 2016 International joint conference on neural networks (IJCNN) (pp. 740-747). https://doi.org/10.1109/IJCNN.2016.7727274. IEEE
Chen, S., & He, H. (2011). Towards incremental learning of nonstationary imbalanced data stream: A multiple selectively recursive approach. Evolving Systems, 2(1), 35–50. https://doi.org/10.1007/s12530-010-9021-y
De Ridder, F., Pintelon, R., Schoukens, J., & Gillikin, D. P. (2005). Modified AIC and mdl model selection criteria for short data records. IEEE Transactions on Instrumentation and Measurement, 54(1), 144–150. https://doi.org/10.1109/TIM.2004.838132
Gama, J. (2012). A survey on learning from data streams: Current and future trends. Progress in Artificial Intelligence, 1, 45–55. https://doi.org/10.1007/s13748-011-0002-6
Gebraeel, N., & Pan, J. (2008). Prognostic degradation models for computing and updating residual life distributions in a time-varying environment. IEEE Transactions on Reliability, 57(4), 539–550. https://doi.org/10.1109/TR.2008.928245
Ghimire, S., Deo, R. C., Casillas-Pérez, D., Salcedo-Sanz, S., Sharma, E., & Ali, M. (2022). Deep learning CNN-LSTM-MLP hybrid fusion model for feature optimizations and daily solar radiation prediction. Measurement, 202, 111759. https://doi.org/10.1016/j.measurement.2022.111759
Golmohammadi, E., & Ardakan, M. A. (2022). Reliability optimization problem with the mixed strategy, degrading components, and a periodic inspection and maintenance policy. Reliability Engineering & System Safety, 223, 108500. https://doi.org/10.1016/j.ress.2022.108500
Green, D. H., Langham, A. W., Agustin, R. A., Quinn, D. W., & Leeb, S. B. (2022). Physics-informed feature space evaluation for diagnostic power monitoring. IEEE Transactions on Industrial Informatics, 19(3), 2363–2373. https://doi.org/10.1109/TII.2022.3202798
Hinton, G. E., & Salakhutdinov, R. R. (2006). Reducing the dimensionality of data with neural networks. Science, 313(5786), 504–507. https://doi.org/10.1126/science.1127647
Hyndman, R. J., & Koehler, A. B. (2006). Another look at measures of forecast accuracy. International Journal of Forecasting, 22(4), 679–688. https://doi.org/10.1016/j.ijforecast.2006.03.001
Jiang, L., Yao, J., Shi, L., & Han, Z. (2023). A fusion recommendation model based on mutual information and attention learning in heterogeneous social networks. Future Generation Computer Systems, 148, 128–138. https://doi.org/10.1016/j.future.2023.05.027
Kinney, J. B., & Atwal, G. S. (2014). Equitability, mutual information, and the maximal information coefficient. Proceedings of the National Academy of Sciences, 111(9), 3354–3359. https://doi.org/10.1073/pnas.1309933111
Krawczyk, B., Minku, L. L., Gama, J., Stefanowski, J., & Woźniak, M. (2017). Ensemble learning for data stream analysis: A survey. Information Fusion, 37, 132–156. https://doi.org/10.1016/j.inffus.2017.02.004
Kusiak, A. (2023). Predictive models in digital manufacturing: Research, applications, and future outlook. International Journal of Production Research, 61(17), 6052–6062. https://doi.org/10.1080/00207543.2022.2122620
Kusiak, A., Li, M., & Zhang, Z. (2010). A data-driven approach for steam load prediction in buildings. Applied Energy, 87(3), 925–933. https://doi.org/10.1016/j.apenergy.2009.09.004
Kusiak, A., & Zhang, Z. (2010). Short-horizon prediction of wind power: A data-driven approach. IEEE Transactions on Energy Conversion, 25(4), 1112–1122. https://doi.org/10.1109/TEC.2010.2043436
Lee, J., Ghaffari, M., & Elmeligy, S. (2011). Self-maintenance and engineering immune systems: Towards smarter machines and manufacturing systems. Annual Reviews in Control, 35(1), 111–122. https://doi.org/10.1016/j.arcontrol.2011.03.007
Li, H., Wang, S., Wan, R., & Kot, A. C. (2020). Gmfad: Towards generalized visual recognition via multilayer feature alignment and disentanglement. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(3), 1289–1303. https://doi.org/10.1109/TPAMI.2020.3020554
Li, J., Dai, Q., & Ye, R. (2019). A novel double incremental learning algorithm for time series prediction. Neural Computing and Applications, 31, 6055–6077. https://doi.org/10.1007/s00521-018-3434-0
Luo, X., Sun, J., Wang, L., Wang, W., Zhao, W., Wu, J., Wang, J.-H., & Zhang, Z. (2018). Short-term wind speed forecasting via stacked extreme learning machine with generalized correntropy. IEEE Transactions on Industrial Informatics, 14(11), 4963–4971. https://doi.org/10.1109/TII.2018.2854549
Montgomery, D. C., & Runger, G. C. (2010). Applied statistics and probability for engineers. Wiley.
Qi, M., & Zhang, G. P. (2001). An investigation of model selection criteria for neural network time series forecasting. European Journal of Operational Research, 132(3), 666–680. https://doi.org/10.1016/S0377-2217(00)00171-5
Rodriguez, J. D., Perez, A., & Lozano, J. A. (2009). Sensitivity analysis of k-fold cross validation in prediction error estimation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(3), 569–575. https://doi.org/10.1109/TPAMI.2009.187
Salami, B. A., Rahman, S. M., Oyehan, T. A., Maslehuddin, M., & Al Dulaijan, S. U. (2020). Ensemble machine learning model for corrosion initiation time estimation of embedded steel reinforced self-compacting concrete. Measurement, 165, 108141. https://doi.org/10.1016/j.measurement.2020.108141
Sarker, I. H. (2022). Ai-based modeling: Techniques, applications and research issues towards automation, intelligent and smart systems. SN Computer Science, 3(2), 158. https://doi.org/10.1007/s42979-022-01043-x
Shahraki, A., Abbasi, M., Taherkordi, A., & Jurcut, A. D. (2022). A comparative study on online machine learning techniques for network traffic streams analysis. Computer Networks, 207, 108836. https://doi.org/10.1016/j.comnet.2022.108836
Shen, Y., Song, Z., & Kusiak, A. (2021). Enhancing the generalizability of predictive models with synergy of data and physics. Measurement Science and Technology, 33(3), 034002. https://doi.org/10.1088/1361-6501/ac3944
Shi, J., Guo, J., & Zheng, S. (2012). Evaluation of hybrid forecasting approaches for wind speed and power generation time series. Renewable and Sustainable Energy Reviews, 16(5), 3471–3480. https://doi.org/10.1016/j.rser.2012.02.044
Song, Z., Jiang, Y., & Zhang, Z. (2014). Short-term wind speed forecasting with Markov-switching model. Applied Energy, 130, 103–112. https://doi.org/10.1016/j.apenergy.2014.05.026
Tennant, M., Stahl, F., Rana, O., & Gomes, J. B. (2017). Scalable real-time classification of data streams with concept drift. Future Generation Computer Systems, 75, 187–199. https://doi.org/10.1016/j.future.2017.03.026
Walser, T., & Sauer, A. (2021). Typical load profile-supported convolutional neural network for short-term load forecasting in the industrial sector. Energy and AI, 5, 100104. https://doi.org/10.1016/j.egyai.2021.100104
Wang, J., Lan, C., Liu, C., Ouyang, Y., Qin, T., Lu, W., Chen, Y., Zeng, W., & Yu, P. (2023). Generalizing to unseen domains: A survey on domain generalization. IEEE Transactions on Knowledge and Data Engineering, 35(8), 8052–8072. https://doi.org/10.1109/TKDE.2022.3178128
Wang, L., Zhang, Z., Long, H., Xu, J., & Liu, R. (2016). Wind turbine gearbox failure identification with deep neural networks. IEEE Transactions on Industrial Informatics, 13(3), 1360–1368. https://doi.org/10.1109/TII.2016.2607179
Wang, M., & Barbu, A. (2022). Online feature screening for data streams with concept drift. IEEE Transactions on Knowledge and Data Engineering, 35(11), 11693–11707. https://doi.org/10.1109/TKDE.2022.3232752
Yan, M. M. W. (2020). Accurate detecting concept drift in evolving data streams. ICT Express, 6(4), 332–338. https://doi.org/10.1016/j.icte.2020.05.011
Yang, S., Wu, J., Du, Y., He, Y., & Chen, X. (2017). Ensemble learning for short-term traffic prediction based on gradient boosting machine. Journal of Sensors. https://doi.org/10.1155/2017/7074143
Yue, F., Chen, C., Yan, Z., Chen, C., Guo, Z., Zhang, Z., Chen, Z., Zhang, F., & Lv, X. (2020). Fourier transform infrared spectroscopy combined with deep learning and data enhancement for quick diagnosis of abnormal thyroid function. Photodiagnosis and Photodynamic Therapy, 32, 101923. https://doi.org/10.1016/j.pdpdt.2020.101923
Zhang, B., Wu, J.-L., & Chang, P.-C. (2018). A multiple time series-based recurrent neural network for short-term load forecasting. Soft Computing, 22, 4099–4112. https://doi.org/10.1007/s00500-017-2624-5
Zhang, Z.-Y., Zhao, P., Jiang, Y., & Zhou, Z.-H. (2019). Learning from incomplete and inaccurate supervision. In Proceedings of the 25th ACM SIGKDD International conference on knowledge discovery & data mining (pp. 1017–1025). https://doi.org/10.1145/3292500.3330902
Zheng, W., Zhao, P., Chen, G., Zhou, H., & Tian, Y. (2022). A hybrid spiking neurons embedded LSTM network for multivariate time series learning under concept-drift environment. IEEE Transactions on Knowledge and Data Engineering, 35(7), 6561–6574. https://doi.org/10.1109/TKDE.2022.3178176
Zhou, K., Yang, Y., Hospedales, T., & Xiang, T. (2020). Learning to generate novel domains for domain generalization. In Computer vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XVI 16 (pp. 561–578). Springer. https://doi.org/10.1007/978-3-030-58517-4_33
Zhu, J., Shen, Y., Song, Z., Zhou, D., Zhang, Z., & Kusiak, A. (2019). Data-driven building load profiling and energy management. Sustainable Cities and Society, 49, 101587. https://doi.org/10.1016/j.scs.2019.101587
Zou, H., & Yang, Y. (2004). Combining time series models for forecasting. International Journal of Forecasting, 20(1), 69–84. https://doi.org/10.1016/S0169-2070(03)00004-9
Züfle, M., Erhard, F., & Kounev, S. (2021). Machine learning model update strategies for hard disk drive failure prediction. In 2021 20th IEEE International conference on machine learning and applications (ICMLA) (pp. 1379–1386). IEEE. https://doi.org/10.1109/ICMLA52953.2021.00223
Funding
The research leading to these results received funding from Nanjing University under Grant Agreement No 2022300018.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare no conflict of interest.
Ethical approval
We are sure that our manuscript complies to the Ethical Rules of this journal. The submitted work is original and has not been published elsewhere in any form or language.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Shen, Y., Wang, T. & Song, Z. Online performance and proactive maintenance assessment of data driven prediction models. J Intell Manuf (2024). https://doi.org/10.1007/s10845-024-02357-8
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s10845-024-02357-8