概要
目 的
由于使用的列车运行数据偏离系统应用的数据特征, 现代智能交通系统部署的模型推理结果可能不可靠. 本文旨在研究部署在系统中的模型使用的数据变化对模型性能的影响, 通过研究衡量模型可信性方法, 实现在现实场景中无需标注数据实时检测部署在系统中的模型可信性.
创新点
1. 提出一种复杂完整性约束概念, 在无标注数据的情况下, 衡量模型使用数据的不安全程度. 2. 为实现现代智能交通系统实时检测模型的可信性, 我们设计一种新颖的算法, 利用位向量索引技术和规则推理系统, 快速发现模型应用数据的复杂完整性约束.
方 法
1. 通过输入部署在现代智能交通系统模型中的训练数据, 系统构建面向输入数据的索引向量从而避免对大规模数据进行多次. 2. 通过规则推理系统和支持度剪枝技术, 将语意重复的冗余约束和一些无意义的约束忽略, 得到有效的复杂完整性约束. 3. 利用完整性约束计算违反约束的数据在数据集中的比例从而衡量模型使用的数据不安全程度. 4. 通过使用真实的列车运行数据集测试, 分析复杂完整性约束衡量的数据不安全程度和模型性能的关系, 从而验证复杂完整性约束的可行性和有效性.
结 论
1. 模型使用的数据偏离模型训练数据特征会影响模型的性能. 2. 通过发现复杂完整性约束, 衡量模型使用的数据不安全程度, 可以快速检测部署的模型可信性. 3. 通过对模型可信性的研究, 可以无需标注而快速发现不可信的模型, 从而及时重新部署可信模型, 提升现代智能交通系统的稳定性.
References
Ak R, Fink O, Zio E, 2016. Two machine learning approaches for short-term wind speed time-series prediction. IEEE Transactions on Neural Networks and Learning Systems, 27(8):1734–1747. https://doi.org/10.1109/TNNLS.2015.2418739
Azzedine B, Zheng LN, Alfandi O, 2021. Outlier detection: methods, models, and classification. ACM Computing Surveys, 53(3):1–37. https://doi.org/10.1145/3381028
Bai QB, Bedi AS, Agarwal M, et al., 2022. Achieving zero constraint violation for constrained reinforcement learning via primal-dual approach. Proceedings of the 36th AAAI Conference on Artificial Intelligence, p.3682–3689.
Berti-Équille L, Harmouch H, Naumann F, et al., 2018. Discovery of genuine functional dependencies from relational data with missing values. Proceedings of the VLDB Endowment, 11(8):880–892. https://doi.org/10.14778/3204028.3204032
Bleifuß T, Kruse S, Naumann F, 2017. Efficient denial constraint discovery with hydra. Proceedings of the VLDB Endowment, 11(3):311–323. https://doi.org/10.14778/3157794.3157800
Caruccio L, Deufemia V, Polese G, 2016. Relaxed functional dependencies—a survey of approaches. IEEE Transactions on Knowledge and Data Engineering, 28(1):147–165. https://doi.org/10.1109/TKDE.2015.2472010
Chen HT, Jiang B, Ding SX, et al., 2022. Data-driven fault diagnosis for traction systems in high-speed trains: a survey, challenges, and perspectives. IEEE Transactions on Intelligent Transportation Systems, 23(3):1700–1716. https://doi.org/10.1109/TITS.2020.3029946
Fan WF, Geerts F, Li JZ, et al., 2011. Discovering conditional functional dependencies. IEEE Transactions on Knowledge and Data Engineering, 23(5):683–698. https://doi.org/10.1109/TKDE.2010.154
Fan WF, Hu CM, Liu XL, et al., 2020. Discovering graph functional dependencies. ACM Transactions on Database Systems, 45(3):15. https://doi.org/10.1145/3397198
Ho LV, Nguyen HD, de Roeck G, et al., 2021. Damage detection in steel plates using feed-forward neural network coupled with hybrid particle swarm optimization and gravitational search algorithm. Journal of Zhejiang University-SCIENCE A (Applied Physics & Engineering), 22(6): 467–480. https://doi.org/10.1631/jzus.A2000316
Hu QX, Long JS, Wang SK, et al., 2021. A novel time-span input neural network for accurate municipal solid waste incineration boiler steam temperature prediction. Journal of Zhejiang University-SCIENCE A (Applied Physics & Engineering), 22(10):777–791. https://doi.org/10.1631/jzus.A2000529
Hu WT, Zhang DX, Jiang DW, et al., 2020. AUDITOR: a system designed for automatic discovery of complex integrity constraints in relational databases. Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data, p.2697–2700. https://doi.org/10.1145/3318464.3384683
Huhtala Y, Kärkkäinen J, Porkka P, et al., 1999. Tane: an efficient algorithm for discovering functional and approximate dependencies. The Computer Journal, 42(2):100–111. https://doi.org/10.1093/comjnl/42.2.100
Kieu T, Yang B, Guo CJ, et al., 2019. Outlier detection for time series with recurrent autoencoder ensembles. Proceedings of the 28th International Joint Conference on Artificial Intelligence, p.2725–2732. https://doi.org/10.24963/ijcai.2019/378
Kossmann J, Papenbrock T, Naumann F, 2022. Data dependencies for query optimization: a survey. The VLDB Journal, 31(1):1–22. https://doi.org/10.1007/s00778-021-00676-3
Kruse S, Naumann F, 2018. Efficient discovery of approximate dependencies. Proceedings of the VLDB Endowment, 11(7):759–772. https://doi.org/10.14778/3192965.3192968
Livshits E, Kimelfeld B, Roy S, 2020. Computing optimal repairs for functional dependencies. ACM Transactions on Database Systems, 45(1):4. https://doi.org/10.1145/3360904
Malini N, Pushpa M, 2017. Analysis on credit card fraud identification techniques based on KNN and outlier detection. Proceedings of the 3rd International Conference on Advances in Electrical, Electronics, Information, Communication and Bio-Informatics, p.255–258. https://doi.org/10.1109/AEEICB.2017.7972424
Pena EHM, de Almeida EC, Naumann F, 2019. Discovery of approximate (and exact) denial constraints. Proceedings of the VLDB Endowment, 13(3):266–278. https://doi.org/10.14778/3368289.3368293
Pena EHM, de Almeida EC, Naumann F, 2021. Fast detection of denial constraint violations. Proceedings of the VLDB Endowment, 15(4):859–871. https://doi.org/10.14778/3503585.3503595
Qahtan A, Tang N, Ouzzani M, et al., 2020. Pattern functional dependencies for data cleaning. Proceedings of the VLDB Endowment, 13(5):684–697. https://doi.org/10.14778/3377369.3377377
Ranjan KG, Tripathy DS, Prusty BR, et al., 2021. An improved sliding window prediction-based outlier detection and correction for volatile time-series. International Journal of Numerical Modelling: Electronic Networks, Devices and Fields, 34(1):e2816. https://doi.org/10.1002/jnm.2816
Sharma V, Chandel SS, 2013. Performance and degradation analysis for long term reliability of solar photovoltaic systems: a review. Renewable and Sustainable Energy Reviews, 27:753–767. https://doi.org/10.1016/j.rser.2013.07.046
Tan P, Li XF, Xu JM, et al., 2020. Catenary insulator defect detection based on contour features and gray similarity matching. Journal of Zhejiang University-SCIENCE A (Applied Physics & Engineering), 21(1):64–73. https://doi.org/10.1631/jzus.A1900341
Wu PZ, Yang W, Wang HC, et al., 2020. GDS: general distributed strategy for functional dependency discovery algorithms. Proceedings of the 25th International Conference on Database Systems for Advanced Applications, p.270–278. https://doi.org/10.1007/978-3-030-59410-7_17
Zhou P, Li T, Zhao CF, et al., 2020. Numerical study on the flow field characteristics of the new high-speed maglev train in open air. Journal of Zhejiang University-SCIENCE A (Applied Physics & Engineering), 21(5):366–381. https://doi.org/10.1631/jzus.A1900412
Zhu L, Yu FR, Wang YG, et al., 2019. Big data analytics in intelligent transportation systems: a survey. IEEE Transactions on Intelligent Transportation Systems, 20(1):383–398. https://doi.org/10.1109/TITS.2018.2815678
Acknowledgments
This work is supported by the Key Research and Development Program of Zhejiang Province of China (No. 2021C01009) and the Fundamental Research Funds for the Central Universities, China.
Author information
Authors and Affiliations
Contributions
Wen-tao HU, Da-wei JIANG, Sai WU, Ke CHEN, and Gang CHEN designed the research. Wen-tao HU wrote the first draft of the manuscript. Da-wei JIANG and Sai WU participated in the theoretical model design. Ke CHEN processed the corresponding data. Gang CHEN helped to make the experiment and organize the manuscript. Wen-tao HU and Gang CHEN revised and edited the final version.
Corresponding author
Ethics declarations
Wen-tao HU, Da-wei JIANG, Sai WU, Ke CHEN, and Gang CHEN declare that they have no conflict of interest.
Additional information
Electronic supplementary materials
Data S1
Electronic Supplementary Materials
Rights and permissions
About this article
Cite this article
Hu, Wt., Jiang, Dw., Wu, S. et al. Complex integrity constraint discovery: measuring trust in modern intelligent railroad systems. J. Zhejiang Univ. Sci. A 23, 832–837 (2022). https://doi.org/10.1631/jzus.A2200156
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1631/jzus.A2200156