Abstract
Context: Training image recognition systems is one of the crucial elements of the AI Engineering process in general and for automotive systems in particular. The quality of data and the training process can have a profound impact on the quality, performance, and safety of automotive software. Objective: Splitting data between train and test sets is one of the crucial elements in this process as it can determine both how well the system learns and generalizes to new data. Typical data splits take into consideration either randomness or timeliness of data points. However, in image recognition systems, the similarity of images is of equal importance. Methods: In this computational experiment, we study the impact of six data-splitting techniques. We use an industrial dataset with high-definition color images of driving sequences to train a YOLOv7 network. Results: The mean average precision (mAP) was 0.943 and 0.841 when the similarity-based and the frame-based splitting techniques were applied, respectively. However, the object-based splitting technique produces the worst mAP score (0.118). Conclusion: There are significant differences in the performance of object detection methods when applying different data-splitting techniques. The most positive results are the random selections, whereas the most objective ones are splits based on sequences that represent different geographical locations.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Berger, V.W., Zhou, Y.: Kolmogorov-smirnov test: overview. Wiley StatsRef: Statistics reference online (2014)
Boukerche, A., Hou, Z.: Object detection using deep learning methods in traffic scenarios. ACM Comput. Surv. (CSUR) 54(2), 1–35 (2021)
Caesar, H., et al.: nuScenes: a multimodal dataset for autonomous driving. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11621–11631 (2020)
Cheng, G., et al.: Towards large-scale small object detection: survey and benchmarks. IEEE Trans. Pattern Anal. Mach. Intell. (2023)
Doan, Q.H., Mai, S.H., Do, Q.T., Thai, D.K.: A cluster-based data splitting method for small sample and class imbalance problems in impact damage classification. Appl. Soft Comput. 120, 108628 (2022)
Geiger, A., Lenz, P., Stiller, C., Urtasun, R.: Vision meets robotics: the kitti dataset. Int. J. Robot. Res. 32(11), 1231–1237 (2013)
Gupta, A., Anpalagan, A., Guan, L., Khwaja, A.S.: Deep learning for object detection and scene perception in self-driving cars: survey, challenges, and open issues. Array 10, 100057 (2021)
Hanusz, Z., Tarasinska, J., Zielinski, W.: Shapiro-Wilk test with known mean. REVSTAT-Stat. J. 14(1), 89–100 (2016)
Huang, Z., Wang, J., Fu, X., Yu, T., Guo, Y., Wang, R.: DC-SPP-YOLO: dense connection and spatial pyramid pooling based yolo for object detection. Inf. Sci. 522, 241–258 (2020)
Jiang, P., Ergu, D., Liu, F., Cai, Y., Ma, B.: A review of yolo algorithm developments. Procedia Comput. Sci. 199, 1066–1073 (2022)
Kiran, B.R., et al.: Deep reinforcement learning for autonomous driving: a survey. IEEE Trans. Intell. Transp. Syst. 23(6), 4909–4926 (2021)
Kosuge, A., Suehiro, S., Hamada, M., Kuroda, T.: mmWave-YOLO: a mmWave imaging radar-based real-time multiclass object recognition system for ADAS applications. IEEE Trans. Instrum. Meas. 71, 1–10 (2022)
Li, Y., Li, S., Du, H., Chen, L., Zhang, D., Li, Y.: YOLO-ACN: focusing on small target and occluded object detection. IEEE Access 8, 227288–227303 (2020)
Liu, L., et al.: Deep learning for generic object detection: a survey. Int. J. Comput. Vision 128, 261–318 (2020)
Lyu, Y., Li, H., Sayagh, M., Jiang, Z.M., Hassan, A.E.: An empirical study of the impact of data splitting decisions on the performance of AIOPs solutions. ACM Trans. Softw. Eng. Methodol. (TOSEM) 30(4), 1–38 (2021)
McKight, P.E., Najab, J.: Kruskal-Wallis test. In: The Corsini Encyclopedia of Psychology, p. 1 (2010)
Meng, Z., McCreadie, R., Macdonald, C., Ounis, I.: Exploring data splitting strategies for the evaluation of recommendation models. In: Proceedings of the 14th ACM Conference on Recommender Systems, pp. 681–686 (2020)
Rashed, H., et al.: Generalized object detection on fisheye cameras for autonomous driving: dataset, representations and baseline. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 2272–2280 (2021)
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788 (2016)
Roriz, R., Cabral, J., Gomes, T.: Automotive lidar technology: a survey. IEEE Trans. Intell. Transp. Syst. 23(7), 6282–6297 (2021)
Snee, R.D.: Validation of regression models: methods and examples. Technometrics 19(4), 415–428 (1977)
Tu, F., Zhu, J., Zheng, Q., Zhou, M.: Be careful of when: an empirical study on time-related misuse of issue tracking data. In: Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, pp. 307–318 (2018)
Wang, C.Y., Bochkovskiy, A., Liao, H.Y.M.: Yolov7: trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. arXiv preprint arXiv:2207.02696 (2022)
Wang, Z., et al.: Cirrus: a long-range bi-pattern lidar dataset. In: 2021 IEEE International Conference on Robotics and Automation (ICRA), pp. 5744–5750. IEEE (2021)
Wen, J., et al.: Convolutional neural networks for classification of Alzheimer’s disease: overview and reproducible evaluation. Med. Image Anal. 63, 101694 (2020)
Wohlin, C., Runeson, P., Höst, M., Ohlsson, M.C., Regnell, B., Wesslén, A.: Experimentation in Software Engineering. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-29044-2
Wu, W., May, R., Dandy, G.C., Maier, H.R.: A method for comparing data splitting approaches for developing hydrological ANN models (2012)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Babu, M.A.A., Pandey, S.K., Durisic, D., Koppisetty, A.C., Staron, M. (2024). Impact of Image Data Splitting on the Performance of Automotive Perception Systems. In: Bludau, P., Ramler, R., Winkler, D., Bergsmann, J. (eds) Software Quality as a Foundation for Security. SWQD 2024. Lecture Notes in Business Information Processing, vol 505. Springer, Cham. https://doi.org/10.1007/978-3-031-56281-5_6
Download citation
DOI: https://doi.org/10.1007/978-3-031-56281-5_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-56280-8
Online ISBN: 978-3-031-56281-5
eBook Packages: Computer ScienceComputer Science (R0)