Skip to main content

Impact of Image Data Splitting on the Performance of Automotive Perception Systems

  • Conference paper
  • First Online:
Software Quality as a Foundation for Security (SWQD 2024)

Abstract

Context: Training image recognition systems is one of the crucial elements of the AI Engineering process in general and for automotive systems in particular. The quality of data and the training process can have a profound impact on the quality, performance, and safety of automotive software. Objective: Splitting data between train and test sets is one of the crucial elements in this process as it can determine both how well the system learns and generalizes to new data. Typical data splits take into consideration either randomness or timeliness of data points. However, in image recognition systems, the similarity of images is of equal importance. Methods: In this computational experiment, we study the impact of six data-splitting techniques. We use an industrial dataset with high-definition color images of driving sequences to train a YOLOv7 network. Results: The mean average precision (mAP) was 0.943 and 0.841 when the similarity-based and the frame-based splitting techniques were applied, respectively. However, the object-based splitting technique produces the worst mAP score (0.118). Conclusion: There are significant differences in the performance of object detection methods when applying different data-splitting techniques. The most positive results are the random selections, whereas the most objective ones are splits based on sequences that represent different geographical locations.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://figshare.com/s/48cede3fffc2ff3c92df.

  2. 2.

    https://github.com/WongKinYiu/yolov7.

References

  1. Berger, V.W., Zhou, Y.: Kolmogorov-smirnov test: overview. Wiley StatsRef: Statistics reference online (2014)

    Google Scholar 

  2. Boukerche, A., Hou, Z.: Object detection using deep learning methods in traffic scenarios. ACM Comput. Surv. (CSUR) 54(2), 1–35 (2021)

    Article  Google Scholar 

  3. Caesar, H., et al.: nuScenes: a multimodal dataset for autonomous driving. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11621–11631 (2020)

    Google Scholar 

  4. Cheng, G., et al.: Towards large-scale small object detection: survey and benchmarks. IEEE Trans. Pattern Anal. Mach. Intell. (2023)

    Google Scholar 

  5. Doan, Q.H., Mai, S.H., Do, Q.T., Thai, D.K.: A cluster-based data splitting method for small sample and class imbalance problems in impact damage classification. Appl. Soft Comput. 120, 108628 (2022)

    Article  Google Scholar 

  6. Geiger, A., Lenz, P., Stiller, C., Urtasun, R.: Vision meets robotics: the kitti dataset. Int. J. Robot. Res. 32(11), 1231–1237 (2013)

    Article  Google Scholar 

  7. Gupta, A., Anpalagan, A., Guan, L., Khwaja, A.S.: Deep learning for object detection and scene perception in self-driving cars: survey, challenges, and open issues. Array 10, 100057 (2021)

    Article  Google Scholar 

  8. Hanusz, Z., Tarasinska, J., Zielinski, W.: Shapiro-Wilk test with known mean. REVSTAT-Stat. J. 14(1), 89–100 (2016)

    MathSciNet  Google Scholar 

  9. Huang, Z., Wang, J., Fu, X., Yu, T., Guo, Y., Wang, R.: DC-SPP-YOLO: dense connection and spatial pyramid pooling based yolo for object detection. Inf. Sci. 522, 241–258 (2020)

    Article  MathSciNet  Google Scholar 

  10. Jiang, P., Ergu, D., Liu, F., Cai, Y., Ma, B.: A review of yolo algorithm developments. Procedia Comput. Sci. 199, 1066–1073 (2022)

    Article  Google Scholar 

  11. Kiran, B.R., et al.: Deep reinforcement learning for autonomous driving: a survey. IEEE Trans. Intell. Transp. Syst. 23(6), 4909–4926 (2021)

    Article  Google Scholar 

  12. Kosuge, A., Suehiro, S., Hamada, M., Kuroda, T.: mmWave-YOLO: a mmWave imaging radar-based real-time multiclass object recognition system for ADAS applications. IEEE Trans. Instrum. Meas. 71, 1–10 (2022)

    Article  Google Scholar 

  13. Li, Y., Li, S., Du, H., Chen, L., Zhang, D., Li, Y.: YOLO-ACN: focusing on small target and occluded object detection. IEEE Access 8, 227288–227303 (2020)

    Article  Google Scholar 

  14. Liu, L., et al.: Deep learning for generic object detection: a survey. Int. J. Comput. Vision 128, 261–318 (2020)

    Article  Google Scholar 

  15. Lyu, Y., Li, H., Sayagh, M., Jiang, Z.M., Hassan, A.E.: An empirical study of the impact of data splitting decisions on the performance of AIOPs solutions. ACM Trans. Softw. Eng. Methodol. (TOSEM) 30(4), 1–38 (2021)

    Article  Google Scholar 

  16. McKight, P.E., Najab, J.: Kruskal-Wallis test. In: The Corsini Encyclopedia of Psychology, p. 1 (2010)

    Google Scholar 

  17. Meng, Z., McCreadie, R., Macdonald, C., Ounis, I.: Exploring data splitting strategies for the evaluation of recommendation models. In: Proceedings of the 14th ACM Conference on Recommender Systems, pp. 681–686 (2020)

    Google Scholar 

  18. Rashed, H., et al.: Generalized object detection on fisheye cameras for autonomous driving: dataset, representations and baseline. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 2272–2280 (2021)

    Google Scholar 

  19. Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788 (2016)

    Google Scholar 

  20. Roriz, R., Cabral, J., Gomes, T.: Automotive lidar technology: a survey. IEEE Trans. Intell. Transp. Syst. 23(7), 6282–6297 (2021)

    Article  Google Scholar 

  21. Snee, R.D.: Validation of regression models: methods and examples. Technometrics 19(4), 415–428 (1977)

    Article  MathSciNet  Google Scholar 

  22. Tu, F., Zhu, J., Zheng, Q., Zhou, M.: Be careful of when: an empirical study on time-related misuse of issue tracking data. In: Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, pp. 307–318 (2018)

    Google Scholar 

  23. Wang, C.Y., Bochkovskiy, A., Liao, H.Y.M.: Yolov7: trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. arXiv preprint arXiv:2207.02696 (2022)

  24. Wang, Z., et al.: Cirrus: a long-range bi-pattern lidar dataset. In: 2021 IEEE International Conference on Robotics and Automation (ICRA), pp. 5744–5750. IEEE (2021)

    Google Scholar 

  25. Wen, J., et al.: Convolutional neural networks for classification of Alzheimer’s disease: overview and reproducible evaluation. Med. Image Anal. 63, 101694 (2020)

    Article  Google Scholar 

  26. Wohlin, C., Runeson, P., Höst, M., Ohlsson, M.C., Regnell, B., Wesslén, A.: Experimentation in Software Engineering. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-29044-2

    Book  Google Scholar 

  27. Wu, W., May, R., Dandy, G.C., Maier, H.R.: A method for comparing data splitting approaches for developing hydrological ANN models (2012)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Md. Abu Ahammed Babu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Babu, M.A.A., Pandey, S.K., Durisic, D., Koppisetty, A.C., Staron, M. (2024). Impact of Image Data Splitting on the Performance of Automotive Perception Systems. In: Bludau, P., Ramler, R., Winkler, D., Bergsmann, J. (eds) Software Quality as a Foundation for Security. SWQD 2024. Lecture Notes in Business Information Processing, vol 505. Springer, Cham. https://doi.org/10.1007/978-3-031-56281-5_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-56281-5_6

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-56280-8

  • Online ISBN: 978-3-031-56281-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics