Skip to main content
Log in

Towards the availability of video communication in artificial intelligence-based computer vision systems utilizing a multi-objective function

  • Published:
Cluster Computing Aims and scope Submit manuscript

Abstract

Availability is one of the three main goals of information security. This paper contributes to systems’ availability by introducing an optimization model for the adaptation (controlling the capturing, coding, and sending features of the video communication system) of live broadcasting of video to limited and varied network bandwidth and/or limited power sources such as wireless and mobile network cases. We first, analyzed the bitrate-accuracy and bitrate-power characteristics of various video transmission techniques for adapting video communication in Artificial Intelligence-based Systems. To optimize resources for live video streaming, we analyze various video parameter settings for adapting the stream to available resources. We consider the object detection accuracy, the bandwidth, and power consumption requirement. The results showed that setting SNR and spatial video encoding features (with upscaling the frames at the destination) are the best techniques that maximizing the object detection accuracy while minimizing the bandwidth and the consumed energy requirements. In addition, we analyze the effectiveness of combining SNR and spatial video encoding features with upscaling and find that we can increase the performance of the streaming system by combining these two techniques. We presented a multi-objective function for determining the parameter or parameters’ pairing that provides the optimal object detection’s accuracy, power consumption, and bit rate. Results are reported based on more than 15,000 experiments utilizing standard datasets for short video segments and a collected dataset of 300 videos from YouTube. We evaluated results based on the detection index, false-positive index, power consumption, and bandwidth requirements metrics. For a single adaptive parameter, the analysis of the experiment’s outcome demonstrate that the multi-objective function achieves object detection accuracy as high as the best while drastically reducing bandwidth requirements and energy consumption. For multiple adaptive parameters, the analysis of the experiment’s outcome demonstrate the significant benefits of effective pairings (pairs) of adaptive parameters. For example, by combining the signal-to-noise ratio (SNR) with the spatial feature in H.264, a certain optimal parameter setting can be reached where the power consumption can be reduced to \(20\%\), and the bandwidth requirements to \(2\%\) from the original, while keeping the Object Detection Accuracy (ODA) within 10% less of the highest ODA.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15

Similar content being viewed by others

References

  1. Nguyen, M.T., Truong, L.H., Tran, T.T., Chien, C.-F.: Artificial intelligence based data processing algorithm for video surveillance to empower industry 3.5. Comput. Ind. Eng. 148, 106671 (2020)

    Article  Google Scholar 

  2. Tiefenau, C., Häring, M., Krombholz, K., von Zezschwitz, E.: Security, availability, and multiple information sources: Exploring update behavior of system administrators. In Sixteenth Symposium on Usable Privacy and Security (\(\{\)SOUPS\(\}\)2020), pp. 239–258 (2020)

  3. Alsmirat, M., Sarhan, N.J.: Intelligent optimization for automated video surveillance at the edge: A cross-layer approach. Simul. Model. Pract. Theory 105, 102171 (2020)

    Article  Google Scholar 

  4. Mama, C., Noureddine, B., Benaissa, B.: Control of variable reluctance machine (8/6) by artificial intelligence techniques. Int. J. Electr. Comput. Eng. (2088–8708) 10, 2 (2020)

    Google Scholar 

  5. Korshunov, P., Ooi, W.T.: Video quality for face detection, recognition, and tracking. ACM Trans. Multimedia Comput. Commun. Appl. 7(3), 1–21 (2011)

    Article  Google Scholar 

  6. Alsmirat, M., Sarhan, N. J.: Cross-layer optimization for automated video surveillance. In IEEE International Symposium on Multimedia (ISM), pp. 243–246, December (2016)

  7. Wang, Z., Chen, J., Hoi, S. C. H.: Deep learning for image super-resolution: A survey. IEEE Transactions on Pattern Analysis and Machine Intelligence, arXiv:1902.06068 (2020)

  8. Son, S., Lee, J., Nah, S., Timofte, R., Lee, K. M., Liu, Y., Xie, L., Siyao, L., Sun, W., Qiao, Y., et al.: Aim 2020 challenge on video temporal super-resolution. In European Conference on Computer Vision, pp. 23–40. Springer (2020)

  9. Sharrab, Y. O., Sarhan, N. J.: Accuracy and power consumption tradeoffs in video rate adaptation for computer vision applications. In Proceedings of the IEEE International Conference on Multimedia and Expo (ICME), pp. 410–415 (2012)

  10. Javed, O., Shah, M.: Tracking and object classification for automated surveillance. In Proceedings of the European Conference on Computer Vision-Part IV, pp. 343–357, (2002)

  11. Yuan, X., Sun, Z., Varol, Y., Bebis, G.: A distributed visual surveillance system. In Proceedings of the IEEE Conference on Advanced Video and Signal Based Surveillance (AVSS), p. 199 (2003)

  12. Niu, W., Long, J., Han, D., Wang, Y.-F.: Human activity detection and recognition for video surveillance. In IEEE International Conference on Multimedia and Expo (ICME), vol. 1, pp. 719–722 (2004)

  13. Kim, J., Wang, Y., Chang, S.: Content-adaptive utility-based video adaptation. In Proceedings of the IEEE International Conference on Multimedia and Expo (ICME), pp. 281–284 (2003)

  14. Hamandi, H. R., Sarhan, N. J.: Novel analytical models of face recognition accuracy in terms of video capturing and encoding parameters. In 2020 IEEE International Conference on Multimedia and Expo (ICME), pp. 1–6. IEEE (2020)

  15. Barzigar, N., Roozgard, A., Verma, P., Cheng, S.: A video super-resolution framework using SCoBeP. IEEE Trans. Circuits Syst. Video Technol. 26(2), 264–277 (2016)

    Article  Google Scholar 

  16. Georgis, G., Lentaris, G., Reisis, D.: Reduced complexity superresolution for low-bitrate video compression. IEEE Trans. Circ. Syst. Video Technol. 26(2), 332–345 (2016)

    Article  Google Scholar 

  17. Sonka, M., Hlavac, V., Boyle, R.: Image Processing, Analysis, and Machine Vision. Cengage Learning, Boston (2014)

    Google Scholar 

  18. Sharrab, Y.: Video stream adaptation in computer vision systems. digitalcommons.wayne.ed (2017)

  19. Sarif, B.A.B., Pourazad, M. T., Nasiopoulos, P., Leung, V.: Encoding and communication energy consumption trade-off in H.264/AVC based video sensor network. In Proceedings of the IEEE International Symposium and Workshops on a World of Wireless, Mobile and Multimedia Networks (WoWMoM), pp. 1–6, (2013)

  20. Brown, T. X.: Low power wireless communication via reinforcement learning. In: Advances in Neural Information Processing Systems, pp. 893–899, Citeseer (2000)

  21. Azar, A. T.: Proceedings of the International Conference on Artificial Intelligence and Computer Vision (AICV2020), vol. 1153. Springer Nature (2020)

  22. Meske, C., Bunde, E.: Using explainable artificial intelligence to increase trust in computer vision. arXiv:2002.01543 (2020)

  23. Szegedy, C., Ioffe, S., Vanhoucke, V.: Inception-v4, inception-resnet and the impact of residual connections on learning. In Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence (AAAI-17), (2017)

  24. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778, (2016)

  25. Liu, W., Wen, Y., Zhiding, Y., Li, M., Raj, B., Song, L.: Sphereface: Deep hypersphere embedding for face recognition. In Proceedings of The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), vol. 1, p. 1 (2017)

  26. Sultani, W., Chen, C., Shah, M.: Real-world anomaly detection in surveillance videos. (2018) CoRR (arXiv:abs/1801.04264)

  27. Esterle, L., Lewis, P. R.: Online multi-object k-coverage with mobile smart cameras. In Proceedings of the 11th International Conference on Distributed Smart Cameras (ICDSC), pp. 107–112. ACM, (2017)

  28. Sharrab, Y. O., Sarhan, N. J.: Detailed comparative analysis of vp8 and h. 264. In 2012 IEEE International Symposium on Multimedia, pp. 133–140. IEEE, (2012)

  29. Sharrab, Y. O., Sarhan, N. J.: Aggregate power consumption modeling of live video streaming systems. In Proceedings of the 4th ACM Multimedia Systems Conference, pp. 60–71 (2013)

  30. Caramia, M., Dell’Olmo, P.: Multi-objective optimization. In: Multi-objective management in freight logistics. Springer, pp. 21–51 (2020)

  31. SAE International: On-board system requirements for V2V safety communications. Standard J2945/1\_201603, March (2016)

  32. Sharrab, Y.O., Sarhan, N.J.: Modeling and analysis of power consumption in live video streaming systems. ACM Trans. Multimedia Comput. Commun. Appl. 13(4), 1–25 (2017)

    Article  Google Scholar 

  33. Sharrab, Y.O., Alsmirat, M., Hawashin, B., Sarhan, N.: Machine learning-based energy consumption modeling and comparing of H. 264 and google vp8 encoders. Int. J. Electr. Comput. Eng. 11(2), 2088–8708 (2021)

    Google Scholar 

  34. He, Z., Liang, Y., Chen, L., Ahmad, I., Dapeng, W.: Power-rate-distortion analysis for wireless video communication under energy constraints. IEEE Trans. Circ. Syst. Video Technol. 15(5), 645–658 (2005)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Izzat Alsmadi.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Sharrab, Y.O., Alsmadi, I. & Sarhan, N.J. Towards the availability of video communication in artificial intelligence-based computer vision systems utilizing a multi-objective function. Cluster Comput 25, 231–247 (2022). https://doi.org/10.1007/s10586-021-03391-4

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10586-021-03391-4

Keywords

Navigation