Abstract
This paper proposes a generalized self-cueing real-time attention scheduling framework for DNN-based visual machine perception pipelines on resource-limited embedded platforms. Self-cueing means we identify subframe-level regions of interest in a scene internally by exploiting temporal correlations among successive video frames as opposed to externally via a cueing sensor. One limitation of our original self-cueing-and-inspection strategy (Liu et al. in Proceedings of the 28th IEEE real-time and embedded technology and applications symposium (RTAS), 2022b) lies in its lack of computational efficiency under high workloads, like busy traffic scenarios where a large number of objects are identified and separately inspected. We extend the conference publication by integrating image resizing with intermittent inspection and task batching in attention scheduling. The extension enhances the original algorithm by accelerating the processing of large objects by reducing their resolution at the cost of only a negligible degradation in accuracy, thereby achieving a higher overall object inspection throughput. After extracting partial regions around objects of interest, using an optical flow-based tracking algorithm, we allocate computation resources (i.e. DNN inspection) to them in a criticality-aware manner using a generalized batched proportional balancing algorithm (GBPB), to minimize a concept of generalized system uncertainty. It saves computational resources by inspecting low-priority regions intermittently at low frequencies and inspecting large objects at low resolutions. We implement the system on an NVIDIA Jetson Xavier platform and extensively evaluate its performance using a real-world driving dataset from Waymo. The proposed GBPB algorithm consistently outperforms the previous BPB algorithm that only uses intermittent inspection and a set of baselines. The performance gain of GBPB is larger in facing more significant resource constraints (i.e., lower sampling intervals or busy traffic scenarios) because its multi-dimensional scheduling strategy achieves better resource allocation of machine perception.
Similar content being viewed by others
Notes
This operation is used to align the inspection times among objects to trigger more batching opportunities.
\(L/x_i\) is an integer since both L and \(x_i\) are powers of 2 multiples of the minimum non-zero element in \({\mathcal {C}}\) and \(x_i\le L\).
We base on the assumption that it is beneficial to slice the image and run the inspection tasks at the sub-frame level.
Without loss of generality, we assume that \(\left\lfloor \frac{w_N {\tilde{x}}^*_{{\hat{i}}}}{w_{{\hat{i}}}} \right\rfloor \ge 1\) and \(\frac{{\tilde{x}}^*_{{\hat{i}}}}{w_{{\hat{i}}}}\) is an integer; otherwise, we can just take the largest i with non-zero value of this equation and leave out the remaining objects.
The specific definition of the metric will be given later.
We want to remind that the object size is not identical to the object target size because the target size not only depends on the object size, but also the object motion.
References
Amert T, Otterness N, Yang M, et al (2017) GPU scheduling on the nvidia tx2: hidden details revealed. In: 2017 IEEE real-time systems symposium (RTSS), IEEE, pp 104–115
Amert T, Tong Z, Voronov S, et al (2021) Timewall: enabling time partitioning for real-time multicore+ accelerator platforms. In: 2021 IEEE real-time systems symposium (RTSS), IEEE, pp 455–468
Bastani F, Madden S (2021) Multiscope: efficient video pre-processing for exploratory video analytics. CoRR abs/2103.14695. arXiv:2103.14695
Bateni S, Liu C (2018) Apnet: approximation-aware real-time neural network. In: 2018 IEEE real-time systems symposium (RTSS), IEEE, pp 67–79
Bewley A, Ge Z, Ott L, et al (2016) Simple online and realtime tracking. In: 2016 IEEE international conference on image processing (ICIP), IEEE, pp 3464–3468
Buckler M, Bedoukian P, Jayasuriya S, et al (2018) Eva$^2$: Exploiting temporal redundancy in live computer vision. In: 2018 ACM/IEEE 45th annual international symposium on computer architecture (ISCA), IEEE, pp 533–546
Capodieci N, Cavicchioli R, Bertogna M, et al (2018) Deadline-based scheduling for gpu with preemption support. In: 2018 IEEE real-time systems symposium (RTSS), IEEE, pp 119–130
Cavigelli L, Degen P, Benini L (2017) Cbinfer: change-based inference for convolutional neural networks on video data. In: Proceedings of the 11th international conference on distributed smart cameras, pp 1–8
Chin T, Ding R, Marculescu D (2019) Adascale: Towards real-time video object detection using adaptive scaling. In: Talwalkar A, Smith V, Zaharia M (eds) Proceedings of machine learning and systems 2019, MLSys 2019, Stanford, CA, USA, March 31–April 2, 2019. mlsys.org
Grana C, Borghesani D, Cucchiara R (2010) Optimized block-based connected components labeling with decision trees. IEEE Trans Image Process 19(6):1596–1609
Heo S, Cho S, Kim Y, et al (2020) Real-time object detection system with multi-path neural networks. In: 2020 IEEE real-time and embedded technology and applications symposium (RTAS), IEEE, pp 174–187
Heo S, Jeong S, Kim H (2022) Rtscale: Sensitivity-aware adaptive image scaling for real-time object detection. In: 34th euromicro conference on real-time systems (ECRTS 2022), Schloss Dagstuhl-Leibniz-Zentrum für Informatik
Holte R, Mok A, Rosier L, et al (1989) The pinwheel: A real-time scheduling problem. In: Proceedings of the 22nd Hawaii international conference of system science, pp 693–702
Hu Y, Liu S, Abdelzaher T, et al (2021) On exploring image resizing for optimizing criticality-based machine perception. In: 2021 IEEE 27th international conference on embedded and real-time computing systems and applications (RTCSA), IEEE, pp 169–178
Hu Y, Liu S, Abdelzaher T, et al (2022) Real-time task scheduling with image resizing for criticality-based machine perception. Real-Time Systems pp 1–26
Jang W, Jeong H, Kang K, et al (2020) R-tod: Real-time object detector with minimized end-to-end delay for autonomous driving. In: In Proc. IEEE Real-time Systems Symposium (RTSS)
Ji M, Yi S, Koo C, et al (2022) Demand layering for real-time dnn inference with minimized memory usage. In: 2022 IEEE real-time systems symposium (RTSS), IEEE, pp 291–304
Kang W, Lee K, Lee J, et al (2021) Lalarand: Flexible layer-by-layer cpu/gpu scheduling for real-time dnn tasks. In: 2021 IEEE real-time systems symposium (RTSS), IEEE, pp 329–341
Kang D, Lee S, Chwa HS, et al (2022a) Rt-mot: Confidence-aware real-time scheduling framework for multi-object tracking tasks. In: 2022 IEEE real-time systems symposium (RTSS), IEEE, pp 318–330
Kang W, Chung S, Kim JY, et al (2022b) Dnn-sam: Split-and-merge dnn execution for real-time object detection. In: 2022 IEEE 28th real-time and embedded technology and applications symposium (RTAS), IEEE, pp 160–172
Kannan T, Hoffmann H (2021) Budget rnns: Multi-capacity neural networks to improve in-sensor inference under energy budgets. In: 2021 IEEE 27th real-time and embedded technology and applications symposium (RTAS), IEEE, pp 143–156
Kroeger T, Timofte R, Dai D, et al (2016) Fast optical flow using dense inverse search. In: European conference on computer vision, Springer, pp 471–488
Kumar AR, Ravindran B, Raghunathan A (2019) Pack and detect: Fast object detection in videos using region-of-interest packing. In: Proceedings of the ACM India joint international conference on data science and management of data, pp 150–156
Lee S, Nirjon S (2020a) Fast and scalable in-memory deep multitask learning via neural weight virtualization. In: Proceedings of the 18th international conference on mobile systems, applications, and services, pp 175–190
Lee S, Nirjon S (2020b) Subflow: A dynamic induced-subgraph strategy toward real-time dnn inference and training. In: 2020 IEEE real-time and embedded technology and applications symposium (RTAS), IEEE, pp 15–29
Li X, Yin F, Zhang X, et al (2021) Adaptive scaling for archival table structure recognition. In: Lladós J, Lopresti D, Uchida S (eds) 16th International Conference on Document Analysis and Recognition, ICDAR 2021, Lausanne, Switzerland, September 5-10, 2021, Proceedings, Part I, Lecture Notes in Computer Science, vol 12821. Springer, pp 80–95
Lin TY, Maire M, Belongie S, et al (2014) Microsoft coco: Common objects in context. In: European conference on computer vision, Springer, pp 740–755
Liu S, Yao S, Fu X, et al (2020a) On removing algorithmic priority inversion from mission-critical machine inference pipelines. In: In Proc. IEEE real-time systems symposium (RTSS)
Liu S, Yao S, Li J et al (2020) Giobalfusion: a global attentional deep learning framework for multisensor information fusion. Proc ACM Interactive Mob Wearable Ubiquitous Technol 4(1):1–27
Liu S, Yao S, Fu X, et al (2021) Real-time task scheduling for machine perception in intelligent cyber-physical systems. IEEE Trans Comput
Liu L, Dong Z, Wang Y, et al (2022a) Prophet: Realizing a predictable real-time perception pipeline for autonomous vehicles. In: 2022 IEEE real-time systems symposium (RTSS), IEEE, pp 305–317
Liu S, Fu X, Wigness M, et al (2022b) Self-cueing real-time attention scheduling in criticality-aware visual machine perception. In: Proceedings of the 28th IEEE real-time and embedded technology and applications symposium (RTAS)
Liu S, Wang T, Guo H, et al (2022c) Multi-view scheduling of onboard live video analytics to minimize frame processing latency. In: 2022 IEEE 42nd international conference on distributed computing systems (ICDCS), pp 503–514
Liu S, Wang T, Li J, et al (2022d) Adamask: Enabling machine-centric video streaming with adaptive frame masking for dnn inference offloading. In: Proceedings of the 30th ACM international conference on multimedia, pp 3035–3044
Mao H, Kong T, Dally WJ (2018) Catdet: cascaded tracked detector for efficient object detection from video. arXiv:1810.00434
Minnehan B, Savakis A (2019) Cascaded projection: End-to-end network compression and acceleration. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 10,715–10,724
Najibi M, Singh B, Davis L (2019) Autofocus: Efficient multi-scale inference. In: 2019 IEEE/CVF international conference on computer vision, ICCV 2019, Seoul, Korea (South), October 27–November 2, 2019. IEEE, pp 9744–9754
Razavi K, Luthra M, Koldehofe B, et al (2022) Fa2: fast, accurate autoscaling for serving deep learning inference with sla guarantees. In: 2022 IEEE 28th real-time and embedded technology and applications symposium (RTAS), IEEE, pp 146–159
Redmon J, Divvala S, Girshick R, et al (2016) You only look once: unified, real-time object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 779–788
Restuccia F, Biondi A (2021) Time-predictable acceleration of deep neural networks on FPGA SOC platforms. In: 2021 IEEE real-time systems symposium (RTSS), IEEE, pp 441–454
Song Z, Fu B, Wu F, et al (2020) Drq: dynamic region-based quantization for deep neural network acceleration. In: 2020 ACM/IEEE 47th annual international symposium on computer architecture (ISCA), IEEE, pp 1010–1021
Soyyigit A, Yao S, Yun H (2022) Anytime-lidar: deadline-aware 3D object detection. In: 2022 IEEE 28th international conference on embedded and real-time computing systems and applications (RTCSA), IEEE, pp 31–40
Sun P, Kretzschmar H, Dotiwalla X, et al (2020) Scalability in perception for autonomous driving: Waymo open dataset. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 2446–2454
Torralba A (2009) How many pixels make an image? Vis Neurosci 26(1):123–131
Wang S, Lu H, Deng Z (2019) Fast object detection in compressed video. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 7104–7113
Wu J, Subasharan V, Tran T, et al (2022) MRIM: enabling mixed-resolution imaging for low-power pervasive vision tasks. In: IEEE international conference on pervasive computing and communications, PerCom 2022, Pisa, Italy, March 21–25, 2022. IEEE, pp 44–53
Xiang Y, Kim H (2019) Pipelined data-parallel CPU/GPU scheduling for multi-DNN real-time inference. In: 2019 IEEE real-time systems symposium (RTSS), IEEE, pp 392–405
Xu M, Zhu M, Liu Y, et al (2018) Deepcache: principled cache for mobile deep vision. In: Proceedings of the 24th annual international conference on mobile computing and networking, pp 129–144
Yang Z, Nahrstedt K, Guo H, et al (2021) Deeprt: a soft real time scheduler for computer vision applications on the edge. arXiv:2105.01803
Yao S, Zhao Y, Shao H, et al (2018) Fastdeepiot: towards understanding and optimizing neural network execution time on mobile and embedded devices. In: Proceedings of the 16th ACM conference on embedded networked sensor systems, pp 278–291
Yao S, Hao Y, Zhao Y, et al (2020a) Scheduling real-time deep learning services as imprecise computations. In: Proc. IEEE international conference on embedded and real-time computing systems and applications (RTCSA)
Yao S, Li J, Liu D, et al (2020b) Deep compressive offloading: Speeding up neural network inference by trading edge computation for network latency. In: Proceedings of the international conference on embedded networked sensor systems (SenSys)
Zhang S, Lin W, Lu P, et al (2017) Kill two birds with one stone: boosting both object detection accuracy and speed with adaptive patch-of-interest composition. In: 2017 IEEE international conference on multimedia & expo workshops (ICMEW), IEEE, pp 447–452
Zhou Y, Moosavi-Dezfooli SM, Cheung NM, et al (2018) Adaptive quantization for deep neural network. In: Thirty-Second AAAI conference on artificial intelligence
Zhu X, Wang Y, Dai J, et al (2017a) Flow-guided feature aggregation for video object detection. In: Proceedings of the IEEE international conference on computer vision, pp 408–417
Zhu X, Xiong Y, Dai J, et al (2017b) Deep feature flow for video recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2349–2358
Acknowledgements
Research reported in this paper was sponsored in part by the U.S. DEVCOM Army Research Laboratory under Cooperative Agreement W911NF-17-20196, NSF CNS 20-38817, IBM (IIDAI), and the Boeing Company. The views and conclusions contained in this document are those of the author(s) and should not be interpreted as representing the official policies of the U.S. DEVCOM Army Research Laboratory or the U.S. government. The U.S. government is authorized to reproduce and distribute reprints for government purposes notwithstanding any copyright notation hereon.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Liu, S., Fu, X., Hu, Y. et al. Generalized self-cueing real-time attention scheduling with intermittent inspection and image resizing. Real-Time Syst 59, 302–343 (2023). https://doi.org/10.1007/s11241-023-09396-z
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11241-023-09396-z