A note on slack enforcement mechanisms for self-suspending tasks

This paper provides counterexamples for the slack enforcement mechanisms to handle segmented self-suspending real-time tasks by Lakshmanan and Rajkumar (Proceedings of the Real-Time and Embedded Technology and Applications Symposium (RTAS), pp 3–12, 2010).


Introduction
During the execution of a job, it may suspend itself, i.e., its computation ceases to process until certain activities are complete to be resumed. Such suspension behavior can appear in complex cyber-physical real-time systems, e.g., multiprocessor locking protocols, computation offloading, and multicore resource sharing, as demonstrated in (Chen et al. 2019, Sect. 2). The impact of self-suspension behavior has been investigated since 1990. However, the literature of this research topic before 2015 has been flawed as reported in the review by Chen et al. (2019).
The review by Chen et al. (2019) examines the literature in details, but two unresolved issues are listed in their concluding remark. One of them has been recently resolved by Günzel and Chen (2020). The remaining open problem is regarding the correctness of the "slack enforcement mechanisms to shape the demand of a selfsuspending task so that the task behaves like an ideal ordinary periodic task" (Chen et al. 2019, Sect. 9.1), proposed by Lakshmanan and Rajkumar (2010) in 2010. This paper provides counterexamples, which show that their slack enforcement mechanisms (1) may provoke deadline misses and therefore (2) do not guarantee the same worst-case response time as without slack enforcement when all higher priority selfsuspending tasks behave like ideal ordinary periodic tasks.

3
The slack enforcement mechanisms by Lakshmanan and Rajkumar (2010) were argued to be applicable for one-segment self-suspending task systems under uniprocessor fixed-priority preemptive schedules. Specifically, they used the classical rate-monotonic priority assignment. They considered a set of implicit-deadline sporadic real-time tasks = { 1 , … , n } , in which each task i has its minimum interarrival time T i , where the relative deadline of i is also T i . A task i is either an ordinary sporadic one with worst-case execution time C i (without any suspension) or a one-segment self-suspending task with an execution pattern of (C 1 i , S 1 i , C 2 i ) . That is, a job of a one-segment self-suspending task i has a worst-case execution time C 1 i for its first computation segment i,1 , then is suspended from the system for up to S 1 i time units, and then is resumed with its second computation segment i,2 associated with its worst-case execution time C 2 i . Note that we follow the notation used in the survey paper by Chen et al. (2019).
It is well known that the suspension behavior of higher-priority tasks can result in more interference on a lower-priority task. There are three mechanisms developed in the literature to reduce the impact of the higher-priority tasks: • Period enforcer proposed by Rajkumar Rajkumar (1991) intends to apply a runtime rule so that "it forces tasks to behave like ideal periodic tasks from the scheduling point of view with no associated scheduling penalties." This is termed as dynamic online period enforcement in Sect. 4.3.1 in the survey paper Chen et al. (2019). • Release guard Sun and Liu (1996) or release enforcement Huang and Chen (2016) mechanisms which enforce the computation segments to be released with a guaranteed minimum inter-arrival time. This is termed as static period enforcement in Sect. 4.3.2 in the survey paper Chen et al. (2019). • Slack enforcement proposed by Lakshmanan and Rajkumar (2010) intends to create execution enforcement for self-suspending tasks by utilizing the available slack so that a self-suspending task behaves like an ideal (ordinary) periodic task.
However, it has been recently concluded by Chen and Brandenburg (2017) that "period enforcement Rajkumar (1991) is not strictly superior (compared to the base case without enforcement) as it can cause deadline misses in self-suspending task sets that are schedulable without enforcement." In the paper by Lakshmanan and Rajkumar (2010), they present a static and a dynamic version of slack enforcement. Moreover, they provide a critical instant theorem to compute the worst-case response time for self-suspending tasks. Nelissen et al. (2015) later showed that the critical instant presented in Lakshmanan and Rajkumar (2010) is flawed. Despite that, the slack enforcement mechanisms proposed in Lakshmanan and Rajkumar (2010) can still be applied when worst-case response times are given beforehand. Hence, the correctness of the slack enforcement mechanism is not affected directly by the incorrect critical instant theorem in Lakshmanan and Rajkumar (2010). The review paper by Chen et al. (2019) calls for more rigorous proofs to support the correctness of the mechanism as the proof of the key lemma of the slack enforcement mechanisms in Lakshmanan and Rajkumar (2010) is incomplete. Since the correctness of the slack enforcement mechanisms was unclear, to the best of our knowledge, there is no published work based on slack enforcement.
The ultimate goal of the period enforcer and the slack enforcement mechanisms is to ignore the self-suspension behavior of higher-priority tasks. This property is highly desirable in many practical applications in which self-suspensions are inevitable. Unfortunately, neither the period enforcer nor the slack enforcement mechanisms can achieve the above ultimate goal, shown in Chen and Brandenburg (2017) and this paper. Moreover, we note that the release enforcement mechanisms do not have the above ultimate goal, but only aim for better and easier schedulability analyses.

Misconception of the static slack enforcement mechanism
The static slack enforcement mechanism, as it is presented in (Lakshmanan and Rajkumar 2010, Section V), delays the second computation segment of each self-suspending job generated by a self-suspending task i , such that the processor indeed idles the maximal suspension time S 1 i between both segments. Its formulation relies on the definition of level-i slack: Definition of level-i slack in Section IV in Lakshmanan and Rajkumar (2010): The level-i slack over any time interval [t 1 , t 2 ] (with t 2 ≥ t 1 ) is defined as the total time within [t 1 , t 2 ] during which no tasks with priority greater than or equal to i are executing.
◻ Definition of static slack enforcement in Section V in Lakshmanan and Rajkumar (2010): Static slack enforcement is defined as an execution control policy that delays the release of the second segment of a self-suspending task i = ((C 1 i , S 1 i , C 2 i ), T i ) such that the level-i slack between the two segments of i is at least S 1 i . ◻ The work of Lakshmanan and Rajkumar (2010) does not explain how self-suspending tasks may meet their deadlines utilizing this mechanism. In fact, the static slack enforcement is a source of deadline miss of self-suspending tasks, since the response time is increased if the slack is less than the suspension time. Figure 1 shows a schedule where the static slack enforcement leads to a deadline miss: Consider a task set with only two tasks 1 = ((1), 5) and 2 = ((1, 7, 2), 12) . At most one job of 1 interferes with each execution segment of 2 . Hence, the worst-case response time of 2 is 12, as depicted on the left hand side of Fig. 1. The level-2  [2,9] is 6 since 1 utilizes the processor for 1 time unit. To obtain level-2 slack of 7 the second segment of the job of 2 is delayed. This leads to a deadline miss as depicted on the right hand side of Fig. 1. Moreover, since the schedule on the left hand side does not consider any suspension from the higher priority task 1 , this also shows that static slack enforcement does not guarantee the same worst-case response time as without enforcement when all higher priority self-suspending tasks behave like ideal ordinary periodic tasks.
We note that the proof related to the static slack enforcement mechanism was provided in a technical report but not in the published paper Lakshmanan and Rajkumar (2010). We are therefore not able to explain the reason which causes the misconception.

Misconception of the dynamic slack enforcement mechanism
The dynamic slack enforcement mechanism, presented in (Lakshmanan and Rajkumar 2010, Section IV), ensures that no deadline misses occur in the delayed task by calculating the response time of each job during runtime and comparing it with the worst case: Definition of dynamic slack enforcement in Sect. IV in Lakshmanan and Rajkumar (2010): Dynamic slack enforcement is an execution control policy that delays the release of the second segment of a self-suspending sporadic task such that i can still meet its normal (nonexecution-controlled) worst-case response time R i . ◻ For the correctness of the dynamic slack enforcement algorithm in Lakshmanan and Rajkumar (2010), they formulate the following two properties, based on their Lemma 4 and Lemma 5.
• Property P1 If a task i ∈ under static-priority preemptive scheduling has a worst-case response time (WCRT) of R i , applying the slack enforcement mechanism makes its WCRT always the same or shorter. • Property P2 The worst-case response time (WCRT) R i of i under the dynamic slack enforcement mechanism and static-priority preemptive scheduling is not longer than the WCRT in the corresponding scenario by considering only i 's suspension behavior and treating all higher-priority tasks as non-self-suspending tasks.
In Appendix A, we discuss the worst-case response times of 3 and 4 . In particular, we show that the worst-case response time of 3 is 15 + . Moreover, by replacing suspension of 3 by execution, we show that the worst-case response time of 4 is upper bounded by 36 + as depicted in Fig. 2. However, the concrete example in Fig. 3 demonstrates that the dynamic slack enforcement mechanism presented in Lakshmanan and Rajkumar (2010) leads to a deadline miss of 4 since T 4 < 37 . According to the dynamic slack enforcement mechanism, the second computation segment of 3 is delayed to the latest time such that it still meets its worst-case response time of 15 + , i.e., no later than 12 + 15 + = 27 + . This disproves Property P1.
For Property P2 we consider the schedule depicted in Fig. 4, which treats all higher priority tasks as non-suspending tasks. Since the obtainable schedules without suspension of 3 are a subset of the obtainable schedules of with suspension, the worst-case response time of 4 is again bounded by 36 + . However, we have already shown that dynamic slack enforcement leads to a deadline miss. This disproves Property P2. We note that the stated properties for the dynamic slack enforcement mechanism are invalidated even if the mechanism is restricted to periodic or synchronous task sets due to the following consideration. Let = 0.2 and consider to be a synchronous periodic task set, i.e., job releases are aligned with the previous deadline and the first job release of each task is at time 0. In this case the release pattern from Fig. 3 starts from time 393,120, i.e., 393,120 is an integer multiple of 7, 24 and 36.4, and 10, 860 ⋅ 36.2 − 393, 120 = 12 . Hence, the dynamic slack enforcement causes a deadline miss of 4 at time 393,120+36.4.
Source of misconception We believe that the main source of the misconception of the dynamic slack enforcement mechanism is inherited from the misconception of the critical instant theorem for self-suspending task systems, claimed in Lakshmanan and Rajkumar (2010). They argued that the dynamic slack enforcement mechanism makes the second computation segment released as late as possible and therefore does not worsen the schedulability of lower-priority tasks. However, this is an incorrect argument. Our counterexample is based on a condition: • If task 3 interferes with only one computation segment of a job of 4 , the response time of the job of 4 is at most 36 + . • If task 3 interferes with two computation segments of a job of 4 , the response time of the job of 4 can be up to 37.
The dynamic slack enforcement mechanism delays the second computation segment of 3 in this counterexample and forces the latter case to take place, whilst the original fixed-priority scheduler has a safe worst-case response time of 36 + . This is the counterpart of the misconception of the critical instant theorem claimed in Lakshmanan and Rajkumar (2010). Imagine that we split task 3 into two ordinary sporadic tasks 1 3 and 2 3 that do not suspend themselves, both with execution time 1 and minimum inter-arrival time 36. If we apply the (incorrect) critical instant theorem in Lakshmanan and Rajkumar (2010), the worst-case response time of 4 follows exactly Fig. 4. However, the actual worst-case for this pattern is to release 1 3 and 2 3 so that each of them interferes with one computation segment of 4 , i.e., exactly Fig. 3.
The proof of Lemma 4 in Lakshmanan and Rajkumar (2010) is incorrect because the proof did not inspect the impact of the two computation segments of 3 on the two computation segments of 4 in this counterexample. It solely argues that I ns j (R) = I 1 j (R) + I 2 j (R) (here, the notation is directly from Lakshmanan and Rajkumar (2010)), i.e., for an interval length R the interference I ns j . This is irrelevant to a formal proof of the worst-case response time. A correct treatment in the proof should analyze the worst-case response times of a task for both cases, e.g., using the iterative approach like time demand analysis (TDA), and demonstrate their equivalence.
We also note that our counterexample does not follow the call for a rigorous proof of Lemma 4 in Lakshmanan and Rajkumar (2010) by Chen et al. (2019). The main argument in Chen et al. (2019) was due to the incomplete proof of the level-i busy period, which is irrevalent in our counterexample.

Appendix A: Analysis of Sect. 3
The following analysis consists of two parts. At first we derive the worst-case response time of 3 as foundation for the response time analysis of 4 . Afterwards we provide a bound on the response time of 4 which is sufficient for the counterexample in Sect. 3.
Response time of 3 : To analyze the worst-case response time R 3 of task 3 , we consider the suspension-oblivious schedule where suspension is replaced by computation. Using the time demand function for this case yields a worst-case response time of W 3 (15 + ) = (2 + ) + ⌈ ⌉ 10 = 15 + . This also bounds the worst case response time of 3 in the case with suspension, i.e., R 3 ≤ 15 + . The schedule in Fig. 5 shows a case where the response time is actually 15 + . We conclude that R 3 = 15 + . Moreover, we note that the worst-case offset of the second computation segment of 3 is 13 + since 1 + ⌈ ⌉ 10 = 13 is the worstcase response time of the first computation segment. Response time of 4 : To analyze the worst-case response time of 4 , we consider a concrete fixed-priority preemptive schedule of . Suppose that the first job J of 4 is released at time a 4 and finished at time f 4 . We bound the response time of J and prove that in any circumstances f 4 − a 4 ≤ T 4 . When this property holds, we can remove the first job of 4 in the schedule and use the same argument to bound the response time of every job of 4 inductively. Suppose that the schedule is busy from t 0 to a 4 with t 0 ≤ a 4 and the processor idles right prior to t 0 . Such a time point t 0 exists. Since the job of 4 released at time a 4 is not constrained by the inter-arrival time constraint T 4 of 4 , we can move its release time to t 0 . After this change of arrival time, the schedule remains unchanged, but the response time of the job J is increased. For notational brevity, we set t 0 to 0 in this proof.
As a fundamental tool for our analysis we use the time demand function on each computation segment of 4 . For a segment of J let I be the interference of 3 during the segment. We define the time demand function for that segment by If there exists some t ∈ [0, T 4 ] with W * 4 (t, I) ≤ t , this is an upper bound on the response time R * 4 (I) of that segment, i.e., R * 4 (I) ≤ t. To derive an upper bound on the worst-case response time of 4 which is sufficient for the counterexample, we fix the releases of the job segments of 4 and replace the suspension in 3 by execution. This conversion does not decrease the response time of J. We call the new task obl 3 the suspension-oblivious 3 with worst-case execution time 2 + . If there is some busy interval [x, 0] before 0 (choose the smallest x possible), then we move the release of J to x. This does not change the schedule and only increases the response time of J. Moreover, after this procedure only jobs which are released at or after the release of J can interfere with J. Therefore, we delete all jobs released before the release of J without changing the response time of J.
The remaining analysis is to analyze the worst-case response time of 4 under the interference of three ordinary sporadic tasks 1 , 2 , obl 3 , which can be achieved by adopting the response time analysis in Nelissen et al. (2015). We use the time demand function from Eq. (1) on each segment of J. If a job of 3 interferes with a segment of J, then the worst-case response time of that segment is since W * 4 (17 + , 2 + ) = 17 + . If no job of 3 interferes with the segment, then its worst-case response time is since W * 4 (14, 0) = 14 . If no job of 3 interferes with J, then J is finished after at most R * 4 (0) + 5 + R * 4 (0) ≤ 33 time units. If only the first job of 3 interferes with J, then the total worst-case response time is at most R * 4 (2 + ) + 5 + R * 4 (0) ≤ 36 + . We note that the second job of 3 can not interfere with J since it is released when J is already finished.
Funding Open Access funding enabled and organized by Projekt DEAL. This work has been supported by Deutsche Forschungsgemeinschaft (DFG), as part of Sus-Aware (Project No. 398602212).