Optimal priority assignment for real-time systems: a coevolution-based approach

In real-time systems, priorities assigned to real-time tasks determine the order of task executions, by relying on an underlying task scheduling policy. Assigning optimal priority values to tasks is critical to allow the tasks to complete their executions while maximizing safety margins from their specified deadlines. This enables real-time systems to tolerate unexpected overheads in task executions and still meet their deadlines. In practice, priority assignments result from an interactive process between the development and testing teams. In this article, we propose an automated method that aims to identify the best possible priority assignments in real-time systems, accounting for multiple objectives regarding safety margins and engineering constraints. Our approach is based on a multi-objective, competitive coevolutionary algorithm mimicking the interactive priority assignment process between the development and testing teams. We evaluate our approach by applying it to six industrial systems from different domains and several synthetic systems. The results indicate that our approach significantly outperforms both our baselines, i.e., random search and sequential search, and solutions defined by practitioners. Our approach scales to complex industrial systems as an offline analysis method that attempts to find near-optimal solutions within acceptable time, i.e., less than 16 hours.

from different domains and several synthetic systems. The results indicate that our approach significantly outperforms both our baselines, i.e., random search and sequential search, and solutions defined by practitioners.Our approach scales to complex industrial systems as an offline analysis method that attempts to find near-optimal solutions within acceptable time, i.e., less than 16 hours.
Keywords Priority Assignment, Schedulability Analysis, Real-Time Systems, Coevolutionary Search, Search-Based Software Engineering

Introduction
Mission-critical systems are found in many different application domains, such as aerospace, automotive, and healthcare domains. The success of such systems depends on both functional and temporal correctness. For functional correctness, systems are required to provide appropriate outputs in response to the corresponding stimuli. Regarding temporal correctness, systems are supposed to generate outputs within specified time constraints, often referred to as deadlines. The systems that have to comply with such deadlines are known as real-time systems (Liu, 2000). Real-time systems typically run multiple tasks in parallel and rely on a real-time scheduling policy to decide which tasks should have access to processing cores, i.e., CPUs, at any given time.
While developing a real-time system, one of the most common problems that engineers face is the assignment of priorities to real-time tasks in order for the system to meet its deadlines. Based on priorities of real-time tasks, the system's task scheduler determines a particular order for allocating real-time tasks to processing cores. Hence, a priority assignment that is poorly designed by engineers makes the system scheduler execute tasks in an order that is far from optimal. In addition, the system will likely violate its performance and time constraints, i.e., deadlines, if a poor priority assignment is used.
In real-time systems, the problem of optimally assigning priorities to tasks is important not only to avoid deadline misses but also to maximize safety margins from task deadlines and is subject to engineering constraints. Tasks may exceed their expected execution times due to unexpected interrupts. For example, it is infeasible to test an aerospace system exhaustively on the ground such that potential environmental uncertainties, e.g., those related to space radiations, are accounted for. Hence, engineers assign optimal priorities to tasks such that the remaining times from tasks' completion times to their deadlines, i.e., safety margins, are maximized to cope with potential uncertainties. Furthermore, engineers typically have to account for additional engineering constraints, e.g., they assign higher priorities to critical tasks that must always meet their deadlines compared to the tasks that are less critical or non-critical.
A brute force approach to find an optimal priority assignment would have to examine all n! distinct priority assignments, where n denotes the number of tasks. Furthermore, for a given priority assignment, schedulability analysis is, in general, known as a hard problem (Audsley, 2001), which determines whether or not tasks will always complete their executions within their specified deadlines. Thus, optimizing priority assignments is also a hard problem because the space of all possible system states to explore in order to find optimal priority assignments is very large. Most of the prior works on optimizing priority assignments provide analytical methods (Fineberg and Serlin, 1967;Leung and Whitehead, 1982;Audsley, 1991;Davis and Burns, 2007;Chu and Burns, 2008;Davis and Burns, 2009;Davis and Bertogna, 2012), which rely on well-defined system models and are very restrictive. For example, they assume that tasks are independent, i.e., tasks do not share resources (Davis et al., 2016;Zhao and Zeng, 2017). Industrial systems, however, are typically not compatible with such (simple) system models. In addition, none of the existing work addresses the problem of optimizing priority assignments by simultaneously accounting for multiple objectives, such as safety margins and engineering constraints, as discussed above.
Search-based software engineering (SBSE) has been successfully applied in many application domains, including software testing (Wegener et al., 1997;Wegener and Grochtmann, 1998;Lin et al., 2009;Arcuri et al., 2010;Shin et al., 2018), program repair (Weimer et al., 2009;Tan et al., 2016;Abdessalem et al., 2020), and self-adaptation (Andrade and Macêdo, 2013;Chen et al., 2018;Shin et al., 2020), where the search spaces are very large. Despite the success of SBSE, engineering problems in real-time systems have received much less attention in the SBSE community. In the context of real-time systems, there exists limited work on finding stress test scenarios (Briand et al., 2005) and predicting worst-case execution times (Lee et al., 2020b), which complements our work.
In practice, priority assignments result from an interactive process between the development and testing teams. While developing a real-time system, developers assign priorities to real-time tasks in the system and then testers stress the system to check whether or not the system meets its specified deadlines. If testers find a problematic condition under which any of the tasks violates its deadline, developers have to modify the priority assignment to address the problem. The back-and-forth between the development and testing teams continues until a priority assignment that does not lead to any deadline miss is found or the one that yields the least critical deadline misses is identified. The process is, however, not automated.
In this article, we use metaheuristic search algorithms to automate the process of assigning priorities to real-time tasks. To mimic the interactive backand-forth between the development and testing teams, we use competitive coevolutionary algorithms (Luke, 2013). Coevolutionary algorithms are a specialized class of evolutionary search algorithms. They simultaneously coevolve two populations (also called species) of (candidate) solutions for a given problem. They can be cooperative or competitive. Such competitive coevolution is similar to what happens in nature between predators and preys. For example, faster preys escape predators more easily, and hence they have a higher probability of generating offspring. This impacts the predators, because they need to evolve as well to become faster if they want to feed and survive (Meneghini et al., 2016). Hence, the two species, i.e., predators and preys, have coevolved competitively. We note that no species has the competing traits of predators and preys simultaneously as such species could not evolve to survive. In our context, priority assignments defined by developers can be seen as preys and stress test scenarios as predators. The priority assignments need to evolve so that stress testing is not able to push the system into breaking its real-time constraints. Dually, stress test scenarios should evolve to be able to break the system when there is a chance to do so. Contributions. We propose an Optimal Priority Assignment Method for realtime systems (OPAM). Specifically, we apply multi-objective, two-population competitive coevolution (Popovici et al., 2012) to address the problem of finding near-optimal priority assignments, aiming at maximizing the magnitude of safety margins from deadlines and constraint satisfaction. In OPAM, two species relate to priority assignment and stress testing coevolve synchronously, and compete against each other to find the best possible solutions. We evaluated OPAM by applying it to six complex, industrial systems from different domains, including the aerospace, automotive, and avionics domains, and several synthetic systems. Our results show that: (1) OPAM finds significantly better priority assignments compared to our baselines, i.e., random search and sequential search, (2) the execution time of OPAM scales linearly with the number of tasks in a system and the time required to simulate task executions, and (3) OPAM priority assignments significantly outperform those manually defined by engineers based on domain expertise.
We note that OPAM is the first attempt to apply coevolutionary algorithms to address the problem of priority assignment. Further, it enables engineers to explore trade-offs among different priority assignments with respect to two objectives: maximizing safety margins and satisfying engineering constraints. Our full evaluation package is available online (Lee et al., 2021). Organization. The remainder of this article is structured as follows: Section 2 motivates our work. Section 3 defines our specific problem of priority assignment in practical terms. Section 4 discusses related work. Sections 5 and 6 describe OPAM. Section 7 evaluates OPAM. Section 8 concludes this article.

Motivating case study
We motivate our work using an industrial case study from the satellite domain. Our case study concerns a mission-critical real-time satellite, named ESAIL (LuxSpace, 2021), which has been developed by LuxSpace -a leading system integrator for microsatellites and aerospace system. ESAIL tracks vessels' movements over the entire globe as the satellite orbits the earth. The vessel-tracking service provided by ESAIL requires real-time processing of messages received from vessels in order to ensure that their voyages are safe with the assistance of accurate, prompt route provisions. Also, as ESAIL orbits the planet, it must be oriented in the proper position on time in order to provide services correctly. Hence, ESAIL's key operations, implemented as real-time tasks, need to be completed within acceptable times, i.e., deadlines.
Engineers at LuxSpace analyze the schedulability of ESAIL across different development stages. At an early design stage, the engineers use a priority assignment method that extends the rate monotonic scheduling policy (Fineberg and Serlin, 1967), which is a theoretical priory assignment algorithm used in real-time systems. At a later development stage, if the engineers found that any real-time task of ESAIL cannot complete its execution within its deadline, the engineers, in our study context, reassign priorities to tasks in order to address the problem of deadline violations.
The rate monotonic policy assigns priorities to tasks that arrive to be executed periodically and must be completed within a certain amount of time, i.e., periodic tasks with hard deadlines. According to the policy, periodic tasks that arrive frequently have higher priorities than those of other tasks that arrive rarely. In ESAIL, for example, if the vessel-tracking task arrives every 100ms and the satellite-position control task arrives every 150ms, the former has a higher priority than the latter. However, the rate monotonic policy does not account for tasks that arrive irregularly and should be completed within a reasonable amount of time, i.e., aperiodic tasks with soft deadlines. ESAIL contains aperiodic tasks with soft deadlines as well, such as a task for updating software. Hence, the engineers extend the rate monotonic policy to assign priorities to all tasks of ESAIL. The extensions are as follows: First, the engineers assign priorities to periodic tasks based on the rate monotonic policy. Second, the engineers assign lower priorities to aperiodic tasks than those of periodic tasks. As aperiodic tasks with soft deadlines are typically considered less critical than periodic tasks with hard deadlines, the engineers aim to ensure that periodic tasks complete their executions within their deadlines by assigning lower priorities to aperiodic tasks while periodic tasks have higher priority. Engineers use a heuristic to assign priorities to aperiodic tasks. They treat aperiodic tasks as (pseudo-)periodic tasks by setting aperiodic tasks' (expected) minimum arrival rates as their fixed arrival periods, making the tasks frequently arrive. The engineers then apply the rate monotonic policy for the aperiodic tasks with the synthetic periods while ensuring that aperiodic tasks have lower priorities than those of periodic tasks.
A priority assignment made at an early design stage keeps changing while developing ESAIL due to various reasons, such as changes in requirements and implementation constraints. At a development stage, instead of relying on the extended rate monotonic policy, the engineers assign priorities based on their domain expertise, manually inspecting schedulability analysis results. Hence, a priority assignment at later development stages often does not follow the extended rate monotonic policy. For example, as aperiodic tasks are also expected to be completed within a reasonable amount of time, some aperiodic tasks may have higher priorities than some periodic tasks as long as they are schedulable.
Engineers at LuxSpace, however, are still faced with the following issues: (1) Their priority assignment method, which extends the rate monotonic scheduling policy, assigns priorities to tasks in order to ensure only that tasks are to be schedulable. However, engineers have a pressing need to understand the quality of priority assignments in detail as they impact ESAIL operations differently. For example, once ESAIL is launched into orbit, the satellite operates in the space environment, which is inherently impossible to be fully tested on the ground. Unexpected space radiations may trigger unusual system interrupts, which hasn't been observed on the ground, resulting in overruns of ESAIL tasks' executions. In such cases, a priority assignment assessed on the ground may not be able to tolerate such unexpected uncertainties. Hence, engineers need a priority assignment that enables ESAIL tasks to tolerate unpredictable uncertainties as much as possible and to be schedulable.
(2) Engineers at LuxSpace assign priorities to tasks without any systematic assistance. Instead, they rely on their expertise and the current practices described above to manually assign priorities to ensure that tasks are to be schedulable. To this end, we are collaborating with LuxSpace to develop a solution for addressing these issues in assigning task priority.

Problem description
This section defines the task, scheduler, and schedulability concepts, which extend the concepts defined in our previous work (Lee et al., 2020b) by augmenting our previous definitions with the notions of safety margins, constraints in assigning priorities, and relationships between real-time tasks. We then describe the problem of optimizing priority assignments such that we maximize the magnitude of safety margins and the degree of constraint satisfaction. Figure 1 shows an overview of the conceptual model that represents the key abstractions required to analyze optimal priority assignments for real-time systems. The entities in the conceptual model are described below. Task. We denote by j a real-time task that should complete its execution within a specified deadline after it is activated (or arrived). Every real-time task j has the following properties: priority denoted by pr (j), deadline denoted by dl (j), and worst-case execution time (WCET) denoted by wcet(j). Task priority pr determines if an execution of a task is preempted by another task. Typically, a task j preempts the execution of a task j if the priority of j is higher than the priority of j , i.e., pr (j) > pr (j ). The pr (j) priority is a fixed value assigned to task j. Such fixed priorities are determined offline; hence, they are not changed online for any reason. Note that a real-time task scheduler that relies on fixed priorities is applied in all the study subjects in this article (see Section 7.2) and is commonly used in industrial systems (Briand et al., 2005;Guan et al., 2009;Lin et al., 2009;Anssi et al., 2011;Zeng et al., 2014;Di Alesio et al., 2015;Dürr et al., 2019;Lee et al., 2020a).
The dl (j) function determines the deadline of a task j relative to its arrival time. A task deadline can be either hard or soft. A hard deadline of a task j constrains that j must complete its execution within a deadline dl (j) after j is activated. While violations of hard deadlines are not acceptable, depending on the operating context of a system, violating soft deadlines may be to some extent tolerated. Note that we use a metaheuristic search relying on fitness functions quantifying the degrees of deadline misses, safety margins, and constraint satisfaction. Such functions do not depend on the nature of the deadlines. Our approach outputs a set of priority assignments that are Pareto optimal with respect to safety margins and constraint satisfaction. Engineers then perform domain-specific trade-off analysis among Pareto solutions. Hence, in this article, we handle hard and soft deadline tasks in the same manner.
Real-time tasks are either periodic or aperiodic. Periodic tasks, which are typically triggered by timed events, are invoked at regular intervals specified by their period. We denote by pd (j) the period of a periodic task j, i.e., a fixed time interval between subsequent activations (or arrivals) of j. Any task that is not periodic is called aperiodic. Aperiodic tasks have irregular arrival times and are activated by external stimuli which occur irregularly. In real-time analysis, based on domain knowledge, we typically specify a minimum interarrival time denoted by pmin(j) and a maximum inter-arrival time denoted by pmax (j) indicating the minimum and maximum time intervals between two consecutive arrivals of an aperiodic task j. In real-time analysis, sporadic tasks are often separately defined as having irregular arrival intervals and hard deadlines (Liu, 2000). In our conceptual definitions, however, we do not introduce new notations for sporadic tasks because the deadline and period concepts defined above sufficiently characterize sporadic tasks. Note that for periodic tasks j, we have pmin(j) = pmax (j) = pd (j). Otherwise, for aperiodic tasks j, we have pmax (j) > pmin(j).
Task relationships. The execution of a task j depends not only on its own parameters described above, e.g., priority pr (j) and period pd (j), but also on its relationships with other tasks. Relationships between tasks are typically determined by task interactions related to accessing shared resources and triggering arrivals of other tasks . Specifically, if two tasks j and j access a shared resource r in a mutually exclusive way, j may be blocked from executing for the period during which j accesses r. We denote by dp(j, j ) the resource-dependency relation between tasks j and j that holds if j and j have mutually exclusive access to a shared resource r such that they cannot be executed in parallel or preempt each other, but one can execute only after the other has completed accessing r.
The other type of relationship between tasks is related to a task j triggering the arrival of another task j . This is a common interaction between tasks (Locke et al., 1990;Anssi et al., 2011;Di Alesio et al., 2015). For example, j may hand over some of its workload to j due to performance or reliability reasons. We denote by tr (j, j ) the triggering relation between tasks j and j that holds if j triggers the arrival of j . We note that both relationships are defined at the level of tasks, following prior works (Locke et al., 1990;Anssi et al., 2011;Di Alesio et al., 2015) describing the five industrial case study systems used in our experiments (see Section 7.2).
Scheduler. Let J be a set of tasks to be scheduled by a real-time scheduler. A scheduler then dynamically schedules executions of tasks in J according to the tasks' arrivals and the scheduler's scheduling policy over the scheduling period T = [0, T]. We denote by at k (j) the kth arrival time of a task j ∈ J. The first arrival of a periodic task j does not always occur immediately at the system start time (0). Such offset time from the system start time to the first arrival time at 1 (j) of j is denoted by offset(j). For a periodic task j, the kth arrival of j within T is at k (j) ≤ T and is computed by at k (j) = offset(j) + (k − 1) · pd (j). For an aperiodic task j , at k (j ) is determined based on the k−1th arrival time of j and its minimum and maximum arrival times.
A scheduler reacts to a task arrival at at k (j) by scheduling the execution of j. Depending on a scheduling policy (e.g., rate monotonic scheduling policy for single-core systems (Fineberg and Serlin, 1967) and single-queue multi-core scheduling policy (Arpaci-Dusseau and Arpaci-Dusseau, 2018)), an arrived task j may not start its execution at the same time as it arrives when higher priority tasks are executing on all processing cores. Also, task executions may be interrupted due to preemption. We denote by et k (j) the completion time for the kth arrival of a task j. According to the worst-case execution time of a task j, we have: et k (j) ≥ at k (j) + wcet(j).
During system operation, a scheduler generates a schedule scenario which describes a sequence of task arrivals and their completion time values. We define a schedule scenario as a set S of tuples (j, at k (j), et k (j)) indicating that a task j has arrived at at k (j) and completed its execution at et k (j). Due to a degree of randomness in task execution times and aperiodic task arrivals, a scheduler may generate a different schedule scenario for different runs of a system. Figure 2 shows two schedule scenarios S ( Figure 2a) and S (Figure 2b) produced by a scheduler over the [0, 23] time period of a system run. Both S and S describe executions of three tasks, j 1 , j 2 , and j 3 arrived at the same time stamps (see at i in the figures). In both scenarios, the aperiodic task j 1 is characterized by: pmin(j 1 ) = 5, pmax (j 1 ) = 13, dl (j 1 ) = 4, and wcet(j 1 ) = 2. The (a) The S schedule scenario is produced when pr (j 1 ) = 3, pr (j 2 ) = 2, and pr (j 3 ) = 1. (b) The S schedule scenario is produced when pr (j 1 ) = 1, pr (j 2 ) = 3, and pr (j 3 ) = 3.
Schedulability. Given a schedule scenario S, a task j is schedulable if j completes its execution before its deadline, i.e., for all et k (j) observed in S, et k (j) ≤ at k (j) + dl (j). Let J be a set of tasks to be scheduled by a scheduler. A set J of tasks is then schedulable if for every schedule S of J, we have no task j ∈ J that misses its deadline.
As shown in schedule scenarios S and S presented in Figures 2a and 2b, respectively, all three tasks, j 1 , j 2 , and j 3 , are schedulable. However, we note that the overall amounts of remaining time, i.e., safety margins, from the tasks' completions to their deadlines observed in S and S are different (see the second completion times and deadlines of j 1 , j 2 , and j 3 in S and S ) because S and S are produced by using different priority assignments. Engineers typically desire to assign optimal priorities to real-time tasks that aim at maximizing such safety margins, as discussed below.
Problem. In real-time systems, fixed priorities are typically assigned to tasks (Davis et al., 2016;Lee et al., 2020a). Finding an appropriate priority assignment is important not only for ensuring the schedulability of a system but also for maximizing the safety margins within which a system can tolerate unexpected execution time overheads. For example, if an unpredictable error occurs and triggers check-point mechanisms (Davis and Burns, 2007), which re-execute part or all of a task j, then the execution time of j unexpectedly overruns. Hence, engineers need an optimal priority assignment that maximizes the overall remaining times from task completion times to task deadlines, i.e., safety margins.
While assigning priorities to tasks, engineers also account for constraints, that are often but not always domain-specific. For example, aperiodic tasks' priorities should be lower than those of periodic tasks because periodic tasks are often more critical than aperiodic tasks. Hence, engineers develop a system that prioritizes executions of periodic tasks over aperiodic tasks. Recall from Section 2, this constraint is desirable by engineers. When needed, however, engineers can violate the constraint to some extent in order to ensure that aperiodic tasks complete within a reasonable amount of time while periodic tasks meet their deadlines. Constraints can be either hard constraints, which must be satisfied, or soft constraints, which are desired to be satisfied. In our study, hard constraints need to be assured while scheduling tasks, e.g., a running task's priority must be higher than a ready task's priority, which are enforced by a scheduler. In the context of optimizing priority assignments, we focus on maximizing the extent of satisfying soft constraints. We refer to a soft constraint as a constraint in this paper.
Our work aims at optimizing priority assignments that maximize the safety margins while satisfying such constraints. Specifically, for a set J of tasks to be analyzed, we define three concepts as follows: (1) a priority assignment for J denoted by #» P , (2) the magnitude of safety margins for a priority assignment #» P denoted by fs( #» P ), and (3) the degree of constraint satisfaction denoted by fc( #» P ). We note that Section 6.3 describes how we optimize #» P , and compute fs( #» P ) and fc( #» P ) in detail. Our study aims at finding a set B of best possible priory assignments that are Pareto optimal (Knowles and Corne, 2000) such that a priority assignment #» P ∈ B maximizes both fs( #» P ) and fc( #» P ), and any other priority assignments in B are equally viable.

Related Work
This section discusses related research strands in the areas of priority assignments, real-time analysis using exhaustive techniques, search-based analysis in real-time systems, and coevolutionary analysis in software engineering. Table 1: Comparing our work, OPAM, with existing priority assignment techniques with respect to the properties captured in their underlying system models.
Properties OPAM RMPO DMPO OPA OPA-MLD RPA FNR-PA PRPA OPTA EPAF Priority assignment. The problem of optimally assigning priorities to realtime tasks has been widely studied (Fineberg and Serlin, 1967;Liu and Layland, 1973;Leung and Whitehead, 1982;Audsley, 1991;Tindell et al., 1994;George et al., 1996;Audsley, 2001;Davis and Burns, 2007;Chu and Burns, 2008;Burns, 2009, 2011;Davis and Bertogna, 2012;Davis et al., 2016;Zhao and Zeng, 2017;Hatvani et al., 2018). Fineberg and Serlin (1967) reported early work that relies on a simple system model, assuming, for example, that all tasks arrive periodically, tasks run on a single processing core, tasks' deadlines are equal to their periods, and task executions are independent from one another. They proposed a priority assignment method, named rate-monotonic priority ordering (RMPO), that assigns higher priorities to the tasks with shorter periods. RMPO can find a feasible priority assignment that guarantees periodic tasks to be schedulable when such priority assignments exist (Liu and Layland, 1973). Leung and Whitehead (1982) extended RMPO to relax one of the underlying assumptions made in RMPO. Specifically, their priority assignment approach, known as deadline-monotonic priority ordering (DMPO), accounts for task deadlines that can be less than or equal to their periods. In contrast to our work, however, these methods are often not applicable to industrial systems that are not compatible with their simplified system models. Recall from Section 3 that a realistic system typically consists of both periodic and aperiodic tasks. Task executions depend on their relationships, i.e., resource dependencies and triggering relationships, with other tasks.
Audsley (2001) designed a priority assignment method, named optimal priority assignment (OPA), that relies on an existing schedulability analysis method M . OPA guarantees to find a feasible priority assignment that is schedulable according to M if such priority assignments exist. OPA is applicable to more complex systems than those supported by the methods mentioned above, i.e., RMPO and DMPO. Specifically, OPA can find a feasible priority assignment even in the following situations: (1) First arrivals of periodic tasks occur after some offset time (Audsley, 1991). (2) Aperiodic tasks have arbitrary deadlines (Tindell et al., 1994). (3) Task executions are scheduled based on a non-preemptive scheduling policy (George et al., 1996). (4) Tasks run on multiple processing cores . Unlike our approach that accounts for two objectives, safety margins and engineering constraints (see Section 3), OPA attempts to find a feasible priority assignment whose only objective is to make all tasks schedulable. Note that such a feasible priority assignment does not necessarily maximize safety margins as discussed in Section 3. Hence, a feasible priority assignment obtained by OPA is often fragile and sensitive any changes in task executions and unable to accommodate unexpected overheads in task execution times, which are commonly observed in industrial systems (Davis and Burns, 2007).
OPA has been extended by several works (Davis and Burns, 2007;Chu and Burns, 2008;Davis and Burns, 2009;Davis and Bertogna, 2012). Davis and Burns (2007) presented a robust priority assignment method (RPA) with a degree of tolerance for unexpected overruns of task execution times. Chu and Burns (2008) introduced an extended OPA algorithm (OPA-MLD) that minimizes the lexicographical distance between the desired priority assignment and the one obtained by the algorithm. OPA-MLD enables important tasks to have higher priorities. Davis and Bertogna (2012) proposed an RPA extension (FNR-PA) to make RPA work when a system allows task preemption to be deferred for some interval of time. Davis and Burns (2009) developed a probabilistic robust priority assignment method (PRPA) for a real-time system to be less likely to violate its deadlines. Even though the prior works mentioned above improve OPA to some extent, they assume that task executions are independent of one another. In contrast to these existing approaches, OPAM accounts for dependencies among task executions, i.e., resource dependencies and triggering relationships (see our problem description in Section 3).
Some recent priority assignment techniques address scalability. Hatvani et al. (2018) presented an optimal priority and preemption-threshold assignment algorithm (OPTA) that attempts to decrease the computation time for finding a feasible priority assignment. OPTA uses a heuristic to traverse a problem space while pruning infeasible paths to efficiently and effectively explore the problem space. Zhao and Zeng (2017) introduced an effective priority assignment framework (EPAF) that combines a commercial solver for integer linear programs and their problem-specific optimization algorithm. However, these methods rely on simple system models that assume, for example, task executions to be independent and running on a single processing core. Therefore, the applicability of these techniques is limited. In contrast, recall from Sections 2 and 3 that our approach aims at scaling to complex industrial systems while accounting for realistic system characteristics regarding task periods, inter-arrival times, resource dependencies, triggering relationships, and multiple processing cores. Table 1 compares our work, OPAM, with the other priority assignment techniques mentioned above. As shown in the table, we note that prior works rely on system models that are very restrictive. In particular, existing work assumes that task executions are independent of one another. However, task dependencies such as resource dependencies and triggering relationships are commonly observed in industrial systems. In addition, we note that no existing solution simultaneously accounts for safety margins and engineering constraints. Hence, to our knowledge, OPAM is the first attempt to provide engineers with a set of equally viable priority assignments, allowing trade-off analysis with respect to the two objectives: maximizing safety margins and satisfying engineering constraints. Real-time analysis using exhaustive techniques. Constraint programming and model checking have been applied to conclusively and exhaustively verify whether or not a system meets its deadlines (Kwiatkowska et al., 2011;Di Alesio et al., 2012;Nejati et al., 2012;Di Alesio et al., 2013). Existing research on priority assignment based on OPA rely on such exhaustive techniques to prove the schedulability of a set of tasks for a given priority assignment. We note that schedulability analysis is, in general, an NP-hard problem (Davis et al., 2016) that cannot be solved in polynomial time. As a result, exhaustive techniques based on model checking and constraint solving are often not amenable to analyze large industrial systems such as ESAIL -our motivating case study system -described in Section 2. To assess if exhaustive techniques could scale to ESAIL, as discussed in Section 7.8, we performed a preliminary experiment using UPPAAL (Behrmann et al., 2004), a model checker for real-time systems. We observed that UPPAAL was not able to verify schedulability of ESAIL tasks for a fixed priority assignment even after letting it run for several days (see Section 7.8 for more details). Search-based analysis in real-time systems. In real-time systems, most of the existing works that use search-based techniques focus on testing (Wegener et al., 1997;Wegener and Grochtmann, 1998;Briand et al., 2005;Lin et al., 2009;Arcuri et al., 2010). Wegener et al. (1997Wegener et al. ( , 1998 introduced a testing approach based on a genetic algorithm that aims to check computation time, memory usage, and task synchronization by analyzing the control flow of a program. Briand et al. (2005) applied a genetic algorithm to find stress test scenarios for real-time systems. Lin et al. (2009) proposed a searchbased approach to check whether a real-time system meets its timing and security constraints. Arcuri et al. (2010) presented a black-box system testing approach based on a genetic algorithm. Beyond testing real-time systems, Nejati et al. (2013Nejati et al. ( , 2014) developed a search-based trade-off analysis technique that helps engineers balance the satisfaction of temporal constraints and keeping the CPU time usage at an acceptable level. Lee et al. (2020b) combined a search algorithm and machine learning to estimate safe ranges of worst-case task execution times within which tasks likely meet their deadlines. In contrast to these prior works, OPAM addresses the problem of optimally assigning priorities to real-time tasks while accounting for multiple objectives regarding safety margins and engineering constraints, thus enabling Pareto (trade-off) analysis. Further, OPAM uses a multi-objective, competitive coevolutionary search algorithm, which has been rarely applied to date in prior studies of real-time systems, as discussed next. Coevolutionary analysis in software engineering. Despite the success of search-based software engineering (SBSE) in many application domains including software testing (Wegener et al., 1997;Wegener and Grochtmann, 1998;Lin et al., 2009;Arcuri et al., 2010;Shin et al., 2018), program repair (Weimer et al., 2009;Tan et al., 2016;Abdessalem et al., 2020), and self-adaptation (Andrade and Macêdo, 2013;Chen et al., 2018;Shin et al., 2020), coevolutionary algorithms have been applied in only a few prior studies (Wilkerson and Tauritz, 2010;Wilkerson et al., 2012;Boussaa et al., 2013). Wilkerson et al. (2010Wilkerson et al. ( , 2012) present a coevolution-based approach to automatically correct software. Their work introduced a program representation language to facilitate their automated corrections. Boussaa et al. (2013) developed a code-smells detection approach. The main idea is to evolve two competing populations of code-smell detection rules and artificial code-smells. Unlike these prior works, we study the problem of optimally assigning priorities to tasks in real-time systems. To our knowledge, we are the first to address the priority assignment problem using a multi-objective, competitive coevolutionary search algorithm.

Approach Overview
Finding an optimal priority assignment is an inherently interactive process. In practice, once engineers assign priorities to the real-time tasks in a system, testers then stress the system to find a condition, i.e., a particular sequence of task arrivals, in which a task execution violates its deadline. Testers typically use a simulator or hardware equipment to stress the system by triggering plausible worst-case arrivals of tasks that maximize the likelihood of deadline misses. If testers find task arrivals that induce deadline misses, the task arrivals are reported to engineers in order to fix the problem by reassigning priorities. This interactive process of assigning priorities and testing schedulability continues until both engineers and testers ensure that the tasks meet their deadlines.
For such intrinsically interactive problem-solving domains, we conjecture that coevolutionary algorithms are potentially suitable solutions. A coevolutionary algorithm is a search algorithm that mutually adapts one of different species, e.g., in our study, two populations of priority assignments and taskarrival sequences, acting as foils against one another. Specifically, we apply multi-objective, two-population competitive coevolution (Luke, 2013) to address our problem of finding optimal priority assignments (see Section 3). In our approach, the two populations of priority assignments and stress test scenarios, i.e., task-arrival sequences, evolve synchronously, competing with each other in order to search for optimal priority assignments that maximize the magnitude of safety margins from deadlines and the extent of constraint satisfaction. Note that better priority assignments enable a system to achieve find worst task arrivals find best priority assignments task descriptions larger safety margins. Hence, those priority assignments have a higher chance to pass stress test scenarios. This impacts the stress test scenarios because they need to evolve as well, aiming at inducing deadline misses in the system.
Recall from Section 4 that most of the existing SBSE research relies on search algorithms using a single population (Chen et al., 2018;Abdessalem et al., 2020;Shin et al., 2020). However, such algorithms do not fit the problem of priority assignments targeted here. When (1) two competing traits between task arrivals and priority assignments are encoded together in an individual of a single population and (2) two contradicting fitness functions regarding safety margins and deadline misses, which are exact opposites, assess such individuals, the notion of Pareto optimality is not applicable. In that case, maximizing the magnitude of safety margins necessarily entails minimizing the magnitude of deadline misses. Hence, a single population-based search algorithm cannot make Pareto improvements that maximize safety margins (resp. deadline misses) while not minimizing deadline misses (resp. safety margins). Specifically, the dominance relation over such individuals does not exist because if an individual I is strictly better than another individual I in one fitness value, I is always worse than I in the other fitness value. Hence, we are not able to obtain equally viable solutions with respect to the contradicting objectives using such a method. Figure 3 shows an overview of our proposed solution: Optimal Priority Assignment Method for real-time tasks (OPAM). OPAM requires as input task descriptions defined by engineers, which specify task characteristics and their relationships (see Section 3). Given such input task descriptions, the "find worst task arrivals' and "find best priority assignments" steps aim at generating worst-case sequences of task arrivals and best-case priority assignments, respectively. A worst-case sequence of task arrivals means that the magnitude of deadline misses, i.e., the amounts of time from task deadlines to task completion times, is maximized when tasks arrive as defined in the sequence. Note that if there is no deadline miss, a task-arrival sequence is considered worst-case if tasks complete their executions as close to their deadlines as possible. In contrast, a priority assignment is best-case when the magnitude of safety margins is maximized. Beyond maximizing safety margins, the "find best priority assignments" step accounts for satisfying engineering constraints in assigning priorities to tasks. OPAM evolves two competing populations of task-arrival sequences and priority assignments synchronously generated from the two steps. OPAM then outputs a set of priority assignments that are Pareto optimal with regards to the magnitude of safety margins and the extent of satisfying constraints. Hence, OPAM allows engineers to perform domain-specific trade-off analysis among Pareto solutions and is useful in practice to support decision making with respect to their task design. For example, suppose engineers develop a weakly hard real-time systems (Bernat et al., 2001) that can tolerate occasional deadline misses. In that case, engineers may consider a few deadline misses as less important (as long as their consequences are negligible) than the overall magnitude of safety margins in their trade-off analysis. Section 6 describes OPAM in detail. Figure 4 describes the OPAM algorithm for finding optimal priority assignments, which employs multi-objective, two-population competitive coevolution. The algorithm first randomly initializes two populations A and P for task-arrival sequences and priority assignments, respectively (lines [13][14][15]. For A, OPAM randomly varies task arrivals of aperiodic tasks to create ps a taskarrival sequences, according to the input task descriptions D. Regarding P, OPAM randomly creates ps p priority assignments that may include one defined by engineers if available.

Competitive Coevolution
The two populations sequentially evolve during the allotted analysis budget (see line 17 in Figure 4). The best priority assignment is the one that makes tasks schedulable and maximizes the magnitude of safety margins, while satisfying engineering constraints for a given worst sequence of task arrivals. Hence, searching for the best priority assignments involves searching for the worst sequences of task arrivals. We create two populations A and P searching for the worst arrival sequences and the best priority assignments, respectively. The fitness values of task-arrival sequences in A are computed based on how well they challenge the priority assignments in P, i.e., maximizing the magnitude of deadline misses (line 20). Likewise, the priority assignments in P are evaluated based on how well they perform against the task-arrival sequences in A, i.e., maximizing the magnitude of safety margins while satisfying constraints (line 25). Once the two populations are assessed against each other, OPAM generates the next populations based on the computed fitness values (lines 21 and 26). OPAM tailors the breading mechanisms of steady-state genetic algorithms (GA) (Whitley and Kauth, 1988) for A and NSGAII (Deb et al., 2002) for P.
OPAM uses two types of fitness functions, namely internal and external fitness evaluations, which play a different and complementary role as described below. The two internal fitness evaluations in lines 20 and 25 of the listing in Figure 4 aim at selecting individuals -task-arrival sequences and priority assignments -for breeding the next A and P populations. OPAM evaluates the external fitness for the P population of priority assignments to find a best Pareto front (lines 28-31). As shown in lines 20 and 25, the internal fitness values of individuals in A (resp. P) are computed based on how they perform 1 Algorithm Search optimal priority assignments 2 Input D: task descriptions 3 Input nc: number of coevolution cycles //budget 4 Input ps a : population size //task-arrival sequences 5 Input ps p : population size //priority assignments 6 Input cp a : crossover probability //task-arrival sequences 7 Input cp p : crossover probability //priority assignments 8 Input mp a : mutation probability //task-arrival sequences 9 Input mp p : mutation probability //priority assignments 10 Input E: set of task-arrival sequences //external evaluation 11 Output B: best Pareto front 12 13 //initialize populations 14 A ← randomize_arrivals(D, ps a ) 15 P ← randomize_priorities(D, ps p ) 16   with respect to individuals in P (resp. A). Hence, an individual's internal fitness is assessed through interactions with competing individuals. For example, a priority assignment in the first generation may have acceptable fitness values regarding safety margins and constraint satisfaction with respect to the first generation of task-arrival sequences, which are likely far from worst-case sequences. However, priority assignment fitness may get worse in later generations as the task-arrival sequences evolve towards larger deadline misses. Thus, if OPAM simply monitors internal fitness, it cannot reliably detect coevolutionary progress as an individual's internal fitness changes according to competing individuals. The problem of monitoring progress in coevolution has been observed in many studies (Ficici, 2004;Popovici et al., 2012). To address it, OPAM computes external fitness values of priority assignments in P based on a set E of task-arrival sequences generated independently from the coevolution process. By doing so, OPAM can observe the monotonic improvement of external fitness for priority assignments. We note that, in general, if interactions between two competing populations are finite and any interaction can be examined with non-zero probability at any time, monotonicity guarantees that a coevolutionary algorithm converges to a solution (Popovici et al., 2012).
We note that our approach for evolving task-arrival sequences is based on past work (Briand et al., 2005), where a specific genetic algorithm configuration was proposed to find worst-case task-arrival sequences. One significant modification is that OPAM accounts for task relationships -resourcedependency and task triggering relationships -and a multi-core scheduling policy based on simulations to evaluate the magnitude of deadline misses.
Following standard practice (Ralph et al., 2020), the next sections describe OPAM in detail by defining the representations, the scheduler, the fitness functions, and the evolutionary algorithms for coevolving the task-arrival sequences and priority assignments. We then describe the external fitness evaluation of OPAM.

Representations
OPAM coevolves two populations of task-arrival sequences and priority assignments. A task-arrival sequence is defined by their inter-arrival time characteristics (see Section 3). A priority assignment is defined by a function that maps priorities to tasks. Task-arrival sequences. Given a set J of tasks to be scheduled, a feasible sequence of task arrivals is a set A of tuples (j, at k (j)) where j ∈ J and at k (j) is the kth arrival time of a task j. Thus, a solution A represents a valid sequence of task arrivals of J (see valid at k (j) computation in Section 3). Let T = [0, T] be the time period during which a scheduler receives task arrivals. The size of A is equal to the number of task arrivals over the T time period. Due to the varying inter-arrival times of aperiodic tasks (Section 3), the size of A will vary across different sequences. Priority assignments. Given a set J of tasks to be scheduled, a feasible priority assignment is a list #» P of priority pr (j) for each task j ∈ J. OPAM assigns a non-negative integer to a priority pr (j) of j such that priorities are comparable to one another. The size of #» P is equal to the number of tasks in J. Each task in J has a unique priority. Hence, a priority assignment #» P is a permutation of all tasks' priorities. We note that these characteristics of priority assignments are common in many real-time analysis methods (Audsley, 2001;Davis and Burns, 2007;Zhao and Zeng, 2017) and industrial systems (e.g., see our six industrial case study systems described in Section 7.2).

Simulation
OPAM relies on simulation for analyzing the schedulability of tasks in a scalable way. For instance, an inter-arrival time of a software update task in a satellite system is approximately at most three months. In such cases, conducting an analysis based on an actual scheduler is prohibitively expensive. Also, applying an exhaustive technique for schedulability analysis typically doesn't scale to an industrial system (e.g., see our experiment results using a model checker described in Section 7.8). Instead, OPAM uses a real-time task scheduling simulator, named OPAMScheduler, which applies a scheduling policy, i.e., single-queue multi-core scheduling policy (Arpaci-Dusseau and Arpaci-Dusseau, 2018), based on discrete simulation time events. Note that we chose the single-queue multi-core scheduling policy for OPAMScheduler since our case study systems (described in Section 7.2) rely on this policy.
OPAMScheduler takes as input a feasible task-arrival sequence A and a priority assignment #» P for scheduling a set J of tasks. It then outputs a schedule scenario as a set S of tuples (j, at k (j), et k (j)) where at k (j) and et k (j) are the kth arrival and end time values of a task j, respectively (see Section 3). For each task j, OPAMScheduler computes et k (j) based on its WCET and scheduling policy while accounting for task relationships (see the dp(j, j ) resource-dependency relationship and the tr (j, j ) task triggering relationship in Section 3). To simulate the worst-case executions of tasks, OPAMScheduler assigns tasks' WCETs to their execution times.
OPAMScheduler implements a single-queue multi-core scheduling policy (Arpaci-Dusseau and Arpaci-Dusseau, 2018), which schedules a task j with explicit priority pr (j) and deadline dl (j). When tasks arrive, OPAMScheduler puts them into a single queue that contains tasks to be scheduled. At any simulation time, if there are tasks in the queue and multiple cores are available to execute tasks, OPAMScheduler first fetches a task j from the queue in which j has the highest priority pr (j). OPAMScheduler then allocates task j to any available core. Note that if task j shares a resource with a running task j in another core, i.e., the dp(j, j ) resource-dependency relationship holds, j will be blocked until j releases the shared resource.
OPAMScheduler works under the assumption that context switching time is negligible, which is also a working assumption in many scheduling analysis methods (Liu and Layland, 1973;Audsley, 2001;Di Alesio et al., 2015). Note that the assumption is practically valid and useful at an early development step in the context of real-time analysis. For instance, our collaborating partner, LuxSpace, accounts for the waiting time of tasks due to context switching between tasks through adding some extra time to WCET estimates at the task design stage. Note that OPAM can be applied with any scheduling policy, including those that account for context switching time and multiple queues.

Fitness functions
Internal fitness: deadline misses. Given a feasible task-arrival sequence A and a priority assignment #» P , we formulate a function, fd (A, #» P ), to quantify the degree of deadline misses regarding a set J of tasks to be scheduled. To compute fd (A, #» P ), OPAM runs OPAMScheduler for A and #» P and obtains a schedule scenario S. We denote by dist k (j) the distance between the end time and the deadline of the kth arrival of task j observed in S and define dist k (j) = et k (j) − at k (j) + dl (j) (see Section 3 for the notation end time et k (a), arrival time at k (j), and deadline dl (j)). We denote by lk (j) the last arrival index of a task j in A. Given a set J of tasks to be scheduled, the fd (A, #» P ) function is defined as follows: Note that fd (A, #» P ) is defined as an exponential equation. Hence, when all task executions observed in a schedule scenario S meet their deadlines, fd (A, #» P ) is a small value as any distance dist k (j) between the task end time and the deadline of the kth arrival of task j is a negative value. In contrast, deadline misses result in positive values for dist k (j). In such cases, fd (A, #» P ) is a large value. The exponential form of fd (A, #» P ) was precisely selected for this reason, to assign large values for deadline misses but small values when deadlines are met. By doing so, fd (A, #» P ) prevents an undesirable solution that would result into many task executions meeting deadlines obfuscating a smaller number of deadline misses.
Following the principles of competitive coevolution, individuals in a population A of task-arrival sequences need to be assessed by pitting them against individuals in the other population P of priority assignments. We denote by fd (A, P) the internal fitness function that quantifies the overall magnitude of deadline misses across all priority assignment #» P ∈ P, regarding a set J of tasks to be scheduled. The fd (A, P) fitness is used for breeding the next population of task-arrival sequences. OPAM aims to maximize fd (A, P), defined as follows: Internal fitness: safety margins. Given a feasible priority assignment #» P and a task-arrival sequence A, we denote by fs( #» P , A) the magnitude of safety margins regarding a set J of tasks to be scheduled. The computation of fs( #» P , A) is similar to the computation of fd (A, #» P ) regarding the use of OPAM-Scheduler, which outputs a schedule scenario S. The difference is that OPAM reverses the sign of fd (A, #» P ) as OPAM aims at maximizing the magnitude of safety margins. Given a set J of tasks to be scheduled, the fs( #» P , A) function is defined as follows: Given two populations P and A of priority assignments and task-arrival sequences, similar to internal fitness fd (A, P), priority assignments in P need to be assessed against task-arrival sequences in A. We formulate an internal fitness function, fs( #» P , A), to quantify the overall magnitude of safety margins across all task-arrival sequences A ∈ A, regarding a set J of tasks to be scheduled and a priority assignment #» P . OPAM relies on the fs( #» P , A) function to breed the next population of priority assignments. OPAM aims to maximize fs( #» P , A), which is defined as follows: Internal fitness: constraints. Given a priority assignment #» P , we formulate an internal fitness function, fc( #» P ), to quantify the degree of satisfaction of soft constraints set by engineers. Such function is required as we recast the satisfaction of such constraints into an optimization problem, in order to minimize constraint violations. Specifically, OPAM accounts for the following constraint: aperiodic tasks should have lower priorities than those of periodic tasks. Recall from Section 2 that engineers consider this constraint to be desirable. We denote by lp( #» P ) the lowest priority of periodic tasks in #» P . For a set J of tasks to be scheduled, OPAM aims to maximize fc( #» P ), which is defined as follows: , if j is an aperiodic task 0, otherwise Greater pr (j) values denote higher priorities. Given a priority assignment #» P , if pr (j) for an aperiodic task j is lower than the priority of any of the periodic tasks, lp( #» P )−pr (j) is a positive value. OPAM measures the difference between priorities of aperiodic and periodic tasks. By doing so, fc( #» P ) rewards aperiodic tasks that satisfy the above constraint and consistently penalizes those that violate it. Hence, OPAM aims at maximizing fc( #» P ). External fitness: safety margins and constraints. To examine the quality of priority assignments and monitor the progress of coevolution, OPAM takes as input a set E of task-arrival sequences created independently from the coevolution process. Given a set E of task-arrival sequences and a priority assignment #» P , OPAM utilizes fs( #» P , E) and fc( #» P ) described above as external fitness functions for quantifying the magnitude of safety margins and the extent of constraint satisfaction, respectively. As E does not change over the coevolution process, fs( #» P , E) is used for evaluating a priority assignment #» P since it is not impacted by the evolution of task-arrival sequences. Hence, external fitness functions ensure that OPAM monitors the progress of coevolution in a stable manner. Given two populations P and A of priority assignments and task-arrival sequences, we recall that the fd (A, P) internal fitness function quantifies the overall magnitude of deadline misses across all priority assignments in P for the given sequence of task arrivals A. The fs( #» P , A) internal fitness function quantifies the overall magnitude of safety margins across all sequences of task arrivals in A for the given priority assignments #» P . Hence, the internal fitness of A (resp. #» P ) is assessed through interactions with competing individuals in P (resp. A). Therefore, if OPAM relies only on the internal fitness functions, it cannot gauge the progress of coevolution in a stable manner as an individual's internal fitness depends on competing individuals.
1 Algorithm Task-arrival sequences evolution 2 Input A: population of task-arrival sequences 3 Input P: population of priority assignments 4 Input cp a : crossover probability //task-arrival sequences 5 Input mp a : mutation probability //task-arrival sequences 6 Output A: population of task-arrival sequences 7 8 //evaluate internal fitness values for A 9 for each A i ∈ A 10 for each  We note that soft deadline tasks also require to execute within reasonable execution time, i.e., (soft) deadline. As the above fitness functions return quantified degrees of deadline misses and safety margins, OPAM uses the same fitness functions for both soft and hard deadline tasks.

Evolution: Worst-case task arrivals
The algorithm in Figure 5 describes in detail the evolution of task-arrival sequences in lines 18-21 of the listing in Figure 4. OPAM adapts a steady-state Genetic Algorithm (GA) (Luke, 2013) for evolving task-arrival sequences. As shown in lines 8-14, OPAM first evaluates each task-arrival sequence in the A population against the P population of priority assignments. OPAM executes OPAMScheduler to obtain a schedule scenario S for a task-arrival sequence A i ∈ A and a priority assignment #» P l ∈ P (line 11). OPAM then computes the internal fitness fd (A i , P) capturing the magnitude of deadline misses (lines 12-14). We note that a steady-state GA iteratively breeds offspring, assess their fitness, and then reintroduce them into a population. However, OPAM computes internal fitness of all task-arrival sequences in A at every generation. This is because internal fitness is computed in relation to P, which is coevolving with A.
Breeding the next population is done by using the following genetic operators: (1) Selection: OPAM selects candidate task-arrival sequences using a tournament selection technique, with the tournament size equal to two which is the most common setting (Gendreau and Potvin, 2010) (line 17 in Figure 5).
(2) Crossover: Selected candidate task-arrival sequences serve as parents to create offspring using a crossover operation (line 18).
(3) Mutation: The offspring are then mutated (line 19). Below, we describe our crossover and mutation operators.
Crossover. A crossover operator is used to produce offspring by mixing traits of parent solutions. OPAM modifies the standard one-point crossover operator (Luke, 2013) as two parent task-arrival sequences A p and A q may have different sizes, i.e., |A p | = |A q |. Let J = {j 1 , j 2 , . . . , j m } be a set of tasks to be scheduled. Our crossover operator first randomly selects an aperiodic task j r ∈ J. For all i ∈ [1, r] and j i ∈ J, OPAM then swaps all j i arrivals between the two task-arrival sequences A p and A q . Since J is fixed for all solutions, OPAM can cross over two solutions that may have different sizes.
Mutation operator OPAM uses a heuristic mutation algorithm. For a taskarrival sequence A, OPAM mutates the kth task arrival time at k (j) of an aperiodic task j with a mutation probability. OPAM chooses a new arrival time value of at k (j) based on the [pmin(j), pmax (j)] inter-arrival time range of j. If such a mutation of the kth arrival time of j does not affect the validity of the k+1th arrival time of j, the mutation operation ends. Specifically, let d be a mutated value of at k (j). In case at k+1 (j) ∈ [d + pmin(j), d + pmax (j)], OPAM returns the mutated A task-arrival sequence.
After mutating the kth arrival time at k (j) of a task j in a solution A, if the k+1th arrival becomes invalid, OPAM corrects the remaining arrivals of j. Let o and d be, respectively, the original and mutated kth arrival time of j. For all the arrivals of j after d, OPAM first updates their original arrival time values by adding the difference d − o. Let T = [0, T] be the scheduling period. OPAM then removes some arrivals of j if they are mutated to arrive after T or adds new arrivals of j while ensuring that all tasks arrive within T.
As shown in lines 20-26 in Figure 5, the internal fitness of the generated offspring is computed based on the P population. OPAM then updates the A population of task-arrival sequences by comparing the offspring and individuals in A (line 27).
We note that when a system is only composed of periodic tasks, OPAM will skip evolving for worst-case arrival sequences as arrivals of periodic tasks are deterministic (see Section 3). Nevertheless, OPAM will optimize priority assignments based on given arrivals of periodic tasks. When needed, OPAM can be easily extended to manipulate offset and period values for periodic tasks, in a way identical to how we currently handle inter-arrival times for aperiodic tasks.
1 Algorithm Priority assignments evolution 2 Input A: population of task-arrival sequences 3 Input P: population of priority assignments 4 Input ps p : population size //priority assignments 5 Input cp p : crossover probability //priority assignments 6 Input mp p : mutation probability //priority assignments 7 Output P: population of priority assignments 8 9 //evaluate internal fitness values for P 10 for each #» P i ∈ P 11 for each A l ∈ A 12 S ← simulate(A l , #» R, ps p ) 33 34 return P Fig. 6: An NSGAII-based algorithm for evolving priority assignments. 6.5 Evolution: Best-case priority assignments Figure 6 shows the evolution procedure of priority assignments, which refines lines 23-26 in Figure 4. OPAM tailors the Non-dominated Sorting Genetic Algorithm version 2 (NSGAII) (Deb et al., 2002) to generate a nondominating (equally viable) set of priority assignments, representing the best trade-offs found among the given internal fitness functions. This is referred to as a Pareto nondominated front (Knowles and Corne, 2000), where the dominance relation over priority assignments is defined as follows: A priority assignment #» P dominates another priority assignment #» P if #» P is not worse than #» P in all fitness values, and #» P is strictly better than #» P in at least one fitness value. NSGAII has been applied to many multi-objective optimization problems (Langdon et al., 2010;Shin et al., 2018;Wang et al., 2020).
OPAM maintains a population P of priority assignments as an archive that contains the best priority assignments discovered during coevolution. Unlike a standard application of NSGAII, in our study, we need to reevaluate the internal fitness values for priority assignments in P at every generation as the internal fitness values are computed based on the A population of taskarrival sequences, which coevolves. As shown in lines 9-16 in Figure 6, OPAM first computes the internal fitness functions that measure the magnitude of safety margins and the extent of constraint satisfaction. OPAM then sorts non-dominated Pareto fronts (line 19) and assigns crowding distance (line 20) to introduce diversity among non-dominated priority assignments (Deb et al., 2002).
For breeding the next population of priority assignments (line 21 in Figure 6, OPAM applies the following standard genetic operators (Sivanandam and Deepa, 2008) that have been applied to many similar problems (Islam et al., 2012;Marchetto et al., 2016;Shin et al., 2018): (1) Selection. OPAM uses a binary tournament selection based on non-domination ranking and crowding distance. The binary tournament selection has been used in the original implementation of NSGAII (Deb et al., 2002). (2) Crossover. OPAM applies a partially mapped crossover (PMX) (Goldberg and Lingle, 1985). PMX ensures that the generated offspring are valid permutations of priorities. (3) Mutation. OPAM uses a permutation swap method for mutating a priority assignment. This mutation method interchanges two randomly-selected priorities in a priority assignment according to a given mutation probability.
For the generated population P α of priority assignments, OPAM computes the two internal fitness functions (lines 22-29 in Figure 6). OPAM then sorts non-dominated Pareto fronts for the union of the current P and next P α populations (line 30), assign crowding distance (line 31), and select the best archive by accounting for the computed non-domination ranking and crowding distance (line 32). Figure 7 shows an algorithm that computes the external fitness functions and finds the best Pareto front, which refines lines 28-31 in Figure 4. To monitor the coevolution progress in a stable manner, OPAM takes as input a set E of task-arrival sequences that are generated independently from the coevolution process. We use an adaptive random search technique (Chen et al., 2010) to sample task-arrival sequences in order to create E. The adaptive random search extends the naive random search by maximizing the Euclidean distance between the sampled points such that it maximizes the diversity of task-arrival sequences in E.

External fitness evaluation
1 Algorithm Priority assignments evolution 2 Input E: set of task-arrival sequences //external evaluation 3 Input P: population of priority assignments 4 Input ps p : population size //priority assignments 5 Input cp p : crossover probability //priority assignments 6 Input mp p : mutation probability //priority assignments 7 Output P: population of priority assignments 8 9 //evaluate external fitness values for P 10 for each #» P i ∈ P 11 for each E l ∈ E 12 S ← simulate(E l , As shown in lines 9-16 in Figure 7, OPAM computes the two external fitness values for each priority assignment in the P population based on a given set E of task-arrival sequences. OPAM then sorts non-dominated Pareto fronts for the union of the P population and the current best Pareto front (line 17), assigns crowding distance (line 18), and selects the best Pareto front by accounting for the computed non-domination ranking and crowding distance (line 32). OPAM adopts NSGAII in order to maximize the diversity of priority assignments in the best Pareto front.

Evaluation
This section describes our evaluation of OPAM through six industrial case studies from different domains and several synthetic subjects. Our full evaluation package is available online (Lee et al., 2021).

RQ1 (Sanity check): How does OPAM perform compared with Random
Search? For search-based solutions, this RQ is an important sanity check to ensure that success is not due to the search problem being easy (Arcuri and Briand, 2014). Our conjecture is that a search-based algorithm, although expensive, will significantly outperform naive random search (RS).

RQ2 (Coevolution):
Is competitive coevolution suitable to find best-case priority assignments? We conjecture that a coevolutionary algorithm is a suitable solution to address the priority assignment problem since it is solved, in practice, through a competing interactive process between the development and testing teams. To answer this RQ, we compare OPAM with a sequential approach that first looks for worst-case sequences of task arrivals and then tries to find best-case priority assignments. RQ3 (Scalability): Can OPAM find (near-)optimal solutions for large-scale systems in a reasonable time budget? In this RQ, we investigate the scalability of OPAM by conducting some experiments with systems of various sizes, including six industrial and several synthetic subjects. We study the relationship between OPAM's performance measures and the characteristics of study subjects. RQ4 (Usefulness): How do priority assignments generated by OPAM compare with priority assignments defined by engineers? OPAM can be considered useful only when it finds priority assignments that show benefits over those defined (manually) by engineers with domain expertise. This RQ therefore compares the quality of priority assignments generated by OPAM with those defined by engineers. We further discuss the usefulness of OPAM from a practical perspective, based on the feedback received from engineers in LuxSpace.

Industrial study subjects
To evaluate RQs in realistic and diverse settings, we apply OPAM to six industrial study subjects from different domains such as aerospace, automotive, and avionics domains. Specifically, we obtained one case study subject from our industry partner, LuxSpace. We found the other five industrial study subjects in the literature (Di Alesio et al., 2015), which, consistent with the LuxSpace system, all assume a single-queue, multi-core, fixed-priority scheduling policy.
Note that OPAM uses the same scheduling policy (described in Section 6.2) as in Di Alesio et al.'s work. This policy uses fixed priorities that are determined offline and therefore do not change dynamically. Table 2 summarizes the relevant attributes of these subjects, presenting the number of periodic and aperiodic tasks, resource dependencies, triggering relations, and platform cores. The subjects are characterized by real-time parameters, e.g., periods, deadlines, and priorities, described in Section 3. We note that all the study subjects are deadlock-free systems as they do not have circular resource dependencies. Regarding task priorities, all tasks in the six subjects have fixed priorities, which are defined by experts in their domains. The full task descriptions (including WCET, inter-arrival times, periods, deadlines, priorities, and relationship details) of the subjects are available online (Lee et al., 2021). The main missions of the six subjects are described as follows: -ICS is an ignition control system that checks the status of an automotive engine and corrects any errors of the engine (Peraldi-Frati and Sorel, 2008). The system was developed by Bosch GmbH 1 . -CCS is a cruise control system that acquires data from vehicle sensors and maintains the specified vehicle speed (Anssi et al., 2011). Continental AG 2 developed the system. -UAV is a mini unmanned air vehicle that follows dynamically defined way-points and communicates with a ground station to receive instructions (Traore et al., 2006). The system was developed in collaboration with the University of Poitiers France and ENSMA 3 . -GAP is a generic avionics platform for a military aircraft (Locke et al., 1990). The system was designed in a joint project with Carnegie Mellon University, the US Navy, and IBM 4 , aiming at supporting several missions regarding air-to-surface attacks. -HPSS is a satellite system for two satellites, named Herschel and Planck (Mikučionis et al., 2010). The two satellites share the same computational architecture, although they have different scientific missions. Herschel aims at studying the origin and evolution of stars and galaxies. Planck's primary mission is the study of the relic radiation from the Big Bang. ESA 5 carried out the HPSS project. -ESAIL is a microsatellite for tracking ships worldwide by detecting messages that ships radio-broadcast (see Section 2). Luxspace, our industry partner, developed ESAIL in an ESA project.

Synthetic study subjects
To investigate RQ3, we use synthetic subjects in order to freely control key parameters in real-time systems. We create a set of tasks by adopting a wellknown procedure (Emberson et al., 2010) for synthesizing real-time tasks, which has been applied in many schedulability analysis studies (Davis et al., 2008;Zhang and Burns, 2009;Davis and Burns, 2011;Grass and Nguyen, 2018;Dürr et al., 2019). Figure 8 describes a procedure that synthesizes a set of real-time tasks. For a given number n of tasks and a target utilization u t , the procedure first generates a set U of task utilization values by using the UUniFast-Discard algorithm (Davis and Burns, 2011) (line 13). The UUniFast-Discard algorithm is devised to give an unbiased distribution of utilization values, where a utilization U j ∈ U is a positive value and Uj ∈U U j = u t .
The procedure then generates a set I of n task periods according to a loguniform distribution within a range [pd min , pd max ], i.e., given a task period (random variable) I j ∈ I, log I j follows a uniform distribution (line 14 in Figure 8). For example, when the minimum and maximum task periods are pd min = 10ms and pd max = 1000ms, respectively, the procedure generates (approximately) an equal number of tasks in time intervals [10ms, 100ms] and [100ms, 1000ms]. The parameter g is used to choose the granularity of the periods, i.e., task periods are multiples of g. Such a distribution of task periods provides a reasonable degree of realism with respect to what is usually observed in real systems (Baruah et al., 2011).
As shown in lines 15-16 of the procedure in Figure 8, a set C of task WCETs are computed based on the set U of task utilization values and the set I of task periods. Specifically, a task WCET C j ∈ C is computed as C j = U j · I j .
As per line 17 of the listing in Figure 8, the procedure synthesizes a set S of tasks. A task j is characterized by a period I j and a WCET C j and it is associated with a deadline dl (j) and a priority pr (j). According to the rate-monotonic scheduling policy (Liu and Layland, 1973), tasks' deadlines are equal to their periods and tasks with shorter periods are given higher priorities.
To synthesize aperiodic tasks, the procedure converts some periodic tasks to aperiodic tasks according to a given ratio γ of aperiodic tasks among all tasks (see line 19 in Figure 8). A range factor µ is used to determine maximum inter-arrival times of aperiodic tasks. Specifically, for a task j to be converted, the procedure sets the minimum inter-arrival time pmin(j) as pmin(j) = I j . The procedure then selects a uniformly distributed value x from the range (1, µ] and computes the maximum inter-arrival time pmax (j) as pmax (j) = x · I j .

Experimental Design
This section describes how we design experiments to answer the RQs described in Section 7.1. We conducted four experiments, EXP1, EXP2, EXP3, and EXP4, as described below.
EXP1. To answer RQ1, EXP1 compares OPAM with our baseline, which relies on random search, to ensure that the effectiveness of OPAM is not due to the search problem being simple. Our baseline, named RS, replaces GA with a random search for finding worst-case sequences of task arrivals and NSGAII with a random search for finding best-case priority assignments. Note that RS uses the same internal and external fitness functions (see Section 6.3) and also maintains the best populations during search; however, it does not employ any genetic operators, i.e., crossover and mutation. In EXP1, we applied OPAM and RS to the six industrial subjects described in Section 7.2.
Recall from Section 6.3 that OPAM uses a set E of task-arrival sequences that are generated independently from the coevolution process in order to monitor the coevolution progress in a stable manner. As OPAM and RS use the same set E of task-arrival sequences, EXP1 first compares OPAM and RS based on E. In addition, EXP1 examines how well the solutions, i.e., priority assignments, found by OPAM and RS perform with other sequences of task arrivals. To do so, we create six sets of sequences of task arrivals for each study subject by varying the method to generate task-arrival sequences and the number of task-arrival sequences. Note that task-arrival sequences generated by different methods are valid with respect to the inter-arrival times defined in each study subject. Below we describe the six sets of task-arrival sequences generated for each subject.
-T 10 a : A set of task-arrival sequences generated by using an adaptive random search technique (Chen et al., 2010) that aims at maximizing the diversity of task-arrival sequences. The T 10 a set contains 10 sequences of task arrivals.
-T 10 w : A set of task-arrival sequences generated by using a stress test case generation method that aims at maximizing the chances of deadline misses in task executions. The stress test case generation method extends prior work (Briand et al., 2005). The extended method uses the fitness function regarding deadline misses and genetic operators that OPAM introduces for evolving worst-case task-arrival sequences (see Section 6). The T 10 w set contains 10 sequences of task arrivals.
-T 10 r : A set of task-arrival sequences generated randomly. The T 10 r set has 10 sequences of task arrivals.
-T 500 a : A set of task-arrival sequences generated by using the adaptive random search technique. The T 500 a set contains 500 sequences of task arrivals.
-T 500 w : A set of task-arrival sequences generated by using the stress test case generation method. The T 500 w set contains 500 sequences of task arrivals. -T 500 r : A set of task-arrival sequences generated randomly. The T 500 r set has 500 sequences of task arrivals.
EXP2. To answer RQ2, EXP2 compares OPAM with a priority assignment method, named SEQ, that relies on one-population search algorithms. SEQ first finds a set of worst-case sequences of task arrivals using GA with the fitness function that measures the magnitude of deadline misses (see fd () in Section 6.3) and the genetic operators described in Section 6.4. Given a set of worst-case task-arrival sequences obtained from GA, SEQ then aims at finding best-case priority assignments using NSGAII with the fitness functions that quantify the magnitude of safety margins and the degree of constraint satisfaction (see fs() and fc(), respectively, in Section 6.3) and the genetic operators described in Section 6.5.
We note that SEQ does not use the external fitness functions as it does not coevolve task-arrival sequences and priority assignments. Hence, the numbers of fitness evaluations of the two methods are not comparable. To fairly compare OPAM and SEQ, we set the same time budget for the two methods. Specifically, we first measure the execution time of OPAM for analyzing each subject. We then split the execution time in half and set each half time as the execution budget of the GA and NSGAII steps in SEQ for the corresponding subject. In order to assess the quality of priority assignments obtained from OPAM and SEQ, we use the sets of task-arrival sequences described in EXP1, i.e., T 10 a , T 10 w , T 10 r , T 500 a , T 500 w , and T 500 r , which are created independently from the two methods.
EXP3. To answer RQ3, EXP3 examines not only the six industrial subjects but also 370 synthetic subjects. We create the synthetic subjects to study correlations between the execution time and memory usage of OPAM and the following parameters: the number of tasks (n), a (part-to-whole) ratio of aperiodic tasks (γ), a range factor for maximum inter-arrival times (µ), and simulation time (T ), as described in Sections 7.3 and 6. We note that we chose to control parameters n, γ, and µ because they are the main parameters on which engineers have control to define tasks in real-time systems. Simulation time T obviously impacts the execution time of OPAM as well. But EXP3 aims at modeling such correlations precisely and providing experimental results. Regarding the other factors that define, for example, task relationships and platform cores, we note significant diversity across the six industrial subjects.
Recall from Section 7.3 that we use the task generation procedure presented in Figure 8 to synthesize tasks. For EXP3, we set some parameter values of the procedure as follows: (1) Target utilization u t = 0.7, which is a common objective in the development of a real-time system in order to guarantee the schedulability of tasks (Fineberg and Serlin, 1967;Dürr et al., 2019). (2) The range of task periods [pd min , pd max ] = [10ms, 1s], which are common values in many real-time systems (Emberson et al., 2010;Baruah et al., 2011). (3) The granularity of task periods g = 10ms in order to increase realism as most of the task periods in our industrial subjects are multiples of 10ms. Because of some degree of randomness in the procedure of Figure 8, we create ten synthetic subjects per configuration. Below we further describe how synthetic subjects are created for each controlled experiment.
EXP3.1. To study the correlations between the execution time and memory usage of OPAM with the number of tasks n, we create nine sets of ten synthetic subjects such that no two sets have the same number of tasks. Specifically, we create sets with 10, 15, ..., 50 tasks, respectively. Regarding the ratio of aperiodic tasks, γ = 0.4 as, on average, the ratio of aperiodic tasks to periodic tasks in our industrial subjects is 2/3. For the range factor, µ = 2, which is determined based on the inter-arrival times of aperiodic tasks in our industry subjects. We set the simulation time T to 2s in order to ensure that any aperiodic task arrives at least once during that time. We note that, given the maximum task period pd max = 1s and the range factor µ = 2, the maximum inter-arrival time of an aperiodic task is at most 2s (see Section 7.3).
EXP3.2. To study the correlations between the execution time and memory usage of OPAM with the ratio of aperiodic tasks γ, we create ten sets of synthetic subjects by setting this ratio to the following values: 0.05, 0.10, ..., 0.50. We set the number of tasks to 20 (n = 20), which is the average number of tasks in our six industrial subjects. Regarding the other parameters, range factor and simulation time, µ = 2 and T = 2s are set as discussed in EXP3.1.
EXP3.3. To study the correlations between the execution time and memory usage of OPAM with the range factor µ that is used to determine the maximum inter-arrival times, we create nine sets of synthetic subjects by setting µ to 2, 3, ..., 10. We set the simulation time as follows: T = 10s. This ensures that any aperiodic task arrives at least once during the simulation time when µ is at most 10 (see Section 7.3). The other parameters, the number of tasks and ratio of aperiodic tasks, n = 20 and γ = 0.4 are set as discussed in EXP3.1 and EXP3.2.
EXP3.4. To study the correlations between the execution time and memory usage of OPAM with the simulation time T , we create nine sets of synthetic subjects by setting T to 2s, 3s, ..., 10s. The other parameters, e.g., the number of tasks, the ratio of aperiodic tasks, and the range factor, n = 20, γ = 0.4, and µ = 2, are set as discussed in EXP3.1 and EXP3.2.
EXP4. To answer RQ4, EXP4 compares priority assignments optimized by OPAM and those defined by engineers. We apply OPAM to the six industrial subjects (see Section 7.2) which include priority assignments defined by practitioners. Note that we focus here on the ESAIL subject in collaboration with our industry partner, LuxSpace; The other five subjects are from the literature (Di Alesio et al., 2015) and hence we can only collect feedback from practitioners for ESAIL.

Evaluation metrics
Multi-objective evaluation metrics. In order to fairly compare the results of search algorithms, based on existing guidelines (Li et al., 2020) for assessing multi-objective search algorithms, we use complementary quality indicators: Hypervolume (HV) (Zitzler and Thiele, 1999), Pareto Compliant Generational Distance (GD+) (Ishibuchi et al., 2015), and Spread (∆) (Deb et al., 2002). To compute the GD+ and ∆ quality indicators, following the usual procedure (Li et al., 2020), we create a reference Pareto front as the union of all the non-dominated solutions obtained from all runs of the algorithms being compared. Identifying the optimal (ideal) Pareto front is typically infeasible for a complex optimization problem (Li et al., 2020). Key features of the three quality indicators are described below.
-HV is defined to measure the volume in the objective space that is covered by members of a Pareto front generated by a search algorithm (Zitzler and Thiele, 1999). The higher the HV values, the more optimal the search outputs. -GD+ is defined to measure the distance between the points on a Pareto front obtained from a search algorithm and the nearest points on a reference Pareto front (Ishibuchi et al., 2015). GD+ modifies General Distance (GD) (Veldhuizen and Lamont, 1998) to account for the dominance relations when computing the distances. The lower the GD+ values, the more optimal the search outputs. -∆ is defined to measure the extent of spread among the points on a Pareto front computed by a search algorithm (Deb et al., 2002). We note that OPAM aims at obtaining a wide variety of equally-viable priority assignments on a Pareto front (see Section 6). The lower the Spread values, the more spread out the search outputs.
Interpretable metrics. The two external fitness functions described in Section 6 mainly aim at effectively guiding search. It is, however, difficult for practitioners to interpret the computed fitness values. Since they are not intuitive to practitioners, to assess the usefulness of OPAM from a practitioner perspective, we measure (1) the safety margins from tasks' completion times to their deadlines across our experiments and (2) the number of constraint violations in a priority assignment. In addition, we measure the execution time and memory usage of OPAM.
Statistical comparison metrics. To statistically compare our experiment results, we use the Mann-Whitney U-test (Mann and Whitney, 1947) and Vargha and Delaney'sÂ 12 effect size (Vargha and Delaney, 2000), which have been frequently applied for evaluating search-based algorithms (Arcuri et al., 2010;Hemmati et al., 2013;Shin et al., 2018). Mann-Whitney U-test determines whether two independent samples are likely or not to belong to the same distribution. We set the level of significance, α, to 0.05. Vargha and Delaney'sÂ 12 measures probabilistic superiority -effect size -between search algorithms. Two algorithms are considered to be equivalent when the value of A 12 is 0.5.

Parameter tuning and implementation
Parameters for coevolutionary search. For the coevolutionary search parameters, we set the population size to 10, the crossover rate to 0.8, and the mutation rate to 1/|J|, where |J| denotes the number of tasks. We apply these parameter values for both the evolution of task-arrival sequences and priority assignments (see Section 6). These values are determined based on existing guidelines (Arcuri and Fraser, 2011;Sayyad et al., 2013) and previous work (Lee et al., 2020b). We determine the number of coevolution cycles (see Section 6) based on an initial experiment. We applied OPAM to the six industrial subjects and ran OPAM 50 times for each subject. From the experiment results, we observed that there is no notable difference in Pareto fronts generated after 1000 cycles. Hence, we set the number of coevolution cycles to 1000 in our experiments, i.e., EXP1, EXP2, and EXP3 described in Section 7.4. Parameters for evaluating fitness functions. To evaluate external fitness functions, we use a set of task-arrival sequences that are generated independently from the coevolution process (see Section 6.6). We use an adaptive random search (Chen et al., 2010) to generate a set E of task-arrival sequences, which varies task arrival times within the specified inter-arrival time ranges of aperiodic tasks. We set the size of E to 10. From our initial experiment, we observed that this is sufficient to compute the external fitness functions of OPAM under a reasonable time, i.e., less than 15s. We note that E contains two default sequences of task arrivals as follows: (seq. 1) aperiodic tasks always arrive at their maximum inter-arrival times and (seq. 2) aperiodic tasks always arrive at their minimum inter-arrival times. By having those two sequences of task arrivals as initial elements in E, the adaptive random search finds other sequences of task arrivals to maximize the diversity of elements in E.
If a system contains only periodic tasks, the simulation time is often set as the least common multiple (LCM) of their periods to account for all possible arrivals (Peng et al., 1997). However, as the six industrial subjects include aperiodic tasks, this is not applicable. For the experiments with the six industrial subjects, we set the simulation time to the maximum time between the LCM of periodic tasks' periods and the maximum inter-arrival time among aperiodic tasks. By doing so, all possible arrival patterns of periodic tasks are examined and any aperiodic task arrives at least once during simulation. Recall from Section 6.4 that OPAM varies arrival times of aperiodic tasks to find worst-case sequences of task arrivals.
We note that the parameters mentioned above can probably be further tuned to improve the performance of our approach. However, since with our current setting, we were able to convincingly and clearly support our conclusions, we do not report further experiments on tuning those values.
Implementation. We implemented OPAM by extending jMetal (Durillo and Nebro, 2011), which is a metaheuristic optimization framework supporting NSGAII and GA. We conducted our experiments using the high-performance computing cluster (Varrette et al., 2014) at the University of Luxembourg. To account for randomness, we repeated each run of OPAM 50 times for all experiments. Each run of OPAM was executed on a different node (equipped with five 2.5GHz cores and 20GB memory) of the cluster, and took less than 16 hours. Figure 9 shows the best Pareto fronts obtained with 50 runs of OPAM and RS, for the six industrial study subjects described in Section 7.2. The fitness values presented in the figures are computed based on each subject's set E of task-arrival sequences (see Section 7.6), which is created independently from OPAM and RS. Figures 9a, 9c, 9d, 9e, and 9f indicate that OPAM finds significantly better solutions than RS for ICS, UAV, GAP, HPSS, and ESAIL. Regarding CCS (see Figure 9b), it is difficult to conclude anything based only on visual inspection. Hence, we compared Pareto fronts obtained by OPAM and RS using the three quality indicators HV, GD+, and ∆, described in Section 7.5. Figure 10 depicts distributions of HV (Figure 10a), GD+ (Figure 10b), and ∆ (Figure 10c) for the six industrial subjects. The boxplots in the figures present the distributions (25%-50%-75%) of the quality values obtained from 50 runs of OPAM and RS. The quality values are computed based on the Pareto fronts obtained by the algorithms and each subject's set E of taskarrival sequences (see Section 7.6). In the figures, statistical comparisons of the two corresponding distributions are summarized using p-values andÂ 12 values, as described in Section 7.5, under each subject name.

RQ1.
As shown in Figures 10a and 10b, OPAM obtains better distributions of HV and GD+ compared to RS for all six subjects. All the differences are statistically significant as the p-values are below 0.05. Regarding ∆, as depicted in Figure 10c, OPAM yields higher diversity in Pareto front solutions than RS for the following subjects: UAV, GAP, and HPSS. For ICS, CCS, and ESAIL, OPAM and RS obtain similar ∆ values. From Figures 10a and 10b, and Table 2, we also observe that the higher the number of aperiodic tasks in a subject, the larger the differences in HV and GD+ between OPAM and RS. Hence, for these two quality indicators, OPAM outperforms RS more significantly for more complex search problems. Note that the number of aperiodic tasks is one of the main factors that drives the degree of uncertainty in task arrivals.
Given the Pareto priority assignments obtained by OPAM and RS, we further assessed the quality values of the solutions by evaluating them with  different sets of task-arrival sequences. As described in Section 7.4, we created six test sets of task-arrival sequences for each subject by varying the sequence generation methods and the number of task-arrival sequences in a set (see T 10 a , T 10 w , T 10 r , T 500 a , T 500 w , and T 500 r described in Section 7.4). Table 3 reports the average quality values measured by HV, GD+, and ∆ based on 50 runs of OPAM and RS with the different test sets of task-arrival sequences. The results indicate that OPAM significantly outperforms RS in most comparison cases. Specifically, out of a total of 108 comparisons, OPAM outperforms RS 87 times (see the blue-colored cells related to OPAM in Table 3). Regarding ∆, RS outperforms OPAM for the CCS subject (see the gray-colored cells related to RS in Table 3). As shown in Table 2, CCS has only 3 aperiodic tasks and RS was therefore able to find better solutions with respect to ∆ for such a simple subject.
The answer to RQ1 is that OPAM significantly outperforms RS with respect to HV and GD+. In particular, OPAM performs considerably better than RS when more aperiodic tasks are involved.

RQ2.
To compare OPAM and SEQ, we first visually inspect the best Pareto fronts obtained from 50 runs of OPAM and SEQ for the six study systems described in Section 7.2 by varying the test sets of task-arrival sequences for each subject (see T 10 a , T 10 w , T 10 r , T 500 a , T 500 w , and T 500 r described in Section 7.4), which are created independently from OPAM and SEQ. Overall, we observed that OPAM finds significantly better priority assignments in most cases. For example, Figure 11 depicts the best Pareto fronts obtained by OPAM and SEQ when the fitness values are computed based on each subject's test set T 500 a of 500 task-arrival sequences, which are generated with adaptive random search. The results clearly show that OPAM outperforms SEQ with respect to producing more optimal Pareto fronts for ICS, CCS, UAV, HPSS, and ESAIL. For GAP, the visual inspection is not sufficient to provide any conclusions. Hence, we further compare OPAM and SEQ based on the quality indicators described in Section 7.5. Table 4 compares the quality values measured by HV, GD+, and ∆ for the six study subjects. To fairly compare the priority assignments obtained by OPAM and SEQ, we assess them with the test sets of task-arrival sequences for each subject (see T 10 a , T 10 w , T 10 r , T 500 a , T 500 w , and T 500 r described in Section 7.4). Table 4 reports the average quality values computed based on 50 runs of OPAM and SEQ. In Table 4, the statistical comparison of the two corresponding distributions are reported using p-values andÂ 12 values.
As shown in Table 4, we compared OPAM and SEQ 108 times by varying the study subjects, the quality indicators, the number of task-arrival sequences, and the task-arrival sequence generation methods. Out of 108 comparisons, OPAM significantly outperforms SEQ 63 times. Specifically, out of 36 HV comparisons, OPAM obtains better HV values than SEQ 28 times. For ICS (6 HV comparisons), the differences in HV values between OPAM and SEQ are not statistically significant. In only one HV comparison for CCS, SEQ outperforms OPAM (see the gray-colored cell related to HV and CCS in Table 4). To interpret these results, one must recall from Table 2 that ICS and CCS have only three aperiodic tasks that impact the degree of uncertainty in task arrivals and therefore represent simple cases. Out of 36 GD+ comparisons, OPAM outperforms SEQ 32 times. SEQ outperforms OPAM only two times for CCS. Hence, overall, the results indicate that OPAM outperforms SEQ, in terms of generating more optimal Pareto fronts, when the subjects  feature a considerable degree of uncertainty in task arrivals and therefore make our search problem more complex. Otherwise differences are not statistically or practically significant. Regarding ∆, which focuses on the diversity of solutions on the Pareto front, SEQ outperforms OPAM 24 times out of 36 comparisons (see the gray-colored cells related to ∆ in Table 4). However, since OPAM pro- duces enough alternative priority assignments spreading across Pareto fronts (as visible from the solutions obtained by OPAM in Figure 11), these differences in ∆ have limited implications in practice.
The answer to RQ2 is that OPAM significantly outperforms SEQ with respect to HV and GD+ when in the presence of more than a few aperiodic tasks and therefore higher uncertainty in terms of task arrivals. OPAM therefore generate solutions on a Pareto front that is closer to the unknown, optimal one. In other words, coevolution is a suitable and successful strategy for finding better priority assignments in complex systems.
RQ3. Table 5 reports the average execution times and memory usage required to run OPAM for the six industrial subjects, over 50 runs. As shown in Table 5, finding optimal priority assignments for ESAIL requires the largest execution time (≈15.5h) and memory usage (≈2.9GB), compared to the other subjects. We note that such execution time and memory usage are acceptable as OPAM can be executed offline in practice. Figures 12 and 13 show, respectively, the execution times and memory usage from EXP3.1 (a), EXP3.2 (b), EXP3.3 (c), and EXP3.4 (d), described in Section 7.4. The boxplots in the figures show distributions (25%-50%-75%) obtained from 50 × 10 runs of OPAM for a set of 10 synthetic subjects, which are created with the same experimental setting. Regarding the execution time of OPAM, Figures 12a and 12d show that the execution time of OPAM is linear both in the number of tasks and simulation time. As for the memory usage of OPAM, results in Figures 13a and 13d indicate that memory usage is linear both in the number of tasks and in the simulation time. However, the results depicted in Figures 12b, 12c, 13b, and 13c indicate that there are no correlations between OPAM execution time and memory usage and the following two parameters: ratio of aperiodic tasks and range factor. Therefore, we expect OPAM to scale well as the numbers of tasks and simulation time increase. The answer to RQ3 is that the execution time and memory usage of OPAM are linear in the number of tasks and simulation time, thus scaling to industrial systems. Further, across our experiments, OPAM takes at most 15.5h using 2.9GB of memory to optimize priority assignments, an acceptable result since this is done offline. Figure 14 compares, with respect to external fitness (see the fs() and fc() fitness functions and the set E of sequences of task arrivals described in Section 6.6), the Pareto solutions obtained by OPAM against the priority assignments defined by engineers for the six industrial subjects: ICS (Figure 14a jectives: the magnitude of safety margins and the extent to which constraints are satisfied. Table 6 summarizes safety margins from the task executions of ESAIL when using one of our priority assignments optimized by OPAM and the one defined by engineers at LuxSpace. Note that we focus on ESAIL as it is not possible to access the engineers who developed the other five industrial subjects reported in the literature (Locke et al., 1990;Traore et al., 2006;Peraldi-Frati and Sorel, 2008;Mikučionis et al., 2010;Anssi et al., 2011). For comparison, we chose the bottom-left solution in Figure 14f since it is optimal for the constraint fitness, which is the same as the fitness value of the priority assignment defined by engineers, and the differences in safety margin fitness among our solutions are negligible.

RQ4.
As shown in Table 6, our optimized priority assignment significantly outperforms the one of engineers. Our solution increases safety margins, on av- erage, by 5.33% compared to the engineers' solution. For aperiodic tasks, our solution decreases safety margins by 0.01% (4.2ms difference) when the safety margins being compared are the maximum margins observed in both solutions (see the maximum safety margins, 59710.3ms obtained by engineers' solution and 59707.2ms obtained by OPAM, in Table 6). Such a small decrease is however negligible in the context of ESAIL as the maximum safety margin Table 6: Comparing safety margins from the task executions of ESAIL when using our optimized priority assignment and the one defined by engineers. obtained by our solution is still large, i.e., ≈1m. For periodic tasks, we note that our solution increases safety margins by 208.09% when the safety margins being compared are the minimum margins observed in both solutions (see the minimum safety margins, -44.5ms obtained by engineers' solution and 48.1ms obtained by OPAM, in Table 6). Note that the minimum safety margin of -44.5ms obtained with the engineers' solution indicates that a task violates its deadline. In the context of ESAIL, which is a mission-critical system, such gain in safety margins in the executions of periodic tasks is important because the hard deadlines of periodic tasks are more critical than the soft deadlines of aperiodic tasks. Investigating practitioners' perceptions of the benefits of OPAM is necessary to adopt OPAM in practice. To do so, we draw on the qualitative reflections of three software engineers at LuxSpace, with whom we have been collaborating on this research. They have had four to seven years of experience developing satellite systems at LuxSpace, with more than 50 years of collective experience in companies. All the reflections are based on observations made throughout our interactions. The engineers at LuxSpace deemed OPAM to be an improvement over their current practice as it allows them to perform domain-specific trade-off analysis among Pareto solutions and is useful in practice to support decision making with respect to their task design. Encouraged by the promising results, we are now applying OPAM to new systems in collaboration with LuxSpace.
The answer to RQ4 is that OPAM helps optimize priority assignments such that they outperform those manually defined by engineers based on domain expertise. Our results show that OPAM, compared to current practice, increases safety margins, on average, by 5.33%.

Threats to Validity
To mitigate the main threats that arise from not accounting for random variation, we compared OPAM against RS under identical parameter settings. We present all the underlying parameters and provide the full package of our experiments to facilitate replication. Also, we ran OPAM 50 times for each study subject and compared results using statistical analysis, i.e., Mann-Whitney Utest and Vargha and Delaney'sÂ 12 .
We note that there are prior studies that aim at optimizing priority assignments such as OPA (Audsley, 1991) and RPA (Davis and Burns, 2007). However, to our knowledge, none of the existing works offer ways to analyze trade-offs among equally viable priority assignments with respect to safety margins and the satisfaction of constraints. Nevertheless, we attempted to compare OPAM with an extension of an existing method, e.g., RPA (Davis and Burns, 2007). To do so, we first applied an exhaustive schedulability analysis technique to the ESAIL subject -our motivating case study -in order to verify whether the ESAIL tasks are schedulable for a given priority assignment. Note that existing priority assignment techniques are built on such schedulability analysis methods, which are therefore a prerequisite. We chose UPPAAL (Behrmann et al., 2004), a model checker, for schedulability analysis as it has been used in real-time system studies (Mikučionis et al., 2010;Yu et al., 2010;Yalcinkaya et al., 2019). However, our experiment results using UPPAAL for ESAIL showed that it was not able to complete the analysis task, even after 5 days of execution, for a single priority assignment. We were therefore not able to perform experimental comparisons with existing priority assignment methods. Since this evaluation is not the main focus of this article, we point the reader to the UPPAAL specification of ESAIL available online (Lee et al., 2021).
Recall from Section 6.2 that OPAM assigns tasks' WCETs to their execution times when it simulates the worst-case executions of tasks while varying task arrival times. In many real-time systems studies (Briand et al., 2005;Guan et al., 2009;Lin et al., 2009;Anssi et al., 2011;Zeng et al., 2014;Di Alesio et al., 2015;Dürr et al., 2019), static WCETs are often used instead of varying task execution times for the purpose of real-time analysis. For example, practitioners typically use WCETs to estimate the lowest bound of CPU utilization required to properly apply the rate monotonic scheduling policy (Fineberg and Serlin, 1967) to their systems. Similarly, OPAM assumes that near-worst-case schedule scenarios can be simulated by assigning tasks' WCETs to their execution times and varying tasks' arrival times using search. A near-worst-case schedule scenario entails that the magnitude of deadline misses is maximized when tasks execute as per this scenario. Under this working assumption, we were able to empirically evaluate the sanity, coevolution, scalability, and usefulness aspects of OPAM (see Section 7). The results indicate that OPAM is a promising and useful tool. However, the formal proof of whether or not the WCET assumption holds in the system model described in Section 3 requires complex analysis, accounting for varying task arrival times, triggering relation-ships, resource dependencies, and multiple cores. When task execution times need to be varied during simulation, engineers can adapt OPAM by utilizing Monte-Carlo simulation (Kroese et al., 2014) to account for such variations.
The main threat to external validity is that our results may not generalize to other systems. We mitigate potential biases and errors in our experiments by drawing on real industrial subjects from different domains and several synthetic subjects. Specifically, we selected two subjects from the aerospace domain, two from the automotive domain, and two from the avionics domain. The positive feedback obtained from LuxSpace and the encouraging results from our industrial case studies indicate that OPAM is a scalable and practical solution. Furthermore, we believe OPAM introduces a promising avenue for addressing the problem of priority assignment by applying coevolutionary algorithms, even for systems that use other scheduling policies, e.g., priority inheritance. In order for OPAM to support different scheduling policies, the main requirement is to replace the existing simulator (described in Section 6) with a new simulator supporting the desired scheduling policy. In our approach, the coevolution part of OPAM is separated from the scheduling policy, which is contained in the simulator. Hence, we deem the expected changes for the coevolution part of OPAM to be minimal. Future studies are nevertheless necessary to investigate how OPAM can be adapted to find near-optimal priority assignments for other real-time systems in different contexts.

Conclusion
We developed OPAM, a priority assignment method for real-time systems, that aims to find equally viable priority assignments that maximize the magnitude of safety margins and the extent to which engineering constraints are satisfied. OPAM uses a novel approach, based on multi-objective, competitive coevolutionary search, that simultaneously evolves different species, i.e., populations of priority assignments and stress test scenarios, that compete with one another with opposite objectives, the former trying to minimize chances of deadline misses while the latter attempts to maximize them. We evaluated OPAM on a number of synthetic systems as well as six industrial systems from different domains. The results indicate that OPAM is able to find significantly better solutions than both those manually defined by engineers based on expert knowledge and those obtained by our baselines: random search and sequential search. Further, OPAM scales linearly with the number of tasks in a system and the time required to simulate task executions. Execution times on our industrial systems are practically acceptable.
In the future, we will continue to study the problem of optimal priority assignment by accounting for (1) priority assignments that change dynamically, (2) WCET value ranges that account for non-deterministic computation times, (3) interrupt handling routines that execute differently compared to real-time tasks, and (4) hybrid scheduling policies that combine multiple standard policies. We also plan to develop a real-time task modeling language to specify task characteristics such as resource dependencies, triggering relationships, engineering constraints, and behaviors of real-time tasks and to facilitate real-time system analysis, e.g., optimal priority assignment and schedulability analysis. In addition, we would like to incorporate additional analysis capabilities into OPAM in order to verify whether or not a system satisfies the required properties, e.g., schedulability of tasks and absence of deadlocks, for a given priority assignment. For example, statistical model checking (Legay et al., 2010) may allow us to verify whether tasks meet their deadlines for a given priority assignment with a probabilistic guarantee. In the long term, we plan to more conclusively validate the usefulness of OPAM by applying it to additional case studies in different application domains.