Addressing the Robustness of Resource Allocation in the Presence of Application and System Irregularities via PEPA Based Modeling

Srivastava, Srishti; Banicescu, Ioana; Sanders, William S.

doi:10.1007/978-3-030-50417-5_44

Srishti Srivastava¹⁵,
Ioana Banicescu¹⁶ &
William S. Sanders¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 12138))

Included in the following conference series:

International Conference on Computational Science

1953 Accesses

Abstract

Applications executing in heterogeneous parallel and/or distributed computing (PDC) environments are often prone to unpredictable runtime due to variations in problem, algorithm, and system characteristics. This serves as a key motivation towards a study of the robustness of resource allocations required to maintain and guarantee a desired level of performance. Performance modeling and evaluation is often utilized to understand and predict the behavior of the application and the computational system from a performance point of view. In prior work, performance modeling for evaluating response times of static resource allocations in PDC systems was introduced by the authors as a proof of concept for validating the use of the performance evaluation process algebra (PEPA) for analyzing the robustness of static resource allocations. Herein, the authors present numerical modeling of several static resource allocations to evaluate their robustness in the presence of compound perturbations generated as combinations of variations in application workload and machine availability. The novelty of the approach is to introduce the compound effect as the variability of both, application workload and processor/machine availability, into the performance modeling of the overall computational system. The performance is obtained as a parallel execution time via a numerical analysis of the modeled execution of applications on non-dedicated parallel computational resources. A significant improvement in the robustness value (up to 143%) among the mappings yielding equal parallel execution times has been demonstrated via the analysis of the results. This notable difference in the robustness values strongly indicates the benefit of selecting one mapping versus the other for guaranteeing the best execution performance.

You have full access to this open access chapter, Download conference paper PDF

A New Parallel Research Kernel to Expand Research on Dynamic Load-Balancing Capabilities

Analysis of High Performance Applications Using Workload Requirements

The HOPSA Workflow and Tools

Keywords

1 Introduction

Often in parallel and distributed computing (PDC) environments, the applications are expected to undergo variations in workload, and the underlying computational resource is considered to be non-dedicated at runtime. Therefore, appropriate initial resource allocation algorithms are required for an efficient mapping of applications to machines. In addition to the traditional performance metrics (execution time, speedup, efficiency, and others), there is a need for the study of a metric of the robustness of resource allocations to guarantee a desired level of performance. Performance modeling and evaluation is often utilized to understand the behavior of concurrent and parallel computing and communication systems by identifying features of the system that are sensitive from a performance point of view. When compared to direct experiments and simulations, numerical models and the corresponding analyses are easier to reproduce, do not incur any setup or installation costs, do not impose any prerequisites for learning a simulation framework, and are not limited by the complexity of the underlying infrastructure or simulation libraries. To the best of our knowledge, performance modeling for evaluating response times of static resource allocations in parallel and distributed computing systems and the related robustness analysis remained an open problem until the authors introduced the first solution by utilizing the performance evaluation process algebra (PEPA) [5] for analyzing the robustness of static resource allocations [22,23,24].

The main contribution of this work is to study a number of static resource allocations via a PEPA based numerical analysis of performance modeling of the parallel execution of applications with varying workload on non-dedicated parallel computing resources with varying machine availability. The robustness of the resource allocations is evaluated against the compound effect, which is defined as the combination of the impacts of the variations in the application workload and those of machine availability on system performance. Note that in this study, the terms processor availability and machine availability are used interchangeably throughout the paper. Prior validation of our PEPA models, and a confirmation of the similarity in results of our numerical analysis with the experimental results of earlier reported research results available in the existing literature [1] are illustrated in Fig. 1. However, in our prior validation work [23] only the variability in problem characteristics has been considered to mimic the experiments in [1].

In this work, the robustness value for each resource allocation is calculated as the minimum probability of the machines to finish before a desired makespan goal. Based on the analysis of the results, a number of mappings yield equal execution performance. However, a significant improvement in the robustness value (up to 143%) among the mappings yielding equal parallel execution times has been observed. Thus, this notable 2.43 times increase of robustness strongly indicates the benefits of selecting one mapping versus the other for guaranteeing the best execution performance. The work with PEPA can also be extended to modeling a single utility function that includes metrics, such as robustness, power consumption, and others. In general, this work is applicable to various types of computing and communication environments.

The rest of the paper is organized as follows. A description of the robustness of resource allocations in parallel and distributed computing systems and of the performance modeling techniques for numerical analysis of performance in parallel computing systems are given in Sect. 2. The design and the use of PEPA for the numerical evaluation of the analytical models of resource allocations in parallel computing systems are presented in Sect. 3. A comparison of the numerical results of robustness of a number of resource allocations obtained via analytical modeling using the PEPA tool, is discussed in Sect. 4. Conclusions and possible future work are summarized in Sect. 5.

2 Background and Related Work

In general, the mapping problem, defined as finding the best allocation of independent tasks (or applications) onto a set of parallel processors, is known to be NP-Complete [6,7,8]. A number of research efforts have been made towards achieving robust mapping, or resource allocation techniques in parallel and distributed computing systems. Key work done in this area is being reviewed and is presented in this section. In addition, a survey of the work in performance evaluation of computer and communication systems using analytical and numerical modeling is also being discussed.

2.1 Robustness of Resource Allocation and Application Scheduling

The initial work on robust scheduling originated from job-shop application scheduling frameworks [9]. A Standard branch and bound approach was used to solve the NP-Hard robust scheduling problem (RSP) to obtain a robust schedule for N independent jobs on a single machine [10]. A number of approaches have been developed to obtain an initial robust resource allocation by utilizing optimization techniques such as, stochastic mixed integer programming [11], iterative integer programming [12], and others. In addition to designing robust resource allocations, robustness metrics have also been formulated to study the performance guarantee of available static resource allocations against possible inadequate or variable computational environmental factors. A general methodology, called the Feature Perturbation Impact Analysis (FePIA) procedure, for developing robustness metrics for resource allocation has been presented by Ali et al. [1]. The authors define a resource allocation to be robust with respect to specific system performance features against perturbations (uncertainties) in specified system parameters if degradation in these features is constrained when limited perturbations occur [1, 2]. To address this issue of investigating the robustness of scheduling techniques at the application level, together with studies conducted at the system level for a holistic approach of robustness, research has also been conducted towards analyzing the robustness of a number of dynamic loop scheduling (DLS) algorithms, which are effective in dynamic scheduling of scientific applications on large-scale parallel and distributed systems, in the presence of varying processor loads (where robustness is quantified using the flexibility metric) and processor failures (where robustness is quantified using the resilience metric) [3, 4].

2.2 Process Algebra for Performance Evaluation

Analytical and numerical modeling for performance evaluation allows derivation of an expression of the performance feature of interest in terms of the input parameters of the model. In case of a predictive analysis of a computing system, analytical models generally provide the best insight into the effects of various perturbation parameters on system performance, and are easier to replicate for a comparative analysis of different systems. Markovian models have been shown to be an effective tool for performance analysis of computer and communication systems, where the system components are modeled as Markov processes, and the overall performance (for example, throughput, resource utilization, and others) is evaluated upon the numerical analysis of these Markov processes [15]. Process algebras are abstract languages used for specification and design of such systems.

PEPA, a stochastic process algebra, has been used for performance modeling and analysis of a wide range of concurrent systems. In a recent research study related to the scheduling of pipeline applications on grid resources, the PEPA workbench [16] has been used to solve the performance models of a scheduling system to obtain relevant performance information required for enhancing the execution performance of pipeline applications executing on the allocated grid resources (processors and network links) [13, 14]. The PEPA workbench is used to calculate the performance feature, namely the throughput of executing pipeline applications, which is obtained when employing the modeled mapping for scheduling the pipeline applications onto grid resources [13, 14].

Another important performance measure is the response time, which is prominently used for analyzing the performance of a resource allocation or a task scheduling system in parallel and distributed computing, where applications are bound by time constraints (such as an execution deadline). A research direction towards evaluating response time profiles has been given in the work performed on evaluating PEPA models via an ordinary differential equation (ODE) analysis [14]. The functionality for evaluating response time profiles via passage time analysis has been implemented in the Imperial PEPA Compiler (IPC) tool [18], and in the PEPA workbench [16]. However, to the best of our knowledge, performance modeling for evaluating the response times of resource allocations in parallel and distributed computing systems and their related robustness analysis remained an open problem until the authors introduced the first solution by utilizing PEPA for analyzing the robustness of static resource allocations in [22, 23].

3 Methodology: Performance Modeling Using PEPA for Robustness Analysis

The target applications for this study are considered to be independent parallel applications waiting in a job queue for execution. Definitions of the notations used in the following description of the methodology are given in Table 1.

Table 1. Table of notations

Full size table

Each application receives a data set generated by three heterogeneous sensors that produce the workloads ($\lambda _1$, $\lambda _2$, and $\lambda _3$). The underlying computational system consists of parallel machines (that contain K heterogeneous processors, where K varies from machine to machine). Each machine has an associated availability factor ($\eta _i$), which is the computational availability of the allocated machine for executing an application. Further, a set of possible resource allocations are considered to be available for an initial mapping of applications to machines based on an expected time to compute (ETC) matrix. In general, in a parallel and distributed computing environment, it is realistic to assume that the ETC values of applications on all the available machines are known a priori. Often, these estimates are derived from application profiling and machine benchmarking, from the previous executions of an application on a machine, or are provided by the user [19, 20]. All applications in the job queue are assumed to start executing at time $t = 0$ s. In this work, the perturbation parameter considered for the robustness analysis of resource allocations is defined as the compound effect of perturbations generated from runtime variations in both, application workload and processor availability of a machine.

Given, A: a set of parallel applications, $a_i \in A$: a parallel application, $\lambda _i$: the workload of application $a_i$, $\beta _i^{max}$: a user defined makespan goal for $a_i$, M: a set of parallel machines, $M_j \in M$: the machine allocated to $a_i$, $\hat{\lambda _i}$: a perturbation parameter defined as the workload variation from the initial workload ($\lambda _i$) for an application $a_i$, $\hat{\eta _j}$: a perturbation parameter defined as the machine availability variation from the initial machine availability ($\eta _j$) for an application $a_i$, $F_i(M_j(\hat{\eta _j}), \hat{\lambda _i})$: the finishing time of application $a_i$ on machine $M_j$, then the robustness ($\psi $) of a mapping is formulated as in Eq. 1.

$$\begin{aligned} \psi = \min _{\forall i \in A}\Pr [F_i(M_j(\hat{\eta _j}), \hat{\lambda _i}) \le \beta _i^{max}] \end{aligned}$$

(1)

The goal of the robustness analysis is to find a resource allocation that maximizes the robustness of the execution of the applications on the allocated parallel machines. An example PEPA model of a mapping system for two applications ($a_1, a_2$) and five processors ($P_0, P_1, P_2, P_3, P_4$) distributed among two machines ($M_1, M_2$) has been described below. For reference, a detailed description of the PEPA language and the language operators can be found in [5, 17].

where $\mathcal {L}=\{compute_1, compute_2\}$.

The two applications ($a_1$ and $a_2$) engage in their compute activities. In the above model, the first two statements model the two applications when they are not allocated to any processor. $\infty $ is a predefined symbol in PEPA that denotes an unknown rate for an activity. This symbol is used in situations when a system is carrying out some action (or sequence of actions) whose rate is unavailable at the given time. Later when the processors are modeled, a processor engages in the compute activities of all the applications mapped to this processor. At this point, the unknown activity rate ($\infty $) is converted to actual activity rates ($r_1$ and $r_2$ in our example). The rate ($r_1$ or $r_2$) of the compute activity is calculated as a function of the speed of the processors in the machine allocated to the application and the workload for that application.

The PEPA model is provided as an input to the PEPA workbench [16] in a file format. Each application and machine PEPA component in the model is translated into an underlying mathematical Markovian model by the PEPA workbench. A more detailed description of the functionality of the PEPA workbench for our study can be found in [21]. The robustness of the modeled resource allocation is obtained as a probability of attaining a predefined makespan value, which is calculated by a passage time analysis [18] of the computational activities of all the machine components in the generated Markovian model. The passage time analysis generates a Cumulative Distribution Function (CDF) of the passage time ($T_p$) from a source state ($S_s$) into a non-empty set of target states ($S_T$), such that,

$$\begin{aligned} T_p = \inf \{u>0: S_s(u) \in S_T|S_s(0) = \text {initial state}\} \end{aligned}$$

(2)

The CDF is generated by a convolution of the state holding times over all possible paths from state $i \in S_s$ into any of the states in the set $S_T$ [18]. For the passage time analysis, the stop time is analogous to the user specified makespan goal, $\beta _i^{max}$. The solution is obtained as a cumulative distribution function (CDF) of the probability of machine finishing times for the modeled resource allocation. The robustness, as formulated in Eq. 1, of the resource allocation is obtained from the generated CDFs, as the minimum probability of achieving a user defined makespan goal.

4 Results and Analysis: Robustness Analysis via Numerical Modeling of Resource Allocations

In this study, a number of PEPA models have been generated for several feasible mappings of applications onto parallel heterogeneous machines. The variations in the workload values are sampled from a uniform probability distribution for all the applications. For this study, the Python function, has been used, with different values of and resulting in different mappings for a given number of applications and machines. The variations in application workload and machine availability can be sampled from any probability distribution model that can capture the static features of real workloads and machine characteristics at a particular point in time.

In PEPA, a choice operator permits a model component to exist in one of the many possible states of the component [5]. For example, for a component $P := P1 + P2$, P can only exist in state P1 or P2. In this work, the left hand side of the $+$ operator models the ideal execution scenario in the absence of perturbations in the sensor workload and the machine availabilities, where $\hat{\lambda _i} = \lambda _i, \forall i \in \{1,2,3\}$ and $\hat{\eta _j} = \eta _j$, for all machines. The right hand side of the $+$ operator models the perturbed execution scenario generated by a compound effect of variations in application workload and machine availabilities. The compound effect is modeled as simultaneous equal variations in one or more sensor workload(s) across all applications, where $\hat{\lambda _i} \ne \lambda _i$, and equal variations in processor availability across all applications, where $\hat{\eta _j} \ne \eta _j$. The rates of the compute activities are calculated as a function of $\lambda _i$ and the actual computation times $T_{ij}$ of each application on the machine where it is mapped. Further, $T_{ij}$ is calculated as a function of the estimated variation in application workload ($\hat{\lambda _i}$) and the estimated variation in the machine availabilities ($\hat{\eta _j}$), as defined in Table 1. The values of initial rates ($r_1, r_2, \cdots $), which remain constant during runtime, are calculated as a function of the initial machine availability factor ($\eta _j$), as described later in Sect. 4.1. The values of perturbed rates ($p_1, p_2, \cdots $), which vary during runtime, are calculated as a function of the varied application workload ($\hat{\lambda _i}$) and the varied machine availability factor ($\hat{\eta _j}$), as described later in Sect. 4.2.

The modeling generated for this study represents a resource allocation system that has a higher load imbalance factor at runtime when compared to the analysis in our prior work in [21,22,23,24]. Herein, the authors have analyzed the robustness of several mappings to study the impact of heightened load imbalance in computational systems with (i) 16 applications and 4 machines, (ii) 20 applications and 5 machines (for validation against our prior work), and (iii) 32 applications and 8 machines. Due to the state space explosion limitation of the CTMC analysis in PEPA, the authors successfully modeled only systems of less than 8 machines. Research work is ongoing to explore different implementations of PEPA that will improve the scalability of the numerical modeling and analysis of the robustness of resource allocations for larger PDC systems and applications.

4.1 Deriving PEPA Activity Rates in an Ideal Computing Environment ($\hat{\lambda _i} = \lambda _i$ and $\hat{\eta _j} = \eta _j$)

The calculation for the rate is given in Eq. 3, where $T_{ij}$ is the actual time to compute an application i on machine j with initial availability ($\eta _j$) that remains constant at runtime, and ${\lambda _i}$ is calculated as a function of the initial sensor loads $\lambda _1, \lambda _2, \lambda _3$.

$$\begin{aligned} r_i = \frac{\lambda _i}{T_{ij}} \;\;\;\; \forall i,j, \; where \; T_{ij} = \lambda _i \times \eta _j \end{aligned}$$

(3)

$T_{ij}$ is calculated as a product of the runtime workload for that application ($\lambda _i$), and the machine availability factor ($\eta _j$). Rate $r_i$ is calculated as a ratio of the application workload $\lambda _i$ and $T_{ij}$. In the ideal execution scenario, $\hat{\lambda _i} = \lambda _i, \forall i \in \{1,2,3\}$. Therefore, $T_i$ is equal to the initial ETC values and consequently, the rates, ($r_1, r_2, \cdots $), are only calculated using the initial machine availability ($\eta _j$).

4.2 Deriving PEPA Activity Rates in a Perturbed Computing Environment ($\hat{\lambda } \ne \lambda $ and $\hat{\eta _j} \ne \eta _j$)

The calculation of the perturbed rate is given in Eq. 4, where $T_{ij}$ is the actual time to compute an application i on machine j with the varied runtime availability ($\hat{\eta _j}$), and $\hat{\lambda _i}$ is calculated as a function of the varying sensor loads $\hat{\lambda _1}, \hat{\lambda _2}, \hat{\lambda _3}$, and $\lambda _i$ is calculated as a function of the initial sensor loads $\lambda _1, \lambda _2, \lambda _3$.

$$\begin{aligned} p_i = \frac{\lambda _i}{T_{ij}} \;\;\;\; \forall i,j, \; \text{ where } \; T_{ij} = \hat{\lambda _i} \times \hat{\eta _j} \end{aligned}$$

(4)

$T_{ij}$ is calculated as a product of the estimated runtime application workload ($\hat{\lambda }$), and the runtime machine availability factor ($\hat{\eta _j}$). Henceforth, $p_i$ is calculated as a ratio of the initial application workload $\lambda _i$ and $T_{ij}$.

4.3 Numerical Analysis of Performance Modeling of Resource Allocations Using the PEPA Workbench

The model file is compiled and solved using the PEPA workbench tool. The state space derived using the Markovian analysis provided by the PEPA workbench represents the continuous time Markov chain (CTMC) processes for each component of the modeled resource allocation.

Once the state space of all the components (CTMC processes of applications and machines) of the PEPA model are generated, the tool allows the modeler to specify the type of Markovian analysis that needs to be used for solving the generated Markov models to derive performance measures [5, 21]. In this work, we choose the passage time analysis that is used to solve the Markov models using the timing information associated with the activity rates to derive performance measures, such as, makespan and response time [18]. In this study, the passage time analysis of the Markov models for the resource allocations yields CDFs of the machine finishing times, $F_i(M_j(\hat{\eta _j}), \hat{\lambda _i})$, as passage times between the states associated with applications assigned to a machine. An example of the CDF of the finishing time of machine $M_1$ in one of our randomly generated resource allocation model is shown in Fig. 2. A comparison of the finishing times of each machine with respect to the four mappings (selected from a larger set of generated mappings due to space constraints) is illustrated in Fig. 3 for two different execution scenarios, (i) 16 applications and 4 heterogenous machines, and (ii) 20 applications and 5 heterogenous machines.

A key benefit resulting from the performance modeling and analysis in this work is to identify the most robust resource allocation among the ones delivering equal execution performance. The robustness metric $\psi _x$ is calculated using Eq. 5. For this study, the makespan goal is set as $\beta _i^{max} = 45$ s.

$$\begin{aligned} \psi _x = \min _{\forall i,j \; in \; Mapping_x,} \Pr [F_i(M_j(\hat{\eta _j}), \hat{\lambda _i}) \le 45] \end{aligned}$$

(5)

The robustness value for each machine is calculated as the probability of that machine to finish before the makespan goal. A comparison of the robustness values of all the machines for each of the four mappings of (i) 16 applications to 4 heterogeneous machines, and (ii) 20 applications to 5 heterogeneous machines is illustrated in Fig. 4.

The results illustrated in Figs. 5a and 5b indicate that for both test cases of resource allocations modeled, one for a PDC system with 16 applications and 4 heterogeneous machines, and the other for a PDC system with 20 applications and 5 heterogeneous machines, resource allocations promising better parallel execution time over the others can be identified. Moreover, the faster resource allocations that deliver equal execution makespan also differ vastly in their degrees of robustness. For example, in the case of mappings and in Fig. 5b, both mappings deliver equal execution performance in terms of system makespan. However, Mapping is 2.43 times more robust that Mapping . Therefore, the value of robustness for significantly increases by 143% over . For highly critical applications, this substantial improvement in robustness, enables a more informed decision for selecting the most appropriate mapping that can withstand the runtime perturbations in application and system parameters in a parallel computing environment, and can guarantee the best execution performance. For example, in this modeling study, Mapping is a more robust choice for an initial allocation in terms of achieving a set makespan goal in the presence of runtime perturbations caused by a compound effect from variations in application workload and machine availability.

5 Conclusions and Future Work

The analytical study performed in this work to obtain the robustness of static resource allocations, which are analytically modeled and numerically solved using PEPA, provides a useful tool that can be incorporated in the design phase of a resource allocation system in a parallel and distributed computing environment. Although the analysis is limited to static resource allocations, a key benefit of this study is learning that, in the case of resource allocations promising equal execution performance, the numerical evaluation of the PEPA models enables a selection of the most robust resource allocation in the presence of runtime perturbations caused by a compound effect arising from dynamic variations in application workload and machine availability. For each of the several modeled resource allocations, a numerical analysis of the PEPA models yields the execution performance as the parallel execution time. The robustness value for each mapping is calculated as the minimum probability of the machines to finish before a desired makespan goal. The results presented in Sect. 4 indicate that two or more different mappings yield equal execution performance. However, a significant improvement (up to 143%) in the robustness value among the mappings yielding equal parallel execution times has been observed. Thus, this substantial increase in robustness strongly suggests the selection of one mapping versus the other for guaranteeing the best execution performance.

To the best of our knowledge, this work is the first effort towards modeling the execution of applications on heterogeneous machines by simultaneously incorporating variations in both application workload and machine availabilities, considered together as a compound perturbation. To facilitate and ease the reproducibility of our research, we provide a Singularity container of the PEPA Workbench that has been validated to produce identical results to the non-containerized version of the PEPA Workbench [25]. The PEPA model and build recipe for the Singularity container are available at https://github.com/williamssanders/pepa, and the container is publicly available at https://www.singularity-hub.org/collections/2351.

In the future, the authors plan to continue this study towards performance modeling of resource allocations and numerical evaluation of their robustness on larger scale PDC systems and data sets from real scientific applications. To facilitate a scalable evaluation, the authors plan to investigate a number of other existing implementations of PEPA that are not limited by state space explosion. Further, the authors also plan to explore an integration of the PEPA models into a runtime scheduler/controller in a model-based framework, where the embedded models can be re-evaluated with minimal overhead when a system parameter changes at runtime.

References

Ali, S., Maciejewski, A., Siegel, H., Kim, J.K.: Measuring the robustness of a resource allocation. IEEE Trans. Parallel Distrib. Syst. 15(7), 630–641 (2004)
Article Google Scholar
Ali, S., Kim, J.-K., Siegel, H.J., Maciejewski, A.A.: Static heuristics for robust resource allocation of continuously executing applications. J. Parallel Distrib. Comput. 68(8), 1070–1080 (2008)
Article Google Scholar
Banicescu, I., Ciorba, F.M., Carino, R.L.: Towards the robustness of dynamic loop scheduling on large-scale heterogeneous distributed systems. In: Proceedings of the IEEE International Symposium on Parallel and Distributed Computing (ISPDC 2009), pp. 129–132 (2009)
Google Scholar
Srivastava, S., Banicescu, I., Ciorba, F.: Investigating the robustness of adaptive dynamic loop scheduling on heterogeneous computing systems. In: Proceedings of The 2010 IEEE/ACM International Symposium on Parallel Distributed Processing, Workshops and Phd Forum (IPDPSW-PDSEC, On CD-ROM), pp. 1–8, April 2010
Google Scholar
Hillston, J.: A Compositional Approach to Performance Modelling. Cambridge University Press, Cambridge (1996)
Google Scholar
Coffman, E., Bruno, J.: Computer and Job-Shop Scheduling Theory. A Wiley-Interscience Publication. Wiley, Hoboken (1976)
Google Scholar
Fernandez-Baca, D.: Allocating modules to processors in a distributed system. IEEE Trans. Softw. Eng. 15(11), 1427–1436 (1989)
Article Google Scholar
Ibarra, O.H., Kim, C.E.: Heuristic algorithms for scheduling independent tasks on nonidentical processors. J. ACM 24(2), 280–289 (1977)
Article MathSciNet Google Scholar
Leon, J.V., Wu, D.S., Storer, R.H.: Robustness measures and robust scheduling for job shops. IIE Trans. 26(5), 32–43 (1994)
Article Google Scholar
Daniels, R.L., Carrillo, J.E.: Beta-robust scheduling for single machine systems with uncertain processing times. IIE Trans. 29(11), 977–985 (1997)
Google Scholar
Gertphol, S., Prasanna, V.: MIP formulation for robust resource allocation in dynamic real-time systems. In: Proceedings of the International Parallel and Distributed Processing Symposium (2003)
Google Scholar
Gertphol, S., Prasanna, V.: Iterative integer programming formulation for robust resource allocation in dynamic real-time systems. In: Proceedings of the International Parallel and Distributed Processing Symposium (2004)
Google Scholar
Benoit, A., Cole, M., Gilmore, S., Hillston, J.: Evaluating the performance of skeleton-based high level parallel programs. In: Bubak, M., van Albada, G.D., Sloot, P.M.A., Dongarra, J. (eds.) ICCS 2004. LNCS, vol. 3038, pp. 289–296. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-24688-6_40
Chapter Google Scholar
Benoit, A., Cole, M., Gilmore, S., Hillston, J.: Scheduling skeleton-based grid applications using PEPA and NWS. Comput. J. 48(3), 369–378 (2005)
Article Google Scholar
Wallace, V.L., Rosenberg, R.S.: Markovian models and numerical analysis of computer system behavior. In: Proceedings of the Spring Joint Computer Conference, pp. 141–148. ACM (1966)
Google Scholar
Gilmore, S., Hillston, J.: The PEPA workbench: a tool to support a process algebra-based approach to performance modelling. In: Haring, G., Kotsis, G. (eds.) TOOLS 1994. LNCS, vol. 794, pp. 353–368. Springer, Heidelberg (1994). https://doi.org/10.1007/3-540-58021-2_20
Chapter Google Scholar
Hillston, J.: Tuning systems: from composition to performance. Comput. J. 48(4), 385–400 (2005)
Article Google Scholar
Bradley, J.T., Dingle, N.J., Gilmore, S.T., Knottenbelt, W.J.: Derivation of passage-time densities in PEPA models using IPC: The imperial PEPA compiler. In: 11th IEEE/ACM International Symposium on Modeling, pp. 344–351. MASCOTS, Analysis and Simulation of Computer Telecommunications Systems (2003)
Google Scholar
Ghafoor, A., Yang, J.: A distributed heterogeneous supercomputing management system. Computer 26(6), 78–86 (1993)
Article Google Scholar
Iverson, M.A., Ozguner, F., Potter, L.: Statistical prediction of task execution times through analytic benchmarking for scheduling in a heterogeneous environment. IEEE Trans. Comput. 48(12), 1374–1379 (1999)
Article Google Scholar
Srivastava, S.: Evaluating the robustness of resource allocations obtained through performance modeling with stochastic process algebra. Ph.D. dissertation, Mississippi State University, Department of Computer Science and Engineering, May 2015
Google Scholar
Banicescu, I., Srivastava, S.: Towards robust resource allocations via performance modeling with stochastic process algebra. In: Proceedings of the 18th IEEE International Conference on Computational Science and Engineering (CSE-2015), Porto, Portugal, pp. 270–277, October 2015
Google Scholar
Srivastava, S., Banicescu, I.: Robust resource allocations through performance modeling with stochastic process algebra. Concurr. Comput.: Practice Exper. 29(7), e3894 (2017). https://doi.org/10.1002/cpe.3894
Article Google Scholar
Srivastava, S., Banicescu, I.: PEPA based performance modeling for robust resource allocations amid varying processor availability. In: Proceedings of the 17th IEEE International Symposium on Parallel and Distributed Computing (ISPDC), Geneva, pp. 61–68 (2018)
Google Scholar
Sanders, W.S., Srivastava, S., Banicescu, I.: A container-based framework to facilitate reproducibility in employing stochastic process algebra for modeling parallel computing systems. In: Proceedings of the 2019 IEEE/ACM International Symposium on Parallel Distributed Processing, Workshops and PhD Forum (IPDPSW-HIPS) (2019, in press)
Google Scholar

Download references

Acknowledgment

The authors would like to acknowledge The Jackson Laboratory for providing support and helping to make this work possible. Additionally, the authors would like to acknowledge the partial support of the NSF #1034897 grant.

Author information

Authors and Affiliations

University of Southern Indiana, Evansville, IN, 47712, USA
Srishti Srivastava
Mississippi State University, Mississippi State, MS, 39762, USA
Ioana Banicescu & William S. Sanders

Authors

Srishti Srivastava
View author publications
You can also search for this author in PubMed Google Scholar
Ioana Banicescu
View author publications
You can also search for this author in PubMed Google Scholar
William S. Sanders
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Srishti Srivastava .

Editor information

Editors and Affiliations

University of Amsterdam, Amsterdam, The Netherlands
Valeria V. Krzhizhanovskaya
University of Amsterdam, Amsterdam, The Netherlands
Gábor Závodszky
University of Amsterdam, Amsterdam, The Netherlands
Michael H. Lees
University of Tennesee, Knoxville, TN, USA
Jack J. Dongarra
University of Amsterdam, Amsterdam, The Netherlands
Peter M. A. Sloot
Intellegibilis, Setúbal, Portugal
Sérgio Brissos
Intellegibilis, Setúbal, Portugal
João Teixeira

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Srivastava, S., Banicescu, I., Sanders, W.S. (2020). Addressing the Robustness of Resource Allocation in the Presence of Application and System Irregularities via PEPA Based Modeling. In: Krzhizhanovskaya, V., et al. Computational Science – ICCS 2020. ICCS 2020. Lecture Notes in Computer Science(), vol 12138. Springer, Cham. https://doi.org/10.1007/978-3-030-50417-5_44

Download citation

DOI: https://doi.org/10.1007/978-3-030-50417-5_44
Published: 15 June 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-50416-8
Online ISBN: 978-3-030-50417-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Addressing the Robustness of Resource Allocation in the Presence of Application and System Irregularities via PEPA Based Modeling

Abstract

Similar content being viewed by others

A New Parallel Research Kernel to Expand Research on Dynamic Load-Balancing Capabilities