Abstract
In real-time systems with highly variable task execution times simplistic task models are insufficient to accurately model and to analyze the system. Variability can be tackled using distributions rather than a single value, but the proper characterization depends on the degree of variability. Self-similarity is one of the deepest kinds of variability. It characterizes the fact that a workload is not only highly variable, but it is also bursty on many time-scales. This paper identifies in which situations this source of indeterminism can appear in a real-time system: the combination of variability in task inter-arrival times and execution times. Although self-similarity is not a claim for all systems with variable execution times, it is not unusual in some applications with real-time requirements, like video processing, networking and gaming.
The paper shows how to properly model and to analyze self-similar task sets and how improper modeling can mask deadline misses. The paper derives an analytical expression for the dependence of the deadline miss ratio on the degree of self-similarity and proofs its negative impact on real-time systems performance through system’s modeling and simulation. This study about the nature and impact of self-similarity on soft real-time systems can help to reduce its effects, to choose the proper scheduling policies, and to avoid its causes at system design time.
Similar content being viewed by others
Notes
Note that the aggregate process of W t can be obtained as \(W_{t}^{(m)}=\int_{(t-1)Tm}^{tTm}A(u) du\).
Assuming that all deadlines are the same is a weak condition. For different values of D i , we can obtain the same results if we select, for example, the minimum deadline D=min i D i (a tight condition) or the maximum deadline D=max i D i (a loose condition).
References
Abdelzaher TF, Sharma V, Lu C (2004) A utilization bound for aperiodic tasks and priority driven scheduling. IEEE Trans Comput 53(3):334–350
Abeni L, Buttazzo G (1999) QoS guarantee using probabilistic deadlines. In: Proc of the Euromicro confererence on real-time systems
Abeni L, Buttazzo G (2004) Resource reservation in dynamic real-time systems. Real-Time Syst 37(2):123–167
Anantharam V (1999) Scheduling strategies and long-range dependence. Queueing Syst 33(1–3):73–89
Beran J (1994) Statistics for long-memory processes. Chapman and Hall, London
Beran J, Sherman R, Taqqu M, Willinger W (1995) Long-range dependence in variable-bit-rate video traffic. IEEE Trans Commun 43(2):1566–1579
Boxma O, Zwart B (2007) Tails in scheduling. SIGMETRICS Perform Eval Rev 34(4):13–20
Brichet F, Roberts J, Simonian A, Veitch D (1996) Heavy traffic analysis of a storage model with long range dependent on/off sources. Queueing Syst 23(1):197–215
Crovella M, Bestavros A (1997) Self-similarity in world wide web traffic: evidence and possible causes. IEEE/ACM Trans Netw 5(6):835–846
Dìaz J, Garcìa D, Kim K, Lee C, Bello LL, López J, Min LS, Mirabella O (2002) Stochastic analysis of periodic real-time systems. In: Proc of the 23rd IEEE real-time systems symposium, pp 289–300
Erramilli A, Narayan O, Willinger W (1996) Experimental queueing analysis with long-range dependent packet traffic. IEEE/ACM Trans Netw 4(2):209–223
Erramilli A, Roughan M, Veitch D, Willinger W (2002) Self-similar traffic and network dynamics. Proc IEEE 90(5):800–819
Gardner M (1999) Probabilistic analysis and scheduling of critical soft real-time systems. Phd thesis, University of Illinois, Urbana-Champaign
Garrett MW, Willinger W (1994) Analysis, modeling and generation of self-similar vbr video traffic. In: ACM SIGCOMM
Harchol-Balter M (2002) Task assignment with unknown duration. J ACM 49(2):260–288
Harchol-Balter M (2007) Foreword: Special issue on new perspective in scheduling. SIGMETRICS Perform Eval Rev 34(4):2–3
Harchol-Balter M, Downey AB (1997) Exploiting process lifetime distributions for dynamic load balancing. ACM Trans Comput Syst 15(3):253–285
Hernandez-Orallo E, Vila-Carbo J (2007) Network performance analysis based on histogram workload models. In: Proceedings of the 15th international symposium on modeling, analysis, and simulation of computer and telecommunication systems (MASCOTS), pp 331–336
Hernandez-Orallo E, Vila-Carbo J (2010) Analysis of self-similar workload on real-time systems. In: IEEE real-time and embedded technology and applications symposium (RTAS). IEEE Computer Society, Washington, pp 343–352
Hernández-Orallo E, Vila-Carbó J (2010) Network queue and loss analysis using histogram-based traffic models. Comput Commun 33(2):190–201
Hughes CJ, Kaul P, Adve SV, Jain R, Park C, Srinivasan J (2001) Variability in the execution of multimedia applications and implications for architecture. SIGARCH Comput Archit News 29(2):254–265
Leland W, Ott TJ (1986) Load-balancing heuristics and process behavior. SIGMETRICS Perform Eval Rev 14(1):54–69
Leland WE, Taqqu MS, Willinger W, Wilson DV (1994) On the self-similar nature of ethernet traffic (extended version). IEEE/ACM Trans Netw 2(1):1–15
Liu CL, Layland JW (1973) Scheduling algorithms for multiprogramming in a hard-real-time environment. J ACM 20(1):46–61
Mandelbrot B (1965) Self-similar error clusters in communication systems and the concept of conditional stationarity. IEEE Trans Commun 13(1):71–90
Mandelbrot BB (1969) Long run linearity, locally Gaussian processes, h-spectra and infinite variances. Int Econ Rev 10:82–113
Norros I (1994) A storage model with self-similar input. Queueing Syst 16(3):387–396
Norros I (2000) Queueing behavior under fractional Brownian traffic. In: Park K, Willinger W (eds) Self-similar network traffic and performance evaluation. Willey, New York, Chap 4
Park K, Willinger W (2000) Self-similar network traffic: An overview. In: Park K, Willinger W (eds) Self-similar network traffic and performance evaluation. Willey, New York, Chap 1
Paxson V, Floyd S (1995) Wide area traffic: the failure of Poisson modeling. IEEE/ACM Trans Netw 3(3):226–244
Rolls DA, Michailidis G, Hernández-Campos F (2005) Queueing analysis of network traffic: methodology and visualization tools. Comput Netw 48(3):447–473
Rose O (1995) Statistical properties of mpeg video traffic and their impact on traffic modeling in atm systems. In: Conference on local computer networks
Roy N, Hamm N, Madhukar M, Schmidt DC, Dowdy L (2009) The impact of variability on soft real-time system scheduling. In: RTCSA ’09: Proceedings of the 2009 15th IEEE international conference on embedded and real-time computing systems and applications. IEEE Computer Society, Washington, pp 527–532
Sha L, Abdelzaher T, Årzén KE, Cervin A, Baker T, Burns A, Buttazzo G, Caccamo M, Lehoczky J, Mok AK (2004) Real time scheduling theory: A historical perspective. Real-Time Syst 28(2):101–155
Taqqu MS, Willinger W, Sherman R (1997) Proof of a fundamental result in self-similar traffic modeling. SIGCOMM Comput Commun Rev 27(2):5–23
Tia T, Deng Z, Shankar M, Storch M, Sun J, Wu L, Liu J (1995) Probabilistic performance guarantee for real-time tasks with varying computation times. In: Proc of the real-time technology and applications symposium, pp 164–173
Vila-Carbó J, Hernández-Orallo E (2008) An analysis method for variable execution time tasks based on histograms. Real-Time Syst 38(1):1–37
Willinger W, Taqqu M, Erramilli A (1996) A bibliographical guide to self-similar traffic and performance modeling for modern high-speed networks. In: Stochastic networks: Theory and applications, pp 339–366
Willinger W, Taqqu MS, Sherman R, Wilson DV (1997) Self-similarity through high-variability: statistical analysis of ethernet lan traffic at the source level. IEEE/ACM Trans Netw 5(1):71–86
Acknowledgement
This work was developed under a grant from the European Union (FRESCOR-FP6/2005/IST/5-03402).
Author information
Authors and Affiliations
Corresponding author
Appendix: Background on self-similarity
Appendix: Background on self-similarity
The analysis of stationary time series data or stochastic processes can reveal the property of self-similarity. This is the case, for example, of the CPU utilization over a sampling period. This has been well studied in network traffic. For a detailed discussion of self-similarity see Beran (1994).
1.1 A.1 Definition of self-similarity
A phenomenon that is self-similar looks the same at different scales on a dimension. This means that at different time scales (microseconds, milliseconds, seconds) the statistical properties are similar.
More formally, let {X t :t=0,1,2,…} ({X t } for short) be a discrete stationary stochastic process representing the evolution in time of a statistical distribution (for example, the workload of a system). Let σ 2 be the variance and let r(k)=Cov (X t ,X t+k )/σ 2 be the autocorrelation function of {X t }. We define \(\{X_{t}^{(m)}\}\) as the m-aggregated process of {X t }, which is obtained by aggregating and averaging the data in X t by blocks of size m:
and r (m)(k) is defined as the autocovariance function of \(\{X_{t}^{(m)}\}\). Aggregating really means multiplying the sampling period by a factor m. An important effect of the process aggregation is to smooth the traffic rate in each period. Therefore, the variation is reduced. The question of how this variation changes depending on the aggregation factor is related to the study of the self-similarity of a process.
A process is distributionally self-similar if the processes {X t } and \(\{X_{t}^{(m)}\}\) have the same distribution, up to a scaling factor. A self-similar process has the property that, when it is aggregated, the new process has the same autocorrelation function as the original one. Formally, the process {X t } is called second-order self-similar with Hurst parameter \(H=1-| \frac{\beta}{2} |\) if:
If the second condition only holds when m→∞, then a process is asymptotically second-order self-similar. If 0<H<0.5, the process is Short-Range Dependent (SRD), and if 0.5<H<1, the process is Long-Range Dependent (LRD).
The variance of the m-aggregated process can be obtained as:
Processes that are LRD exhibit correlations over a wide range of times scales, while processes that are SRD exhibit correlation functions that decay exponentially fast. This implies that time-aggregation quickly results in white noise characterized by the absence of any significant temporal correlations.
The Hurst parameter can be evaluated in several ways (Beran 1994). In this paper, we use the variance-time plot, which is the graph of the variance \(\log(\operatorname {Var}(X^{(m)}_{t}))\) versus log(m). With this graph, we can obtain the Hurst parameter by fitting a least-square line with slope β=2H−2 through the resulting points, ignoring those for small m.
Since a self-similar process has observable bursts at a wide range of scales, it can exhibit long-range dependence: values at any instant are correlated with values at all future instants. Although the terms self-similarity and long-range dependence are not exactly equivalent, we use them in an interchangeable fashion throughout the paper.
One of the advantages of using self-similar models is that the degree of variation on multiple time scales can be expressed using only a single parameter: the Hurst parameter.
1.2 A.2 Fractional Brownian motion
A common way to model self-similar processes is using a fractional Brownian model. The fractional Brownian motion (denoted fBm) model can be viewed as an extension of the standard Brownian Motion models that have been used in heavy traffic analysis. Formally, the normalized fractional Brownian motion {Z H (t):t≥0} is a continuous zero mean Gaussian process with stationary increments and variance |t|2H. The correlation of the increments is characterized by the Hurst index, H. Unlike the standard Brownian motion, the fractional one has a long-range dependency property when H>1/2. Since this process is self-similar: Z H (at)=|a|H Z H (t). Using the normalized fBM Z H (t), we can define a new fBM A ∗(t) with mean μ and variance σ 2:
In Sect. 4 we see that the proposed “on-off” workload model behaves statistically as this fractional Brownian motion.
1.3 A.3 Heavy-tailed distributions
A distribution is heavy-tailed if
That is, if the asymptotic shape of the distribution is hyperbolic, it is heavy tailed. Heavy-tailed distributions have a number of properties that are different from exponential or normal distributions. If α≤2, then the distribution has an infinite variance; if α≤1, then the distribution has an infinite mean. Thus, as α decreases, a larger portion of the probabilistic mass function may be present in the tail of the distribution. In practical terms, a random variable that follows a heavy-tailed distribution can have extremely large values with non-negligible probability.
The simplest heavy-tailed distribution is the Pareto distribution:
where d is the minimal value of the distribution and α is known as the Pareto index.
The problem with using the Pareto distribution is its infinite variance. This poses problems when simulating a system with heavy-tailed distributions. Specifically, in Crovella and Bestavros (1997), it is shown that simulations become infeasible for α<1.5. A practical approach for solving this problem is to bound the Pareto distribution (Harchol-Balter 2002). A Bounded-Pareto distribution is characterized by three parameters: the Pareto index α, the minimal value d, and the largest value p:
It is easy to see that if p is very high, the distribution is very similar to the unbounded one. Therefore, for large values of p it can be assumed that it is heavy-tailed (Harchol-Balter 2002).
Finally, note that self-similarity as well as heavy-tailed distributions are convenient mathematical idealizations and can never be fully validated from finite data sets. However, these idealizations are powerful mathematical tools for modeling important aspects of time series.
Rights and permissions
About this article
Cite this article
Hernández-Orallo, E., Vila-Carbó, J. On the nature and impact of self-similarity in real-time systems. Real-Time Syst 48, 294–319 (2012). https://doi.org/10.1007/s11241-012-9146-0
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11241-012-9146-0