On the Job Distribution in Random Brokering for

On the Job Distribution in Random Brokering
for Computational Grids
Vandy Berten1,2 and Joël Goossens2
1
Research Fellow for FNRS (Fond National de la Recherche Scientifique - Belgium)
2
Université Libre de Bruxelles, Belgium
{vandy.berten,joel.goossens}@ulb.ac.be
Abstract. This paper analyses the way jobs are distributed in a computational grid environment where the brokering is done in such a way that
each Computing Element has a probability to be chosen proportional to
its number of CPUs. We give the asymptotic behaviour for several metrics (queue sizes, slowdown. . . ), or, in some case, an approximation of
this behaviour. We study the unsaturated case as well as the saturated
case, in several stochastic distributions.
1
Introduction
Used by several popular grid systems, we shall see that “ranked brokers” can
have unexpected behaviours and, to the best of our knowledge, these behaviours
have not been studied so far. We focus mainly this paper – which summarises a
more complete work [1] – on a particular case of ranked brokering (i.e., random
brokering). Due to space limitation, we do not give proofs of our results; those
proofs can be found in [1].
On a random broked grid, when a job arrives in the system, it is sent to a
Computing Element (CE) with a probability proportional to that CE number
of CPUs. Once a job has been dispatched towards a CE, it has to be scheduled
by this CE. In this work, we consider mainly FCFS (First Come First Serve)
scheduling rule.
1.1
Model of Computation
In the grids we consider, there is a central Resource Broker (RB), to which
each CE is connected, and a client sends its jobs to that central RB. Each job
j has mainly two parameters: a length (or execution time) j` , and a width (or
number of parallel processes) jw . The job j will therefore need jw CPUs during
j` units of time. We assume that, on one processor, we do not use parallelism
nor preemption (and consequently migration), and that our system is greedy3 .
We also suppose that jobs are not spread across several CEs.
3
A system is said to be greedy (sometimes called expedient) if it never leaves any
resource idle intentionally. If a system is greedy, a resource is idle only if there is no
eligible job waiting for that resource.
1.2
Mathematical Model
We will use in this paper the notations defined in [1]. Briefly, our system is
composed of N CE, called Ci (i ∈ [1 . . . N ]). ci refers to the number of CPUs
PN
of Ci , and C , i=1 ci (the symbol , means “is by definition”). ν(t) is the
total amount of work received in [0, t] divided by the product between the total
number of CPUs (C) and the total duration (t), or in other word, the total
amount of work received divided by the total amount of work that the system
could provide. The interval [0, t] is called the observation period.
f1 (t) ∼t f2 (t) means that limt→∞ ff12 (t)
(t) = 1.
We assume that the arrival of jobs is a random process with an average delay
λ−1 between two successive arrivals. The average execution time (E[j` ]) has a
distribution with mean µ−1 , and jobs are independent. λi , λ cCi , ρi , λµi , ρ , λµ ,
and the system load ν , ρcii = Cρ .
2
Sequential Jobs
In this section, we will analyse the case where j` = 1 ∀j.
2.1
Queue Size
ν < 1. We focus here on a single (arbitrary) Ci , where the arrival is a Poisson
process with rate λi and the execution time has an Exponential distribution with
mean µ−1 . Such a system is well known, and has been abundantly studied in the
literature: this is a M/M/ci queueing system. Notice that this is a quite naı̈ve
approximation; in practice, grids are not generally as predictable as a M/M/c
system.
Let Ji be the number of jobs in Ci (running and waiting jobs) and Qi be the
number of jobs in the queue. Knowing P[Ji = n] (for instance by [5, page 371,
section 8.5.2]) and the relationship between Ji and Qi , we show in [1] that
Theorem 1. If ν < 1, the average queue size of Ci is
−1
cX
i −1
(νci )k
(νci )ci 1
ν ci +1 cci i
where
b
,
+
E[Qi ] ∼t b
ci !(ν − 1)2
k!
ci ! 1 − ν
k=0
ν > 1. For that case, we do not longer assume that the system is M/M/c;
we just need to know the average execution time, and the average inter-arrival
delay. We will focus on the average queue size at time t, that is, if Pn (t) is
the probability that there are n jobs in the queue at time t, we will consider
P
∞
n=0 nPn (t). With some subtile manipulations, we proof in [1] that:
Theorem 2. If ν > 1, we have
E[Qi (t)]
∼
t
λi t
ν −1
ν
Fig. 1. Queue Size observed in simulation and theoretically expected, for non saturated
(left side, with Theorem 1) and saturated systems (right side, with Theorem 2).
Experimental Results. Figure 1 (see [1] for details) highlights that our expectations (continuous lines) are really close to what we got by simulation (dots).
We observe that there is an “inversion” around ν = 1: when ν < 1, E[Qi ] <
E[Qj ] if ci > cj , and for ν > 1, we have the opposite.
2.2
Used CPUs
It can be intuitively easy to see that, for ν > 1, the average number of used
CPUs on Ci is ci , and that for ν < 1, νci CPU’s are used in average on Ci . A
formal proof is given in [1].
2.3
Slowdown
The slowdown for a particular job is classically defined as
waiting time + execution time
execution time
ν < 1. We have an approximation in the case where job lengths have a shifted
exponential distribution with a α small (see [1])
Lemma 1. If ν < 1, with shifted exponential distribution for job length, the
average slowdown E[SD i ] is asymptotically close to4
α
b(νci )ci
α
e 1−α
Γ [0,
]+1
2
ci · ci !(1 − ν) 1 − α
1−α
ν > 1. In the system we are studying, we measure the slowdown of completed
jobs. We computed then the average for each measured job. But, at the end of
our observation period, especially if ν 1, a lot of jobs are still in the queue
and are therefore not taken into account in our average.
4
Γ [0, z] =
R∞
z
e−τ
τ
dτ is the incomplete Euler gamma function.
Lemma 2. If ν > 1, the average slowdown for jobs leaving the system between
0 and t, in the case of constant execution time µ−1 , tends asymptotically on t
towards
ν −1
λt 2
2ν C
Lemma 3. If ν > 1, the average slowdown for jobs leaving the system between
0 and t, in the case of shifted exponential distribution with parameter α tends
asymptotically on t towards
α
λte 1−α
α
]ν −1
Γ [0, 1−α
1−α
2ν 2 C
(Experimental results can be found in [1].)
3
Parallel Jobs
In the previous section, we imposed the constraint that jobs required only one
processor during their execution. In this section, we will be more general and
relax this constraint: a job can require several CPUs, and uses them from the
beginning up to the end of its execution.
We need here some moreP
notations : wk is the probability for a job to need for
k CPUs, and W stands for k kwk . The system load is redefined as : ν , λW
µC .
For more details, see the full version of this paper ([1]). Notice that the case
separation ν < 1 and ν > 1 is here changes into ν < ν̃i and ν > ν̃i , where ν̃i is
the point of saturation.
i
Lemma 4. If ν > ν̃i , the average queue size of Ci is close to tλi ν−ν̃
ν .
Lemma 5. Whatever the job width distribution, if the job length is fixed, the
average number of used CPUs on a CE having c CPUs is
c
X
kPk
k=1
Where Pk are solutions of the system

Pk
Pc
w` β(k−`)

γ(c
−
k
+
1)
Pk = j=c−k+1 Pj `=c−j+1
γ(c−j+1)
 Pc
i=1 Pk = 1
with β(k) =
Pk
i=1
wi β(k − i) and γ(k) =
Pc
i=k
wi .
Lemma 6. In the case of equidistributed job width distribution between 1 and c
(the CE size), if the job length is fixed, the average number of used CPUs is
3c(c + 1)
2(1 + 2c)
Lemma 7. In case of equidistributed job width distribution between 1 and ci
(the CE size), if the job length is fixed, the point of saturation ν̃i is
3(ci + 1)
2(1 + 2ci )
Lemma 8. If ν > ν̃i , the average slowdown for jobs leaving the system between
0 and t (MSD i (t)), in the case of constant execution time µ−1 , tends (approximately) asymptotically on t towards
λt
4
ν −1
W
2(ν − 1 + ν̃i )νC
Conclusion and Future Work
Our work was a first step towards a more complex analysis of general ranked
based brokering. As we shown by plotting together our simulation observations and our theoretical predictions or approximations, we acquired really good
knowledge of the job brokering characteristics and behaviour in the specific case
we observed.
A second step in our work would be to have a look at some more complex cases; brokering based on the number of free CPUs, the queue size, or an
estimation of the waiting time, for other job length and inter-arrival distributions. . . These new constraints will make more than probably our analysis more
difficult, for instance because we introduce a feedback from Computing Elements
to the Resource Broker. We believe that we now have built the tools we needed
for this futher study.
Acknowledgements
The authors would like to thank Prof G. Louchard and R. Devillers (from
ULB - Computer Science dpt) for their significant contributions.
References
1. Berten, V., and Goossens, J. On the job distribution in random brokering for
computational grids. Tech. Rep. 518, Université Libre de Bruxelles, May 2004.
http://homepages.ulb.ac.be/~vberten/Papers/RandomBrokering-Full.ps.
2. Buyya, R. High Performance Cluster Computing, vol. 1, Architectures and Systems.
Prentice Hall PTR, 1999.
3. Ernemann, C., Hamscher, V., Schwiegelshohn, U., Streit, A., and
R.Yahyapour. On Advantages of Grid Computing for Parallel Job Scheduling.
In Proceedings of the 2nd IEEE International Symposium on Cluster Computing
and the Grid (CC-GRID 2002) (May 2002).
4. J. Krallmann, U. S., and Yahyapour, R. On the design and evaluation of
job scheduling algorithms. Job Scheduling Strategies for Parallel Processing (1999),
17–42.
5. Nelson, R. Probability, Stochastic Processes, and Queueing Theory. SpringerVerlag, 1995.