Scheduling
Determines the precise start time of each task.
The start times must satisfy the original dependencies of
the sequencing graph,
Scheduling determines the concurrency of the resulting
implementation
Area/latency trade-off points can be derived and resources
may be bounded to satisfy design requirement.
A spectrum of solutions may be obtained by scheduling a
sequencing graph with different resource constraints.
Parallelism increases area due to more number of resource
usage but reduces latency (delay)
Model for scheduling problems :
X1 = a+da;
U1= u-(3*a*u*da) – (3*y*da);
Y1= y+u*da;
C=X1< a;
3
a
u
*
da
3
*
y
u
da
a
+
*
*
da
y
da
a
X1
*
*
+
<
u
_
Y1
_
U1
C
Sequencing graph
V0
*
*
1
*
-
2
3
*
6
*
8
+
*
7
+
9
<
4
5
vn
10
11
Scheduling rules
The Latency of the schedule is the number of cycles to
execute the entire schedule or it is the difference in start
time of the sink and source vertices = tn – to
The start time of an operation is at least as large as the start
time of each of its direct predecessor plus its execution
delay.
i.e. ti > tj+dj i,j (vj,vi) E.
Scheduling without Resource constraints
Used when dedicated resources are used, when
operations differ in their use.
Used when resource binding is done prior to
scheduling and resource conflicts are solved by
serializing the operations that share the same
resource.
Used to derive bounds on latency for constrained
problems. A lower bound on latency can be
computed.
Problem formulation for Scheduling
Input to Scheduling problem is a sequencing graph with
1.D = {di ; i=0, 1….. n} denotes the set of operation
execution delays. The execution delays of the source and
sink vertices are both zero. i.e. d0=dn=0. Also assume
that delays are data independent.
2. Set of start time T= {ti ; i=0, 1….. n}, the start time for
the operations.
The Latency of the schedule is the number of cycles to
execute the entire schedule or it is the difference in start
time of the sink and source vertices = tn – to
ASAP ( As Soon As Possible) Algorithm
ASAP
(Gs (V,E)) {
Schedule Vo by setting tso =1;
Repeat {
Select a vertex Vi whose predecessors were all
scheduled;
Schedule Vi by setting tsi= max tsj+dj; j:(Vj,Vi) E.
}
until Vn is scheduled
return (ts);
}
Assume all operations have unit execution delays.
ASAP algorithm
ASAP algorithm would set first tso =1
then vertices whose predecessors have
been scheduled are v1, v2, v3, v8, v10.
Their start time is set to tso + do = 1+0 =1
The start time for sink tns = 5
There fore the latency is = 5-1 = 4 .
Sequencing graph
V0
*
*
1
*
-
2
3
*
6
*
8
+
*
7
+
9
<
4
5
vn
10
11
ALAP ( As Late As Possible) Algorithm
ALAP (Gs(V,E),) {
Schedule Vn by setting tLn= +1;
Repeat {
Select vertex Vi whose successors are all
scheduled;
Schedule Vi setting tLi=min (tLj-dj); j: (Vi, Vj) E.
}
Until (V0 is scheduled);
Return (tL)
}
ALAP graph
Vo
*
*
1
*
2
3
-
4
-
*
6
*
7
5
Vn
*
8
+
10
+
9
<
11
Scheduling with Resource constraints :
The integer linear programming model
Start time of each operations is unique
Sequencing relations must be satisfied
The resource bounds must be met at every schedule time
step
Example
xil parameters.
x 0,1=1; x 1,1=1, x 2,1=1; x 3,2=1; x 4,3=1; x 5,4=1
Vertex V0 starts at 1,therefore x 0,1=1, similarly V1
starts at time step 1, V2 starts at time step 1, V3
starts at time step 2, V4 starts at time step 3 and V5
starts at time step 4
Vertices V1, V2, V3, V4, and V5 has mobility of 0
ie they have only one start time possibilities.
Other vertices V6, V7 has mobility of 1, hence their
xil parameters are
x 6,1+ x 6,2=1;
x 7,2+ x 7,3 = 1
vertices V8, V9, V10 and V11 has mobility of 2,
hence their xil parameters are
x 8,1+x 8,2+x 8,3=1;
x 9,2+x 9,3+x 9,4=1;
x 10,1+x 10,2+x 10,3=1;
x 11,2+x 11,3+x 11,4=1;
x n,5=1
Using sequencing constraints we get a set of sequencing
conditions as follows.
2x 7,2+3x 7,3 – x 6,1 – 2x 6,2 -1 ≥
0
If V7 starts at time step 2, then v6 should start at time step 1
or if V7 starts at time step 3, then V6 can start at time step 1
or step 2
2x 9,2+3x 9,3+4x 9,4-x 8,1-x 8,3 -1 ≥ 0
2x 11,2+3x 11,3+4x 11,4-x 10,1-2x 10,2-3x 10,3 -1 ≥
4x 5,4 -2x 7,2-3x 7,3-1 ≥ 0
0
Resource constraints:
two multipliers available
1. At time step 1
x 1,1+x 2,1+x 6,1+x 8,1 ≤ 2
Selected: x 1,1+x 2,1 = 2
1. At time step 2
x 3,2+x 6,2+x 7,2+x 8,2 ≤ 2
Selected: x 3,2+x 6,2 = 2
3. At time step 3
x 7,3+x 8,3 ≤ 2
Resource constraints:
two ALUS available
1. At time step 1
x 10,1≤ 2
2. At time step 2
x 9,2+x 10,2+x 11,2≤ 2
V11 is selected for scheduling
3. At time step 3
x 4,3+x 9,3+x 10,3+x 11,3≤ 2
V4 is selected
4. At time step 4
x 5,4+x 9,4+x 11,4≤ 2 V5, V9 are selected
Vo
10
*
*
1
*
+
2
3
-
4
-
*
6
*
7
5
Vn
11
<
*
8
+
9
Heuristic scheduling algorithms
List scheduling
List { Gs (V.E) a )
l=1
Repeat {
for each resource type k = 1,2 ….nres
Determine candidate operators U l,k;
Determine unfinished operations T l,k ;
Select a vertex such that Sk is subset of U l,k and Sk+ T l,k
≤ ak
Schedule the Sk operations at step l by setting ti=l; Vi €
Sk;
}
l=l+1;
} until(Vn is scheduled);
return (t); }
The candidate operations U l,k are those operations of
type k whose predecessors have already been scheduled
early enough so, that the corresponding operations are
completed at step
The unfinished operations T l,k are the set of operations
of type k that started at earlier cycles and whose
execution is not finished at step l.
A priority list of operations is used in choosing among
the operations based on some heuristic urgency measures
A common priority list is the table with weights of their
longest path to the sink and rank them in decreasing order
the most urgent operations are scheduled first
scheduling under resource.
Labeled graph
Let a = [l,1] T in the beginning.
At the first step for k = 1,
U1,1 = { v1,v2,v6,v8}.
two operations with zero slack/longest
path, ( v1,v2) are scheduled.
Thus vector a = [2, 1] T .
For k = 2, U1,1 = {v10}, is
selected and scheduled.
At the second step for k = 1,
U2,1 = {v 3 , v6, v8}. There are two
operations with zero slack,
{ v 3 ,v 6} which are scheduled.
For k = 2, U2,2 = {v11} ,which is
selected and scheduled.
• At the third step for k = 1 ,
U 3,1 , = {v7,v8} which are
selected.
• For k = 2, U3.2 = (v4), which is
selected and scheduled.
• At the fourth step U4,2 =
{v5,v9}. Both operations have
zero slack.
• They are selected and
scheduled.
• a is updated to a = [2,2]T.
• Hence two resources of each
type are required..
Labeled graph
Assumptions all operations have unit delay
a1 = 2 Multiplier
a2 = 2 ALUs
1st Step, k = 1, U 1,1 = { V1, V2 V6, V8 }
the selected operations are { V1, V2, } because their label in
maximum
k = 2, U 2,1 = { V10} which is Selected & scheduled
At 2nd step, k = 1, U 2,1= { V3, V6, V8 }
Selected operations are { V3, V6} because their label in maximum
For k = 2, U 2,2 { V11} which is selected and scheduled
At 3rd Step, K= 1, U 3,1, = { V7, V8} Which are selected &
scheduled
K= 2, U 3,2 = { V4} is Scheduled
At 4th Step { V5, V9} are selected and scheduled
List scheduling to determine minimum
resource
• List scheduling applied to minimize the resource usage under
•
•
•
•
•
•
•
latency
Constraint ʎ.
At the beginning, one resource per type is assumed, i.e., a is a
vector with
all entries set to 1. a=[1,1]
Slack of an operation is used to rank the operations
The lower the slack, the higher the urgency in the list is.
Operations with zero slack are always scheduled; otherwise
the latency bound would be violated.
Scheduling such operations may require additional resources,
i.e., updating a.
The remaining operations are scheduled only if they do not
require additional resources
Let a = [l,1] T in the beginning.
At the first step for k = 1,
U1,1 = { v1,v2,v6,v8}.
two operations with zero slack,
( v1,v2) are scheduled.
Thus vector a = [2, 1] T .
For k = 2, U1,1 = {v10}, is
selected and scheduled.
At the second step for k = 1,
U2,1 = {v 3 , v6, v8}. There are two
operations with zero slack,
{ v 3 ,v 6} which are scheduled.
For k = 2, U2,2 = {v11} ,which is
selected and scheduled.
• At the third step for k = 1 ,
U 3,1 , = {v7,v8} which are
selected.
• For k = 2, U3.2 = (v4), which is
selected and scheduled.
• At the fourth step U4,2 =
{v5,v9}. Both operations have
zero slack.
• They are selected and
scheduled.
• a is updated to a = [2,2]T.
• Hence two resources of each
type are required..
a1 = 3 multipliers and a2 = I ALU.
execution delays of the multiplier and the ALU are 2
and 1 respectively.
,
Multiplie AL
r
U
Start time
V1,v2,v6
v10
1
-
v11
2
V3,v7,v8
-
3
-
4
v4
5
v5
6
v9
7
Multiprocessor Scheduling and Hu's
Algorithm
labels by αi;: i = 1 , 2 , . . . , n) and let =
α=max αi
p ( j ) be the number of vertices
with label equal to j,
p(0) = 1, p(l) = 3. p(2) = 4,
p(3) = 2. p(4) = 2.
a = 3.
the first iteration of Hu's
algorithm would select U =
{V1, V2, V6, V8, V10}
schedule operations
{V1,V2,V6) at the first time
step, because their labels
(a(1) =4, a(2) = 4, a(6) = 3)
are not smaller than any other
label of unscheduled vertices
in U.
At the second iteration U =
{V3,V7,V8,V10} and
{V3,V7,V8) are scheduled at
the second time step.
.
Operations {V4,V9,V10} are scheduled at the
third step,
{V5, V11} at the fourth
(Vn) at the last
Heuristic Scheduling Algorithms:
Force-directed Scheduling
• The time frame of an operation is the time interval
where it can be scheduled.
• Time frames : ( [tis , tiL] ); i = 0, I. . . . , n].
• The operation probability is a function that is zero
outside the corresponding time frame and is equal
to the reciprocal of the frame width inside it
• Probability of the operations at time l is {pi(l)i; =
0, I , . . . , n }
Operations whose time frame is one unit wide are bound to
start in one specific time step.
For the remaining operations, the larger the width, the lower
the probability that the operation is scheduled in any given step
inside the corresponding time frame.
The type distribution is the sum of the probabilities of the
operations implementable by a specific resource type in the set
{I, 2, . . . , nres} at any time step of interest.
the type distribution at time 1 is {qk(l);k = 1 , 2 , .. . , nres}.
A distribution graph is a plot of an operation-type distribution
over the schedule steps.
Operation v1, has zero mobility. Hence p1( l ) = 1, p1(2) =
p1(3) = p1(4) = 0.
Similar considerations apply to operation v2. Operation v6 has
mobility 1. Its time frame is [I , 2].
p6(l) = p6(2) = 0.5 and p6(3) = p6(4) = 0.
Operation v8 has mobility 2.
Its time frame is [1, 3].
Hence p8(l) = p8(2) = p8(3) = 0.3 and p8(4) = 0.
Thus the type distribution for the multiplier (k = 1 ) at step 1
is q1 ( 1 ) = 1 + 1 + 0.5 + 0.3 = 2.8.
Forces can be categorized into two classes.
1. set of forces relating an operation to the different
possible control steps where it can be scheduled and
called self-forces.
2. related to the operation dependencies and called
predecessor/successor forces.
Consider the operation v6 . Its type is multiply (i.e., k = 1).
v6 can he scheduled in the first two schedule steps, and its
probability is p6 = 0.5 in those steps and zero elsewhere.
Type probability q1(l) = 2.8 and q1(2) = 2.3.
When the operation is assigned to step 1,
its probability variations are 1-0.5 for step 1
0-0.5 for step 2.
self-force = 2.8 * (1 - 0.5) + 2.3 * (0 - 0.5) = 0.25.
Force is positive, because the concurrency at step 1 of the
multiplication is higher than at step 2.
when the operation is assigned to step 2,
self-force = 2.8 * (0 - 0.5) + 2.3 * ( 1 - 0.5) = -0.25.
The assignment of operation v6 to step 2 implies the
assignment of operation v7 to step 3.
Therefore the force of v7 related to step 3,
q1(2)(0- p7(2)) +q1(3)(1 – p7(3)) =
=2.3 * (0 - 0.5) + 0.8* (1 - 0.5) = - 0.75 is the successor force of
v6
The total force on v6 at step 2 is the sum of its self-force and
successor force = -0.25 - 0.75 = -1.
The total forces on v6 at step1 and 2 are 1 and -1 ,
respectively.
Scheduling v6 at step 1 would thus increase the
concurrency as compared to scheduling v6 at step 2
© Copyright 2026 Paperzz