Virtual Machine Scheduling - National Taiwan University

An Energy-efficient Task Scheduler for
Multi-core Platforms with per-core
DVFS Based on Task Characteristics
Ching-Chi Lin
Institute of Information Science, Academia Sinica
Department of Computer Science and Information Engineering, National Taiwan University
You-Cheng Syu, Pangfeng Liu
Department of Computer Science and Information Engineering, National Taiwan University
Graduate Institute of Networking and Multimedia, Nation Taiwan University
Chao-Jui Chang, Jan-Jan Wu
Institute of Information Science, Academia Sinica
Research Center for Information Technology Innovation, Academia Sinica
Po-Wen Cheng, Wei-Te Hsu
Information and Communications Research Laboratories, Industrial Technology Research Institute
Introduction

Modern processors support DVFS on a
per-core basis.
◦ Dynamic Voltage and Frequency Scaling(DVFS)

For the same core, increasing computing
power means higher power consumption.
Challenge

Find a good balance between
performance and power consumption.
Two Scenarios

Batch mode
◦ A set of computation-intensive tasks with the
same arrival time.

Online mode
◦ Two types of tasks with different priorities.
 Interactive and non-interactive
◦ Tasks can arrive at any time.
Example: Judge System

Online mode
◦ Users submit their code/answers, and wait for
their scores.
 Interactive: user requests, such as score querying
 Non-interactive: processing user submissions.

Batch mode
◦ Re-judge and validate all submitted
code/answers.
Our Contribution

Present task scheduling strategies that
solves three important issues
simultaneously.
◦ The assignment of tasks to cores
◦ The execution order of tasks on a core
◦ The processing frequency for the execution of
each task.
Our Contribution(Cont.)
For batch mode, we propose Workload
Based Greedy algorithm.
 For online mode, we propose Least
Marginal Cost heuristic.

Models

Task Model
◦ Assume the number of CPU cycles required to
complete a task, Lk, is known.
◦ The arrival time of a task
 batch mode: 0.
 online mode: known.
Models(Cont.)

Processing frequency
◦ Only a set of discrete processing frequencies,
pi, is available.
◦ The core frequency remains the same while
executing a task.
Models(cont.)

Power and Performance
◦ For a task jk
ek  Lk E ( pk )
t k  Lk T ( pk )
◦ E(pk) and T(pk) are the energy and time
required to execute one cycle with frequency
pk.
Task Scheduling in Batch Mode

Two categories:
◦ Tasks with deadline
◦ Tasks without deadline

Two environments:
◦ Single core
◦ Multi-core

Four combinations in total.
Tasks with Deadline

[Objective] Every task must meet its
deadline, and the overall energy
consumption is less than E*.

An NP-Complete problem on both single
and multi-core platform.
◦ Reduce the Partition problem.
Tasks without Deadline

[Objective] Minimize the cost function C
n
C   Ck
k 1
n
  Ck .energy  Ck ,time
k 1
n
k
k 1
i 1
  {Re Lk E ( pk )  Rt  LiT ( pi )}
◦ Re : the cost of a joule of energy
◦ Rt : the cost of a second
Tasks without Deadline: Single Core

Rewrite cost function C into
n
C   C (k , pk ) Lk
k 1

Minimize C(k, pk) for every task in order
to minimize C.

Define C(k) = min{C(k, pk)}
◦ C(k) is a non-increasing function of k.
Minimizing the Cost
n
Since min(C)   C (k ) Lk and C(k) is nonk 1
increasing.
 The tasks are in non-decreasing order of
Lk in an optimal solution.
 Choose pk for each sorted task with the
minimum C(k, pk).

Tasks without Deadline: Multi-Core

Two cases
◦ Homogeneous multi-core
 Same T and E for every cores.
◦ Heterogeneous multi-core
 Different T and E.

Same idea
◦ Minimize total cost by minimizing C(k) for
every task on all cores.
Workload Based Greedy
Sort the tasks according to Lk in
descending order.
 Start from the task with largest Lk

◦ Find k on core j with min Cj(k) among all
cores, and assign the task to the
corresponding position.
◦ Compute pk for the task.

Repeat until all tasks are scheduled.
Workload Based Greedy Example
Execution Order
Core0
Sorted Tasks
(in descending order)
…
Core1
J1
…
Core2
…
J1
J2
J3
…
Task Scheduling in Online Mode

[Objective] minimize the total cost for
every time interval during the execution
of tasks.
◦ Time interval: the time between two
consecutive arrival event.
Some Assumptions

Two categories of tasks:
◦ Interactive tasks
◦ Non-interactive tasks
◦ Interactive tasks have higher priority than
non-interactive tasks
Tasks can arrive at any time.
 Multi-core environment.

Least Marginal Cost

For every new arrival task
◦ For each core, compute the minimum cost
and position of inserting the task.
◦ Insert the task to the corresponding position
of the core with minimum cost among all
cores.

Notice that interactive tasks have higher
priority than non-interactive tasks.
Evaluation

Conduct experiments to compare the
overall cost between our scheduling
strategy with the others.

Environment:
◦ 24 physical servers, each with 4 core X5460
CPU * 2 with hyperthreading,16 GB memory,
and 250 GB disk.
Evaluation: Batch Mode

Input: 12 benchmarks from SPEC2006int
◦ train and ref inputs
Experimental Results: Batch Mode
Time
Time
Energy
1.4
1.2
2
1.6
1.8
1.4
1.6
1
1.2
1.4
1
1.2
0.8
1
0.6
0.8
0.8
0.6
0.6
0.4
0.4
0.4
0.2
0.2
0.2
0
0
WBG
OLB
PS
Energy
0
WBG
OLB
PS
WBG
OLB
 Workload Based Greedy(WBG)
 Opportunistic Load Balancing(OLB)
 Power-Saving(PS)
◦ The total cost reduction is about 27% and
20% to OLB and PS, respectively.
PS
Evaluation: Online Mode

Input: trace from an online judging system.
◦ 768 non-interactive tasks.
◦ 50,525 interactive tasks.
◦ Length of trace: half hour.
Experimental Results: Online Mode
Time
Time
Energy
2
1.14
1.8
1.12
1.6
1.1
1.4
1.2
1.08
1.4
Energy
1
1.06
1.2
0.8
1.04
1
1.02
0.8
0.6
1
0.6
0.4
0.98
0.4
0.96
0.2
0.94
0
0.2
0.92
LMC
OLB
OD
0
LMC
OLB
OD
LMC
OLB
 Least Marginal Cost(LMC)
 Opportunistic Load Balancing(OLB)
 On-Demand(OD)
◦ The total cost reduction is about 17% and
24% to OLB and OD, respectively.
OD
Conclusion

We propose energy-efficient scheduling
algorithms for multi-core systems with
DVFS features.
◦ For batch mode and online mode.
◦ The experimental results show significant cost
reductions.

We will integrate our work into our
existing judging system.
Questions?