A Business-Driven Cloudburst Scheduler for Bag-of

A Business-Driven
Cloudburst Scheduler
for Bag-of-Task Applications
Francisco Brasileiro, Ricardo Araújo,
David Candeia Maia, Raquel Lopes
[email protected], [email protected]
[email protected], [email protected]
Federal University of Campina Grande, Brazil
Department of Systems and Computing
Distributed Systems Lab
Uppsala, April 12-16th 2010
EGEE 5th User Forum
1
Outline
•
•
•
•
•
•
Motivation
Problem Statement
Business-driven heuristics for cloudbursting
Evaluation
Implementation
Conclusions
Uppsala, April 12-16th 2010
EGEE 5th User Forum
2
Motivation
• Many e-Science applications can be easily parallelised
– They fall in the so called, bag-of-tasks class of applications
• They have little QoS requirements
– In particular, they can be executed on opportunistic
infrastructures, since fault tolerance mechanism are
trivially implemented
• Yet, the research cycle could be speeded up, if
applications could complete faster
– Can we leverage on the availability of resources in cloud
computing providers, so to speed up the execution of such
applications?
– How much should one pay for that?
Uppsala, April 12-16th 2010
EGEE 5th User Forum
3
Computation infrastructure
Free resources
from opportunistic
Desktop Grids (eg.
Condor, OurGrid,
XtremWeb, etc.)
Resources acquired
from a Cloud
Computing provider (eg.
AWS EC2 on-demand
instances)
Local resources, possibly
used in an opportunistic
way with a fairly small
additional cost
Uppsala, April 12-16th 2010
BoT user
EGEE 5th User Forum
4
Research question
Free resources
from opportunistic
Desktop Grids (eg.
Condor, OurGrid,
XtremWeb, etc.)
Resources acquired
from a Cloud
Computing provider (eg.
AWS EC2 on-demand
instances)
Where shall I run
my application???
Local resources, possibly
used in an opportunistic
way with a fairly small
additional cost
Uppsala, April 12-16th 2010
BoT user
EGEE 5th User Forum
5
A business-driven approach
• Running the application will incur costs, except
when it is executed on the idle time of the local
resources or on the best-effort grid infrastructure
• Completing the execution of the application by a
given time yields utility
– These are described by monotonically decreasing
utility functions that associate a utility to each
different value of the application’s makespan
• A solution to the problem should maximise the
profit, where:
Profit = Utility – Cost
Uppsala, April 12-16th 2010
EGEE 5th User Forum
6
Examples of utility functions
Let tr be the time the application is ready for submission and
td-tr be the largest makespan for which there is some utility
to be gained by the execution of the application
Uppsala, April 12-16th 2010
EGEE 5th User Forum
7
A family of heuristics for cloudbursting
• From time to time, observe the system past
behaviour
• Calculate the system throughput (number of
tasks processed per unit of time)
• Maximise the profit function:
– Assuming that the current throughput will be
maintained
– Considering the system “acceleration”
• The output of the maximisation procedure is the
number of cloud computing instances that should
be acquired/released for the next period
Uppsala, April 12-16th 2010
EGEE 5th User Forum
8
Evaluation methodology
• We have built a discrete-event simulator to evaluate the proposed
heuristics
– It works with the notion of a turn whose length is equal to the minimal
time window for which resources can be acquired from a cloud
computing provider (eg. 1 hour for AWS EC2 on-demand instances)
– At each turn it decides how many recourses need to be acquired from
the cloud provider for the next turns in order to maximise the profit
• The simulator also performs the cloudburst scheduling with full
knowledge about the future, leading to an optimal solution
– The profit yield by the optimal solution is used to compute the
efficiency of the schedule provided by the heuristics
• E(h) = P(h)/Po, where E(h) is the efficiency of heuristic h, P(h) is the profit achieved
by heuristic h, and Po is the optimal profit for the scenario evaluated
Uppsala, April 12-16th 2010
EGEE 5th User Forum
9
Evaluation scenarios
• Three different heuristics
– Conservative, derivative, midpoint derivative
• Two utility functions
– Decay and exponential
•
•
•
•
Three different grid sizes
Four machine availability traces
AWS EC2 on-demand instances pricing model
Three level of task heterogeneity for BoT
applications
– Homogeneous (10 minutes per task), U[5,15], U(0,20]
Uppsala, April 12-16th 2010
EGEE 5th User Forum
10
Evaluation results
Efficiency
Exponential utility function
Efficiency
Decay utility function
Number of machines in the grid
Efficiency
Efficiency
Number of machines in the grid
Number of machines in the grid
Efficiency
Efficiency
Number of machines in the grid
Number of machines in the grid
Uppsala, April 12-16th 2010
EGEE 5th User Forum
Number of machines in the grid
11
Implementation
• The best heuristic has been implemented in
the OurGrid grid middleware
• The new user interface allows users to
perform cloudbursting using both the AWS
EC2 cloud computing provider and
private/public cloud providers based on
Eucalyptus
Uppsala, April 12-16th 2010
EGEE 5th User Forum
12
Implementation
Implementation
Peers
Implementation
Workers
Implementation
Broker
Implementation
Implementation
Implementation
Cloud
Provider
Peer
Implementation
Implementation
Implementation
Implementation
Implementation
• OurGrid Broker
– User set up a “Cloud Provider Peer”
Conclusions
• We have shown that cloudbursting is a
feasible approach to speed up the execution
of BoT applications
• Simple heuristics perform very well
• The software is not yet available in the latest
release of OurGrid but can be provided upon
requests sent to [email protected]
• The use of the system by real users will help
us to improve its design
Uppsala, April 12-16th 2010
EGEE 5th User Forum
25
Thanks for your attention!
• I will be glad to answer your questions
• For more information about this project visit
http://redmine.lsd.ufcg.edu.br/projects/ourgrid
• For more information about the OurGrid
middelware visit http://www.ourgrid.org/
• For more information about other projects
developed by LSD/UFCG, visit
http://redmine.lsd.ufcg.edu.br/projects
Uppsala, April 12-16th 2010
EGEE 5th User Forum
26