Use of Performance Prediction Techniques for Grid Management

Agent-Based Grid Load-Balancing
Daniel P. Spooner
University of Warwick, UK
Junwei Cao
NEC Europe Ltd., Germany
Outline






Grid Computing & Middleware
Agent-Based Methodology
Distributed Service Discovery
Grid Load-Balancing
Experimental Results
Conclusions & Future Work
Grid Computing



Computational Grids - blueprint for a
new computing infrastructure
Data/Service/Community Grids large-scale resource sharing in VOs
Grid/Web Services distributed system
and application
integration
Grid Middleware





Resource Management & Scheduling
Information Services (Monitoring & Discovery)
Data Management and Access
Application Programming Environments
Security, Accounting, QoS ……
Condor, LSF, Ninf, Nimrod, ……
Globus, Legion, DPSS, ……
Java/Jini, CORBA, Web Services, ……
Grid Resource Management
Key challenges:




Cross-domain
Large-scale
Dynamic
QoS support
GRM
Agent-Based Methodology
An agent is:






A local grid manager
An user agent
A broker
A service provider
A service requestor
A matchmaker
A
A
A
A
Local Management
Performance Prediction


Task models
Hardware models
FIFO Algorithm
Genetic Algorithm


Heuristic &
Evolutionary
Near-optimal on
makespan, deadlines
and idletime.
Processor 1
Processor 2
Processor 3
Processor 4
Processor 5
Processor 6
Processor 7
Processor 8
2n-1
Service Discovery




Pure data-pull
No advertisement
Full discovery
Efficient when service
change more quickly
R/A
2 PullSerInfo




Pure data-push
Full advertisement
No discovery
Efficient when requests
arrive more frequently
R/A
1 PushSerInfo
1 TaskInfo
A
3 TaskInfo
U/A
A
4 Return
R/A
3 PullSerInfo
U/A
4 Return
R/A
2 PushSerInfo
Centralised, not applicable for grid computing!
Optimisation Strategies




Configurable data-pull or data-push
Agent hierarchy
Multi-step advertisement & multi-step discovery
Efficient when frequencies of request arrivals
and service changes are almost the same
Distributed!
balancing between
advertisement and
discovery!
A
User
A
A
A
A
Agent Implementation
Resource
Monitoring
Service Info.
Provider
Task
Management
FIFO/GA
Scheduling
FIFO
Scheduling
Hw Model
Performance
Prediction
Task
User Deadline
Results
Sched. Time
Current
Makespan
Service Info
Database
Management
Task Execution
Task Execution
Execution
Match
Maker
Task Model
Advertisement
Discovery
To Another Agent
Load Balancing Metrics





Total makespan
Average advance time of task execution
completions (required deadline - actual task
completion time)
Average processor utilisation rate (busy time /
total makespan)
Load balancing level (1 - mean square
deviation of processor utilisation rates /
average processor utilisation rate)
Total number of network packages for both
advertisement and discovery
Experiment Design
Tasks:
sweep3d
fft
improc
closure
jacobi
memsort
cpi
S1
(SGIOrigin2000, 16)
S2
S5
(SGIOrigin2000, 16)
(SunUltra5, 16)
S4
(SunUltra10, 16)
S3
S12
(SunUltra10, 16)
S10
S8
S6
(SunUltra1, 16)
(SunUltra1, 16)
(SunUltra5, 16)
(SunSPARCstati
on2, 16)
S11
S7
(SunUltra5, 16)
S9
(SunUltra1, 16)
(SunSPARCstati
on2, 16)
Experiment 1
FIFO
FIFO
FIFO
FIFO
FIFO
Experiment 2
GA
GA
GA
GA
GA
Experiment 3
GA
GA
GA
GA
GA
Task Execution
200
S1
0
1
2
-200
3
S2
S3
S4
S5
S6
S7
-600
S8
S9
ε (s)
-400
S10
S11
-800
S12
-1000
Total
-1200
Experiment Number
Both GA and agents
contribute towards
the improvement in
task executions.
Resource Utilisation
90
80
S1
70
S2
S3
S4
υ (%)
60
50
S5
S6
40
S7
S8
30
S9
S10
S11
S12
20
10
Total
0
1
2
Experiment Number
3
Less powerful S11
& S12 benefit
mainly from the GA.
More powerful S1 &
S2 benefit mainly
from agents.
Load Balancing
β (%)
100
90
S1
80
S2
S3
70
S4
60
S5
S6
50
S7
S8
40
S9
S10
30
20
S11
S12
10
Total
0
1
2
Experiment Number
3
The GA contributes
more to local grid
load balancing.
Agents contribute
more to global grid
load balancing.
Total Makespan
160
Centralised
Distributed
140
t (mins)
120
The centralised pure
data-pull can always
achieve the best results
100
80
Distributed agents with
the hierarchical model
can also achieve
reasonably good results
60
40
20
0
4
6
8
10
12
14
Agent Number
16
18
20
Network Package
25000
Centralised
Distributed
packets
20000
15000
10000
5000
0
4
6
8
10 12 14 16 18 20
Agent Number
The network overhead
for the pure data-pull
strategy to achieve
better results is very
high.
Distributed agentbased service
advertisement and
discovery can scale
well.
Conclusions



An multi-agent paradigm provides a clear
high-level abstraction of grid resource
management system.
Distributed service advertisement and
discovery strategies can be used to improve
agent performance.
Agent-based framework is scalable, flexible,
and extensible for further enhancements.