SOOS: Resource Discovery and Modelling in

S(o)OS Project - CASTNESS'11
Roma, January17-18 2011
System Level Resource Discovery
and Management for Multi Core
Environment
Javad Zarrin
© 2005, it - instituto de telecomunicações. Todos os direitos reservados.
Outline
•
•
•
•
•
•
Introduction
Challenges
•
•
•
Resource Description
Resource Discovery
Resource Management
Current SDPs
Proposed Solution
Simulation & Results
•
•
COTSon
HPL
Conclusion
System Level Resource Discovery & Management For Multi Core Environment
2 17 January 2011, CASTNESS’11
Introduction
Resource Discovery in So(o)S Project – Scenario –
network topology for a cluster combined of n heterogeneous nodes
with n CPUs ( n core per CPU) , n>=100
Core n
Private Cache – L1
Shared Cache – L2
System Level Resource Discovery & Management For Multi Core Environment
3 17 January 2011, CASTNESS’11
Introduction
•
Problems?
•
•
Memory latency, Bandwidth Bottleneck, Interconnection Network
Using all available resources in an efficient manner
•
How to define resources as services?
•
What is a resource?
•
•
•
•
•
•
Core
Chip
Board
Memory Chip
Pluggable Device
Board and Memory Parameters
Chip Parameters
Core Parameters
Shared Cache – L2
What are relevant?
System Level Resource Discovery & Management For Multi Core Environment
4 17 January 2011, CASTNESS’11
Challenges - Resource Description
• How to describe a resource?
•
•
•
Resource description for a huge number of heterogeneous
resources (cores) in an adequate and efficient manner.
The heterogonous resources in the network needs to be defined by
set of strict parameters, these parameters describe the
characteristics and performance factors of the corresponded
resources as services on the network.
Example parameters> Clock rate, MIPS, GFLOPS, cache size,
SPEC Benchmark, etc.,
System Level Resource Discovery & Management For Multi Core Environment
5 17 January 2011, CASTNESS’11
Challenges - Resource Discovery
Massive amount of resources
•
Discovering all the existing cores on the local chip or on the
network with a large scale is costly due to the excessive
information exchange
Scalable search for required resources
•
•
•
Rate of Discovery
Parallel search algorithms
Packet Propagation
System Level Resource Discovery & Management For Multi Core Environment
6 17 January 2011, CASTNESS’11
Challenges - Resource Management
•
Smart Resource Management
•
What is the best resource for a specific requirement?
•
What is the metric?
•
Fault tolerance
System Level Resource Discovery & Management For Multi Core Environment
7 17 January 2011, CASTNESS’11
Service Discovery Protocols
System Level Resource Discovery & Management For Multi Core Environment
8 17 January 2011, CASTNESS’11
Service Discovery Protocols
System Level Resource Discovery & Management For Multi Core Environment
9 17 January 2011, CASTNESS’11
The Proposed Solution
•
Architecture : combination of distributed and centralized
•
Search : Informed -Heuristic Search Methods
•
Message Propagation : Unicast, Anycast
•
Announcement : Pull (Reactive, Query-based) in Network , Push
(Proactive, Announcement-based) in Node
•
Scalable (Consistency and Service Validation)
System Level Resource Discovery & Management For Multi Core Environment
10 17 January 2011, CASTNESS’11
The Proposed Solution – RD Mechanism
QMS
QMS
Search in the next neighboring
tires
QMS
5
5
3
resourceQuery(minReq)
QMS
RCT
4
2
resource(m).setrank=query(z).getorigin.getrank(m)
QMS
1
QMS
5
reply(RO)
If queue(i).lenght(i) > threshold then generate.query(minReq)
QMS
QMS
System Level Resource Discovery & Management For Multi Core Environment
11 17 January 2011, CASTNESS’11
QMS
The Proposed Solution
•
Service Cost , Cost Table and Resource Ranking Algorithms
•
Performance Parameters and Metrics
•
•
•
Memory, Cache
Clock Rate
GFLOPS
•
Alternatives:
•
•
Real time Benchmarking,
Micro Benchmarks (MHPC, SMB, MIBA)
System Level Resource Discovery & Management For Multi Core Environment
12 17 January 2011, CASTNESS’11
Simulation & Result –Simulation Tools
•
COTson
HP Lab’s COTSon is a full system simulation framework based
on AMD’s SimNow.COTSon allows for simulating complete
computing systems, ranging from a single node to a large
cluster of hundreds of multicore nodes.
•
High Performance Linkpack Benchmark (HPL)
"HPL is a software package that solves a (random) dense linear
system in double precision (64 bits) arithmetic on distributedmemory computers. It can thus be regarded as a portable as
well as freely available implementation of the High
Performance Computing Linpack Benchmark.”
Alternative:NAMD
System Level Resource Discovery & Management For Multi Core Environment
13 17 January 2011, CASTNESS’11
Simulation & Result –Simulation
•
Objective of simulation
To make comparison between the performance results of
running HPL on simulated cluster with the proposed RD and
also with SNMP
•
Sample Resource Cost Table
Core ID
Latency Frequency
Cache size
Rank
#1
17
800 MHz
128KB
12
#2
26
1GHz
256KB
7
System Level Resource Discovery & Management For Multi Core Environment
14 17 January 2011, CASTNESS’11
Simulation & Result –Simulation Architecture
COTSON Control
Control Script
4
XML-RPC
DataBase
Host
Control Daemon
Core1
SimNow-Node2
Core2
Memory
BSD
1
HDD
SimNow-Node3
Core1
Core2
Memory
Q –Mediator - Network
SimNow-Node1
3
Core1
Core2
Memory
2
HDD
BSD
Simnow-Node4
Core1
Core2
Memory
System Level Resource Discovery & Management For Multi Core Environment
15 17 January 2011, CASTNESS’11
Results
900
800
700
time - seconds
600
500
400
300
200
100
0
0
5000
10000
15000
20000
N- problem size
25000
30000
#nodes=4-Proposed Solution
#nodes=4-SNMP
#nodes=3-Proposed Solution
#nodes=3-SNMP
#nodes=2-Proposed Solution
#nodes=2-SNMP
System Level Resource Discovery & Management For Multi Core Environment
16 17 January 2011, CASTNESS’11
35000
Results
6.00E+01
5.00E+01
Throughput
GFLOPS
4.00E+01
3.00E+01
2.00E+01
1.00E+01
0.00E+00
0
5000
10000
15000
20000
N- problem size
25000
30000
#nodes=4-Proposed Solution
#nodes=4-SNMP
#nodes=3-Proposed Solution
#nodes=3-SNMP
#nodes=2-Proposed Solution
#nodes=2-SNMP
System Level Resource Discovery & Management For Multi Core Environment
17 17 January 2011, CASTNESS’11
35000
Conclusion & Future Work
According to the results of the simulation, we can conclude that :
• The proposed method is scalable , when we increase the problem
size and the cluster size , it shows better results.
• The proposed resource discovery mechanism enhanced the total
performance of the cluster with multi core nodes
• This work still is in preliminary states , we will extend it to be more
efficient and adapted with multi core environment.
System Level Resource Discovery & Management For Multi Core Environment
18 17 January 2011, CASTNESS’11
Thank You
System Level Resource Discovery & Management For Multi Core Environment
19 17 January 2011, CASTNESS’11