Ivy

Ivy
Eva Wu
Table of Content
 Ivy – Design, Architecture
 Distributed Manager Algorithms
 Performance, Problems, and Potential Improvements
Ivy
 First DSM system – Apollo Workstation
 Implemented at the Yale University in mid-to-late 80’s
 Implement multiprocessor cache coherency protocol in software
 Single writer and multiple readers
 Page management implementation
 Centralized manager
 Fixed distributed manager
 Dynamic distributed manager
Ivy Architecture
 Ownership of a page moves across nodes
 Sequential consistency
 Must invalidate before writing a page
 Simulates FIFO
Granularity
 Size of unit transfer
 Advantage: large blocks – fewer page numbers of transfer (locality)
 Disadvantage – false sharing
 Ivy – 1Kbyte page for access
Centralized Manager
 Only manager knows all the copies
 Contains a table (Info) with one entry
for each page
 Owner – most recent write access
processor
 Copy_set – list of all the processor that
have copies of the page
 Lock – for synchronizing requests
 Each processor has a page table
(PTable) – for accessibility
 Access
 Lock
Read
1. Page fault for p1 in C
2. C sends a read request to
manager
3. Manager sends read forward to
A; manager adds C to copy_set
4. A sends p1 to C; p1 in C is marked
read-only
5. C sends read confirmation to
manager
Write
1. Page fault for p1 on B
2. B sends write request to manager
3. Manager sends invalidate to all
processors in copy_set (C)
4. C sends invalidate confirm to
manager
5. Manager clears copy_set and
sends write forward to A
6. A sends p1 to B and clears access
7. B sends write confirmation to
manager
Eventcount
 Process synchronization mechanism – based on shared virtual memory
 Four primitive operations: init(), read(), await(value), and advance()
 Atomic operation
 Any process can use eventcount after initiailization
 Eventcount operations are local when a page is received by a processor
Improved Centralized Manager
 The owner, instead of the manager, keeps the copy_set of a page
 PTable: access, lock, and copy_set
 Manager still answers where the page owner is
 Copy_set is sent along with the data
 Owner is responsible for invalidation
 Decentralized synchronization – centralized manager no longer the hot
spot
Fixed Distributed Manager
 Every processor has a predetermined set of pages to manage
 One manager per processor
 Manager is responsible for pages specified by fixed mapping function H
 Page fault on p
 Faulting processor asks processor H(p) where the true page owner is
 Processor H(p) finds true page owner using centralized manager algorithm
Broadcast Distributed Manager
 No manager
 PTable: access, lock, copy_set, and owner
 Owner behaves similar to a manager and keeps the copy_set
 Requesting processor sends a broadcast message
 Disadvantage: all processes have to process each broadcast request
Broadcast Distributed Manager Read
Add P1 to copy set and
send copy of page 0
Broadcast
P0
P1
P2
Request Page 0
Page 0
P3
P4
Broadcast Distributed Manager Write
Page 0 and
its copy set
Broadcast
P0
P1
P2
P3
Write request
Page 0
Page 0
P4
Dynamic Distributed Manager
 Manager = owner
 A page does not have a fixed owner or manager
 Each process keeps tracks of the probable owner (probOwner)
probOwner
 Value either true owner or “probable” owner of the page
 Page fault – sends a request to the processor in probOwner field
 If correct, then proceeds as in centralized manager algorithm
 If incorrect, then forward the message to the “probable” owner
 Initially, all probOwners are set to a default processor
 Updates when
 Invalidation request
 Relinquishes ownership
 Forwards a page fault request
Dynamic Distributed Manager Read
Request Read
P0
P1
P2
P3
P4
probOwner
Page 0
Long link to find the true owner
Dynamic Distributed Broadcasts
 Improved the dynamic distributed manager algorithm by enforcing a
broadcast message
 Announce the true owner after every M page faults
 M steadily increases as number of processors get large
 Program converges when M is very large
Dynamic Distributed Broadcasts Read
Current Owner
Request
P0
P1
probOwner
Page 0
Broadcast
P2
P3
P4
Dynamic Distributed Copy Set
 Copy set data to be stored as a tree
 Root: owner
 Bi-directional
 Directed from root: copy_set
 Directed from leaves: probOwner
 Read fault: probOwner to the owner
 Write fault: Invalidates all copies starting at owner and propagate to the
copy_sets
Double Fault
 Read first, then write – page fault twice
 Solution – sequence numbers
 Process can send its page sequence number to the owner. The owner then
decide whether a transfer is needed
 Only avoids transaction
Performance
 Works well when there is little sharing
 Cannot handle false sharing
 Sequential consistency required large amounts of communication
 Ping-pong effect
Potential Improvements
 Allow multiple writers by allowing certain users to keep private copies
 Do not share the entire page to reduce false sharing
Questions?