Internet-Based TSP Computation
with Javelin++
Michael Neary & Peter Cappello
Computer Science, UCSB
Introduction
Goals
• Service parallel applications that are:
– Large: too big for a cluster
– Coarse-grain: to hide communication latency
• Simplicity of use
– Design focus: decomposition [composition] of computation.
• Scalable high performance
– despite large communication latency
• Fault-tolerance
– 1000s of hosts, each dynamically [dis]associates.
Introduction
Some Related Work
Introduction
Some Applications
• Search for extra-terrestrial life
• Computer-generated animation
• Computer modeling of drugs for:
– Influenza
– Cancer
– Reducing chemotherapy’s side-effects
• Financial modeling
• Storing nuclear waste
Outline
• Architecture
• Model of Computation
• API
• Scalable Computation
• Experimental Results
• Conclusions & Future Work
Architecture
Basic Components
Clients
Brokers
Hosts
Architecture
Broker Discovery
B
B
B
Broker
Naming
System
B
B
B
H
B
B
B
Architecture
Broker Discovery
B
B
B
Broker
Naming
System
B
B
B
H
B
B
B
Architecture
Broker Discovery
B
B
B
Broker
Naming
System
B
B
B
H
B
B
B
Architecture
Broker Discovery
B
B
B
Broker
Naming
System
B
B
B
H
B
B
B
PING
(BID?)
Architecture
Broker Discovery
B
B
B
Broker
Naming
System
B
B
B
H
B
B
B
Architecture
Network of Broker-Managed Host Trees
• Each broker manages
a tree of hosts
Architecture
Network of Broker-Managed Host Trees
• Brokers form a network
Architecture
Network of Broker-Managed Host Trees
• Brokers form a
network
• Client contacts broker
Architecture
Network of Broker-Managed Host Trees
• Brokers form a
network
• Client contacts broker
• Client gets host trees
Scalable Computation
Deterministic Work-Stealing Scheduler
addTask( task )
getTask( )
Task
container
stealTask( )
HOST
Scalable Computation
Deterministic Work-Stealing Scheduler
Task getWork( )
{
if ( my deque has a task )
return task;
else if ( any child has a task )
return child’s task;
else
return parent.getWork( );
}
CLIENT
HOSTS
Models of Computation
• Master-slave
– AFAIK all proposed commercial applications
• Branch-&-bound optimization
– A generalization of master-slave.
Models of Computation
Branch & Bound
UPPER =
LOWER = 0
0
7
2
3
4
3
6
8
7
10
12 10 9
8
10
0
Models of Computation
Branch & Bound
UPPER =
LOWER = 2
0
7
2
3
4
3
6
8
7
10
12 10 9
0
2
8
10
Models of Computation
Branch & Bound
UPPER =
LOWER = 3
0
7
2
3
4
3
6
8
7
10
12 10 9
0
2
8
10
3
Models of Computation
Branch & Bound
0
UPPER = 4
LOWER = 4
7
2
3
4
3
6
8
7
10
12 10 9
0
2
8
10
3
4
Models of Computation
Branch & Bound
0
UPPER = 3
LOWER = 3
7
2
3
4
3
6
8
7
10
12 10 9
0
2
8
10
3
4
3
Models of Computation
Branch & Bound
0
UPPER = 3
LOWER = 6
7
2
3
4
3
6
8
7
10
12 10 9
0
2
8
10
3
4
3
6
Models of Computation
Branch & Bound
0
UPPER = 3
LOWER = 7
7
2
3
4
3
6
8
7
10
12 10 9
0
7
2
8
10
3
4
3
6
Models of Computation
Branch & Bound
• Tasks created dynamically
0
• Upper bound is shared
• To detect termination:
7
2
3
6
scheduler detects tasks that
have been:
– Completed
– Killed (“bounded”)
4
3
API
public class Host implements Runnable
{
. . .
public void run()
{
while ( (node = jDM.getWork()) != null )
{
if ( isAtomic() )
compute(); // search space; return result
else
{
child = node.branch(); // put children in child array
for (int i = 0; i < node.numChildren; i++)
if ( child[i].setLowerBound() < UpperBound )
jDM.addWork( child[i] );
//else child is killed implicitly
}
}
}
API
private void compute() {
. . .
boolean newBest = false;
}
while ( (node = stack.pop()) != null ) {
if ( node.isComplete() )
if ( node.getCost() < UpperBound ) {
newBest = true;
UpperBound = node.getCost();
jDM.propagateValue( UpperBound );
best = Node( child[i] );
}
else {
child = node.branch();
for (int i = 0; i < node.numChildren; i++)
if ( child[i].setLowerBound() < UpperBound )
stack.push( child[i] );
//else child is killed implicitly
} }
if ( newBest )
jDM.returnResult( best );
}
Scalable Computation
Weak Shared Memory Model
• Slow propagation of bound affects
performance not correctness.
Propagate bound
Scalable Computation
Weak Shared Memory Model
• Slow propagation of bound affects
performance not correctness.
Propagate bound
Scalable Computation
Weak Shared Memory Model
• Slow propagation of bound affects
performance not correctness.
Propagate bound
Scalable Computation
Weak Shared Memory Model
• Slow propagation of bound affects
performance not correctness.
Propagate bound
Scalable Computation
Weak Shared Memory Model
• Slow propagation of bound affects
performance not correctness.
Propagate bound
Scalable Computation
Fault Tolerance via Eager Scheduling
When:
• All tasks have been assigned
• Some results have not been reported
• A host wants a new task
Re-assign a task!
• Eager scheduling tolerates faults & balances the load.
– Computation completes, if at least 1 host communicates with client.
Scalable Computation
Fault Tolerance via Eager Scheduling
• Scheduler must know which:
0
– Tasks have completed
7
2
– Nodes have been killed
• Performance balance
3
4
3
6
– Centralized schedule info
– Decentralized computation
Experimental Results
100
Speedup
80
graph22
60
ideal
graph24
40
20
0
0
20
40
60
Processors
80
100
Experimental Results
0
Example of a “bad” graph
7
2
3
4
3
6
8
7
10
12 10 9
8
10
Conclusions
• Javelin 2 relieves designer/programmer managing a
set of [Inter-] networked processors that is:
– Dynamic
– Faulty
• A wide set of applications is covered by:
– Master-slave model
– Branch & bound model
• Weak shared memory performs well.
• Use multicast (?) for:
– Code distribution
– Propagating values
Future Work
• Improve support for long-lived computation:
– Do not require that the client run continuously.
• A dag model of computation
– with limited weak shared memory.
Future Work
Jini/JavaSpaces Technology
“Continuously” disperse
Tasks among brokers via a
physics model
H
H
H
TaskManager
aka Broker
H
H
H
H
H
Future Work
Jini/JavaSpaces Technology
• TaskManager uses
persistent JavaSpace
– Host management: trivial
– Eager scheduling: simple
• No single point of failure
– Fat tree topology
Future Work
Advanced Issues
• Privacy of data & algorithm
• Algorithms
– New computational complexity model
“Minimize” communication between machines
– N-body problem, …
• Accounting: Associate specific work with specific host
– Correctness
– Compensation (how to quantify?)
• Create international open source organization
– System infrastructure
– Application codes
Models of Computation
Branch & Bound
0
UPPER = 3
LOWER = 0
7
2
3
4
3
6
8
7
10
12 10 9
0
7
2
8
10
3
4
3
6
8
7
10
12 10 9
8
10
Architecture
Broker Name Service (BNS)
1. Register with BNS
BNS
BROKER
HOST
Architecture
Broker Name Service (BNS)
1. Register with BNS
BNS
2. Get broker list
HOST
BROKER
Architecture
Broker Name Service (BNS)
1. Register with BNS
BNS
2. Get broker list
HOST
BROKER
3. Ping brokers on list
Architecture
Broker Name Service (BNS)
1. Register with BNS
BNS
2. Get broker list
BROKER
4. Connect to selected broker
HOST
3. Ping brokers on list
© Copyright 2026 Paperzz