Chapter 1 - Department of Computer Science and Engineering

A Low-Cost Parallel Queuing
System for Computationally
Intensive Problems
Sean Martin, Bei Yuan, Judy Fredrickson,
Fred Harris, Jr.*
University of Nevada, Reno
Background - Crossing Number Problem

My involvement started in Graduate School
I
was in Computer Science, my fiancé was in
Mathematics at Clemson University
 She
 Her
was under Rich Ringeisen
MS work led to a 1988 Congressus Paper
 “Crossing
Numbers of Permutation Graphs”
 Helping
her with the code got me “hooked” on
the problem and I ended up taking Graph
Theory from Ringeisen a couple of years later.
A Low-Cost Parallel Queuing System for Computationally Intensive Problems
Background - Crossing Number Problem

My work
 A GA for
the Rectilinear MCN Problem
 1993
Cumberland Conference
 1996 Ars Combinatoria Paper
 Found
drawings of K12 and K13 better than the
formulas by Richard Guy
 Richard
Guy said if the rectilinear formula was not a
tight bound, the normal one would not be either.
A Low-Cost Parallel Queuing System for Computationally Intensive Problems
Background - Crossing Number Problem
 Could
you develop an algorithm for calculating
the MCN for non-rectilinear graphs?
 My
wife and I worked on and finally developed a
computational algorithm for solving the Minimum
Crossing Number Problem for non-rectilinear
problem.
 This was presented at the 1996 – Kalamazoo
Conference
A Low-Cost Parallel Queuing System for Computationally Intensive Problems
Background - Crossing Number Problem
 This
algorithm was then implemented by one of
my students
 Umid
Tadjiev
 Developed a static parallel partitioning of it
 Presented our work at the 1997 SIAM Conference
on Parallel Processing for Scientific Computing.
A Low-Cost Parallel Queuing System for Computationally Intensive Problems
Motivation
 We
still have not found out if the formula by
Richard Guy is exact or not.
 The
problem is that this problem, and others
like it, are computationally expensive
 My
goal has been to build a tool that would
allow us to expand our knowledge of the MCN
problem (and others as well).
A Low-Cost Parallel Queuing System for Computationally Intensive Problems
Parallel Cluster Computation
 Computer
clusters are affordable
 Parallel processing now feasible for
computationally intensive problems
 Exhaustive
Searches
 Graph Algorithms
 Can
we build a tool that will harness this power
and allow researchers to use it with little (or no)
knowledge needed of the parallel programming
details?
A Low-Cost Parallel Queuing System for Computationally Intensive Problems
Development/Testing Cluster
 Our
idea for a computational engine
A
group of networked workstations
 Use many machines as one “Supercomputer”
 Work is distributed across all machines
 Low cost makes it affordable resource
 College
of Engineering Computing Center Lab
 44
Pentium 4 machines running Widows XP and
 44 Pentium 4 workstations running Linux (RH 9.0)
A Low-Cost Parallel Queuing System for Computationally Intensive Problems
Run-Time Cluster
 Cortex,
a much larger and faster cluster
 Processors
(128 total)
• 30 dual processor Pentium III
• 34 dual processor Pentium IV Xeon
 Interconnect
• Ethernet for NFS
• Myrinet 2 for communication
– 2 Gigabit bi-directional low-latency network
 Misc.
• 2 GB RAM per CPU
• More than ½ Terabyte of storage
A Low-Cost Parallel Queuing System for Computationally Intensive Problems
The Problem to avoid –
Load (un)Balancing



Unbalanced search tree
Processes 2 & 3 sit idle
while process 1 works
toward a solution
A work queue system
helps balance the
workload
A Low-Cost Parallel Queuing System for Computationally Intensive Problems
The Solution –
A Generic Work Queue System
 Almost
all of the problems we have been
looking at can be broken up into jobs (or subjobs).
 We decided to build a queue of jobs (work) that
can be distributed across a cluster to harness the
parallel computation power available.
 One of the goals:
 Little
knowledge of parallel programming or
message passing needed by user.
A Low-Cost Parallel Queuing System for Computationally Intensive Problems
Queuing System Design Goals
 Master/Slave
architecture
 Master
creates initial jobs for slaves
 Master then monitors messages and keeps the work
load balanced
 Central
and distributed work queues
 Queue
sizes can be altered (while running) for
optimization
 Master
signals termination when master queue
empty and all slaves are idle
A Low-Cost Parallel Queuing System for Computationally Intensive Problems
Queuing System
Master
Central Queue
Share Work Msg
Work Request
Slave 1
Slave 2
Slave n
Distributed Queue
A Low-Cost Parallel Queuing System for Computationally Intensive Problems
User Requirements



Define a job
Define the master and slave functions
Then optionally
 Determine
 Can
queue max and min sizes
be ascertained empirically during development
 Adjust
granularity as needed based upon
performance (message passing behavior)
A Low-Cost Parallel Queuing System for Computationally Intensive Problems
Define a Job

A job is just a C/C++ data structure
 We
used an array of integers
 If the job is not of built in data types then the
user must define types and overload operators

Our system is designed to work with almost
any job
A Low-Cost Parallel Queuing System for Computationally Intensive Problems
Example Job

MCN
 Region
lists
 Adjacency matrix
 Several integers to keep track of best and current
solutions
Integer Array
Job Size, Current MCN,# Vertices,# Regions, Region List, Adjacency Matrix
 Job
is enqueued as an array of integer values
A Low-Cost Parallel Queuing System for Computationally Intensive Problems
User Defined Functions

Master Function
called it master_create_jobs( ) –
 Creates initial jobs (from user data)
 Number of jobs created can be application dependent or
based on number of processes
 May return a meaningful value such as a lower bound
 We

We return the initial MCN (using Guy’s formula)
A Low-Cost Parallel Queuing System for Computationally Intensive Problems
User Defined Functions

Slave Function
called it work( )
 Unpacks job into local data structure to process
 The code for this function determines the granularity of
the work being done
 This function adds jobs it creates onto the local queue
 It may return a meaningful value such as a current best
solution (updating the MCN)
 We
A Low-Cost Parallel Queuing System for Computationally Intensive Problems
Results

The system is able to create and manage a
large number of jobs and messages
 Test

runs generated more than 128 million jobs
The system works for different problems
 Solved
Minimum Crossing Number Problem
for K6, K7, and K8
 Solved TSP for several graphs
A Low-Cost Parallel Queuing System for Computationally Intensive Problems
MCN Results
Graph Size
MCN
# of Jobs
Created/Processed
K5
1
3
K6
3
71
K7
9
25,844
K8
18
128,737,926
A Low-Cost Parallel Queuing System for Computationally Intensive Problems
Future Work

Find the MCN of larger vertex sets
 Currently
being used to solve MCN problem
for growing N



Develop a job to find MCN of bipartite and
other graphs
Add ability to save queues to disk
Develop a GUI (for ease of use)
A Low-Cost Parallel Queuing System for Computationally Intensive Problems
Future Work

Is a Stack of jobs better than a Queue?
 The
number of jobs generated is different
because one does a depth first search and the
other a breadth first search.
A Low-Cost Parallel Queuing System for Computationally Intensive Problems
A Time Saving Region Restriction
for Calculating the MCN of Kn
Judy Fredrickson
Talk 114 - Wednesday 4:00pm
A Low-Cost Parallel Queuing System for Computationally Intensive Problems
Minimum Crossing Number

Classic graph theory problem
 Given
a number of vertices n, what is the
minimum number of crossings (Kn) if every
vertex has an edge to every other vertex


Proven for n  10
Involves a very large search space
A Low-Cost Parallel Queuing System for Computationally Intensive Problems
5 Vertex Graph
A Low-Cost Parallel Queuing System for Computationally Intensive Problems
MCN (K5)
A Low-Cost Parallel Queuing System for Computationally Intensive Problems
Traveling Salesman Problem
“…given a finite number of ‘cities’ along with
the cost of travel between each pair of them,
find the cheapest way of visiting all the cities
and returning to your starting point.” -- Traveling
Salesman Problem Home Page
 Problem
size grows exponentially
 Difficult to solve problems of any significant
size with brute force
A Low-Cost Parallel Queuing System for Computationally Intensive Problems
Traveling Salesman Problem
A
B
C
D
E
A
0
5
1
1
5
B
5
0
5
1
1
C
1
5
0
5
1
D
1
1
5
0
5
E
5
1
1
5
0
A Low-Cost Parallel Queuing System for Computationally Intensive Problems
Traveling Salesman Problem
A Low-Cost Parallel Queuing System for Computationally Intensive Problems
Results TSP

TSP
 Created
over a million jobs with a fairly small
problem size
 ~30,000 jobs sent to master by slaves
 Relatively few requests for work
 Granularity could be less fine, more work done
per job
A Low-Cost Parallel Queuing System for Computationally Intensive Problems