Parallel Computer Architecture

Introduction
Parallel Processing
EE 613
Part 1. Foundations
• Sequential vs. parallel computing
• Example – counting 3s
• Basic architectural features – machine
model
• Key ideas – latency, bandwidth, speedup,
efficiency
• Dependences
Hidden Parallelism
• Separate wires and caches for instructions
and data
• Pipelined instruction execution
• Multiple instructions issued
• Parallel arithmetic circuits
Other Parallelism
•
•
•
•
•
Multi-core - opportunity with work
Supercomputers - experience pool
Clusters – well known
Servers – programs developed
Grids
Sequential vs. Parallel Solutions
- Single Instruction Stream • Sequential solution does not scale, as
shown in sum of numbers example
• Efficient parallel solution requires using a
different algorithm
• Sum of numbers example
– Iterative sum X0 + X1 + …..Xn
– Pair-wise sum
Sequential vs. Parallel Algorithms
for Sum of Numbers
• Sequential sum of numbers
– Algorithm (add the next number to the sum)
– One computer possible
• Parallel sum of numbers
– New algorithm
– Assign pairs of numbers to different
computers
– n/2 computers possible
Sequential vs. Parallel Solutions
- Multiple Instruction Streams • Thread (of execution)
– Has hardware needed to execute instructions
– Shares memory with other threads
• Some issues
– Race condition
– Use of locks – performance
– Cache coherence
• Example - sum of 3’s
Performance of 4th Solution
• Success comparing 1 vs. 2 processors
• Success comparing 2 vs. 4 processors
• Issue with 4 vs. 8 processors
– L2 memory bandwidth
CSE524 Parallel Algorithms
Lawrence Snyder