Supercomputing in Plain English: Shared Memory Multithreading

Topics



Parallel Computing
Shared Memory
OpenMP
1
Parallel Computing
And
Shared Memory
2
What Is Parallel Computing?

Parallel Computing is the use of multiple
processing units or computers working
together to solve a common problem.

Work on different tasks or the same task but
on different pieces of the problem’s data.
3
Parallel Computing


Each Processor works its section of the problem
Processors can exchange and share information
4
Why do Parallel Computing?


Limits of single CPU computing
 Performance: Serial is too slow.
 Available memory: Need for large amount of
memory not accessible by a single processor.
Parallel computing allows one to:
 Solve problems that don’t fit on a single CPU
 Solve problems that can’t be solved in a
reasonable time
5
Why do Parallel Computing?

We can solve...
 Larger problems
 The same problem faster
6
Shared Memory
GENERAL CHARACTERISTICS:
Shared memory parallel computers vary widely, but
generally have in common the ability for all
processors to access all memory as global address
space.

Multiple processors can operate independently
but share the same memory resources.
7
General Characteristics Cont.

Changes in a memory location effected by one
processor are visible to all other processors.

Historically, shared memory machines have
been classified as Uniformed Memory Access
(UMA) and NUMA (Non-Uniformed Memory
Access), based upon memory access times.
8
Uniform Memory Access (UMA)

Equal access and
access times to
memory

If one processor
updates a location in
shared memory, all the
other processors know
about the update.
9
Non-Uniform Memory Access




Often made by physically
linking two or more
multiprocessors
One processor can directly
access memory of another
processor
Not all processors have
equal access time to all
memories
Memory access across link
is slower
10
The Jigsaw Puzzle Analogy
11
Serial Computing
Suppose you want to do a jigsaw puzzle
that has, say, a thousand pieces.
We can imagine that it’ll take you a
certain amount of time. Let’s say
that you can put the puzzle together in
an hour.
12
Shared Memory Parallelism
If Scott sits across the table from you,
then he can work on his half of the
puzzle and you can work on yours.
13
Shared Memory Parallelism
Once in a while, you’ll both reach into
the pile of pieces at the same time
(you’ll contend for the same resource),
which will cause a little bit of
slowdown.
And from time to time you’ll have to
work together (communicate) at the
interface between his half and yours.
The speedup will be nearly 2-to-1: y’all
might take 30 minutes instead of hour.
14
The More the Merrier?
Now let’s put Jane and Sam on
the other two sides of the
table.
15
The More the Merrier?
Each of you can work on a part of the
puzzle, but there’ll be a lot more
contention for the shared resource (the
pile of puzzle pieces) and a lot more
communication at the interfaces. So
y’all will get noticeably less than a
4-to-1 speedup, but you’ll still have
an improvement, maybe something
like 3-to-1: the four of you can get it
done in 20 minutes instead of an hour.
16
More People?
What Happens –
If we now put Sally and
Sue and Jane and Bill on
the corners of the table?
17
Diminishing Returns
There’s going to be a whole lot of
contention for the shared resource,
and a lot of communication at the
many interfaces. So the speedup you
get will be much less than you’d like;
you’ll be lucky to get 5-to-1.
So we can see that adding more and
more workers onto a shared resource
is eventually going to have a
diminishing return.
What activity yesterday demonstrated the
diminishing returns?
18
Advantages of Shared Memory

Global address space provides a user-friendly
programming perspective to memory

Data sharing between tasks is both fast and uniform
due to the proximity of memory to CPUs
19
Disadvantage of Shared Memory

Primary disadvantage is the lack of scalability
between memory and CPUs. Adding more CPUs
can geometrically increases traffic on the shared
memory-CPU path

Programmer responsibility for synchronization
constructs that ensure "correct" access of global
memory
20