Software Group Compiler Technology
Task, thread and processor
— OpenMP 3.0 and beyond
Guansong Zhang,
IBM Toronto Lab
© 2006 IBM Corporation
Compiler technology
Overview
The purpose of the talk
– Introducing the latest improvement on the OpenMP
standard
• Task
• Still under discussion, don’t take the syntax
– Considerations for the future OpenMP development
• Thread affinity
2
Compiler technology
The changing world
Hardware improvement
– Development of the multicore system
• Soon we will have more processors than we know how to
program with
• IBM to Build World's First Cell Broadband Engine Based
Supercomputer
• Intel: Quad core to turbocharge chips
• Terra Soft to Build Cell-Based Super Out of PS3 Beta Iron
OpenMP Standard
– C/C++ and Fortran standard are merged into 2.5
Other language committee
– C++ memory model: atomic access
3
Compiler technology
More changes in the OpenMP world
New players
– Microsoft just joined the OpenMP ARB
• …, and Visual C++® 2005 supports the full standard.
OpenMP is also supported by the Xbox 360™ platform
– GCC
• The GOMP project is developing an implementation of
OpenMP for the C, C++, and Fortran 95 compilers in the
GNU Compiler Collection
4
Compiler technology
Overview
The purpose of the talk
– Introducing the latest improvement on the OpenMP
standard
• Task
• Still under discussion, don’t take the syntax
– Considerations for the future OpenMP development
• Thread affinity
5
Compiler technology
Workshare and task pool
What is a workshare
6
Compiler technology
Workshare and task pool (cont.)
What is a task
Not a workshare
– But still “sharing/cooperating”
between threads
Comparing with a workshare
– Unit can be generated
– Unit can wait for another
generated unit
7
Compiler technology
Task examples
Pointer chasing
#pragma omp parallel
Recursive algorithm
int fib(int n) {
int x, y;
{
if (n<2)
#pragma omp single
return n;
{
#pragma omp taskgroup
while(p) {
{
#pragma omp task
#pragma omp task common(x)
process(p)
x=fib(n-1);
p=p->next;
#pragma omp task common(y)
}
y=fib(n-2);
}
}
}
return x+y;
}
8
Compiler technology
Task schedule
More flexible scheduling
– Can a task be multi-threaded?
•
When a task is encountered, the
thread always go for the new task
Advantage
– The idea is to provide one more
level of abstraction
•
•
•
Task centric view
Try to avoid thread starvation
Potential cache reuse
Disadvantage
– Threadprivate
•
No threadprivate data
– Thread id
•
HPC users may need thread id to
localize data access.
– Locks
•
Locks’ owner becomes confusing
9
Compiler technology
Overview
The purpose of the talk
– Introducing the latest improvement on the OpenMP
standard
• Task
• Still under discussion, don’t take the syntax
– Considerations for the future OpenMP development
• Thread affinity
10
Compiler technology
Emerging architectures
11
Compiler technology
Performance number
Stride 1
Stride 2
12
Compiler technology
Thread affinity
Nested parallelism
• Organize threads to multi levels (This is in previous OpenMP
standard already)
Thread grouping
• Balancing the number of threads available and the
parallelism in the code
Thread mapping
• Associate each OpenMP thread to physical/logical
processors
13
Compiler technology
How to represent a thread group
Environment Var
Explicit index
Descriptor handle
User interface
No touch for user
code
Simple data type;
Possible multiple
changes in the
source.
New internal type,
allow centralized
thread group
programming
Modularity
(procedure calls,
library functions)
No support
Pass level, CPU array
and array size
Pass group type var,
which may be used as
an execution context
(MPI)
Nested par (thread to
thread affinity)
Fixed in advance, no
dynamic adjustment
according to user
input
Specify number of
threads at different
levels
Specify thread
composition
Mapping threads
Implementation
defined
Supported, through
Virtual CPU numbers
Supported, through
omp_get_procs()
Heterogeneous
system
No support (?)
Supported, different
kinds of CPU with
same numbering
scheme?
Different groups
14
Compiler technology
What is for the future
Performance is still our goal
– “OpenMP is about performance.”
• Quoted from NASA scientists
OpenMP needs to enlarge itself for the broader market
– C/C++ will become more interesting
– People like to see non numeric programs in OpenMP
Partition OpenMP interface as different layers
– TASK, WORKSHARE vs. THREAD vs. PROC
– MPI has more than 300 calls, most people will only use 6-8
– Keep the layered approach while we extending OpenMP ?
15
Compiler technology
Summary
Start a parallel region
Split into two nested parallel
regions
– This is the chance to bind
threads to the right processors
Start a task region
– For independent works
•
E.g. game objects
Start a workshare
– For computation intensive
calculation
•
16
E.g. graphic rendering
Compiler technology
Q&A
17
© Copyright 2026 Paperzz