PetaBricks
A Language and Compiler for Algorithmic Choice
Jason Ansel Cy Chan Yee Lok Wong Marek Olszewski
Qin Zhao Alan Edelman Saman Amarasinghe
MIT - CSAIL
June 16, 2009
Jason Ansel (MIT)
PetaBricks
June 16, 2009
1 / 47
Introduction
Motivating Example
Outline
1
Introduction
Motivating Example
Language & Compiler Overview
Why choices
2
PetaBricks Language
Key Ideas
Compilation Example
Other Language Features
3
Results
Benchmarks
Scalability
Variable Accuracy
4
Conclusion
Final thoughts
Jason Ansel (MIT)
PetaBricks
June 16, 2009
2 / 47
Introduction
Motivating Example
Algorithmic choice
Mergesort
(N-way)
Jason Ansel (MIT)
PetaBricks
June 16, 2009
3 / 47
Introduction
Motivating Example
Algorithmic choice
Mergesort
(N-way)
Jason Ansel (MIT)
PetaBricks
June 16, 2009
3 / 47
Introduction
Motivating Example
Algorithmic choice
Mergesort
(N-way)
Jason Ansel (MIT)
Insertionsort
PetaBricks
June 16, 2009
3 / 47
Introduction
Motivating Example
Algorithmic choice
Mergesort
(N-way)
Insertionsort
Radixsort
Jason Ansel (MIT)
PetaBricks
June 16, 2009
3 / 47
Introduction
Motivating Example
Algorithmic choice
Mergesort
(N-way)
Insertionsort
Radixsort
Quicksort
Quicksort
Jason Ansel (MIT)
PetaBricks
June 16, 2009
3 / 47
Introduction
Motivating Example
Algorithmic choice
Mergesort
(N-way)
N=2
Radixsort
Jason Ansel (MIT)
Insertionsort
STL Algorithm
@15
Quicksort
Quicksort
PetaBricks
June 16, 2009
3 / 47
Introduction
Motivating Example
Algorithmic choice
Mergesort
(N-way)
Insertionsort
Optimized For:
Xeon (1 core)
@98
N=4
@75
Radixsort
Jason Ansel (MIT)
Quicksort
Quicksort
PetaBricks
June 16, 2009
3 / 47
Introduction
Motivating Example
Algorithmic choice
Mergesort
(N-way)
Insertionsort
@ 600
@
14
N=4
@98
2
N=
20
Jason Ansel (MIT)
Xeon (1 core)
Xeon (8 cores)
@75
Radixsort
Optimized For:
Quicksort
Quicksort
PetaBricks
June 16, 2009
3 / 47
Introduction
Motivating Example
Algorithmic choice
Mergesort
(N-way)
@ 600
14
N=4
20
Radixsort
Jason Ansel (MIT)
Xeon (1 core)
Niagra (8 cores)
@
@98
@2400
Optimized For:
Xeon (8 cores)
@75
N=2,4,8,16
2
N=
@75
@1461
Insertionsort
Quicksort
Quicksort
PetaBricks
June 16, 2009
3 / 47
Introduction
Motivating Example
@ 1295
@600
Insertionsort
Algorithmic choice
Mergesort
(N-way)
N=2,4,8
Xeon (8 cores)
@75
N=4
N=2,4,8,16
@2400
Xeon (1 core)
@150
Niagra (8 cores)
2
N=
@75
@1461
Optimized For:
40
0
20
Jason Ansel (MIT)
38
14
Radixsort
@ 600
@
@
@98
Core 2 (2 cores)
Quicksort
Quicksort
PetaBricks
June 16, 2009
3 / 47
Introduction
Motivating Example
@ 1295
@600
Insertionsort
Quicksort
Quicksort
Algorithmic choice
Mergesort
Quicksort
Quicksort
(N-way)
N=2,4,8
Xeon (8 cores)
@75
N=4
N=2,4,8,16
@2400
Xeon (1 core)
@150
Niagra (8 cores)
2
N=
@75
@1461
Optimized For:
40
0
20
Jason Ansel (MIT)
38
14
Radixsort
Quicksort
Quicksort
@ 600
@
@
@98
Core 2 (2 cores)
Quicksort
Quicksort
Quicksort
Quicksort
PetaBricks
June 16, 2009
3 / 47
Introduction
Motivating Example
The PetaBricks language
The case for autotuning is obvious
Jason Ansel (MIT)
PetaBricks
June 16, 2009
4 / 47
Introduction
Motivating Example
The PetaBricks language
The case for autotuning is obvious
How should the programmer represent choice?
Jason Ansel (MIT)
PetaBricks
June 16, 2009
4 / 47
Introduction
Motivating Example
The PetaBricks language
The case for autotuning is obvious
How should the programmer represent choice?
We present the PetaBricks programming language and compiler:
Jason Ansel (MIT)
PetaBricks
June 16, 2009
4 / 47
Introduction
Motivating Example
The PetaBricks language
The case for autotuning is obvious
How should the programmer represent choice?
We present the PetaBricks programming language and compiler:
Choice as a fundamental language construct
Jason Ansel (MIT)
PetaBricks
June 16, 2009
4 / 47
Introduction
Motivating Example
The PetaBricks language
The case for autotuning is obvious
How should the programmer represent choice?
We present the PetaBricks programming language and compiler:
Choice as a fundamental language construct
Autotuning performed by the compiler
Jason Ansel (MIT)
PetaBricks
June 16, 2009
4 / 47
Introduction
Motivating Example
The PetaBricks language
The case for autotuning is obvious
How should the programmer represent choice?
We present the PetaBricks programming language and compiler:
Choice as a fundamental language construct
Autotuning performed by the compiler
Automatically parallelized
Jason Ansel (MIT)
PetaBricks
June 16, 2009
4 / 47
Introduction
Motivating Example
Sort in PetaBricks
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
transform S o r t
from A [ n ]
to B [ n ]
{
from (A a ) to (B b ) {
t u n a b l e WAYS;
/∗ M e r g e s o r t ∗/
} or {
/∗ I n s e r t i o n s o r t ∗/
} or {
/∗ R a d i x s o r t ∗/
} or {
/∗ Q u i c k s o r t ∗/
}
}
Jason Ansel (MIT)
PetaBricks
June 16, 2009
5 / 47
Introduction
Motivating Example
Sort in PetaBricks
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
transform S o r t
from A [ n ]
to B [ n ]
{
from (A a ) to (B b ) {
t u n a b l e WAYS;
/∗ M e r g e s o r t ∗/
} or {
/∗ I n s e r t i o n s o r t ∗/
} or {
/∗ R a d i x s o r t ∗/
} or {
/∗ Q u i c k s o r t ∗/
}
}
Jason Ansel (MIT)
PetaBricks
June 16, 2009
5 / 47
Introduction
Motivating Example
Sort in PetaBricks
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
transform S o r t
from A [ n ]
to B [ n ]
{
from (A a ) to (B b ) {
t u n a b l e WAYS;
/∗ M e r g e s o r t ∗/
} or {
/∗ I n s e r t i o n s o r t ∗/
} or {
/∗ R a d i x s o r t ∗/
} or {
/∗ Q u i c k s o r t ∗/
}
}
Jason Ansel (MIT)
PetaBricks
June 16, 2009
5 / 47
Introduction
Motivating Example
Sort in PetaBricks
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
transform S o r t
from A [ n ]
to B [ n ]
{
from (A a ) to (B b ) {
t u n a b l e WAYS;
/∗ M e r g e s o r t ∗/
} or {
/∗ I n s e r t i o n s o r t ∗/
} or {
/∗ R a d i x s o r t ∗/
} or {
/∗ Q u i c k s o r t ∗/
}
}
Jason Ansel (MIT)
PetaBricks
June 16, 2009
5 / 47
Introduction
Motivating Example
Sort in PetaBricks
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
transform S o r t
from A [ n ]
to B [ n ]
{
from (A a ) to (B b ) {
t u n a b l e WAYS;
/∗ M e r g e s o r t ∗/
} or {
/∗ I n s e r t i o n s o r t ∗/
} or {
/∗ R a d i x s o r t ∗/
} or {
/∗ Q u i c k s o r t ∗/
}
}
Jason Ansel (MIT)
PetaBricks
June 16, 2009
5 / 47
Introduction
Motivating Example
Sort in PetaBricks
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
transform S o r t
from A [ n ]
to B [ n ]
{
from (A a ) to (B b ) {
t u n a b l e WAYS;
/∗ M e r g e s o r t ∗/
} or {
/∗ I n s e r t i o n s o r t ∗/
} or {
/∗ R a d i x s o r t ∗/
} or {
/∗ Q u i c k s o r t ∗/
}
}
Jason Ansel (MIT)
PetaBricks
June 16, 2009
5 / 47
Introduction
Motivating Example
Sort in PetaBricks
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
transform S o r t
from A [ n ]
to B [ n ]
{
from (A a ) to (B b ) {
t u n a b l e WAYS;
/∗ M e r g e s o r t ∗/
} or {
/∗ I n s e r t i o n s o r t ∗/
} or {
/∗ R a d i x s o r t ∗/
} or {
/∗ Q u i c k s o r t ∗/
}
}
Jason Ansel (MIT)
PetaBricks
June 16, 2009
5 / 47
Introduction
Language & Compiler Overview
Outline
1
Introduction
Motivating Example
Language & Compiler Overview
Why choices
2
PetaBricks Language
Key Ideas
Compilation Example
Other Language Features
3
Results
Benchmarks
Scalability
Variable Accuracy
4
Conclusion
Final thoughts
Jason Ansel (MIT)
PetaBricks
June 16, 2009
6 / 47
Introduction
Language & Compiler Overview
The PetaBricks compiler
Sort is compiled into a autotuning binary
Jason Ansel (MIT)
PetaBricks
June 16, 2009
7 / 47
Introduction
Language & Compiler Overview
The PetaBricks compiler
Sort is compiled into a autotuning binary
Trained on target architecture
Jason Ansel (MIT)
PetaBricks
June 16, 2009
7 / 47
Introduction
Language & Compiler Overview
The PetaBricks compiler
Sort is compiled into a autotuning binary
Trained on target architecture
Structured genetic tuner
Jason Ansel (MIT)
PetaBricks
June 16, 2009
7 / 47
Introduction
Language & Compiler Overview
The PetaBricks compiler
Sort is compiled into a autotuning binary
Trained on target architecture
Structured genetic tuner
Trained with full number of threads
Jason Ansel (MIT)
PetaBricks
June 16, 2009
7 / 47
Introduction
Language & Compiler Overview
The PetaBricks compiler
Sort is compiled into a autotuning binary
Trained on target architecture
Structured genetic tuner
Trained with full number of threads
Under 1 minute for Sort
Jason Ansel (MIT)
PetaBricks
June 16, 2009
7 / 47
Introduction
Language & Compiler Overview
The PetaBricks compiler
Sort is compiled into a autotuning binary
Trained on target architecture
Structured genetic tuner
Trained with full number of threads
Under 1 minute for Sort
Results fed back into the compiler
Final binary created
Jason Ansel (MIT)
PetaBricks
June 16, 2009
7 / 47
Introduction
Language & Compiler Overview
Sort algorithm timings1
0.0025
InsertionSort
0.002
Time (s)
0.0015
0.001
0.0005
0
0
1
250
500
750
1000
Input Size
1250
1500
1750
On an 8-way Xeon E7340 system
Jason Ansel (MIT)
PetaBricks
June 16, 2009
8 / 47
Introduction
Language & Compiler Overview
Sort algorithm timings1
0.0025
InsertionSort
QuickSort
0.002
Time (s)
0.0015
0.001
0.0005
0
0
1
250
500
750
1000
Input Size
1250
1500
1750
On an 8-way Xeon E7340 system
Jason Ansel (MIT)
PetaBricks
June 16, 2009
8 / 47
Introduction
Language & Compiler Overview
Sort algorithm timings1
0.0025
InsertionSort
QuickSort
MergeSort
0.002
Time (s)
0.0015
0.001
0.0005
0
0
1
250
500
750
1000
Input Size
1250
1500
1750
On an 8-way Xeon E7340 system
Jason Ansel (MIT)
PetaBricks
June 16, 2009
8 / 47
Introduction
Language & Compiler Overview
Sort algorithm timings1
0.0025
InsertionSort
QuickSort
MergeSort
RadixSort
0.002
Time (s)
0.0015
0.001
0.0005
0
0
1
250
500
750
1000
Input Size
1250
1500
1750
On an 8-way Xeon E7340 system
Jason Ansel (MIT)
PetaBricks
June 16, 2009
8 / 47
Introduction
Language & Compiler Overview
Sort algorithm timings1
0.0025
InsertionSort
QuickSort
MergeSort
RadixSort
Autotuned
0.002
Time (s)
0.0015
0.001
0.0005
0
0
1
250
500
750
1000
Input Size
1250
1500
1750
On an 8-way Xeon E7340 system
Jason Ansel (MIT)
PetaBricks
June 16, 2009
8 / 47
Introduction
Language & Compiler Overview
Run on
Timings on different architectures
Mobile
Xeon 1-way
Xeon 8-way
Niagara
Jason Ansel (MIT)
Mobile
1.61x
1.59x
1.12x
Trained on
Xeon 1-way Xeon 8-way
1.09x
1.67x
2.08x
2.14x
1.51x
1.08x
PetaBricks
Niagara
1.47x
2.50x
2.35x
-
June 16, 2009
9 / 47
Introduction
Language & Compiler Overview
Run on
Timings on different architectures
Mobile
Xeon 1-way
Xeon 8-way
Niagara
Jason Ansel (MIT)
Mobile
1.61x
1.59x
1.12x
Trained on
Xeon 1-way Xeon 8-way
1.09x
1.67x
2.08x
2.14x
1.51x
1.08x
PetaBricks
Niagara
1.47x
2.50x
2.35x
-
June 16, 2009
9 / 47
Introduction
Language & Compiler Overview
Run on
Timings on different architectures
Mobile
Xeon 1-way
Xeon 8-way
Niagara
Jason Ansel (MIT)
Mobile
1.61x
1.59x
1.12x
Trained on
Xeon 1-way Xeon 8-way
1.09x
1.67x
2.08x
2.14x
1.51x
1.08x
PetaBricks
Niagara
1.47x
2.50x
2.35x
-
June 16, 2009
9 / 47
Introduction
Why choices
Outline
1
Introduction
Motivating Example
Language & Compiler Overview
Why choices
2
PetaBricks Language
Key Ideas
Compilation Example
Other Language Features
3
Results
Benchmarks
Scalability
Variable Accuracy
4
Conclusion
Final thoughts
Jason Ansel (MIT)
PetaBricks
June 16, 2009
10 / 47
Introduction
Why choices
Early compilers
Constrained
Input
Language
(No choices)
Parsing
Code
Gen
Jason Ansel (MIT)
Early computers (and compilers) were weak
PetaBricks
June 16, 2009
11 / 47
Introduction
Why choices
Early compilers
Constrained
Input
Language
(No choices)
Parsing
Code
Gen
Early computers (and compilers) were weak
Parsing and code generation dominated compilation
Jason Ansel (MIT)
PetaBricks
June 16, 2009
11 / 47
Introduction
Why choices
Early compilers
Constrained
Input
Language
(No choices)
Parsing
Code
Gen
Early computers (and compilers) were weak
Parsing and code generation dominated compilation
Needed a constrained input language to simplify
compilation
Jason Ansel (MIT)
PetaBricks
June 16, 2009
11 / 47
Introduction
Why choices
Current compilers
Constrained
Input
Language
(No choices)
Parsing
Current computers are much more powerful
Compilers can do a lot more
Exposing
Choices
Decisions
Code
Gen
Jason Ansel (MIT)
PetaBricks
June 16, 2009
12 / 47
Introduction
Why choices
Current compilers
Constrained
Input
Language
(No choices)
Parsing
Current computers are much more powerful
Compilers can do a lot more
Input language is still constraining
Exposing
Choices
Decisions
Code
Gen
Jason Ansel (MIT)
PetaBricks
June 16, 2009
12 / 47
Introduction
Why choices
Current compilers
Constrained
Input
Language
(No choices)
Parsing
Current computers are much more powerful
Compilers can do a lot more
Input language is still constraining
Compilation dominated by exposing choices
Exposing
Choices
Decisions
Code
Gen
Jason Ansel (MIT)
PetaBricks
June 16, 2009
12 / 47
Introduction
Why choices
Current compilers
Constrained
Input
Language
(No choices)
Parsing
Current computers are much more powerful
Compilers can do a lot more
Input language is still constraining
Exposing
Choices
Decisions
Compilation dominated by exposing choices
Input language specifies only one
Algorithmic choice
Iteration order choice
Parallelism strategy choice
Data layout choice
Code
Gen
Jason Ansel (MIT)
PetaBricks
June 16, 2009
12 / 47
Introduction
Why choices
Current compilers
Constrained
Input
Language
(No choices)
Parsing
Current computers are much more powerful
Compilers can do a lot more
Input language is still constraining
Exposing
Choices
Decisions
Code
Gen
Jason Ansel (MIT)
Compilation dominated by exposing choices
Input language specifies only one
Algorithmic choice
Iteration order choice
Parallelism strategy choice
Data layout choice
Compiler must perform heroic analysis to reconstruct
other choices
PetaBricks
June 16, 2009
12 / 47
Introduction
Why choices
PetaBricks compiler
Rich
Input
Language
(w/ choices)
Parsing
We propose explicit choices in the language
Exploring Choices
&
Making Decisions
Code
Gen
Jason Ansel (MIT)
PetaBricks
June 16, 2009
13 / 47
Introduction
Why choices
PetaBricks compiler
Rich
Input
Language
(w/ choices)
Parsing
Exploring Choices
&
Making Decisions
We propose explicit choices in the language
The programmer defines the space of legal
Algorithmic choices
Iteration orders (include parallel)
Data layouts
Code
Gen
Jason Ansel (MIT)
PetaBricks
June 16, 2009
13 / 47
Introduction
Why choices
PetaBricks compiler
Rich
Input
Language
(w/ choices)
Parsing
Exploring Choices
&
Making Decisions
We propose explicit choices in the language
The programmer defines the space of legal
Algorithmic choices
Iteration orders (include parallel)
Data layouts
Allow compilers to focus on exploring choices
Compiler no longer needs to reconstruct choices
Code
Gen
Jason Ansel (MIT)
PetaBricks
June 16, 2009
13 / 47
Introduction
Why choices
Future-proof programs
The result: programs can adapt to their environment
Jason Ansel (MIT)
PetaBricks
June 16, 2009
14 / 47
Introduction
Why choices
Future-proof programs
The result: programs can adapt to their environment
Choices make programs less brittle
Jason Ansel (MIT)
PetaBricks
June 16, 2009
14 / 47
Introduction
Why choices
Future-proof programs
The result: programs can adapt to their environment
Choices make programs less brittle
Programs change with architecture, available cores, inputs, etc
Jason Ansel (MIT)
PetaBricks
June 16, 2009
14 / 47
PetaBricks Language
Key Ideas
Outline
1
Introduction
Motivating Example
Language & Compiler Overview
Why choices
2
PetaBricks Language
Key Ideas
Compilation Example
Other Language Features
3
Results
Benchmarks
Scalability
Variable Accuracy
4
Conclusion
Final thoughts
Jason Ansel (MIT)
PetaBricks
June 16, 2009
15 / 47
PetaBricks Language
Key Ideas
Algorithmic choice in the language
Algorithmic choice is the key aspect of PetaBricks
Jason Ansel (MIT)
PetaBricks
June 16, 2009
16 / 47
PetaBricks Language
Key Ideas
Algorithmic choice in the language
Algorithmic choice is the key aspect of PetaBricks
Programmer can define multiple rules to compute the same data
Jason Ansel (MIT)
PetaBricks
June 16, 2009
16 / 47
PetaBricks Language
Key Ideas
Algorithmic choice in the language
Algorithmic choice is the key aspect of PetaBricks
Programmer can define multiple rules to compute the same data
Compiler re-use rules to create hybrid algorithms
Jason Ansel (MIT)
PetaBricks
June 16, 2009
16 / 47
PetaBricks Language
Key Ideas
Algorithmic choice in the language
Algorithmic choice is the key aspect of PetaBricks
Programmer can define multiple rules to compute the same data
Compiler re-use rules to create hybrid algorithms
Can express choices at many different granularities
Jason Ansel (MIT)
PetaBricks
June 16, 2009
16 / 47
PetaBricks Language
Key Ideas
Synthesized outer control flow
Outer control flow synthesized by compiler
Jason Ansel (MIT)
PetaBricks
June 16, 2009
17 / 47
PetaBricks Language
Key Ideas
Synthesized outer control flow
Outer control flow synthesized by compiler
Another choice that the programmer should not make
Jason Ansel (MIT)
PetaBricks
June 16, 2009
17 / 47
PetaBricks Language
Key Ideas
Synthesized outer control flow
Outer control flow synthesized by compiler
Another choice that the programmer should not make
By rows?
Jason Ansel (MIT)
PetaBricks
June 16, 2009
17 / 47
PetaBricks Language
Key Ideas
Synthesized outer control flow
Outer control flow synthesized by compiler
Another choice that the programmer should not make
By rows?
By columns?
Jason Ansel (MIT)
PetaBricks
June 16, 2009
17 / 47
PetaBricks Language
Key Ideas
Synthesized outer control flow
Outer control flow synthesized by compiler
Another choice that the programmer should not make
By rows?
By columns?
Diagonal? Reverse order? Blocked?
Parallel?
Jason Ansel (MIT)
PetaBricks
June 16, 2009
17 / 47
PetaBricks Language
Key Ideas
Synthesized outer control flow
Outer control flow synthesized by compiler
Another choice that the programmer should not make
By rows?
By columns?
Diagonal? Reverse order? Blocked?
Parallel?
Instead programmer provides explicit
producer-consumer relations
Jason Ansel (MIT)
PetaBricks
June 16, 2009
17 / 47
PetaBricks Language
Key Ideas
Synthesized outer control flow
Outer control flow synthesized by compiler
Another choice that the programmer should not make
By rows?
By columns?
Diagonal? Reverse order? Blocked?
Parallel?
Instead programmer provides explicit
producer-consumer relations
Allows compiler to explore choice space
Jason Ansel (MIT)
PetaBricks
June 16, 2009
17 / 47
PetaBricks Language
Compilation Example
Outline
1
Introduction
Motivating Example
Language & Compiler Overview
Why choices
2
PetaBricks Language
Key Ideas
Compilation Example
Other Language Features
3
Results
Benchmarks
Scalability
Variable Accuracy
4
Conclusion
Final thoughts
Jason Ansel (MIT)
PetaBricks
June 16, 2009
18 / 47
PetaBricks Language
Compilation Example
Simple example program
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
transf orm R o l l i n g S u m
from A [ n ]
to B [ n ]
{
// r u l e 0 : u s e t h e p r e v i o u s l y computed v a l u e
B . c e l l ( i ) from (A . c e l l ( i ) a ,
B . c e l l ( i −1) l e f t S u m ) {
r e t u r n a+l e f t S u m ;
}
// r u l e 1 : sum a l l e l e m e n t s t o t h e l e f t
B . c e l l ( i ) from (A . r e g i o n ( 0 , i ) i n ) {
r e t u r n sum ( i n ) ;
}
}
Jason Ansel (MIT)
PetaBricks
June 16, 2009
19 / 47
PetaBricks Language
Compilation Example
Simple example program
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
transf orm R o l l i n g S u m
from A [ n ]
to B [ n ]
{
// r u l e 0 : u s e t h e p r e v i o u s l y computed v a l u e
B . c e l l ( i ) from (A . c e l l ( i ) a ,
B . c e l l ( i −1) l e f t S u m ) {
r e t u r n a+l e f t S u m ;
}
// r u l e 1 : sum a l l e l e m e n t s t o t h e l e f t
B . c e l l ( i ) from (A . r e g i o n ( 0 , i ) i n ) {
r e t u r n sum ( i n ) ;
}
}
Jason Ansel (MIT)
PetaBricks
June 16, 2009
19 / 47
PetaBricks Language
Compilation Example
Simple example program
...
5
6
7
8
9
// r u l e 0 : u s e t h e p r e v i o u s l y computed v a l u e
B . c e l l ( i ) from (A . c e l l ( i ) a ,
B . c e l l ( i −1) l e f t S u m ) {
r e t u r n a+l e f t S u m ;
}
...
A:
B:
Jason Ansel (MIT)
PetaBricks
June 16, 2009
20 / 47
PetaBricks Language
Compilation Example
Simple example program
A:
B:
...
11
12
13
14
// r u l e 1 : sum a l l e l e m e n t s t o t h e l e f t
B . c e l l ( i ) from (A . r e g i o n ( 0 , i ) i n ) {
r e t u r n sum ( i n ) ;
}
...
Jason Ansel (MIT)
PetaBricks
June 16, 2009
21 / 47
PetaBricks Language
Compilation Example
Applicable regions
Applicable regions
Jason Ansel (MIT)
Compilation Process
Choice grids Choice dependency graph
PetaBricks
June 16, 2009
22 / 47
PetaBricks Language
Compilation Example
Applicable regions
Applicable regions
Compilation Process
Choice grids Choice dependency graph
// r u l e 0 : u s e t h e p r e v i o u s l y computed v a l u e
B . c e l l ( i ) from (A . c e l l ( i ) a ,
B . c e l l ( i −1) l e f t S u m ) {
r e t u r n a+l e f t S u m ;
}
Applicable where 1 ≤ i < n
Jason Ansel (MIT)
PetaBricks
June 16, 2009
22 / 47
PetaBricks Language
Compilation Example
Applicable regions
Applicable regions
Compilation Process
Choice grids Choice dependency graph
// r u l e 0 : u s e t h e p r e v i o u s l y computed v a l u e
B . c e l l ( i ) from (A . c e l l ( i ) a ,
B . c e l l ( i −1) l e f t S u m ) {
r e t u r n a+l e f t S u m ;
}
Applicable where 1 ≤ i < n
// r u l e 1 : sum a l l e l e m e n t s t o t h e l e f t
B . c e l l ( i ) from (A . r e g i o n ( 0 , i ) i n ) {
r e t u r n sum ( i n ) ;
}
Applicable where 0 ≤ i < n
Jason Ansel (MIT)
PetaBricks
June 16, 2009
22 / 47
PetaBricks Language
Compilation Example
Choice grids
Applicable regions
Jason Ansel (MIT)
Compilation Process
Choice grids Choice dependency graph
PetaBricks
June 16, 2009
23 / 47
PetaBricks Language
Compilation Example
Choice grids
Applicable regions
Compilation Process
Choice grids Choice dependency graph
Divide data space into symbolic regions with common sets of choices
Jason Ansel (MIT)
PetaBricks
June 16, 2009
23 / 47
PetaBricks Language
Compilation Example
Choice grids
Applicable regions
Compilation Process
Choice grids Choice dependency graph
Divide data space into symbolic regions with common sets of choices
In this simple example:
A: Input (no choices)
B: [0, 1) = rule 1
B: [1, n) = rule 0 or rule 1
R1
0
R0 or R1
1
Jason Ansel (MIT)
n
PetaBricks
June 16, 2009
23 / 47
PetaBricks Language
Compilation Example
Choice grids
Applicable regions
Compilation Process
Choice grids Choice dependency graph
Divide data space into symbolic regions with common sets of choices
In this simple example:
A: Input (no choices)
B: [0, 1) = rule 1
B: [1, n) = rule 0 or rule 1
R1
0
R0 or R1
1
n
Applicable regions map rules → symbolic data
Choice grids map symbolic data → rules
Jason Ansel (MIT)
PetaBricks
June 16, 2009
23 / 47
PetaBricks Language
Compilation Example
Choice dependency graph
Compilation Process
Applicable regions Choice grids Choice dependency graph
(r0,=,-1)
(r1,<=),(r0,=)
A.region(0, n)
(r1,<=),(r0,=)
(r0,=,-1)
B.region(1, n)
Choices: r0, r1
B.region(0, 1)
Choices: r1
Jason Ansel (MIT)
PetaBricks
June 16, 2009
24 / 47
PetaBricks Language
Compilation Example
Choice dependency graph
Compilation Process
Applicable regions Choice grids Choice dependency graph
(r0,=,-1)
(r1,<=),(r0,=)
A.region(0, n)
(r1,<=),(r0,=)
(r0,=,-1)
B.region(1, n)
Choices: r0, r1
B.region(0, 1)
Choices: r1
Adds dependency edges between symbolic regions
Jason Ansel (MIT)
PetaBricks
June 16, 2009
24 / 47
PetaBricks Language
Compilation Example
Choice dependency graph
Compilation Process
Applicable regions Choice grids Choice dependency graph
(r0,=,-1)
(r1,<=),(r0,=)
A.region(0, n)
(r1,<=),(r0,=)
(r0,=,-1)
B.region(1, n)
Choices: r0, r1
B.region(0, 1)
Choices: r1
Adds dependency edges between symbolic regions
Edges annotated with directions and rules
Jason Ansel (MIT)
PetaBricks
June 16, 2009
24 / 47
PetaBricks Language
Compilation Example
Choice dependency graph
Compilation Process
Applicable regions Choice grids Choice dependency graph
(r0,=,-1)
(r1,<=),(r0,=)
A.region(0, n)
(r1,<=),(r0,=)
(r0,=,-1)
B.region(1, n)
Choices: r0, r1
B.region(0, 1)
Choices: r1
Adds dependency edges between symbolic regions
Edges annotated with directions and rules
Many compiler passes on this IR to:
Jason Ansel (MIT)
PetaBricks
June 16, 2009
24 / 47
PetaBricks Language
Compilation Example
Choice dependency graph
Compilation Process
Applicable regions Choice grids Choice dependency graph
(r0,=,-1)
(r1,<=),(r0,=)
A.region(0, n)
(r1,<=),(r0,=)
(r0,=,-1)
B.region(1, n)
Choices: r0, r1
B.region(0, 1)
Choices: r1
Adds dependency edges between symbolic regions
Edges annotated with directions and rules
Many compiler passes on this IR to:
Simplify complex dependency patterns
Jason Ansel (MIT)
PetaBricks
June 16, 2009
24 / 47
PetaBricks Language
Compilation Example
Choice dependency graph
Compilation Process
Applicable regions Choice grids Choice dependency graph
(r0,=,-1)
(r1,<=),(r0,=)
A.region(0, n)
(r1,<=),(r0,=)
(r0,=,-1)
B.region(1, n)
Choices: r0, r1
B.region(0, 1)
Choices: r1
Adds dependency edges between symbolic regions
Edges annotated with directions and rules
Many compiler passes on this IR to:
Simplify complex dependency patterns
Add choices
Jason Ansel (MIT)
PetaBricks
June 16, 2009
24 / 47
PetaBricks Language
Compilation Example
Code generation
PetaBricks Source Code
1
PetaBricks Compiler
Autotuning
Binary
PetaBricks source code is
compiled
Final
Binary
Choice Configuration File
Jason Ansel (MIT)
PetaBricks
June 16, 2009
25 / 47
PetaBricks Language
Compilation Example
Code generation
PetaBricks Source Code
1
PetaBricks source code is
compiled
2
An autotuning binary is created
PetaBricks Compiler
Autotuning
Binary
Final
Binary
Choice Configuration File
Jason Ansel (MIT)
PetaBricks
June 16, 2009
25 / 47
PetaBricks Language
Compilation Example
Code generation
PetaBricks Source Code
1
PetaBricks source code is
compiled
2
An autotuning binary is created
3
Autotuning occurs creating a
choice configuration file
PetaBricks Compiler
Autotuning
Binary
Final
Binary
Choice Configuration File
Jason Ansel (MIT)
PetaBricks
June 16, 2009
25 / 47
PetaBricks Language
Compilation Example
Code generation
PetaBricks Source Code
1
PetaBricks source code is
compiled
2
An autotuning binary is created
3
Autotuning occurs creating a
choice configuration file
4
Choices are fed back into the
compiler to create a final binary
PetaBricks Compiler
Autotuning
Binary
Final
Binary
Choice Configuration File
Jason Ansel (MIT)
PetaBricks
June 16, 2009
25 / 47
PetaBricks Language
Compilation Example
Autotuning
Based on two building blocks:
A genetic tuner
An n-ary search algorithm
Jason Ansel (MIT)
PetaBricks
June 16, 2009
26 / 47
PetaBricks Language
Compilation Example
Autotuning
Based on two building blocks:
A genetic tuner
An n-ary search algorithm
Flat parameter space
Compiler generates a dependency graph describing this parameter
space
Jason Ansel (MIT)
PetaBricks
June 16, 2009
26 / 47
PetaBricks Language
Compilation Example
Autotuning
Based on two building blocks:
A genetic tuner
An n-ary search algorithm
Flat parameter space
Compiler generates a dependency graph describing this parameter
space
Entire program tuned from bottom up
Jason Ansel (MIT)
PetaBricks
June 16, 2009
26 / 47
PetaBricks Language
Compilation Example
Parallel Runtime Library
Task-based parallel runtime
Thread-local decks of runnable tasks
Jason Ansel (MIT)
PetaBricks
June 16, 2009
27 / 47
PetaBricks Language
Compilation Example
Parallel Runtime Library
Task-based parallel runtime
Thread-local decks of runnable tasks
Use a work-stealing algorithm similar to that of Cilk
Jason Ansel (MIT)
PetaBricks
June 16, 2009
27 / 47
PetaBricks Language
Other Language Features
Outline
1
Introduction
Motivating Example
Language & Compiler Overview
Why choices
2
PetaBricks Language
Key Ideas
Compilation Example
Other Language Features
3
Results
Benchmarks
Scalability
Variable Accuracy
4
Conclusion
Final thoughts
Jason Ansel (MIT)
PetaBricks
June 16, 2009
28 / 47
PetaBricks Language
Other Language Features
More PetaBricks features
Automatic consistency checking
The tunable keyword
Call external code
Custom training data generators
Matrix versions for iterative algorithms
Rule priorities
where (clause for limiting applicable regions)
Template transforms
Jason Ansel (MIT)
PetaBricks
June 16, 2009
29 / 47
PetaBricks Language
Other Language Features
More PetaBricks features
Automatic consistency checking
The tunable keyword
Call external code
Custom training data generators
Matrix versions for iterative algorithms
Rule priorities
where (clause for limiting applicable regions)
Template transforms
Jason Ansel (MIT)
PetaBricks
June 16, 2009
29 / 47
PetaBricks Language
Other Language Features
More PetaBricks features
Automatic consistency checking
The tunable keyword
Call external code
Custom training data generators
Matrix versions for iterative algorithms
Rule priorities
where (clause for limiting applicable regions)
Template transforms
Jason Ansel (MIT)
PetaBricks
June 16, 2009
29 / 47
PetaBricks Language
Other Language Features
More PetaBricks features
Automatic consistency checking
The tunable keyword
Call external code
Custom training data generators
Matrix versions for iterative algorithms
Rule priorities
where (clause for limiting applicable regions)
Template transforms
Jason Ansel (MIT)
PetaBricks
June 16, 2009
29 / 47
Results
Benchmarks
Outline
1
Introduction
Motivating Example
Language & Compiler Overview
Why choices
2
PetaBricks Language
Key Ideas
Compilation Example
Other Language Features
3
Results
Benchmarks
Scalability
Variable Accuracy
4
Conclusion
Final thoughts
Jason Ansel (MIT)
PetaBricks
June 16, 2009
30 / 47
Results
Benchmarks
Eigenvector Solve
Bisection
QR decomposition
Divide and conquer
Jason Ansel (MIT)
PetaBricks
June 16, 2009
31 / 47
Results
Benchmarks
Eigenvector Solve
0.12
Bisection
0.1
Time (s)
0.08
0.06
0.04
0.02
0
0
200
400
600
800
1000
Input Size
Jason Ansel (MIT)
PetaBricks
June 16, 2009
32 / 47
Results
Benchmarks
Eigenvector Solve
0.12
Bisection
DC
0.1
Time (s)
0.08
0.06
0.04
0.02
0
0
200
400
600
800
1000
Input Size
Jason Ansel (MIT)
PetaBricks
June 16, 2009
32 / 47
Results
Benchmarks
Eigenvector Solve
0.12
Bisection
DC
QR
0.1
Time (s)
0.08
0.06
0.04
0.02
0
0
200
400
600
800
1000
Input Size
Jason Ansel (MIT)
PetaBricks
June 16, 2009
32 / 47
Results
Benchmarks
Eigenvector Solve
0.12
Bisection
DC
QR
Autotuned
0.1
Time (s)
0.08
0.06
0.04
0.02
0
0
200
400
600
800
1000
Input Size
Jason Ansel (MIT)
PetaBricks
June 16, 2009
32 / 47
Results
Benchmarks
Eigenvector Solve
0.12
Bisection
DC
QR
Autotuned
Cutoff 25
0.1
Time (s)
0.08
0.06
0.04
0.02
0
0
200
400
600
800
1000
Input Size
Jason Ansel (MIT)
PetaBricks
June 16, 2009
32 / 47
Results
Benchmarks
Matrix Multiply
Basic
Recursive decompositions
Strassen’s algorithm
Iteration order (blocking)
Transpose
Jason Ansel (MIT)
PetaBricks
June 16, 2009
33 / 47
Results
Benchmarks
Matrix Multiply
10000
Basic
Time (s)
100
1
0.01
0.0001
1e-06
1
Jason Ansel (MIT)
10
100
Input Size
PetaBricks
1000
10000
June 16, 2009
34 / 47
Results
Benchmarks
Matrix Multiply
10000
Basic
Blocking
Time (s)
100
1
0.01
0.0001
1e-06
1
Jason Ansel (MIT)
10
100
Input Size
PetaBricks
1000
10000
June 16, 2009
34 / 47
Results
Benchmarks
Matrix Multiply
10000
Basic
Blocking
Transpose
Time (s)
100
1
0.01
0.0001
1e-06
1
Jason Ansel (MIT)
10
100
Input Size
PetaBricks
1000
10000
June 16, 2009
34 / 47
Results
Benchmarks
Matrix Multiply
10000
Basic
Blocking
Transpose
Recursive
Time (s)
100
1
0.01
0.0001
1e-06
1
Jason Ansel (MIT)
10
100
Input Size
PetaBricks
1000
10000
June 16, 2009
34 / 47
Results
Benchmarks
Matrix Multiply
10000
Basic
Blocking
Transpose
Recursive
Strassen 256
Time (s)
100
1
0.01
0.0001
1e-06
1
Jason Ansel (MIT)
10
100
Input Size
PetaBricks
1000
10000
June 16, 2009
34 / 47
Results
Benchmarks
Matrix Multiply
10000
Basic
Blocking
Transpose
Recursive
Strassen 256
Autotuned
Time (s)
100
1
0.01
0.0001
1e-06
1
Jason Ansel (MIT)
10
100
Input Size
PetaBricks
1000
10000
June 16, 2009
34 / 47
Results
Scalability
Outline
1
Introduction
Motivating Example
Language & Compiler Overview
Why choices
2
PetaBricks Language
Key Ideas
Compilation Example
Other Language Features
3
Results
Benchmarks
Scalability
Variable Accuracy
4
Conclusion
Final thoughts
Jason Ansel (MIT)
PetaBricks
June 16, 2009
35 / 47
Results
Scalability
Scalability
8
Autotuned Matrix Multiply
7
Speedup
6
5
4
3
2
1
1
2
Jason Ansel (MIT)
3
4
5
Number of Threads
PetaBricks
6
7
8
June 16, 2009
36 / 47
Results
Scalability
Scalability
8
Autotuned Matrix Multiply
Autotuned Sort
7
Speedup
6
5
4
3
2
1
1
2
Jason Ansel (MIT)
3
4
5
Number of Threads
PetaBricks
6
7
8
June 16, 2009
36 / 47
Results
Scalability
Scalability
8
Autotuned Matrix Multiply
Autotuned Sort
Autotuned Poisson
7
Speedup
6
5
4
3
2
1
1
2
Jason Ansel (MIT)
3
4
5
Number of Threads
PetaBricks
6
7
8
June 16, 2009
36 / 47
Results
Scalability
Scalability
8
Autotuned Matrix Multiply
Autotuned Sort
Autotuned Poisson
Autotuned Eigenvector Solve
7
Speedup
6
5
4
3
2
1
1
2
Jason Ansel (MIT)
3
4
5
Number of Threads
PetaBricks
6
7
8
June 16, 2009
36 / 47
Results
Variable Accuracy
Outline
1
Introduction
Motivating Example
Language & Compiler Overview
Why choices
2
PetaBricks Language
Key Ideas
Compilation Example
Other Language Features
3
Results
Benchmarks
Scalability
Variable Accuracy
4
Conclusion
Final thoughts
Jason Ansel (MIT)
PetaBricks
June 16, 2009
37 / 47
Results
Variable Accuracy
Variable accuracy
Most algorithms produce exact solutions
Jason Ansel (MIT)
PetaBricks
June 16, 2009
38 / 47
Results
Variable Accuracy
Variable accuracy
Most algorithms produce exact solutions
Large class of algorithms can produce
approximate solutions
Jason Ansel (MIT)
PetaBricks
June 16, 2009
38 / 47
Results
Variable Accuracy
Variable accuracy
Most algorithms produce exact solutions
Large class of algorithms can produce
approximate solutions
Iterative convergence
Grid coarsening
Others
Jason Ansel (MIT)
PetaBricks
June 16, 2009
38 / 47
Results
Variable Accuracy
Variable accuracy
Most algorithms produce exact solutions
Large class of algorithms can produce
approximate solutions
Iterative convergence
Grid coarsening
Others
Compiler/autotuner should be aware of variable
accuracy
Jason Ansel (MIT)
PetaBricks
June 16, 2009
38 / 47
Results
Variable Accuracy
Variable accuracy
Most algorithms produce exact solutions
Large class of algorithms can produce
approximate solutions
Iterative convergence
Grid coarsening
Others
Compiler/autotuner should be aware of variable
accuracy
Compiler can examine optimal frontier of
algorithms
Jason Ansel (MIT)
PetaBricks
June 16, 2009
38 / 47
Results
Variable Accuracy
Poisson’s equation
A variable accuracy benchmark
Accuracy level expressed as a template parameter
Autotuner exploits variable accuracy in a general way
Choices:
Direct solve
Jacobi iteration
Successive over relaxation
Multigrid
Jason Ansel (MIT)
PetaBricks
June 16, 2009
39 / 47
Results
Variable Accuracy
Choices in Multigrid
128
Time
16
32
64
Grid Size
SOR Iteration
SOR is an iterative algorithm
Jason Ansel (MIT)
PetaBricks
June 16, 2009
40 / 47
Results
Variable Accuracy
Choices in Multigrid
128
Time
16
32
64
Grid Size
SOR Iteration
SOR is an iterative algorithm
Multigrid changes grid coarseness to speed up convergence
Many standard shapes: V-Cycle,
Jason Ansel (MIT)
PetaBricks
June 16, 2009
40 / 47
Results
Variable Accuracy
Choices in Multigrid
128
Time
16
32
64
Grid Size
SOR Iteration
SOR is an iterative algorithm
Multigrid changes grid coarseness to speed up convergence
Many standard shapes: V-Cycle, W-Cycle, etc
Jason Ansel (MIT)
PetaBricks
June 16, 2009
40 / 47
Results
Variable Accuracy
Choices in Multigrid
128
Time
SOR Iteration
16
32
64
Grid Size
Direct Solve
SOR is an iterative algorithm
Multigrid changes grid coarseness to speed up convergence
Many standard shapes: V-Cycle, W-Cycle, etc
Direct solver
Jason Ansel (MIT)
PetaBricks
June 16, 2009
40 / 47
Results
Variable Accuracy
Choices in Multigrid
128
Time
SOR Iteration
16
32
64
Grid Size
Direct Solve
SOR is an iterative algorithm
Multigrid changes grid coarseness to speed up convergence
Many standard shapes: V-Cycle, W-Cycle, etc
Direct solver
Different shapes = different algorithms
Jason Ansel (MIT)
PetaBricks
June 16, 2009
40 / 47
Results
Variable Accuracy
Autotuned V-cycle shapes for different accuracy
requirements
2048
10
Grid Size
1024
1
512
256
128
64
32
16
Jason Ansel (MIT)
PetaBricks
June 16, 2009
41 / 47
Results
Variable Accuracy
Autotuned V-cycle shapes for different accuracy
requirements
2048
10
Grid Size
1024
1
10
3
512
256
128
64
32
16
Jason Ansel (MIT)
PetaBricks
June 16, 2009
41 / 47
Results
Variable Accuracy
Autotuned V-cycle shapes for different accuracy
requirements
2048
10
Grid Size
1024
1
10
3
10
5
512
256
128
64
32
16
Jason Ansel (MIT)
PetaBricks
June 16, 2009
41 / 47
Results
Variable Accuracy
Autotuned V-cycle shapes for different accuracy
requirements
2048
10
1
10
3
10
5
512
256
128
64
32
16
2048
1024
Grid Size
Grid Size
1024
10
7
512
256
128
64
32
16
Jason Ansel (MIT)
PetaBricks
June 16, 2009
41 / 47
Results
Variable Accuracy
Dynamic programming technique for autotuning Multigrid
Jason Ansel (MIT)
PetaBricks
June 16, 2009
42 / 47
Results
Variable Accuracy
Dynamic programming technique for autotuning Multigrid
Jason Ansel (MIT)
PetaBricks
June 16, 2009
42 / 47
Results
Variable Accuracy
Dynamic programming technique for autotuning Multigrid
Partition accuracy space into discrete levels
Jason Ansel (MIT)
PetaBricks
June 16, 2009
42 / 47
Results
Variable Accuracy
Dynamic programming technique for autotuning Multigrid
Partition accuracy space into discrete levels
Jason Ansel (MIT)
PetaBricks
June 16, 2009
42 / 47
Results
Variable Accuracy
Dynamic programming technique for autotuning Multigrid
Partition accuracy space into discrete levels
Jason Ansel (MIT)
PetaBricks
June 16, 2009
42 / 47
Results
Variable Accuracy
Dynamic programming technique for autotuning Multigrid
Grid size i
Grid size 2i
Partition accuracy space into discrete levels
Jason Ansel (MIT)
PetaBricks
June 16, 2009
42 / 47
Results
Variable Accuracy
Dynamic programming technique for autotuning Multigrid
Grid size i
Grid size 2i
⇒
Partition accuracy space into discrete levels
Base space of candidate algorithms on optimal algorithms from
coarser level
Jason Ansel (MIT)
PetaBricks
June 16, 2009
42 / 47
Results
Variable Accuracy
Dynamic programming technique for autotuning Multigrid
Grid size i
Grid size 2i
⇒
Partition accuracy space into discrete levels
Base space of candidate algorithms on optimal algorithms from
coarser level
Jason Ansel (MIT)
PetaBricks
June 16, 2009
42 / 47
Results
Variable Accuracy
Dynamic programming technique for autotuning Multigrid
Grid size i
Grid size 2i
⇒
Partition accuracy space into discrete levels
Base space of candidate algorithms on optimal algorithms from
coarser level
Jason Ansel (MIT)
PetaBricks
June 16, 2009
42 / 47
Results
Variable Accuracy
Poisson’s Equation
10000
Direct
1000
100
Time (s)
10
1
0.1
0.01
0.001
0.0001
1e-05
1
Jason Ansel (MIT)
10
100
Input Size
PetaBricks
1000
June 16, 2009
43 / 47
Results
Variable Accuracy
Poisson’s Equation
10000
Direct
Jacobi
1000
100
Time (s)
10
1
0.1
0.01
0.001
0.0001
1e-05
1
Jason Ansel (MIT)
10
100
Input Size
PetaBricks
1000
June 16, 2009
43 / 47
Results
Variable Accuracy
Poisson’s Equation
10000
Direct
Jacobi
SOR
1000
100
Time (s)
10
1
0.1
0.01
0.001
0.0001
1e-05
1
Jason Ansel (MIT)
10
100
Input Size
PetaBricks
1000
June 16, 2009
43 / 47
Results
Variable Accuracy
Poisson’s Equation
10000
Direct
Jacobi
SOR
Multigrid
1000
100
Time (s)
10
1
0.1
0.01
0.001
0.0001
1e-05
1
Jason Ansel (MIT)
10
100
Input Size
PetaBricks
1000
June 16, 2009
43 / 47
Results
Variable Accuracy
Poisson’s Equation
10000
Direct
Jacobi
SOR
Multigrid
Autotuned
1000
100
Time (s)
10
1
0.1
0.01
0.001
0.0001
1e-05
1
Jason Ansel (MIT)
10
100
Input Size
PetaBricks
1000
June 16, 2009
43 / 47
Conclusion
Final thoughts
Outline
1
Introduction
Motivating Example
Language & Compiler Overview
Why choices
2
PetaBricks Language
Key Ideas
Compilation Example
Other Language Features
3
Results
Benchmarks
Scalability
Variable Accuracy
4
Conclusion
Final thoughts
Jason Ansel (MIT)
PetaBricks
June 16, 2009
44 / 47
Conclusion
Final thoughts
Related work
Languages
Sequoia
Libraries & domain specific tuners
STAPL
ATLAS
FFTW
SPARSITY
SPIRAL
...
Jason Ansel (MIT)
PetaBricks
June 16, 2009
45 / 47
Conclusion
Final thoughts
For more information
PetaBricks makes programs future-proof, by allowing them to adapt
to new architectures
We plan to released PetaBricks at the end of summer
Sign up for our mailing list to be notified
For more information see:
http://projects.csail.mit.edu/petabricks/
Questions?
Jason Ansel (MIT)
PetaBricks
June 16, 2009
46 / 47
Conclusion
Final thoughts
Thank you!
Jason Ansel (MIT)
PetaBricks
June 16, 2009
47 / 47
© Copyright 2026 Paperzz