PQL: A Purely-Declarative Java Extension for Parallel Programming

PQL: A Purely-Declarative Java
Extension for Parallel Programming
Christoph Reichenbach1 , Yannis Smaragdakis2,3 ,
Neil Immerman2
1: Goethe University Frankfurt
2: University of Massachusetts, Amherst
3: University of Athens
1
W RITING PARALLEL PROGRAMS IS HARD
• locking
• races
• side effect order
• consistency models
• distributing computations
...
PQL/J AVA
2
E ASIER PARALLELISM
Approach
Problems
User actions
map-reduce
emb. parallel + aggregation
split computation
fork-join
divide-and-conquer
(recursively) divide up
problem
PLINQ
SQL-like, over containers
tag parallel steps
Pregel
graph algorithms
split into graph computations, -mutations
Frameworks for manual parallelisation
Casual parallelism: fully automatic
PQL/J AVA
3
C ASUAL PARALLELISM
• Pitfalls:
– Side effects
– Order dependency
Declarative language
Specify the ‘what’, not the ‘how’
PQL/J AVA
4
PQL/JAVA
• Declarative extension to Java:
Parallel Query Language
• Fully automatic parallelisation
• Processes and builds Java containers
PQL
Java
PQL
PQL/J AVA
5
PQL/JAVA
• Declarative extension to Java:
Parallel Query Language
• Fully automatic parallelisation
• Processes and builds Java containers
Java for sequential code, PQL for parallel code
PQL/J AVA
6
PQL EXAMPLE
DocRepository
doc
doc
..
.
doc
doc
Set<Document> all_matches = query (Set.contains(doc)):
DocRepository.getAll().contains(doc)
&& forall x: doc.contains(search_terms[x]);
PQL/J AVA
7
PQL EXAMPLE
DocRepository
search_terms
doc
doc
..
.
doc
doc
Set<Document> all_matches = query (Set.contains(doc)):
DocRepository.getAll().contains(doc)
&& forall x: doc.contains(search_terms[x]);
PQL/J AVA
8
PQL EXAMPLE
DocRepository
search_terms
doc
doc
..
.
doc
doc
Set<Document> all_matches = query (Set.contains(doc)):
DocRepository.getAll().contains(doc)
&& forall x: doc.contains(search_terms[x]);
PQL/J AVA
9
PQL EXAMPLE
DocRepository
search_terms
doc
doc
..
.
doc
doc
Set<Document> all_matches = query (Set.contains(doc)):
DocRepository.getAll().contains(doc)
&& forall x: doc.contains(search_terms
PQL/J AVA
10
PQL EXAMPLE
DocRepository
search_terms
doc
doc
..
.
contains all
doc
doc
Set<Document> all_matches = query (Set.contains(doc)):
DocRepository.getAll().contains(doc)
&& forall x: doc.contains(search_terms[x]);
PQL/J AVA
11
PQL EXAMPLE
DocRepository
search_terms
doc
doc
..
.
contains all
doc
?
results
doc
query (Set.contains(doc)):
DocRepository.getAll().contains(doc)
&& forall x: doc.contains(search_terms[x]);
PQL/J AVA
12
W HAT RICH LANGUAGE GIVES US CASUAL PARALLELISM ?
• Embarrassingly parallel:
Executable in O(1) with enough CPUs
PQL/J AVA
13
W HAT RICH LANGUAGE GIVES US CASUAL PARALLELISM ?
• Embarrassingly parallel:
Executable in O(1) with enough CPUs
• Result from Descriptive Complexity :
This language is precisely First-Order Logica
Using O(n3 ) cores may be a bit much...
a if
PQL/J AVA
we assume a polynomial number of CPUs
14
M AKING F IRST-O RDER L OGIC M ORE U SEFUL
• Assert results:
– Finite set comprehension
– SQL-style queries (minus aggregation, ordering)
∀x.∃y.a[x] = b[y]
PQL/J AVA
15
M AKING F IRST-O RDER L OGIC M ORE U SEFUL
• Assert or compute results:
– Finite set comprehension
– SQL-style queries (minus aggregation, ordering)
PQL/J AVA
x
∃y.a[x] = b[y]
0
true
1
false
2
false
3
..
.
true
..
.
representation:
⇒ {0, 3, . . .}
16
A DDING REDUCTION
reduce(sum) x over i: x == a[i]
• log-parallel performance
• user-supplied reductors
PQL/J AVA
17
PQL OVERVIEW
• !, ~, +, −, . . . , ?:, ==, instanceof, &&, ||, −>
• forall , exists
• Java expressions as constants
• m[k], m.get(k), c.length, c.size(), s.contains(e)
• Container construction:
– query (Set.contains(int x)): ...
– query (Array[x] == float f): ...
– query (Map.get(String s) == int i [default v]): ...
• reduce(sumInt) int x [ over y ]: ...
PQL/J AVA
18
M ORE PQL EXAMPLES
assert forall Node n: sorted_list.contains(n)
−> n.prev.value <= n.value;
PQL/J AVA
19
M ORE PQL EXAMPLES
• Check sortedness of list
PQL/J AVA
20
M ORE PQL EXAMPLES
• Check sortedness of list
Set<Item> intersection =
query (Set.get(Item element)):
set0.contains(element)
&& set1.contains(element)
&& !element.is_dead;
PQL/J AVA
21
M ORE PQL EXAMPLES
• Check sortedness of list
• Set intersection together with filtering
PQL/J AVA
22
M ORE PQL EXAMPLES
• Check sortedness of list
• Set intersection together with filtering
query (Map.get(employee) == double bonus):
employees.contains(employee)
&& bonus == employee.dept.bonus_factor
∗ (reduce(sumDouble) v:
exists Bonus b: employee.bonusSet.contains(b)
&& v == b.bonus_base);
PQL/J AVA
23
M ORE PQL EXAMPLES
• Check sortedness of list
• Set intersection together with filtering
• Employee bonus table
PQL/J AVA
24
M ORE PQL EXAMPLES
• Check sortedness of list
• Set intersection together with filtering
• Employee bonus table
dot_product = reduce(add) x over y: x == a[y] ∗ b[y];
PQL/J AVA
25
M ORE PQL EXAMPLES
• Check sortedness of list
• Set intersection together with filtering
• Employee bonus table
• Vector dot product
PQL/J AVA
26
M ORE PQL EXAMPLES
• Check sortedness of list
• Set intersection together with filtering
• Employee bonus table
• Vector dot product
query (Map.find(value) == keyset default new PSet()):
keyset == query (Set.contains(key)):
m.get(key) == value;
PQL/J AVA
27
M ORE PQL EXAMPLES
• Check sortedness of list
• Set intersection together with filtering
• Employee bonus table
• Vector dot product
• Invert map
...
PQL/J AVA
28
R EALISTIC PQL EXAMPLE
DocRepository
far out in the uncharted
backwaters of the . . .
doc
it was a bright cold day
in april .
doc
doc
..
.
✄ the
✂count ✁
results
far
out
in
the
..
.
1
1
2
1
DocRepository.getAll().contains(doc)
&
PQL/J AVA
29
R EALISTIC PQL EXAMPLE
DocRepository
far out in the uncharted
backwaters of the . . .
doc
it was a bright cold day
in april .
doc
doc
..
.
✄ in
✂count ✁
results
far
out
in
the
..
.
1
1
2
1
DocRepository.getAll().contains(doc)
&
PQL/J AVA
30
R EALISTIC PQL EXAMPLE
DocRepository
far out in the uncharted
backwaters of the . . .
doc
it was a bright cold day
in april .
doc
doc
..
.
✄
✂count ✁
results
far
out
in
the
..
.
1
1
2
1
query (Map.get(int word_id) == int wcount default 0):
DocRepository.getAll().contains(doc)
&
PQL/J AVA
31
R EALISTIC PQL EXAMPLE
DocRepository
far out in the uncharted
backwaters of the . . .
doc
it was a bright cold day
in april .
doc
doc
..
.
✄
✂count ✁
results
far
out
in
the
..
.
1
1
2
1
query (Map.get(int word_id) == int wcount default 0):
wcount == reduce(sum) 1 over doc:
DocRepository.getAll().contains(doc)
&
PQL/J AVA
32
R EALISTIC PQL EXAMPLE
DocRepository
far out in the uncharted
backwaters of the . . .
doc
it was a bright cold day
in april .
doc
doc
..
.
✄
✂count ✁
results
far
out
in
the
..
.
1
1
2
1
query (Map.get(int word_id) == int wcount default 0):
wcount == reduce(sum) 1 over doc:
DocRepository.getAll().contains(doc)
&& exists i: doc.words[i] == word_id;
PQL/J AVA
33
I MPLEMENTATION
• Extension to javac 1.6:
– PQL to relations
– Access path selection / Query scheduling
– Optimisation
– Code generation
• Run-time library support:
– parallel execution
– dynamic re-compilation (new)
PQL/J AVA
34
E XAMPLE
reduce(max) int x: a[x] > 0
Gen. relational IL
Query ordering
Optimisation
Code generation
PQL/J AVA
35
E XAMPLE
✄
✂Gen. relational IL ✁
reduce(max) int x: a[x] > 0
Query ordering
Optimisation
Code generation
Int(x)
Translation into relational IL
PQL/J AVA
36
E XAMPLE
✄
✂Gen. relational IL ✁
reduce(max) int x: a[x] > 0
Query ordering
Optimisation
Code generation
Int(x)
ArraySub(a, x, t0 )
Translation into relational IL
PQL/J AVA
37
E XAMPLE
✄
✂Gen. relational IL ✁
reduce(max) int x: a[x] > 0
Query ordering
Optimisation
Code generation
Int(x)
ArraySub(a, x, t0 )
GT(t0 , 0)
Translation into relational IL
PQL/J AVA
38
E XAMPLE
✄
✂Gen. relational IL ✁
reduce(max) int x: a[x] > 0
Query ordering
Optimisation
Code generation
Int(x)
ArraySub(a, x, t0 )
GT(t0 , 0)
Unordered!
PQL/J AVA
39
E XAMPLE
reduce(max) int x: a[x] > 0
Gen. relational IL
✞
☎
Query ordering
✝
✆
Optimisation
Code generation
Int(xw )
ArraySub(ar , xr , t0 w )
GT(t0 r , 0)
PQL/J AVA
40
E XAMPLE
reduce(max) int x: a[x] > 0
Gen. relational IL
✞
☎
Query ordering
✝
✆
Optimisation
Code generation
Int(xw )
ArraySub(ar , xr , t0 w )
GT(t0 r , 0)
Order #1: Must iterate over 232 values!
PQL/J AVA
41
E XAMPLE
reduce(max) int x: a[x] > 0
Gen. relational IL
✞
☎
Query ordering
✝
✆
Optimisation
Code generation
ArraySub(ar , xw , t0 w )
Int(xr )
GT(t0 r , 0)
Order #2: Iterate over a.length values
PQL/J AVA
42
E XAMPLE
reduce(max) int x: a[x] > 0
Gen. relational IL
Query ordering
☎
✞
Optimisation
✝
✆
Code generation
ArraySub(ar , xw , t0 w )
Int(x)
GT(t0 r , 0)
PQL/J AVA
43
E XAMPLE
reduce(max) int x: a[x] > 0
Gen. relational IL
Query ordering
Optimisation
✞
☎
Code generation
✝
✆
ArraySub(ar , xw , t0 w )
GT(t0 r , 0)
PQL/J AVA
44
E XAMPLE
reduce(max) int x: a[x] > 0
Gen. relational IL
Query ordering
Optimisation
✞
☎
Code generation
✝
✆
for (x = 0; x < a.length; x++) {
t_0 = a[x];
GT(t0 r , 0)
}
PQL/J AVA
45
E XAMPLE
reduce(max) int x: a[x] > 0
Gen. relational IL
Query ordering
Optimisation
✞
☎
Code generation
✝
✆
for (x = 0; x < a.length; x++) {
t_0 = a[x];
if (t_0 > 0)
// signal success at x
}
PQL/J AVA
46
E XAMPLE
Gen. relational IL
x
Query ordering
Optimisation
✞
☎
Code generation
✝
✆
for (x = 0; x < a.length; x++) {
t_0 = a[x];
if (t_0 > 0)
// signal success at x
}
PQL/J AVA
47
E XAMPLE
Gen. relational IL
x
Query ordering
Optimisation
void runWorker(int start, int stop) {
✞
☎
Code generation
✝
✆
☎
☎
✞
✞
for (x = ✝start ✆
; x < ✝stop ✆
; x++) {
t_0 = a[x];
if (t_0 > 0)
// signal success at x
}}
PQL/J AVA
48
PARALLEL EXECUTION MODEL : T REE J OIN
runWorker
max
max
max
Core 7
Core 6
Core 5
Core 4
Core 3
Core 2
Core 1
Core 0
PQL/J AVA
49
P ERFORMANCE
• Benchmarks:
– bonus: Salary computation
– threegrep: String pattern search
– wordcount: Word frequency aggregation in documents
– webgraph: One-hop self-references in web graphs
• Hardware:
– Intel Xeon 6×2 threads, 2.67 GHz, 24 GB RAM
– Sun UltraSPARC 16×4 threads, 1.17 GHz, 32 GB RAM
• Methodology:
– For each configuration: 3 warmup runs, 10 eval runs
• Running stock Sun JVM 1.6 with 2 GB Heap
PQL/J AVA
50
P ERFORMANCE RESULTS : WORDCOUNT ON I NTEL X EON
✝✞✟✠✡✞☛☞✌
✄✁
✄
✂✁
✂
✁
✍✎✏
✍✑✟✑✒✓✑☞☛✑✏
✓✑☞☛✑✏
✂
PQL/J AVA
✄
☎
✆
✂✄
51
P ERFORMANCE RESULTS : WORDCOUNT ON U LTRA SPARC
✟✠✡☛☞✠✌✍✎
☎
✄✁
✄
✂✁
✂
✁
✏✑✒
✏✓✡✓✔✕✓✍✌✓✒
✕✓✍✌✓✒
✂
PQL/J AVA
✄
✆
✝
✂✞
☎✄
✞✆
52
P ERFORMANCE RESULTS : W EBGRAPH ON I NTEL X EON
✞✟✠✡☛☞✌✍
✆
☎✁
☎
✄✁
✄
✂✁
✂
✁
✌✎✏
✌☞☛☞✑✒☞✓✔☞✏
✒☞✓✔☞✏
✂
PQL/J AVA
✄
✆
✝
✂✄
53
P ERFORMANCE RESULTS : W EBGRAPH ON U LTRA SPARC
✟✠✡☛☞✌✍✎
☎✁
☎
✄✁
✄
✂✁
✂
✁
✍✏✑
✍✌☞✌✒✓✌✔✕✌✑
✓✌✔✕✌✑
✂
PQL/J AVA
✄
✆
✝
✂✞
☎✄
✞✆
54
P ERFORMANCE RESULTS : B ONUS ON I NTEL
✡☛☞✌✍
✁
✠
✟
✞
✝
✆
☎
✄
✂
✁
✎✏✑
✎✒✓✒✔✕✒☞✌✒✑
✕✒☞✌✒✑
✁
PQL/J AVA
✂
☎
✝
✁✂
55
E XISTING APPROACHES FOR JAVA
• SQL via JDBC: Similar queries, separate heap
• Hadoop: Java Map-Reduce framework
PQL/J AVA
56
C OMPARISON TO SQL AND H ADOOP
Communication overhead
(At
PQL/J AVA
1
10
of the usual benchmark size)
57
C ONCISENESS
Total lines of code (including Java boilerplate):
benchmark
manual
manual- parallel
Hadoop
SQL
PQL
bonus
9
50
130
48
8
threegrep
9
46
60
21
6
webgraph
13
50
105
39
4
wordcount
8
98
93
38
4
PQL implementations are concise
PQL/J AVA
58
S UMMARY
PQL/Java adds casual parallelism to Java through:
• Declarative query semantics
• Automatic parallelisation
• Strong parallel performance
Available at http://creichen.net/pql
PQL/J AVA
59
Backup Slides
P ERFORMANCE RESULTS : I NTEL
✝✞✟✠✠✡✟✠☛
✡☛☞✌✍
✁
✆✂
✠
✆✁
✟
✆
✞
☎
✝
✄
☎
✆
✄
✂
✁
✂
☛☞✌
☛✍✟✍✎✏✍✑✒✍✌
✏✍✑✒✍✌
✆
✁
✁
✂
✄
✎✏✑
✎✒✓✒✔✕✒☞✌✒✑
✕✒☞✌✒✑
✁
✆✁
✂
✞✟✠✡☛☞✌✍
☎
✝
✁✂
✆
✂✄
✝✞✟✠✡✞☛☞✌
✆
✄✁
☎✁
✄
☎
✄✁
✂✁
✄
✂
✂✁
✂
✁
✂
PQL/J AVA
✁
✌✎✏
✌☞☛☞✑✒☞✓✔☞✏
✒☞✓✔☞✏
✄
✆
✝
✂✄
✍✎✏
✍✑✟✑✒✓✑☞☛✑✏
✓✑☞☛✑✏
✂
✄
☎
61
P ERFORMANCE RESULTS : SPARC
✞✟✠✡✡☛✠✡☞
✞✟✠✡☛
✆✁
✆✁
✆
✆
☎
☎
✄
✄
✂
✂
✁
✁
☞✌✍
☞✎✠✎✏✑✎✒✓✎✍
✑✎✒✓✎✍
✆
✁
✂
☎
✆✄
✝✁
✄✂
☞✌✍
☞✎✏✎✑✒✎✠✡✎✍
✒✎✠✡✎✍
✆
✁
✂
✟✠✡☛☞✌✍✎
☎
✆✄
✝✁
✄✂
✂✞
☎✄
✞✆
✟✠✡☛☞✠✌✍✎
☎✁
☎
☎
✄✁
✄✁
✄
✄
✂✁
✂✁
✂
✂
✁
✂
PQL/J AVA
✁
✍✏✑
✍✌☞✌✒✓✌✔✕✌✑
✓✌✔✕✌✑
✄
✆
✝
✂✞
☎✄
✞✆
✏✑✒
✏✓✡✓✔✕✓✍✌✓✒
✕✓✍✌✓✒
✂
✄
✆
✝
62