Hoekzema

Parallel Programming in Contemporary
Programming Languages
James Hoekzema
Why
The optimal environment is not always available.
•
Code needs to run on machines with different specs
•
Operating System does not support certain libraries
•
The environment is a restricted Virtual Machine
•
Licensing for algorithms is not always cheap
•
High end components are expensive
•
“Fast Enough” is better than nothing at all
What
JavaScript - Using parallel programming and concepts in the browser
Go - Built-in features for parallelism beside and to scale a server
LabVIEW - Graphical programming for smart compiler parallelism
JavaScript (in the Browser)
Why
Networking is slow and unpredictable.
•
Different providers, different regions, different services levels
•
Connection is outside of your control
Scaling servers is costly.
•
On a server, more people connecting means slower service all around
•
The user is providing computation power to you, use it!
Much wider support than plugins like Java or Flash.
•
Plugins require downloading, updating, loading up different virtual machines
•
Browsers are less prone to versioning that affects performance
What
Asynchronous Single-thread
•
JS only runs when needed, so networking is nonblocking
•
Tasks are put into a queue for execution during “downtime”
Web Workers
•
Simplified JS environment on another thread, uses message passing interface
•
Can be bound to a webpage or to a domain (Shared Worker)
•
Data is deep copied unless a buffer is specified which then transfers control
WebGL
•
Essentially OpenGL for a web page, runs on the Canvas Element
•
Allows access to GPU computing (or at least accelerated Graphics Computing)
Example (Asynchronous)
var fact = [];
function ajax(method, url, callback) {
//create request
var xhr = new XMLHttpRequest();
//set callback function
xhr.onload = function(){
callback(JSON.parse(xhr.responseText));
};
//specify request
xhr.open(method, url);
//send request
xhr.send();
}
function factorial(n) {
if(n > 1){
return n * factorial(n - 1);
}else{
return 1;
}
}
ajax('GET', 'primes.json', function(primes){
//executes after factorial
var result = 0;
var len = Math.min(primes.length, fact.length);
for(var i = 0; i < len; i++){
result += primes[i] / fact[i];
}
console.log(result);
});
//executes after request, before response
for(var i = 0; i < 20; i++){
fact.push(factorial(i + 1));
}
primes: [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, ...]
fact:
[1, 2, 6, 24, 120, 720, 5040, 40320, ...]
------------------------------------------------result: 4.73863870268622
Example (Web Workers)
var worker1 = new Worker('worker.js');
var worker2 = new Worker('worker.js');
//worker.js
worker1.postMessage({
type: 'sum',
data: [2, 3, 5, 7, 11, 13, 17, 19, 23]
});
onmessage = function(event) {
if(event.data.type == 'sum'){
postMessage(sum(event.data.data));
}
//other parallel methods
}
worker2.postMessage({
type: 'sum',
data: [29, 31, 37, 41, 43, 47, 53, 59]
});
function sum(arr) {
var len = arr.length, sum = 0;
for(var i = 0; i < arr.length; i++){
sum += arr[i];
}
var sum = 0, count = 0;
worker1.onmessage =
worker2.onmessage = function(event) {
count++;
sum += event.data;
if(count == 2){
console.log(sum);
}
}
return sum;
}
result: 440
Polynate
Example (WebGL)
Go (Golang)
Why
Golang is built for Concurrency (and thus parallelism)
•
Simply prefixing a function call with “go” spins up a new goroutine
•
You can compile for different number of processors
•
Many of the standard library features already support concurrency
Golang’s Standard Library is well-suited for servers
•
A simple server can be written in less than 10 lines of code
•
Go makes JSON easily translatable into and out of data structures
•
Go provides an intuitive templating engine
What
“go <function call>”
•
This call spins up a goroutine that can run on any processor or concurrently
•
Not the same as MPI as it is not made for distributed computing
Channels
•
Channels allow you to pass data around between functions and routines
•
Channels can have multiple access points, so you can have several routines “listening” on a channel and when data
comes through, one will grab it and switch to being busy
Channels
2. Main makes routines and
passes them both channels
1. Main makes 2
channels, data and
result
6. Routine 3 claims third
piece of data, no longer
listening
Routine 3
Routine 3
Routine 3
Routine 3
Routine 2
Routine 2
Routine 2
Routine 2
Routine 1
Routine 1
Routine 1
Routine 1
data
Main
3. Main sends
data over channel
and listens to
result channel
8. Main receives and
processes the
results
result
4. Routine 1 claims first
piece of data, no longer
listening
5. Routine 2 claims second
piece of data, no longer
listening
7. The routines finish and
send their results back on
the result channel
Example Code
package main
import "fmt"
func main() {
data := make(chan int, 3)
result := make(chan int, 3)
func nToTheN(data chan int, result chan int) {
n := <-data
res := 1
for i := 0; i < n; i++ {
res *= n
}
result <- res
}
go nToTheN(data, result)
go nToTheN(data, result)
go nToTheN(data, result)
for n := 1; n < 4; n++ {
data <- n
}
fmt.Println(<-result)
fmt.Println(<-result)
fmt.Println(<-result)
}
1
4
27
LabVIEW
Why
LabVIEW automatically parallelizes your code
•
If you put two loops next to each other, LabVIEW runs them in parallel
•
You can force parallelism in some places like for loops
LabVIEW is built for instrumentation (particularly signal processing)
•
You can read signals and process them at the same time without having to worry about delays or order
LabVIEW comes with a graphical interface builder
•
By default, every subVI has a “Front Panel” that shows controls and views
What
Graph/Flowchart Based Programming
•
Graphical programming where subVIs are connected by wires
•
Independent parts of the graph/flowchart can thus be run simultaneously
Physical Hardware Integration
•
LabVIEW provides FPGA, GPU (CUDA), etc. subVIs to use in your code
•
You can literally “draw” the structure, like a systolic array (grid)
Front Panel
Examples (Basic Parallelism)
Execution of independent tasks
Execution of independent branches
Execution of parallel data operations
Example (Parallel loops and FPGA)
Example (Pipelining)
Questions?
What languages would you like to see next time?
Mobile/Tablet?
Cloud?
GPU?
Go deeper on these 3?