Computer Sciences Department University of Wisconsin

Operating Systems Should
Manage Accelerators
Sankaralingam Panneerselvam
Michael M. Swift
Computer Sciences Department
University of Wisconsin, Madison, WI
1
HotPar 2012
Accelerators

Specialized execution units
for common tasks


Wire speed processor
Data parallel tasks, encryption,
video encoding, XML parsing,
network processing etc.
Task offloaded to
accelerators

APU
GPU
High performance, energy
efficient (or both)
DySER
Crypto accelerator
2
HotPar 2012
Why accelerators?
Moore’s law and Dennard’s scaling

Single core to Multi-core
 Dark Silicon: More transistors but limited power
budget [Esmaeilzadeh et. al - ISCA ‘11]

Accelerators

Trade-off area for specialized logic
 Performance up by 250x and energy efficiency by
500x [Hameed et. al – ISCA ’10]

Heterogeneous architectures

3
HotPar 2012
Heterogeneity everywhere

Focus on diverse heterogeneous systems


Our work

4
Processing elements with different properties in the
same system
How can OS support this architecture?
1. Abstract heterogeneity
2. Enable flexible task execution
3. Multiplex shared accelerators among
applications
HotPar 2012
Outline
Motivation
 Classification of accelerators
 Challenges
 Our model - Rinnegan
 Conclusion

5
HotPar 2012
IBM wire-speed processor
CPU cores
Accelerator Complex
Memory
controller
6
HotPar 2012
Classification of accelerators

Accelerators classified based on their
accessibility
Acceleration devices
 Co-Processor
 Asymmetric cores

7
HotPar 2012
Classification
Properties
Acceleration
devices
Co-Processor
Asymmetric cores
Systems
GPU, APU, Crypto
accelerators in Sun
Niagara chips etc.
C-cores, DySER, IBM
wire-speed processor
NVIDIA’s Kal-El
processor, Scalable
cores like WiDGET
Location
Off/On-chip device
On-chip device
Regular core that can
execute threads
Accessibility
1) Application must deal with different classes of accelerators with
Control
Device drivers
ISAmechanisms
instructions
Thread migration,
different access
RPC
2) Resources to be shared among multiple applications
Data
Resource contention
8
DMA or zero copy
System memory
Cache coherency
Multiple applications
might want to make
use of the device
Sharing of resource
across multiple cores
could result in
contention
Multiple threads
might want to
access the special
core
HotPar 2012
Challenges

Task invocation


Virtualization


Sharing across multiple applications
Scheduling

9
Execution on different processing elements
Time multiplexing the accelerators
HotPar 2012
Task invocation
Current systems decide statically
on where to execute the task
 Programmer problems where
system can help
 Data granularity
 Power budget limitation
 Presence/availability of
accelerator

10
HotPar 2012
AES
Crypto accelerator
Virtualization

Process preemption after
launching a computation

11
System must be aware of
computation completion
HotPar 2012
0x000
0x100
0x200
0x500
0x400

0x4800
0x4900
0x5000

Accelerators need to understand
virtual addresses
OS must provide virtual-address
translations
CPU cores
Virtual
memory
crypto
XML
crypt 0x240
Data addressing for accelerators
Accelerator
units

Scheduling
Scheduling needed to
prioritize access to shared
accelerators
 User-mode access to
accelerators complicates
scheduling decisions


12
OS cannot interpose on every
request
HotPar 2012
Outline
Motivation
 Classification of Accelerators
 Challenges
 Our model - Rinnegan
 Conclusion

13
HotPar 2012
Rinnegan

Goals



Application leverage the heterogeneity
OS enforces system wide policies
Rinnegan tries to provide support for different
class of accelerators



14
Flexible task execution by Accelerator stub
Accelerator agent for safe multiplexing
Enforcing system policy through Accelerator
monitor
HotPar 2012
Accelerator stub
Entry point for accelerator
invocation and a dispatcher
 Abstracts different processing
elements
 Binding - Selecting the best
implementation for the task

{
Encrypt
through AES
instructions
}
Static: during compilation
 Early: at process start
 Late: at call

15
{
Offload to
crypto
Accelerator
}
Stub
Process making
use of
accelerator
HotPar 2012
Accelerator agent
Manages an accelerator
 Roles of an agent

Accelerator agents
Asymmetric
core
Hardware
Thread
migration

Device
driver
Resource
allocator


Kernel Space
Stub
Stub
Process trying to offload a
parallel task
Process can
communicate
directly with
accelerator

Virtualize accelerator
Forwarding requests to
the accelerator
Implement scheduling
decisions
Expose accounting
information to the OS
User Space
16
HotPar 2012
Accelerator monitor
Accelerator
Hardware
Accelerator
agent
Kernel
Space
Stub
User
Space
Process
making use of
accelerator
17
Monitors information like
utilization of resources
 Responsible for systemwide energy and
performance goals
 Acts as an online modeling
tool

Accelerator
monitor

Helps in deciding where to
schedule the task
HotPar 2012
Programming model


Based on task level
parallelism
Task creation and execution




Write: different
implementations
Launch: similar to function
invocation
Fetching result: supporting
synchronous/asynchronous
invocation
Example: Intel TBB/Cilk
18
application()
{
task = cryptStub(inputData,
outputData);
...
waitFor(task);
}
<TaskHandle> cryptStub(input,
output)
{
if(factor1 == true)
output = implmn_AES(input)
//or//
else if(factor2 == true)
output = implmn_CACC(input)
}
HotPar 2012
Summary
Kernel Space
CPU cores
Accelerator
Other kernel components
Accelerator agent
Accelerator
monitor
Stub
User Space
19
Process
Process making use of
accelerator
HotPar 2012
Direct
communication
with accelerator
Related work

Task invocation based on data granularity


Generating device specific implementation of
same task


Merge, Harmony
GPUOcelot
Resource scheduling: Sharing between multiple
applications

20
PTask, Pegasus
HotPar 2012
Conclusions
Accelerators common in future systems
 Rinnegan embraces accelerators through stub
mechanism, agent and monitoring components


Other challenges


21
Data movement between devices
Power measurement
HotPar 2012
Thank You
22
HotPar 2012