High-Level Synthesis for FPGAs: Challenges and - TUM-LIS

High-Level Synthesis for FPGAs:
Challenges and Opportunities
Jason H. Anderson
BOF Session – FPL 2014
1 September 2014, Munich
Dept. of Electrical and Computer Engineering
University of Toronto
Opportunities
Hardware’s Potential
• Implementing computations in FPGA hardware can
have speed/energy advantages over software:
– Lithography simulation: 15X speed-up [Cong & Zou, TRETS’09]
– Linear system solver: 2.2X speed-up, 5X more energy efficient
[Zhang, Betz, Rose, TRETS’12]
– Monte Carlo simulation for photodynamic therapy: 80X faster,
45X more energy efficient [Lo et al., J. Biomed Optics’09]
– Options pricing: 4.6X faster, 25X more energy efficient
[Tse, Thomas, Luk, TVLSI’12]
We Are in the Accelerator Era
PROCESSOR CLOCK FREQUENCY
HLS has a role in accelerator design:
1)Raising design productivity for HW engineers
2)Allowing SW engineers to build accelerators
Source: ISSCC 2014 Technology Trends
HLS Enables Use of FPGAs as
Computing Platforms
• Computing with better
– 1) speed
– 2) energy efficiency
• Accessible to software engineers
10X as many software engineers are HW engineers
(US Bureau of Labor Statistics)
FPGAs as computing platforms in
data centres beside x86 machines
Challenges
Quality of the Hardware
• Performance of HLS-generated circuits not
as good as human-designed circuits
• However, HLS-generated circuits are
already better than SW in many cases
Hard to Auto-Synthesize
Syntactic Variance / Constraints
• HLS tool QoR highly sensitive to style of input
code + constraints
• LegUp HLS example:
for (i = 0; i < 100; i++) {
if (A[i] & 1)
sum += A[i];
else
sum -= A[i];
}
Cannot loop pipeline
for (i =
temp1
temp2
sum =
0; i < 100; i++) {
= sum + A[i];
= sum – A[i];
(A[i] & 1) ?
temp1 : temp2;
}
Can loop pipeline
Syntactic Variance / Constraints (2)
Matai et al., “Designing a Hardware in the Loop Wireless Digital Channel Emulator for
Software Defined Radio”, FPT 2012.
Debugging
• Invariably… things go wrong, e.g.:
– Integration of synthesized HW in system
– Silicon issues: timing, reliability (SEUs)
• Today’s HLS:
Visualization
• Today’s HLS:
HLS
“Black box”
(hundreds/tens) thousands
of lines of HDL code
Visualization (2)
“SW-engineer comprehensible”
HW visualization capabilities are needed
that guide HW optimization
Operating System-Like
Functionality
• Manage execution of heterogeneous
x86/FPGA accelerator applications
– Ability to “swap in/out” accelerators in a similar
manner to processes on an OS
– Connects with partial reconfiguration
• Memory management between x86/FPGA
– SW API
– Share/transfer memory
Common Benchmarking
• No accepted benchmark suite for HLS
– CHStone circuits don’t stress capabilities of
modern tools
• No accepted benchmarking methodology:
– Push button?
– Constraints, pragmas?
• “Insecurity” among HLS commercial vendors
– Vendors do not permit results to be published
Summary
• Huge potential for HLS:
– Design productivity
– Accessibility to SW engineers
– Power/performance improvements
• Still some challenges to widespread adoption
– Many research opportunities
• Time is right for HLS.
I am optimistic about its
future.