Progress Report For the Test Problem

FlowMap: An Optimal Technology Mapping
Algorithm for Delay Optimisation in LookupTable Based FPGA Designs
Presented by Qiwei Jin
13/07/2017
1
Overview
•
•
•
•
•
The paper and the authors.
Some background information.
The algorithm in detail.
Results and Conclusion.
Questions for discussion.
2
About the Paper
• Originally published in 1992, won IEEE Circuit &
Systems Society best paper award in 1994.
• 238 citations in total, 33 self.
• The first algorithm to solve a conventionally NP-hard
depth minimisation problem in polynomial time.
• The algorithm is a key component in most commercial
FPGA compilers.
• FlowMap-r and other more sophisticated algorithms
published by the authors at the same year or later for
both depth and area minimisation.
3
Jason Cong
• Chairman of Computer Science
Department, UCLA.
• Was Assistant Professor in 1994
when this paper was published.
• Got Promoted to Associate
Professor in the same year.
• His company Aplus was acquired by
Magma in 2004 for "$13 million in
stock, cash and incentives“.
Picture borrowed from Jason Cong’s homepage
4
Yuzheng Ding
• Very low profile person, no picture,
no home page, not even on FaceBook.
• RA in UCLA for PhD when this paper
was published.
• May have left university for work
(Mentor Graphics) after graduation.
• Still working actively with Jason Cong,
latest paper published in year 2008.
5
Background
• FPGA (Field-Programmable Gate Array):
Programmable hardware.
Xilinx Virtex 5 FPGA
6
Background Cont.
For more information, go to Wayne Luk’s Custom Computing Course
7
Background Cont.
• FPGAs are essentially a bunch of wires and
LUTs (Look-Up Tables) that can be configured
to emulate the behaviour of a digital circuit.
• FPGAs can be configured by Hardware
Description Language (HDL, such as VHDL).
• Based on the HDL, a netlist can be generated
automatically by some algorithm (FlowMap!).
8
Background Cont.
ASIC
=
Addr.
Value
0000
0
0001
0
...
...
1111
1
4-Input 1-Output LUT
(16 entries in total)
9
Background Cont.
• Mappings from ASIC to FPGAs are not
necessary one to one.
ASIC
=
• The question is how to achieve the optimal
condition?
10
Background Cont.
• Trade-offs:
– Area (number of LUTs used)
– Depth (delay of the circuit)
depth
• FlowMap focuses on depth optimisation
11
Depth Minimisation Example
12
The Key Idea of Depth Minimisation
• Try to pack as many gates in different levels into a LUT
as possible.
• Number of LUT used (Area) is not the primary concern.
• The problem is equivalent to generating optimal code
for expressions containing common subexpressions,
hence NP-Hard.*
• Conventional method will decompose the Boolean
network into a forest of trees before processing.
• FlowMap can find an optimal mapping directly from a
Boolean network within polynomial time.
Let’s see how.
* A. Aho, S. C. Johnson, “Optimal Code Generation for
Expression Trees”, 23, 3, 488-501 (1976).
13
Preliminaries
•
•
•
•
•
•
•
Input(T)
Cut (X, X
X)
Node Cut Size
Edge Capacity
Edge Cut Size
Whether a cut is K-feasible
Height of a Cut
The FlowMap Algorithm
• 2 Phases
– Node Labeling: define the optimal depth of the
LUT mapping solution for Nt.
– LUT Mapping: generate the LUT network based on
the labeling in the first phase.
Phase 1: Node Labelling
16
Phase 1: Node Labelling Cont.
17
Phase 2: LUT Mapping
18
Phase 2: LUT Mapping Cont.
19
The FlowMap Pseudocode
20
Enhancements
• Maximising Cut Volume During Mapping
• Postprocessing (flow-pack) Operations to
reduce number of K-LUTs
21
Results
22
Conclusion
• The paper presents the first algorithm to
compute a NP-hard problem in polynomial
time.
• Compared to other algorithms FlowMap is
about to reduce up to 7% of the LUT network
and reduce up to 50% of the number of LUTs.
23
Questions
• It is claimed that a minimum height K-feasible
cut can be found in O(Km) time, where K is the
number of input of LUT and m is the number
of edges of in the network.
• But it is not clear to me how it is derived.
24
Questions Cont.
• It would be interesting to see the time taken
to compute the mapping for the testing cases
with FlowMap vs. Other Algorithms.
• The testing cases are generally small in size, it
would be more convincing to see some large
size examples.
25