Using Vivado HLS to
Implement 8K4K Scaler
© Copyright 2013 Xilinx
.
Introduction
HLS can help to implement the complicated video processing easily, it
is productive design tools to support the similar applications.
The purpose of this design is to using HLS to design a 8K4K scaler.
With this demonstration, you can see that we can use HLS to build an
algorithm quickly and verified it on HW easily, saving much of our
efforts. HLS is very powerful and flexibility.
.
© Copyright 2013 Xilinx
.
Technology Roadmap
In the digital display market, the next innovation wave – SHV(Super Hi-Vision)
8K4K is now emerging.
w/o Glasses 3D
(100 view
points) 60Hz
(Prototype)Natural 3D without fatigue
w/o Glasses 3D
(50 view points)
60Hz
(Prototype) Smooth Motion Parallax
8Kx4K
60Hz
4Kx2K
120Hz
FHD
240Hz
FHD
60Hz
3D(Xpol)
60Hz
3D(R/L)
240Hz
Super High Vision
Public Viewing
4K2K3D(R/L)
120Hz
4Kx2K
60Hz
w/o Glasses 3D
(9 view points)
60Hz
Source: NEDO Strategic Technology Road Map 2009, NHK R&D Open House 2010, Comp. by Xilinx KK
.
© Copyright 2013 Xilinx
.
8Kx4K
60Hz
Super High Vision 21G Band
Satellite Test Transmission
1080p to 8K4K Up-Converter
1X 1080p input
scales to populate
a 8K4K display
1080p
Source
HDMI
Xilinx
7 Series
V-by-OneⓇHS
Up-converters are mandatory in 8K4K displays
Page 4
.
© Copyright 2013 Xilinx
.
Xilinx Scaler IP Status
One key technology in either SHV or UHD is the Scaler.
Unfortunately, currently Xilinx’s Scaler IP can only support 4K2K processing,
and there hasn’t a suitable solution to support 8K4K
.
© Copyright 2013 Xilinx
.
HLS is an efficient tool to implement various
Image & Video Processing Algorithms
C, C++ or SystemC
Image Processing
• Defective Pixel Correction
• Color Filter Array Interp
• Color Correction Matrix
• Gamma Correction
• Color Space Conversion
• Statistics Module
• App example for AWB
• Noise Reduction
• Edge Enhancement
VHDL or Verilog
Video Processing
• Video Scaler
• On-Screen-Display
• Motion Adaptive Noise
Reduction
• Image Characterization
Compression
• H.264/MPEG-4
• MPEG-2
• MPEG-4
• JPEG
• JPEG-2000
• Scalable Video Coding
• Multi-View Coding
• Dolby
• MJPEG
Memory and timing
• Timing Controller
• AXI Interconnect
• AXI Video DMA
Key Image/Video processing IPs
.
© Copyright 2013 Xilinx
.
Connectivity
• 10G/1G Ethernet
• SDI
• Ethernet AVB
• Display port
• HDMI/DVI
• Camera Link
Conceptual Design
Output Image
Input Image
Scaler
Vivado HLS Design
Design Specification
Page 7
FPGA
Kintex7 / 7Z045
Clock Frequency
> 150 MHz
Input Data Rate
1 pixel per clock cycle
Input Image Size
1920 x1080 @ 60 frames per second
Output Image Size
8K x 4K @ 60 frames per second
.
© Copyright 2013 Xilinx
.
Scaler Algorithm Introduction
Lanczos resampling Algorithm.
– Video Scaling is a form of 2D filter operation which can be approximated with the
equation shown bellow
– Usually this equation was separated into two 1-D filter stages in sequence: a
vertical filter stage and a horizontal filter stage:
V-filter:
H-filter:
Page 8
.
© Copyright 2013 Xilinx
.
In this Application
In this Design.
– Vtaps = Htaps = 4
– Coefs:
– TDP: ZC706
– Target Frequency: 250 Mhz
Page 9
.
© Copyright 2013 Xilinx
.
C Model
C Raw Model:
for (row_o=0, row_o<7680, row_o++) {
for (col_o=0, col_o<4320, col_o++) {
row_i = row_o/4;
col_i = col_o/4;
for (m=0, m<4, m++) {
for (n=0, n<4, n++) {
frame_o(row_o, col_o) + = frame_i(row_i+2-m, col_i+2-n) * coef(m,n);
}
}
}
}
C Model can be used for simulation, but may not be suitable for
HW Implementation for this case.
– It just suitable to run on CPU-Cache-Memory architecture
.
© Copyright 2013 Xilinx
.
Expected Scaler HW Structure using HLS
Description.
– read_buffer(): output = input, reserved for future use for algorithm upgrading.
– frame_tmp1 & frame_tmp2 FIFO: used for data buffer
.
© Copyright 2013 Xilinx
.
Vivado HLS Project
Page 12
.
© Copyright 2013 Xilinx
.
C Code: Top level function
An entire image is passed into the function as an array
– Clean, intuitive code
– Designer works at the image level
.
© Copyright 2013 Xilinx
.
v_scaler Implementation
v_sclaer() using 4 line-buffer
– 1920 x 24bit / line buffer
.
© Copyright 2013 Xilinx
.
Top level function rtl ports
The Interface of RTL Top can be easily configured in HLS
.
© Copyright 2013 Xilinx
.
Resource VS Xilinx Scaler IP
Resource Utilization of HLS (Based on 7Z045)
Resource Utilization of Xilinx Scaler IP (K7)
Resource Utilization of HLS is very Impressive! ※
Page 16
.
© Copyright 2013 Xilinx
.
Timing
Timing Summary
– XC7045-1FFG900C
– create_clock -period 4.0 -name sysClk [get_ports ap_clk]
– After Implementation in Vivado2013.02
Xilinx IP
Clock Speed after Implemented: >250Mhz. Amazing!
Page 17
.
© Copyright 2013 Xilinx
.
Modelsim RTL simulation
Simulation Environment
– Using 1920x1080.bmp as imput.
– Write result to 7680x4320.bmp
– One frame simulation is enough.
.
© Copyright 2013 Xilinx
.
Test Image: test_1920x1080.bmp
Page 19
.
© Copyright 2013 Xilinx
.
RTL Simulation Result: result_7680x4320.bmp
Page 20
.
© Copyright 2013 Xilinx
.
Simulation Results and Waveform Snapshoot
Simulation Results
– Input 1080P60, 16.7ms / fs.
– Output 8K4KP, 133ms / fs, that is 8 fs/s.
.
© Copyright 2013 Xilinx
.
HW Implementation
HW Implementation
– the number of scalers used can be reduced to only 8 (each running @
250Mhz).
.
© Copyright 2013 Xilinx
.
Closed & Discussion
HLS is very suitable for many Video & Imaging processing
algorithms.
There are many other algorithms like:
– Noise Reduction
– Image Rotate, Distortion
– Edge Enhancement
– Image Compression like JPEG, JPEG-LS
……
8K4K Scaler should not be an end, but a beginning
.
© Copyright 2013 Xilinx
.
Reference
ug871-vivado-high-level-synthesis-tutorial
xapp890-zynq-sobel-vivado-hls
xapp793-memory-structures-video-vivado-hls
.
© Copyright 2013 Xilinx
.
Thank You!
XILINX CONFIDENTIAL
.
© Copyright 2026 Paperzz