(放大版) - CMLab - National Taiwan University

資
訊
工
程
學
系
國
立
台
灣
大
學
國立台灣大學電機資訊學院資訊工程學系
碩士論文
Department of Computer Science and Information Engineering
碩
士
論
文
分
散
式
影
像
編
碼
在
手
機
上
的
實
現
與
有
效
率
的
回
饋
通
道
陳
群
元
撰
College of Electrical Engineering and Computer Science
National Taiwan University
Master Thesis
分散式影像編碼在手機上的實現與有效率的回饋通道
Distributed Video System realized on mobile device with
efficient Feedback channel
陳群元
Chun-Yuan Chen
指導教授: 吳家麟 博士
Advisor:Ja-Ling Wu,Ph.D
101
7
˙
誌謝
能順利的完成這份碩論,真的要感謝很多人的幫忙,以我一己之力,絕對沒
辦法將這份研究做到盡善盡美,首先我要感謝沈允中學長,沒有他的幫忙,我是
無法在這麼短的時間內通透分散式編碼的精髓。是他一直引導我從各篇論文及書
籍當中,找尋我所需要的知識,帶領我走向學者的殿堂。另外蘇則仲學長,也在
我有疑問的時候,不吝賜教。
在學習的路程中,非常感謝謝致仁學長、林映孜學姊、胡敏君學姊、鄭文皇
學長、林裕訓學長、賴瑞欣學長還有黃俊翔學長的鼎力相助。
當然,影響我最深的還是我的指導教授 吳家麟 博士,每當我在學習的路途
中不知所措的時候,他總是像一道光,指引我走向另一座山峰。
最讓我感動的是,在我學習的路上,有那麼多的好友相伴,在每個辛勞研究
過程中,DSP 的組員們,際巧、明宏、奕婷、宇蓓、志霖還有許多好夥伴們總是
與我並肩作戰,讓我的研究所生活充滿了歡樂!
最後,我感謝我的家人,沒有他們的關愛照顧,我沒有辦法將所有的心力投
注在學業上。
i
中文摘要
分散式編碼是一種新的編碼方式,不同於以往的編碼架構,分散式編碼有著
較簡單的編碼器及較複雜的解碼器,所以分散式編碼很適合在運算能力較差的機
器上做分散式編碼。由於分散式編碼的特性,分散式編碼在許多應用上面變得越
來越熱門。例如視訊會議、手機錄影,都可以利用分散式編碼的好處,降低手機
裝置上的耗電量,而另一方面我們可以利用運算能力極強的伺服器來做較難的解
碼。
但是我們發現一個嚴重的問題,雖然在編碼和解碼都可以在很快執行。但是
由於現行的分散式系統使用的是低密度奇偶校驗碼,所以在解碼的過程中,解碼
器和編碼器需要大量的溝通。這在現實的運用上是一個相當大的問題。所以在我
的碩論裡,不但會在手機裝置實現分散式系統,也會對來回溝通上做加速。
關鍵字: 分散式視訊編碼、低密度奇偶校驗碼
ii
ABSTRACT
DVC is a new video codec. Compare to traditional video codec, DVC has
light-weight encoder and heavy weight decoder, so DVC encoder is suited to do
encoding on low computing power devices. Because of the property of DVC, DVC is
more and more popular in recent years. For example, DVC save electricity in mobile
video conference and video recording. On the other hand, the DVC decoder runs on
power server.
Unfortunately, even thought DVC encoder and decoder run fast, there are a lot of
communications between encoder and decoder because of the LDPCA adopted by
DVC. It is a big problem. in our work, we realize the DVC system on mobile device
and propose an efficient feedback channel.
iii
CONTENTS
口試委員會審定書............................................................................................................ i
誌謝..................................................................................Error! Bookmark not defined.
中文摘要........................................................................ Error! Bookmark not defined.i
ABSTRACT ...................................................................Error! Bookmark not defined.v
CONTENTS ...................................................................................................................... v
LIST OF FIGURES ......................................................................................................... vi
LIST OF TABLES ............................................................................................................ x
Chapter 1
Introduction ............................................................................................. 1
Chapter 2
DVC system architecture ........................................................................ 6
Chapter 3
DVC system realizes on mobile .............................................................11
3.1
3.2
3.3
Encoder ........................................................................................................ 11
3.1.1
Capture video sequence with mobile devices ..................................... 14
3.1.2
Realize DVC encoder on mobile ........................................................ 14
Decoder ........................................................................................................ 13
3.2.1
DVC decode on remote server............................................................ 20
3.2.2
Transcode result sequence for mobile ................................................ 22
Feedback channel ......................................................................................... 13
iv
3.3.1
Chapter4
Feedback channel realize in network ................................................. 16
Performance Evaluation .................................................................. 23
4.1
Syndromes distribution ................................................................................ 23
4.2
Estimate the syndromes size per WZ frame ................................................. 26
4.3
Estimate the syndromes size per bitplane .................................................... 30
Chapter5
Performance Evaluation ....................................................................... 23
5.1
Test conditions and Benchmarks.................................................................. 23
5.2
Decoding complexity analysis ..................................................................... 26
5.3
Quality and Bitrate evalution ....................................................................... 30
Chapter 6
Conclusion and Future Work ............................................................... 41
REFERENCE .................................................................................................................. 43
v
LIST OF FIGURES
Figure 2-1. The architecture of DISCOVER video codec. .................................................
Figure 3-1. DISCOVER video codec with NLM-SIR architecture. ...................................
Figure 3-2. NLM refinement of pixel R[x,y] within a selected block. ...............................
Figure 3-3. NLM refinement of pixel R[x,y] taking into account of the temporal
similarities. .......................................................................................................
Figure 4-1. PSNR (top) and bits per frame (bottom) evolution with and without
NLM-SIR for the Foreman sequence coded at 15Hz, Q8 and GOP size 8. .....
Figure 4-2. PSNR (top) and bits per frame (bottom) evolution with and without
NLM-SIR for the Soccer sequence coded at 15Hz, Q8 and GOP size 8. ........
Figure 4-3. Rate-distorition for the Foreman sequences (QCIF, 15Hz). ............................
Figure 4-4. Rate-distorition for the Soccer sequences (QCIF, 15Hz). ................................
Figure 4-5. Rate-distorition for the Coastguard sequences (QCIF, 15Hz). ........................
Figure 4-6. Rate-distorition for the Hall Monitor sequences (QCIF, 15Hz). ......................
vi
LIST OF TABLES
Table 4-1. Eight quantization matrices associated with eight different RD
performance points. ..........................................................................................
Table 4-2. Decoding complexity analysis for all test sequences using Q1, Q4 and Q8
for GOP size 8. .................................................................................................
Table 4-3. Decoding time for Foreman sequence. ............................................................
Table 4-4. Decoding time for Soccer sequence. ................................................................
Table 4-5. Decoding time for Coastguard sequence. ........................................................
Table 4-6. Decoding time for Hall Monitor sequence. ......................................................
vii
Chapter 1 Introduction
Nowadays, attending mobile video conference and video recording with mobile
phone’s cameras are popular. Even though mobile device’s computing power develops
in recent years, video coding still cost a lot computing power of mobile device and
easily run out battery, which causes a huge problem. Obviously, the conventional
video coding doesn’t fit the mobile devices. In our thesis, we build a whole video
transcoding system based on Distribution Video System on mobile devices, which
doing light weight coding on mobile device and shifting the complexity on remote
server, so the practical video system decrease power costing on mobile largely. Take
an example in our life, FaceTime, a popular video call application on iPhones or iPad,
transfers video taking by mobile camera to another device with network, it cost a lot
of computing power. Mobile battery runs down quickly when using FaceTime
because of network and camera cost a lot power and video coding is a big burden for
mobile device. Because it is essential to keep the network and cameral on when taking
video communication, the only way to decrease cost is using fitter video coding
system for low computing device. In our work, it is easy to decrease power cost on
mobile device by using our DVC video system.
Distributed video coding (DVC) is a new video codec. DVC codec subverts the
viii
traditional prediction-based standard video scheme by exploiting the source statistics
at the decoder with the development of simpler encoders. Different from traditional
video codec has heavy weight encoder and light weight decoder, DVC video system
has light weight encoder and heavy weight decoder, it mean that DVC system more fit
the situation that the encoder part is realized on mobile device and the encoder should
be keep light weight so mobile can handle it.
The benchmark DISPAC[1] adopt the WZ-coding which based on Slipian-Wolf[2]
and Wyner-ziv theorems[3]. DVC divide the source video into key frames and WZ
frames. DISPAC generate Side Information (SI) from key frames, and adopt the
LDPCA[4,5] to do error correction to SI. Because of the property of LDPCA’s
property, LDPCA decoder request more syndrome once once decode on unit (ex.one
bitplane in WZ frame), the feedback channel is designed to handle the communication
between encoder and decoder.
In our work, we build the DVC system based on the benchmark[6] and DVC to
H.264 transcoder[7] .We realize DVC encoder on mobile device, and doing DVC
decoder on remote server, and also build up the feedback channel. At last, transcode
the decoded video into the mp4 or 3gp form which mobile devices can decode.
ix
Unfortunately, the communication on feedback channel spends a lot of time.
Because of LDPCA’s feature, there are a lot communication between encoder and
decoder and there is only a little syndrome in each network packet. These two
problems, frequently network transmission and the network packet header’s overhead,
cause a huge time consuming on the feedback channel.
To resolve these problems, we should group these loose syndromes in each
individual network packet. For shortly, syndromes size represents the amount of
syndromes LDPCA decoder requires correcting one bitplane in the whole paper. To
assemble these loose syndromes packet, we propose two method to predict the
syndromes size need in each bitplane decoding. The first method is estimate the
syndromes size per WZ frames, this method predict syndromes size by referencing the
corresponding WZ frame in the previous group of pictures (GOPs). Because of the
temporal relation, the neighbor GOP’s frames have the similar property such as
motion vectors and side information quality, we can estimate the syndromes size by
referencing the neighbor GOPs. The second method is estimate the syndromes size
per bitplane. By observing our experiment result and number of requests statistic [8],
we can found that in WZ frame, the number of requests in each DC band and AC
bands has the similar distribution. In DC band, the syndromes size is larger by
x
bitplane. In the other hand, in AC band, syndromes size larger increase with the
similar trend as DC band, so we can estimate the AC bands’ syndromes size by
referencing the DC band.
xi
Chapter 2 DVC architecture
The benchmark[6] DVC system DISPAC architecture is showed below as Fig.1,
which has capability to decode low motion video in real time. In our work, we
reference this version of DVC code to realize DVC on mobile. Notably, the
benchmark project’s encoder and decoder are executed on the same PC. But our work
realized the DVC video system on the practical situation, mobile devices.
Fig.1 DVC architecture
In the encoder side, the video sequence is divided into WZ frames and key
xii
frames. The key frames are coding with the traditional video codec H.264/AVC. And
the WZ frames are treated with mode selection, quantizer, LDPCA encoder and CRC.
In the decoder side, decoder generates the side information for each WZ frames.
But the side information is far from the WZ frame, so LDPCA keep request more
syndromes to reconstruct the SI. Because the LDPCA has the property that the
LDPCA decoder request more syndromes once it doing error correction for each WZ
frames, there must be a feedback channel between LDPCA encoder and decoder. The
feedback channel is designed to send syndrome to decoder and request to encoder
when LDPCA decoder fails to decode on bitplane of WZ frames. The feedback
channel is the most important component in the practical DVC application because
the LDPCA coding has no way to do error correct without feedback channel.
Unfortunately, in this benchmark project, there is no feedback channel
implement. Our reference DVC system’s decoder access the same memory as the
encoder when it does LDPCA decoding. And the traditional video coding is doing
offline and implement with JM9.5 which is a version of H.264 and has heavy
complexity. It is not suitable for practical application. In our work, it is no sense to
run DVC video system on different devices without feedback channel. Besides, it is
also very ineffective to run JM for encoding key frame. To improve these problems,
xiii
we implement the feedback with the network connection. And replace the JM with
x264 which is an efficient implementation for H.264 coding.
xiv
Chapter 3 DVC system realizes on mobile devices
Fig.3-1 DVC system realizes on mobile devices
In our work, I realize the DVC video system, Fig.3-1, on mobile devices. At the
encoder side, I capture real video with the mobile camera, implement the real
feedback channel, and aim at the time complexity’s decreasing, propose two methods
to estimate the syndromes size, the estimation method can effectively decrease the
time of request on feedback channel and finally decrease time complexity spending on
the feedback channel.
xv
Compare to the previous work we reference, we implement many components
and utilities for realizing DVC system into practical application. In the encoder side,
we do the practical video capture with mobile camera and DVC encoder realization
on mobile devices. In the decoder side, we execute the DVC decoder on the remote
server, and transcode the decoded video sequence into the format which mobile
devices can decode. We also provide a mobile application for seeing the decoded
video. For the whole DVC system, we implement the feedback channel to handle the
communication between encoder and decoder through network. Besides, we also
propose two method for giving a more efficient feedback channel, and it is an
important contribution.
3.1 Encoder
Our work is a practical application of DVC system on mobile. To achieve that,
we use the mobile devices’ camera to record video, and porting the DVC encoder on
the mobile devices.
3.1.1 Capture video sequence with mobile devices
For the benchmark we referencing, it takes the test video data to be the source
xvi
sequence. But in our work, we record the video with the mobile device’s camera.
3.1.2 Realize DVC encoder on mobile devices
The benchmark DVC project is coding with C/C++ language. And the mobile
device we used is with android system. For the DVC encoder porting on mobile
devices, we use the android Native Development Kit (NDK) [9] to doing the process.
With the NDK tools, we can generate the native code library from the exist DVC
encoder code. Once we generate the native code, we call it in our application on
android mobile device. After all, the native code doing the DVC encoding in our
application.
For the detail of porting on the mobile device, we describe the porting process
step by step. in the first of all, we should build up the situation include Java platform
(JDK), Eclipse, android SDK and NDK.
The second, adjust the C/C++ code project like below, Fig.3-2.
Fig.3-2 C/C++ project
xvii
The third, create the makefile, Fig.3-3. Like the C/C++ project, there should be
one makefile. The makefile is similar, but this makefile should follow the rule of
NDK provide.
Fig.3-3 Makefile
The fourth, we generate the static library(.a) of x264 and other linking library. In
this step, we take the cross compiler to generate the static library which targeting on
android system. For the cross compiler, we use the cross compiler provided by
android SDK.
The fifth, we create builder in eclipse to teach eclipse the information and
argument and C/C++ project and libs location.
The sixth, change the file path. Keep in mind that the file location is not the same
as on PC.
At last, build this project to generate the share library and model it into an
function in Java language.
From now on, we can realize DVC encoder by executing
xviii
the generated function call in our application.
3.2 Decoder
Because the property of the DVC decoder, DVC decoder has a heavy complexity,
the DVC decoder doesn’t suit for realized on mobile devices. In our work, we execute
the DVC decoder on the remote server. Notably, we also need to transcode the
decoded sequence to the format that mobile devices can decode, so the other mobile
device can see the decoded video.
3.2.1 DVC decode on remote server
We execute the DVC decoder on server and communication with mobile device
through network. Besides, we also replace the JM decoder with the more efficient
x264 decoder.
3.2.2 Transcode result sequence for mobile
In the benchmark’s implementation, the final result’s format is the raw sequence.
To make the receiver side mobile device can decode the video, we transcode the result
to mp4 or 3gp which mobile devices can decode.
Notably, because our benchmark only coding the luminance part of video, so we
xix
should do some process before transcoding. Once DVC decoder finish decoding,
decoder generate the luminance sequence. We should use the luminance value to
create each gray-pictures into RGB form, and then execute FFMPEG to collect these
gray-level RGB picture to one video with mp4 or 3gp form which mobile device can
decode.
3.3 Feedback channel
Notable, the previous work and benchmark doesn’t implement the feedback
channel communication. The previous implementation of DVC encoder and decoder
are both execute on the same PC device, and the previous work’s decoder access the
same memory of the LDPCA’s encoder buffer, so there is no feedback channel in
previous work. In our work, I implement the real feedback channel to handle the send
and request on the network.
Because of the property of LDPCA we used in Slepian-Wolf, the LDPCA
decoder request more syndromes bits once decoding one unit (ex. Bitplane in WZ
frames). It is a good property for DVC system to keep the bitrate near Shannon limit,
but in the practical situation, the DVC encoder and decoder are not execute in the
same device. In our application, we do DVC encoding on the mobile device, and
xx
doing DVC decoding on remote server. Between encoder and decoder, there should be
one feedback channel to handle the syndrome delivery. We build up the network
connection to realize the feedback channel, and the process is showed below. Fig.3(a)
show that when decoder decode one bitplane of WZ frame, LDPCA encoder
computing the minimum size of syndromes LDPCA decoder need, and encoder
collect these syndromes into one network packet and send this packet to decoder.
Fig.3(b) once decoder got these syndrome, LDPCA decoder try to decode the bitplane
with syndromes decoder receive and check cyclic redundant check (CRC). If decoder
has recorrect this bitplane successfully, then decoder start to decode the next bitplane
until the all bitplanes of the WZ frame has been recorrect. Fig.3(c) once the decoder
request more syndromes, it mean decoder can not decoded this bitplane correctly, so
encoder send more syndromes to decoder. And each time encoder send more
syndromes, the amount of syndromes need to be sent is depend on the LDPCA’s
bipartite graph. In our work, the syndromes size each time decoder send request is the
same. After repeating Fig.3’s process, one WZ frame has been decoded.
xxi
Fig.3(a)(b)(c) the process of decoding one bitplane with LDPCA. Because the
LDPCA is the accumulation of LDPC, the each syndromes sending size (yellow or
blue block) is depend on the syndromes of one LDPC’s bipartite graph need. (a)
LDPCA encoder send a part of syndromes to LDPCA decoder. (b) once LDPCA
decoder receive syndromes packet from encoder, Decoder do LDPCA decoding and
CRC check. If this bitplane has been recorrected, the decoder start to decode the next
bitplane. If not, decoder send request for more syndromes. (c) once encoder got the
request from decoder, encoder send more syndrome to decoder.
3.3.1 feedback channel realize in network
In our work, we play the DVC encoder on mobile device and play DVC decoder
on remote server. Because the DVC encoder and decoder don’t execute in the same
device, we create a feedback channel through the network to handle the LDPCA’s
mechanism. There are two main kind of internet protocol suite, TCP and UDP, and
both have their advantage and disadvantage. Transmission Control Protocol (TCP)
provides reliable, ordered delivery of stream. On the other hand, User Datagram
xxii
Protocol (UDP) emphasizes reduced latency but reliability. Even though UDP is faster
than TCP, but unfortunately our DVC project DISPAC doesn’t support error resilience,
it means we should adopt TCP protocol.
Once we creating TCP connection and socket between mobile device and server,
we can deliver the syndromes from encoder to decoder and send request from decoder
to encoder. Because of the LDPCA’s property, there are a lot of request when decode
one WZ frame. and each time encoder send more syndrome, the syndromes size is just
a little. Combine these situation, there are a lot of network packet to be sent and each
network packet has just a little syndrome even smaller than packet header. After all,
the large number of network transit delay and packet header’s overhead cause a heavy
time complexity. In our experiment, the time consuming on feedback channel is up to
90 percent of all decoding time. absolutely, the time complexity for send-request is
the bottle neck of the whole system.
xxiii
Chapter 4 Effic ient feedback channel
According to the mention of previous paragraph, the time complexity of network
communication is the main part of total decoding time. The most important problem is
that the amount of requests is too much, the high frequency of data transmit cause a
huge network delay. To improve this problem, we should decrease the amount of
requests.
Aiming to decrease the amount of requests, we can estimate the syndromes size
need for decoding each bitplane by referencing the previous decoded information and
status like syndromes size. We can easy think the side information’s quality effect the
syndromes size because the LDPCA told us that the more similar side information to
WZ frame, less the syndromes bits needed. Notably, the decoder has no way to know
the difference between side information and the real WZ frame, so the only way
decoder can do is referencing other value to simulate the difference between side
information and WZ frame. In the next subparagraph, unfortunately, there is no
decoder known value has strong correlation with syndromes size.
With the syndromes size of decoded bitplanes or WZ frames,we propose two
methods to decrease the communication time. The first method is estimate the
xxiv
syndromes size per WZ frame. In this method, we estimate the syndromes size per
WZ frame by referencing the corresponding WZ frame in the previous GOP. The
second method is estimate the syndrome size per bitplane. In this method, when
decoder decoding the AC bands’ bitplanes, we estimate the syndromes size for one
bitplane by referencing the corresponding bitplane in the DC band of WZ frame.
4.1 syndrome distribution
The essential spirit of the paragraph is to discuss what value has strong
correlation with syndromes size needed for one bitplane or WZ frame. With the
property of LDPCA, the syndromes size needed is related with the side information’s
quality. By simulate the side information’s quality, we have some candidates, motion
vector value and residual value. Motion vector value is computing by accumulate the
motion vector’s absolute value, the motion vector value showed in formula.1 is the
sum of absolute value of motion vector for each pixels. It has relation with the side
information’s quality because that if smaller the motion vector is, more similar the
side information to the WZ frame. Residual value is computing as the sum of absolute
value of difference between previous and next reference frame. It is illustrate as
formula.2, the residual value is sum of absolute value of previous and next frame
pixel value’s difference. Consider these two values; the residual value is the better
xxv
choice because it is more accurate in the situation that the motion vector is big but the
residual value is big.
With the mention of the previous paragraph, we decide to compute the
correlation between residual value and syndrome. The experiment result showed in
Fig.4 is calculated as the middle WZ frame in one GOP. Unfortunately the Fig.4
shows that the residual value and syndromes size have no positive correlation. With
the experiment observation, we thought that it is too hard to find one value to describe
how much syndromes size LDPCA decoder need to decode one WZ frame.
M=
∑ |𝑚𝑣𝑝 |
𝑝=𝑝𝑖𝑥𝑒𝑙𝑠
Formula.1 motion vector value
R = ∑ |𝐹𝑝𝑝𝑟𝑒𝑣𝑖𝑜𝑢𝑠 − 𝐹𝑝𝑛𝑒𝑥𝑡 |
𝑝=𝑝𝑖𝑥𝑒𝑙
Formula.2 residual value
xxvi
Fig.4-1 correlation between syndromes bits and residual value, which residual value is
computed by absolute value of subtract previous reference frame and next.
4.2 Estimate the syndromes size per WZ frame
Before we talking about the methods to estimate syndromes size, we can
observer the distribution of syndromes size for each bitplane. With the observation of
experiment, we found that the correspond WZ frame in neighbor GOP have the
similar syndromes size distribution.
The first method to decrease request times is estimate the syndromes size per
WZ frame. In this method, we assume that the corresponding WZ frame have the
similar syndromes size in the neighbor GOP, so we estimate the syndromes size by
xxvii
referencing the corresponding WZ frame in the previous GOP.
The formula of the first method is showed below Formula. 4-1, WZn means the
WZ frame’s index, n-GOPsize means the correspond WZ frame in the previous GOP,
bt is the bitplane index, {𝐸𝑆 𝑏𝑡 }𝑊𝑍𝑛 means the estimated syndromes bit for the bt’th
bitplane of WZn frame, and {𝑆 𝑏𝑡 }𝑊𝑍(𝑛−𝐺𝑂𝑃𝑠𝑖𝑧𝑒) means the syndromes bit for the
bt’th bitplane of WZ(n-GOPsize) frame.
{𝐸𝑆 𝑏𝑡 }𝑊𝑍𝑛 = {𝑆 𝑏𝑡 }𝑊𝑍(𝑛−𝐺𝑂𝑃𝑠𝑖𝑧𝑒)
Formula. 4-1 estimate syndromes size with the previous GOP
Fig.4-2 bit rate per bitplane per band for Qi=8, for one WZ frame
xxviii
Fig.4-3 bit rate per bitplane and bit rate estimated by DC band for Qi=8, for one WZ
frame
4.3 Estimate the syndromes size per bitplane
With the observing of experiment shown in Fig.4-2, we can found that the
syndromes size in each DC band and AC bands have the similar distribution. In one
WZ frame decoding, we can estimate the syndromes size in AC bands by referencing
the DC band of the same WZ frame.
The formula of the second method is showed in Formula 4-2. Which WZn means
the n’th WZ frame. acn means the n’th AC band. bt means the bitplane index of this
𝑏𝑡
AC band. Which 𝐸𝑆𝑎𝑐𝑛
mean the estimated syndrome size for the bt’th bitplane in
𝑏𝑡−1
n’th AC band. dc means the DC band. 𝑆𝑑𝑐
bitplane in DC band.
xxix
means the syndromes bits for bt-1’th
𝑏𝑡−1
𝑏𝑡 }
{𝐸𝑆𝑎𝑐𝑛
𝑊𝑍𝑛 = {𝑆𝑑𝑐 }𝑊𝑍𝑛
Formula 4-2 Estimate the syndromes size per bitplane
xxx
Chapter 5 Performance Evaluation
5.1 Test conditions and Benchmarks
The following simulation are performed with our transform domain Wyner-Ziv
(TDWZ) video codec, called DISPAC [2]. And mobile video communication system
run on HTC sensation, while the transcoder is run on server with GPGPU Tesla
M2050. This section will be discuss and compare with four proposed methods, basic
version, syndrome rearrangement, pre-send average amount syndromes and parallel
syndrome delivery. All the four method are run with LRSS. In the detail setting, we
take the 8th quantize table, group of picture (GOP) is 8, and set intra mode on to
enable the intra coding. Notably, the following table and result is set with LRSS
which gives the better system performance.
5.2 Decoding complexity analysis
method\video
foreman
Benchmark
Without
feedback
channel
16.32 sec
soccer
coastguard
xxxi
hall
Table. 5-1 time consuming without feedback channel implement.
method\video
foreman
With feedback
channel
330.22 sec
soccer
coastguard
hall
Estimate per
WZ frame
Estimate per
bitplane
Table. 5-1 time consuming in our work
5.3 Quality and Bitrate evaluation
Fig. 5-1 BitRate with all proposed methods, for Qi=8, for Foreman, Soccer, Coast
Guard, and Hall Monitor (QCIF at 15 Hz).
xxxii
Q8
Q4
Q2
Bitrate
PSNR
450.32
161.18
85.79
39.29
32.13
28.53
Table.5-2(a) Bitrate and PSNR for the foreman with LRSS, GOP8,without estimate
Q8
Q4
Q2
Bitrate
PSNR
564.29
204.38
112.06
39.29
32.13
28.53
Table.5-2(b) Bitrate and PSNR for the foreman with LRSS, GOP8, estimate per WZ
frame
Q8
Q4
Q2
Bitrate
PSNR
475.29
178.56
94.50
39.29
32.13
28.53
Table.5-2(c) Bitrate and PSNR for the foreman with LRSS, GOP8, estimate per
bitplane
Fig.5-2 Rate-distortion curve for the foreman with LRSS, GOP8
xxxiii
Q8
Q4
Q2
Bitrate
PSNR
153.45
66.01
38.89
39.47
33.51
30.86
Table.5-3(a) Bitrate and PSNR for the hall monitor with LRSS, GOP8,without
estimate
Bitrate
PSNR
Q8
195.62
39.47
Q4
Q2
84.76
49.76
33.51
30.86
Table.5-3(b) Bitrate and PSNR for the hall monitor with LRSS, GOP8, estimate per
WZ frame
Bitrate
PSNR
Q8
Q4
157.72
68.68
39.47
33.51
Q2
39.50
30.86
Table.5-3(c) Bitrate and PSNR for the hall monitor with LRSS, GOP8, estimate per
bitplane
Fig.5-3 Rate-distortion curve for the hall monitor with LRSS, GOP8
xxxiv
Chapter 6 Conclusion and feature work
In our work, we implement and propose many components and utilities.in the
encoder side, we realize the DVC encoder on mobile device, and record the video
sequence with mobile device’s camera. For the communication between encoder and
decoder, we implement the feedback channel with network connection. Aim to
decrease the time complexity on feedback channel, we propose two methods to
estimate the syndromes size. For the decoding side, we execute the DVC decoder on
the remote server. After decoder decode the video, we do a transcode so that mobile
devices can decode the result video and play it out.
Aim to decrease the time complexity on the feedback channel, we propose two
methods to estimate the syndromes size per WZ frame and per bitplane. The first
method estimates the syndromes size by referencing the corresponding WZ frame in
the previous GOP. The second method estimates the syndromes size of AC band by
referencing the DC band. The first method is decreasing the time complexity largely,
but it cause a huge bitrate increasing. The second method has just a little increasing of
bitrate, and a good speed up on feedback channel.
In the future, we would like to take a more precise estimation of syndromes size.
xxxv
And build a more reliable DVC codec system which is loss tolerant. If we can handle
the loss and error, we may adopt the UDP protocol and decrease the time of
communication easily.
xxxvi
Reference
[1] Han-Ping Cheng 1 , Yun-Chung Shen 1 , Ja-Ling Wu 1 , and Kiyoharu Aizawa 2.
High Efficient Distributed Video Coding with Parallelized Design for Cloud
Computing
[2] Slepian, D. and Wolf, J. 1973. Noiseless coding of correlated information sources.
IEEE Transactions on Information Theory. 19, 4, 471- 480.
[3] Wyner, A. and Ziv, J. 1976. The rate-distortion function for source coding with
side information at the decoder. IEEE Transactions on Information Theory. 22, 1,
1-10.
[4] Yu-Shan Pai, Han-Ping Cheng , Yun-Chung Shen and Ja-Ling Wu. Fast Decoding
for LDPC Based Distributed Video Coding
[5] David Varodayan, Anne Aaron and Bernd Girod. Rate-Adaptive Codes for
Distributed Source Coding.
[6] Tse-Chung Su. Yun-Chung Shen. and Ja-Ling Wu. 2011. Real-time Decoding for
LDPC Based Distributed Video Coding. National Taiwan University
[7] Martinez, J.L.; Fernandez-Escribano, G.; Kalva, H.; Fernando, W.A.C.; Cuenca,
P.2009.Wyner-Ziv to H.264 Video Transcoder for Low Cost Video Encoding.
[8] Catarina Brites a, Joa˜o Ascenso b, Jose´ Quintas Pedro a, Fernando Pereira
Evaluating a feedback channel based transform domain Wyner–Ziv video codec
[9] Android NDK link: http://developer.android.com/tools/sdk/ndk/index.html
xxxvii