short term fourier transform

AN EFFICIENT IMAGE COMPRESSION ALGORITHM WITH
GEOMETRIC WAVELETS & GEOMETRIC WAVELET
PACKETIZATION USING PSO
A PROJECT REPORT
Submitted by
SANDHIYA.R
51608106043
SENTHAMILSELVAN.S
51608106046
SIVA.M
51608106047
THENMOZHI.K
51608106055
LEEYAN MC LAKPAN
51607106023
In partial fulfillment for the award of the degree
of
BACHELOR OF ENGINEERING
in
ELECTRONICS AND COMMUNICATION ENGINEERING
THANTHAI PERIYAR GOVERNMENT INSTITUTE OF
TECHNOLOGY, VELLORE – 2
ANNA UNIVERSITY : CHENNAI 600025
APRIL 2012
1
ANNA UNIVERSITY : CHENNAI 600 025
BONAFIDE CERTIFICATE
Certified
that
this
project
report
“…AN
EFFICIENT
COMPRESSION ALGORITHM WITH GEOMETRIC
IMAGE
WAVELETS &
GEOMETRIC WAVELET PACKETIZATION USING PSO.…” is the bonafide
work of “R.SANDHIYA (51608106043), S.SENTHAMILSELVAN(51608106046),
M.SIVA (51608106047), K.THENMOZHI (51608106055), LEEYAN MC
LAKPAN(516)” who carried out the project work under my supervision.
SIGNATURE
SIGNATURE
Prof.R.Satyabama, M.E.,
Prof.Nishat kanvel, M.E.,
HEAD OF THE DEPARTMENT
SUPERVISOR
Associate Professor/HOD,
Department of MCA,
Thanthai Periyar Govt. Inst. of
Tech,
Vellore – 632 002.
Electronics and Communication Engg.,
Thanthai Periyar Govt. Inst. of Tech,
Vellore – 632 002.
Project Viva-Voce Examination held on ……………………….
2
ANNA UNIVERSITY : CHENNAI 600 025
BONAFIDE CERTIFICATE
Certified that this project report “…AN EFFICIENT IMAGE
COMPRESSION ALGORITHM WITH GEOMETRIC WAVELETS &
GEOMETRIC WAVELET PACKETIZATION USING PSO.…” is the
bonafide work of “R.SANDHIYA (51608106043)”who carried out the project
work under my supervision.
SIGNATURE
SIGNATURE
Prof.R.Satyabama, M.E.,
Prof.Nishat Kanvel, M.E.,
HEAD OF THE DEPARTMENT
SUPERVISOR
Associate Professor/HOD,
Department of MCA,
Thanthai Periyar Govt. Inst. of
Tech,
Vellore – 632 002.
Electronics and Communication Engg.,
Thanthai Periyar Govt. Inst. of Tech,
Vellore – 632 002.
3
ANNA UNIVERSITY : CHENNAI 600 025
BONAFIDE CERTIFICATE
Certified that this project report “…AN EFFICIENT IMAGE
COMPRESSION ALGORITHM WITH GEOMETRIC WAVELETS &
GEOMETRIC WAVELET PACKETIZATION USING PSO.…” is the
bonafide work of “S.SENTHAMILSELVAN (51608106046)”who carried out
the project work under my supervision.
SIGNATURE
SIGNATURE
Prof.R.Satyabama, M.E.,
Prof.Nishat Kanvel, M.E.,
HEAD OF THE DEPARTMENT
SUPERVISOR
Associate Professor/HOD,
Department of MCA,
Thanthai Periyar Govt. Inst. of
Tech,
Vellore – 632 002.
Electronics and Communication Engg.,
Thanthai Periyar Govt. Inst. of Tech,
Vellore – 632 002.
4
ANNA UNIVERSITY : CHENNAI 600 025
BONAFIDE CERTIFICATE
Certified that this project report “…AN EFFICIENT IMAGE
COMPRESSION ALGORITHM WITH GEOMETRIC WAVELETS &
GEOMETRIC WAVELET PACKETIZATION USING PSO.…” is the
bonafide work of “M.SIVA (51608106047)”who carried out the project work
under my supervision.
SIGNATURE
SIGNATURE
Prof.R.Satyabama, M.E.,
Prof.Nishat Kanvel, M.E.,
HEAD OF THE DEPARTMENT
SUPERVISOR
Associate Professor/HOD,
Department of MCA,
Thanthai Periyar Govt. Inst. of
Tech,
Vellore – 632 002.
Electronics and Communication Engg.,
Thanthai Periyar Govt. Inst. of Tech,
Vellore – 632 002.
5
ANNA UNIVERSITY : CHENNAI 600 025
BONAFIDE CERTIFICATE
Certified that this project report “…AN EFFICIENT IMAGE
COMPRESSION ALGORITHM WITH GEOMETRIC WAVELETS &
GEOMETRIC WAVELET PACKETIZATION USING PSO.…” is the
bonafide work of “K.THENMOZHI (51608106055)”who carried out the
project work under my supervision.
SIGNATURE
SIGNATURE
Prof.R.Satyabama, M.E.,
Prof.Nishat Kanvel, M.E.,
HEAD OF THE DEPARTMENT
SUPERVISOR
Associate Professor/HOD,
Department of MCA,
Thanthai Periyar Govt. Inst. of
Tech,
Vellore – 632 002.
Electronics and Communication Engg.,
Thanthai Periyar Govt. Inst. of Tech,
Vellore – 632 002.
6
ANNA UNIVERSITY : CHENNAI 600 025
BONAFIDE CERTIFICATE
Certified that this project report “…AN EFFICIENT IMAGE
COMPRESSION ALGORITHM WITH GEOMETRIC WAVELETS &
GEOMETRIC WAVELET PACKETIZATION USING PSO.…” is the
bonafide work of “LEEYAN MC LAKPAN (51607106023)”who carried out
the project work under my supervision.
SIGNATURE
SIGNATURE
Prof.R.Satyabama, M.E.,
Prof.Nishat Kanvel, M.E.,
HEAD OF THE DEPARTMENT
SUPERVISOR
Associate Professor/HOD,
Department ofMCA,
Thanthai Periyar Govt. Inst. of
Tech,
Vellore – 632 002.
Electronics and Communication Engg.,
Thanthai Periyar Govt. Inst. of Tech,
Vellore – 632 002.
7
ACKNOWLEDGEMENT
I express my humble thanks to the Almighty for having showered his
blessings and divine light to raise me to the height of presenting this beneficial
report and to complete my project successfully.
It is bounden duty to convey my sincere thanks and deep sense of
gratitude to our principal Dr. M. Arularasu, Thanthai Periyar Government
Institute of Technology, Vellore, for his kind help throughout my project work.
It is deep sense of gratitude that I acknowledge my indebtedness to
Prof. R. Sathyabama, Professor & Head, Department of Electronics and
Communication Engineering, Thanthai Periyar Government Institute of
Technology, Vellore, perfectionist for having taught me the value of discipline
which helped me meet my deadlines well within the schedule time.
I extend my hearty gratitude to my guide Prof. NishatKanvel M.E.,
Associate Professor, Department of MCA, Thanthai Periyar Government
Institute of Technology, Vellore, for her valuable directions suggestions and
guidance with ever enthusiastic encouragement ever Since the commencement
of the project.
I owe my special thanks to all staff members of ECE department for their
kind suggestion and their cooperation. My heartful thanks to my parent for their
prayers, suggestions and help.
8
ABSTRACT
Lack of efficiency in image compression based on wavelet, in the
aspect of performance speed, localization of each particle in the image, clarity
and good resolution of image, it is necessity to seek other type of wavelet
compression. In order to circumvent above problem we are going to geometric
wavelet.
In geometric wavelet, image compression is achieved with help of binary
space partition scheme and particle swarm optimization. Here optimization
plays vital role in geometric wavelet based image compression. Because of
packetizing the image, it will ease the compression process in efficient manner
by picking the similarity among the best wavelet packet which is obtained from
particle swarm optimization.
By this method of image compression we attain the good quality of
reconstructed image with appreciable PSNR value. For instance low bit rate
image coding, performance of image compression by wavelet transform is not
optimal. But in this project it is possible by using geometric wavelet.
9
TABLE OF CONTENTS
CHAPTER NO.
1.
2.
3.
4.
TITLE
PAGE NO.
ABSTRACT
ix
LIST OF ABBREVIATIONS
xi
LIST OF FIGURES
xiii
INTRODUCTION
1.1
IMAGE
14
1.2
IMAGE COMPRESSION
15
1.3
IMAGE COMPRESSION MODEL
16
1.4
BENEFITS OF COMPRESSION
17
1.5
IMAGE COMPRESSION TECHNIQUES
17
WAVELET
2.1
WAVELET
21
2.2
WAVELET ANALYSIS
22
2.3
DISCRETE WAVELET TRANSFORM
24
2.4
2DWAVELET ANALYSIS
26
2.5
INVERSE WAVELET TRANSFORM
27
GEOMETRIC WAVELET
3.1
GEOMETRIC WAVELET
28
3.2
CONSTRUCTION OF GEOMETRIC WAVELET 29
PROPOSED IMAGE COMPRESSION TECHNIQUE
4.1
PROPOSED ALGORITHM
31
4.2
BLOCK DIAGRAM
32
4.3
CONSTRUCTION OF BSP TREE
33
10
5.
4.4
BSP TREE
35
4.5
SELECTION OF PARTITION LINES
35
4.6
GEOMETRIC WAVELET COEFFICIENTS
36
4.7
EXTRACTION OF SPARSE GEOMETRIC
COEFFICIENTS
38
4.8
PACKETZATION
39
4.9
ENCODING
40
4.10
ENCODING THE TREE STRUCTURE
42
4.11
PARICLE SWARM OPTIMIZATION
44
4.12
DECODING
48
4.13
PERFORMANCE METRICS
48
EXPERIMENTAL RESULTS
50
5.1
RESULTS
55
5.2
GRAPH
56
5.3
CONCLUSION
58
5.4
REFERENCE
59
11
LIST OF ABBREVIATIONS
DWT -
Discrete wavelet Transform.
DFT -
Discrete Fourier Transform.
DCT -
Discrete Cosine Transform.
FT
Fourier Transform.
-
PCA -
Principal Component Analysis.
RDO -
Rate Distortion Algorithm.
GWT -
Geometric Wavelet Transform.
BSP -
Binary Space Partition.
PSO -
Particle Swarm Optimization.
WT
Wavelet Transform.
-
PSNR -
Peak Signal to Noise Ratio.
CR
-
Compression Ratio.
MSE -
Mean Square Error.
12
LIST OF FIGURES
FIGURENO
TITLE
PAGENO
1.1
Source Encoder Model
15
1.2
Source Decoder Model
16
2.1
Fourier Transform
21
2.2
Analysis of Different Transform
23
2.3
Decomposition
25
2.4
Approximation sub-band signal
26
2.5
Wavelet Analysis
27
3.1
Geometric Wavelet Schematic
30
4.1
Block Diagram of Image Compression Model
32
4.2
BSP
34
4.3
BSP Tree
35
4.4
Example of Greedy Selection
38
4.5
Final Geometric Wavelet tree
39
4.6
Line Partitioning
43
4.7
Concept of PSO
45
4.8
Flowchart of PSO
47
5.1
Experimental Results
13
CHAPTER 1
INTRODUCTION
Compressing an image is significantly different from compressing raw
binary data. Of course, general purpose compression programs can be used to
compress images, but the result is less than optimal. This is because images have
certain statistical properties which can be exploited by encoders specifically
designed for them. Some of the finer details in the image can be removed , which
reduces the amount of bandwidth and memory required .
1.1 IMAGE:
An image is essentially a 2-D signal processed by the human visual
system. The signals representing images are usually in analog form. However,
for processing, storage and transmission by computer applications, they are
converted from analog to digital form. A digital image is basically a 2Dimensional array of pixels. Images form the significant part of data,
particularly in remote sensing, biomedical and video conferencing applications.
The use of and dependence on information and computers continue to grow, so
too does our need for efficient ways of storing and transmitting large amounts
of data.
14
1.2 IMAGE COMPRESSION
Image compression addresses the critical aspect of reducing the amount of
data required to represent a digital image. It is a process intended to yield a
compact
representation
of
an
image,
thereby
reducing
the
image
storage/transmission requirements. Compression is achieved by the removal of one
or more of the three basic data redundancies:
1. Coding Redundancy
2. Inter pixel Redundancy
3. Psycho visual Redundancy
Coding redundancy is present when less than optimal code words are
used.
Inter pixel redundancy results from correlations between the pixels of an
image.
Psycho visual redundancy is due to data that is ignored by the human
visual system (i.e. visually nonessential information).
Image compression techniques reduce the number of bits required to
represent an image by taking advantage of these redundancies. An inverse
process called decompression (decoding) is applied to the compressed data to
get the reconstructed image.
The objective of compression is to reduce the number of bits as much as
possible, while keeping the resolution and the visual quality of the reconstructed
image as close to the original image as possible.
15
1.3 IMAGE COMPRESSION MODEL:
Input image
Encoding
Transform
Reconstructed
Image
Inverse
transform
Decoding
Input Image:
The input image is a monochrome or polychrome image.
Transform:
It converts the image signal in one domain to another domain. Many
mathematical transforms are available. Existing transformations are wavelet
transform, ridgelet,etc,.
Encoding:
Compression is achieved by the process of encoding. It reduces the no. of
bits representing the image. As a result of encoding bit streams are obtained.
Decoding:
The obtained bit streams are decoded for the purpose of reconstruction of
the image.
16
Inverse Transform:
The inverse transform is used to reconstruct the original image.
1.4 BENEFITS OF COMPRESSION:
 It provides a potential cost savings associated with sending less data over
switched telephone network where cost of call is really usually based
upon its duration
 It not only reduces storage requirements but also overall execution time.
 It also reduces the probability of transmission errors since fewer bits are
transferred.
 It also provides a level of security against illicit monitoring.
1.5 IMAGE COMPRESSION TECHNIQUES:
The image compression techniques are broadly classified into two
categories depending whether or not an exact replica of the original image
could be reconstructed using the compressed image .
These are:
1. Lossless technique
2. Lossy technique
1.5.1 LOSSLESS COMPRESSION TECHNIQUE:
In lossless compression techniques, the original image can be
perfectly recovered from the compressed (encoded) image. These are also
17
called noiseless since they do not add noise to the signal (image).It is also
known as entropy coding since it uses statistics/decomposition techniques to
eliminate/minimize redundancy. Lossless compression is used only for a few
applications with stringent requirements such as medical imaging. Following
techniques are included in lossless compression:
1. Run length encoding
2. Huffman encoding
3. LZW coding
4. Area coding
1.5.1.1 RUN LENGTH ENCODING:
This is a very simple compression method used for sequential data. It is
very useful in case of repetitive data. This technique replaces sequences of
identical symbols (pixels) ,called runs by shorter symbols. The run length code
for a gray scale image is represented by a sequence { Vi , Ri } where Vi is the
intensity of pixel and Ri refers to the number of consecutive pixels with the
intensity Vi as shown in the figure. If both Vi and Ri are represented by one
byte, this span of 12 pixels is coded using eight bytes yielding a compression
ratio of 1: 5
Run –Length Encoding
82 82 82 82 82 89 89 89 89 90 90
{82,5} {89,4} {90,2}
18
1.5.1.2 HUFFMAN ENCODING:
This is a general technique for coding symbols based on their statistical
occurrence frequencies (probabilities). The pixels in the image are treated as
symbols. The symbols that occur more frequently are assigned a smaller
number of bits, while the symbols that occur less frequently are assigned a
relatively larger number of bits. Huffman code is a prefix code. This means that
the (binary) code of any symbol is not the prefix of the code of any other
symbol. Most image coding standards use lossy techniques in the earlier stages
of compression and use Huffman coding as the final step.
1.5.2 LOSSY COMPRESSION TECHNIQUE:
Lossy schemes provide much higher compression ratios than lossless
schemes. Lossy schemes are widely used since the quality of the reconstructed
images is adequate for most applications .By this scheme, the decompressed
image is not identical to the original image, but reasonably close to it. In this,
prediction – transformation – decomposition process is completely reversible.
The quantization process results in loss of information. The entropy coding
after the quantization step, however, is lossless. The decoding is a reverse
process. Firstly, entropy decoding is applied to compressed data to get the
quantized data. Secondly, dequantization is applied to it & finally the inverse
transformation to get the reconstructed image.
Major performance considerations of a lossy compression scheme include:
1. Compression ratio
2. Signal - to – noise ratio
19
3. Speed of encoding & decoding.
Lossy compression techniques includes following schemes:
1. Transformation coding
2. Vector quantization
3. Fractal coding
4. Block Truncation Coding
5. Sub band coding
1.5.2.1 TRANSFORMATION CODING:
In this coding scheme, transforms such as DFT (Discrete Fourier
Transform) and DCT (Discrete Cosine Transform) are used to change the pixels
in the original image into frequency domain coefficients (called transform
coefficients).These coefficients have several desirable properties. One is the
energy compaction property that results in most of the energy of the original
data being concentrated in only a few of the significant transform coefficients.
This is the basis of achieving the compression. Only those few significant
coefficients are selected and the remaining are discarded. The selected
coefficients are considered for further quantization and entropy encoding. DCT
coding has been the most common approach to transform coding. It is also
adopted in the JPEG image compression standard.
20
CHAPTER 2
2.1 WAVELET
Often signals are processed in the time-domain, but in order to process
them more easily, other information such as frequency, is required.
Mathematical transforms translate the information of signals into different
domains.
FOURIER TRANSFORM :
Fourier transforms can be used to translate time domain signals into the
frequency domain. It acts as a mathematical prism, breaking up the time signal
into frequencies, as a prism breaks light into different colors. The Fourier
transform converts a signal between the time and frequency domains, such that
the frequencies of a signal are obtained. However the Fourier transform doesn’t
provide information about which frequencies occur at specific times in the
signal as time and frequency are viewed independently.
Fig2.1 The left graph shows a signal plotted in the time domain, the right graph
shows the Fourier transform of the signal.
21
DRAWBACKS:
 The Fourier transform do not suit brief signals, signals that change
suddenly, i.e. non-stationary signals.
 The Fourier transform doesn’t provide information about which frequencies
occur at specific times in the signal as time and frequency are viewed
independently.
SHORT TERM FOURIER TRANSFORM:
STFT introduced the idea of windows through which different parts of a
signal are viewed. For a given window in time the frequencies can be viewed.
However Heisenburg’s Uncertainty Principle states that as the resolution of the
signal improves in the time domain, by zooming on different sections, the
frequency resolution gets worse.
2.2 WAVELET ANALYSIS:
Wavelet analysis of the signal provides Multi-resolution ,as certain parts of
the signal can be resolved well in time, and other parts can be resolved well in
frequency. The power and magic of wavelet analysis is exactly this multiresolution.
Wavelet analysis divides the information of an image into approximation
and detail sub-signals. The approximation sub-signal shows the general trend of
pixel values, and three detail sub-signals show the vertical, horizontal and
22
diagonal details or changes in the image. If these details are very small then they
can be set to zero without significantly changing the image. The value below
which details are considered small enough to be set to zero is known as the
threshold. The greater the number of zeros the greater the compression that can be
achieved.
The amount of information retained by an image after compression and
decompression is known as the energy retained and this is proportional to the sum
of the squares of the pixel values. If the energy retained is 100% then the
compression is known as lossless, as the image can be reconstructed exactly. This
occurs when the threshold value is set to zero, meaning that the detail has not been
changed. If any values are changed then energy will be lost and this is known as
lossy compression. Ideally, during compression the number of zeros and the
energy retention will be as high as possible. However, as more zeros are obtained
more energy is lost, so a balance between the two needs to be found.
Fig2.2 The different transforms provided different resolutions of time and
frequency.
23
2.3 DISCRETE WAVELET TRANSFORM:
An image is represented as a two dimensional array of coefficients, each
coefficient representing the brightness level in that point. The more important
coefficients can be differentiated from the lesser coefficients. Most natural images
have smooth color variations, with the fine details being represented as sharp edges
in between the smooth variations. Technically, the smooth variations in color can
be termed as low frequency variations and the sharp variations as high frequency
variations.
The low frequency components (smooth variations) constitute the base of an
image, and the high frequency components (the edges which give the detail) add
upon them to refine the image, thereby giving a detailed image. Hence, the smooth
variations are demanding more importance than the details.
Separating the smooth variations and details of the image can be done in
many ways. One such way is the decomposition of the image using a Discrete
Wavelet Transform (DWT).
Discrete Wavelet analysis is computed using the filter banks. Filters of
different cut-off frequencies analyze the signal at different scales. Resolution is
changed by the filtering,
the scale is changed by up sampling and down
sampling. If a signal is put through two filters:
(i) A high-pass filter, high frequency information is kept, low frequency
information is lost.
(ii) A low pass filter, low frequency information is kept, high frequency
information is lost.
24
Fig2.3 Decomposition
Then the signal is effectively decomposed into two parts, a detailed part
(high frequency), and an approximation part (low frequency). The sub-signal
produced from the low filter will have a highest frequency equal to half that of
the original. According to Nyquist sampling this change in frequency range
means that only half of the original samples need to be kept in order to
perfectly reconstruct the signal, more specifically this means that up sampling
can be used to remove every second sample. The scale has now been doubled.
The resolution has also been changed, the filtering made the frequency
resolution better, but reduced the time resolution. The DWT is obtained by
collecting together the coefficients of the final approximation sub-signal and all
the detail sub-signals.
25
Fig2.4 The approximation sub-signal can then be put through a filter bank, and
this is repeated until the required level of decomposition has been reached
2.4 2D WAVELET ANALYSIS:
Images are treated as two dimensional signals, they change horizontally
and vertically, thus 2D wavelet analysis must be used for images. 2D wavelet
analysis uses the same ’mother wavelets’ but requires an extra step at every
level of decomposition. The 1D analysis filtered out the high frequency
information from the low frequency information at every level of
decomposition; so only two sub-signals were produced at each level.
26
In 2D, the images are considered to be matrices with N rows and M
columns. At every level of decomposition the horizontal data is filtered, then
the approximation and details produced from this are filtered on columns.
Fig2.5 Wavelet Analysis
2.5 IMAGE RECONSTRUCTION - THE INVERSE DWT OF AN
IMAGE:
Just as a forward transform to used to separate the image data into
various classes of importance, a reverse transform is used to reassemble the
various classes of data into a reconstructed image. A pair of high pass and low
pass filters are used here also. This filter pair is called the Synthesis Filter pair.
The filtering procedure is just the opposite – it is started from the topmost level,
apply the filters column wise first and then row wise, and proceed to the next
level, the first level is reached.
27
CHAPTER 3
3.1 GEOMETRIC WAVELETS
Data sets such as images, documents or gene expression data may be
modeled as point clouds in high-dimensional Euclidean space. In the case of
images, each pixel can be thought of as one coordinate in a vector with a length
equal to the number D of pixels in the image, and the intensity of each pixel
corresponds to the coordinate magnitude in that pixel’s direction. Real data
points often have structure which has dimension d much smaller than the
ambient space dimension D, for example under the well-studied case when they
lie near a low-dimensional manifold M. Discovering and characterizing this
lower-dimensional structure can dramatically affect the performance in tasks
such as data compression, interpretation, outlier detection, classification and
clustering.
If M is just a linear subspace, Principal Component Analysis (PCA) can
discover a dictionary of d vectors which describe the data well at low
computational cost. However, when M is nonlinear it is usually necessary to
use random dictionaries or black box optimization, which are much more costly
and in general do not yield interpretable features of the data.
Geometric Wavelets
are multi-scale dictionary elements which are
constructed directly from the data, adapt to arbitrary nonlinear manifolds, and
have guarantees on the computational cost, the number of elements in the
dictionary and the sparsely of the representation (as a function of an
approximation error parameter). In particular they provide feature
28
sets that may be particularly useful for data exploration, and tasks
such as anomaly detection and classification.
3.2 CONSTRUCTION OF GEOMETRIC WAVELETS:
 First, relationships between data points are computed with respect to a
given similarity function.
 At the coarsest scale all data points are considered one group and global
PCA is performed, yielding a d-dimensional plane fit to the data
 With axes in the directions of maximum variance, which we think of as a
parallel of “scaling functions” in wavelet analysis.
 The projection of the data points onto this plane is the coarsest-scale
approximation of the data.
 Next, the graph is split into two groups
 On each of these finer scale groups, PCA is again performed and the
projection of the data points onto these two new planes will more
accurately approximate M.
 To form a compact representation for the data at this finer scale, the
differences between the original coarse projections of the data and the
points projected onto the planes at the finer scale , is alone encoded as in
a wavelet decomposition.
29
Fig3.1 Geometric Wavelet Schematic.
 In order to do this an efficient scheme is derived based on the
construction of a minimal space spanning this set of differences.
 The axes of this difference space are called “geometric wavelets”, and
the projections of the finer-scale corrections to the data points onto the
plane spanned by these axes are called the “wavelet coefficients”.
 The process is continued, forming a binary tree of parents and children at
finer and finer scales until no further details are needed to approximate
the data up to a pre-specified precision.
 Geometric wavelets provide a dictionary or feature set of the data that
efficiently captures coarse-to-fine structure in the data, and the data may
be transformed back and forth between its original representation and a
geometric wavelet representation via fast algorithms.
30
CHAPTER 4
PROPOSED IMAGE COMPRESSION ALGORIHTM
The existing low rate bit rate image compression utilizes BINARY
SPACE PARTITION SCHEME and GEOMETRIC WAVELET TREE
APPROXIMATION. And this efficiently captures the curve singularities , thus
providing a sparse representation of the image. The optimization technique
used here is the RATE DISTORTION OPTIMIZATION algorithm.
The
existing
image
compression
is
enhanced
by
exploiting
PACKETIZATION technique and PARTICLE SWARM OPTIMIZATION
technique.
4.1 PROPOSED ALGORITHM:
Step 1:
Read the monochrome/polychrome image and this image is
partitioned up to N=3 levels by using Binary Space Partition
scheme to get decomposed image.
Step 2:
Represent that decomposed image coefficient in sparse-Geometric
wavelet representation and arrange this coefficient in tree
structure.
Step 3:
Packetization of geometric wavelet image is done as per required
packet size in order to improve the peak-signal to noise ratio value.
31
Step 4:
Encode the each packet of geometric wavelet image with Particle
Swarm Optimization to get optimized image.
Step 5:
After getting optimized image, image is decoded. So bit streamed
output is obtained.
Step 6:
By taking inverse transform of bit streamed output of image,
reconstructed image is obtained.
4.2 PROPOSED BLOCK DIAGRAM:
Input
Image
Construction
of geometric
wavelet tree
Reconstructed
Image
Extraction of sparse
geometric
wavelet
Packetization
coefficients
Inverse
Transform
Decoding
Fig4.1 Block Diagram of Image Compression Model
32
Encoding
using
PSO
BLOCK DIAGRAM EXPLANATION:
4.3 CONSTRUCTION OF BSP TREE :
The desired image is partitioned using the BINARY SPACE
PARTITION technique. The image is partitioned recursively by arbitrarily
oriented lines in a hierarchical manner. This recursive partition process
generates a binary tree, which is called the BSP TREE REPRESENTATION
of the desired image. The most critical task is the criterion used to select the
partitioning lines of the BSP tree representation of the image . hence the given
image signal is divided into different geometric regions. The regions are
described using two information. They are
 The geometry of the region boundaries
 Attributes of the image signal within the region
To achieve high compression ratio and image quality, the given image
must be partitioned into minimum number of regions in such a way that the
geometric description of the region‘s boundaries is simple and the image
signals within the region is continuous i.e. smooth. Therefore a balance must be
achieved between the above two contradicting issues. The conventional
algorithm focuses only on any one of the requirements, at the cost of the other.
Our proposed algorithm pays heed to both requirements. The balance is
achieved by simple and flexible description of the image regions. Due to this
flexible geometric description, that is based on arbitrarily oriented lines, the
given image is represented by a minimum number of regions while maintaining
the smoothness of the signals in these regions.
33
CONSTRUCTION PROCEDURE:
The BSP approach partitions the given image recursively by
straight lines in a hierarchical manner. First , a line is selected based on an
appropriate criterion. And the given image is initially partitioned into two subimages. Using the same criterion, two lines are selected to split the obtained
two sub-images obtained as a result of first partition. The process is repeated
until the terminating criterion is met. A set of convex regions is obtained as a
result of partitioning, and the binary space partitioning tree is obtained.
Fig 4.2 BSP
34
4.4 BSP TREE :
The non-leaf nodes of the tree are associated with partitioning
lines and the leaves represent the un-partitioned regions of the image.
Fig4.3 BSP Tree
4.5 SELECTION OF PARTITION LINES:
It is the most critical task of the BSP tree construction. The
selection of the partition lines involves two steps. They are
 The space of all possible lines partitioning the image has to be
quantized. This line quantizing process generates a finite set of lines.
And the lines are represented using two parameters , therefore in
quantization process two parameters are quantized.
 The partitioning line is then selected from the finite set of
quantized lines, using the line selection criteria.
35
4.6 GEOMETRIC WAVELET COEFFICIENTS:
To approximate the image f in a given region Ωi a bivariate linear
polynomial is used which is defined by:
𝑄Ω𝑖 = 𝐴𝑖 𝑥 + 𝐵𝑖 𝑦 + 𝐶𝑖
(1)
The functional used to find the best subdivision for a given region is the
following:
Where Ω0 and Ω1 represent the subsets resulting from the subdivision of
Ω where Ω0 and Ω1 should be considered as children for the parent Ω
First a line L divides the region Ω into two regions Ω0 and Ω1 . The two
regions Ω0 and Ω1 are further divided into Ω00 , Ω01 and Ω11 , Ω10
respectively. These four regions are further divided into eight segments and this
is done recursively. Then it is represented in a tree structure.
The value of the coefficients A,B and C are found by minimizing the function
given below:
By taking the partial derivatives for A, B ,C and solving the below three
equations , the coefficients of the polynomial are obtained.
36
The local difference is used to define the geometric wavelets. The local
difference computes the difference between the actual partition and the
previous giving an idea of the degree of change, if the difference is large then
the new partition is capturing new details, and if the difference is small, the new
partition does not add new information. The GW is defined as follows:
ΨΩ0 𝑓 ≜1Ω0(QΩ0− 𝑄Ω)
(4)
where 1Ω0 is the function that gives us 1 in Ω0 and 0 in the rest. Ω0 here
means one of the children. It is possible to reconstruct the function f using GW
due to the term cancelations
Using the BSP tree, the norm of each ψΩi (f ) is computed, which is a
measure of the degree of change, then these numbers are sorted and the
function is approximated by the n-term geometric wavelet sum defined as
37
4.7 EXTRACTION OF SPARSE GEOMETRIC WAVELET COEFFICIENTS:
BSP tree constructed in the above process consists of large number of
nodes. Geometric wavelet is created for each node and thereafter they are
sorted according to their contribution. Sparse geometric representation is
extracted using greedy approximation methodology, where n wavelets are
selected from the joint list of geometric wavelets over all tiles.
Fig4.4 Example of a greedy selection
For the efficient encoding of extracted BSP forest it is necessary that if a
child is present in the sparse representation then the parent should also be there,
i.e., each BSP tree should be connected. Instead of encoding an n-term tree
approximation, an n+k geometric wavelet tree is generated by considering
more k nodes. The penalty for imposing the condition of the connected tree
structure is not very huge, since there is high probability that if a child is
significant all its ancestors are also significant. The encoding of the geometry
of the extracted connected tree structure saves bits as only optimal cut is to be
encoded.
38
Fig4.5 The final GW tree with the additional nodes.
4.8 PACKETIZATION:
Various basis selection methods have been proposed to select the best
basis among a library of available wavelet packets bases. The use of different
cost functions may result in different best bases which, in turn, may produce
different coding results using the same quantization method. It is, therefore,
important to take into account the quantization strategy at the time of basis
selection to ensure that the basis chosen by employing certain criterion will
actually result in better performance in terms of coding gains.
The geometric wavelet transform is actually a subset of a far more
versatile
transform, the wavelet packet transform. Wavelet packets are
particular linear combinations of wavelets. They form bases which retain many
of the orthogonally, smoothness and localization properties of their parent
wavelets. The coefficients in the linear combinations are computed by
recursive algorithm making each newly computed wavelet packet coefficient
sequence the root of its own analysis tree. Wavelet packet decomposition is a
39
wavelet transform where the signal is passed through more filters than the
discrete wavelet transform.
ENCODING THE SPASRE GEOMETRIC WAVELET
REPRESENATION :
HEADER ADDITION :
For the purpose of the decoding by the decoder during the
reconstruction of the compressed image , header is added to compressed file.
The header is added before the encoding process. This header contains the
minimum and maximum values of the coefficients of the wavelets QΩ
participating in the sparse representation. These values are used by the decoder
to decode the coefficients. In addition, the header contains the minimum and
maximum values of the gray levels in the image.
4.9 ENCODING:
The types of information of the geometric wavelet tree to be
encoded are
 Tree structure information.
 The wavelet coefficients QΩ .
 The bisecting line information of each Ω if it has a child.
 Header information.
40
ENCODING THE ROOT :
Due to the fact that the contribution of the root “wavelets” to the
approximation is generally high, all of them are encoded. This is analogous to
the JPEG algorithm, where the DC component of the DCT transform plays the
same role and is always encoded.
The encoding procedure is then applied recursively for each GW tree
root in each image tile. The following steps are performed for each node that is
visited in the recursive algorithm.

The coefficients of Q Ω are encoded using an orthonormal
basis representation.

The number of children of Ω that participate in the sparse
representation (0,1, or 2) are encoded.

In case only, one child belongs to the sparse representation,
we encode (using one bit) which of the two it is.

In case at least one of the children belongs to the sparse
representation, the BSP line that bisects is alone encoded.
Once the information associated with Ω is encoded, the recursion is
applied only to the children nodes that belong to the sparse representation.
4.10 ENCODING THE TREE STRUCTURE:
There are two types of tree structure information that are needed to be
encoded:
 The number of children of each node and
 The information that is needed to distinguish between child nodes.
41
In our wavelet-based method, all the children are not significant. As in
sparse isotropic wavelets representations, with high probability a significant
node does not have any significant children nodes. Thus, the three values are
encoded using the static Huffman code. The Zero-Children symbol is encoded
by “1,” the One-child symbol by “00” and the Two-Children symbol by “01.”
If the symbol One-child is encoded, then an additional bit indicating the
child is also encoded and it indicates to the decoder which of the two children is
the significant. To avoid a synchronization problem at the decoder, a consistent
order is imposed on the children, in the following way:
The first-child of a polygon is defined as follows.
 If the first point of the parent’s polygon exists only in one child, then this
child is a first-child.
 Else, the first point of the parent’s polygon exists in both child’s
polygons. In this case, the second point of the polygon is examined. The
child’s polygon, which contains the second point, is a first-child
42
Fig4.6 demonstrates two cases of line-partitioning of a polygon.
The left figure shows that the first point of the parent’s polygon exists
only in one of the child’s polygons, whereas in the right figure the first point
exists in both of the child’s polygons.
5.11 PARTICLE SWARM OPTIMIZATION:
INTRODUCTION:
PSO is a robust stochastic optimization technique based on the
movement and intelligence of swarms. PSO applies the concept of social
interaction to problem solving. It was developed in 1995 by James Kennedy
(social-psychologist) and Russell Eberhart (electrical engineer). It uses a
number of agents (particles) that constitute a swarm moving around in the
search space looking for the best solution. Each particle is treated as a point in a
43
N-dimensional space which adjusts its “flying” according to its own flying
experience as well as the flying experience of other particles.
PSO is a computational method that optimizes a problem by iteratively
trying to improve a candidate solution with regard to a given measure of
quality. PSO optimizes a problem by having a population of candidate
solutions, here dubbed particles, and moving these particles around in the
search-space according to simple mathematical formulae over the particle’s
position and velocity. Each particle’s movement is influenced by its local best
known position and is also guided toward the best known positions in the
search-space, which are updated as better positions are found by other particles.
This is expected to move the swarm toward the best solutions.
PSO shares many similarities with the evolutionary computation
techniques such as Genetic Algorithms (GA). The system is initialized with a
population of random solutions and searches for optima by updating
generations. However, unlike GA, PSO has no evolution operations such as
crossover & mutation. In PSO, the potential solutions, called particles, fly
through the problem space by following the current optimum particles.
DESCRIPTION OF PSO:
 Each particle keeps track of its coordinates in the solution space
which are associated with the best solution (fitness) that has achieved
so far by that particle. This value is called personal best , pbest.
 Another best value that is tracked by the PSO is the best value
obtained so far by any particle in the neighborhood of that particle.
This value is called gbest.
44
 The basic concept of PSO lies in accelerating each particle toward its
pbest and the gbest locations, with a random weighted acceleration at
each time step as shown in figure.
sk+1
vk
vk+1
vgbest
vpbest
sk
Fig4.7 Concept of modification of a searching point by PSO
o sk : current searching point.
o sk+1: modified searching point.
o vk: current velocity.
o vk+1: modified velocity.
o vpbest : velocity based on pbest.
o vgbest : velocity based on gbest
45
Start
Intialize particles with random positions
& velocity vectors
For each particle’s position(p) evaluate
fitness
If fitness(p) better than fitness(pbest) then
pbest=p
Set best of PBest as GBest
Update particles position & Velocity
Stop: Giving GBest, Optimal Solution
Fig4.8 Flow chart depicting the General PSO Algorithm
46
4.12 DECODING:
In this step compressed bit stream is read to find whether the
participating node is the leaf node, has 1 child or 2 children. If one child is
participating then by using bit stream, it is found that whether it is left or right.
If at least one of the children belongs to the sparse representation, then the
coefficients of the bisecting line are calculated. Thereafter, using this optimal
cut, domain is partitioned into two sub-domains; and depending upon the
situation vertex set of only one child or both children is found. An orthogonal
basis was used during the encoding of the coefficients of the coefficients of
geometric wavelet. Thus, before using the decoded geometric wavelets in nterm sum, its representation in the standard basis is found. This process is
repeated until entire bit stream is read.
4.13 PERFORMANCE METRICS:
The image quality can be evaluated objectively and subjectively.
Objective methods are based on computable distortion measures. A standard
objective measures of image quality are
Mean Square Error (MSE) and Peak Signal-to-Noise Ratio (PSNR) which
is defined as
I(x,y) is the original image, I'(x,y) is the approximated version (which is
actually the decompressed image) and M,N are the dimensions of the images.
A lower value for MSE means lesser error, and as seen from the inverse
relation between the MSE and PSNR, this translates to a high value of PSNR.
47
Logically, a higher value of PSNR is good because it means that the ratio of
Signal to Noise is higher. Here, the 'signal' is the original image, and the 'noise' is
the error in reconstruction. So, a compression scheme having a lower MSE (and a
high PSNR), is recognized as a better one.
48
CHAPTER 5
EXPERIMENTAL RESULTS:
Experimental results are obtained for different type images such as
monochrome and polychrome image. The size of image is in the range of
256x256.
To calculate the PSNR and CR value we use the following formula:
1) PSNR = 20 * log(256 / sqrt(MSE))
2) Compression Ratio =
No. of bits in the compressed image x 100
No. of bits in the original image
The simulation results for our proposed algorithm are as follows:
FOR MONOCHROME IMAGE
ORIGINAL IMAGE
RECONSTRUCTED IMAGE
49
50
51
FOR POLYCHROME IMAGES
52
53
READINGS:
MONOCHROME IMAGES:CR =20
IMAGE
PSNR
MSE
CR
Camera man
37.77
0.301
37.164
Fruit
37.9377
0.184
95.611
Lena
35.708
0.4024
95.607
Child
36.59
0.296
36.035
Elephant
37..57
0.226
35.578
Flower
37.21
0.325
37.164
POLYCHROME IMAGES:
IMAGE
PSNR
MSE
CR
Child
38.89
0.150
34.665
Indian Flag
38.39
0.228
45.242
Rose
40.101
0.095
43.75
Earth
40.3744
0.107
67.62
54
GRAPH:
Monochrome images vs CR
100
90
80
70
60
50
CR
40
30
20
10
0
camera
man
fruit
lena
child
elephant
flower
Monochrome images vs PSNR
38
37.5
37
PSNR
36.5
36
35.5
camera
man
fruit
lena
child
55
elephant
flower
Polychrome images vs CR
50
45
40
35
30
25
CR
20
15
10
5
0
rose
child
flag
earth
Polychrome images vs PSNR
40.5
40
39.5
39
PSNR
38.5
38
37.5
37
rose
child
flag
56
earth
CONCLUSION:
The proposed compression algorithm was tested for the monochrome and
polychrome images. The resultant image quality of the reconstructed image
using geometric wavelet based PSO algorithm was found to be higher than the
image obtained by wavelet based. Therefore our proposed PSO algorithm
applied to geometric wavelet with PSO is better than the geometric wavelet
without PSO for quality image.
FUTURE ENHANCEMENT:
Our proposed algorithm can be enhanced by applying Particle Swarm
Optimization with recent works over with curvelet, wedgelet.
57
REFERENCE:
1. An Improved Image Compression Algorithm Using Binary
Space Partition Scheme and Geometric Wavelets,Garima Chopra and A. K. Pal,
2.K. Dabov, A. Foi, V. Katkovnik, and K. Egiazarian, “Image denoising
by sparse 3D transform-domain collaborative filtering,” IEEE Trans.
Image Process., vol. 16, no. 8, pp. 2080–2095, Aug. 2007.
3. S. Lansel, D. Donoho, and T.Weissman, “DenoiseLab: A standard test
set and evaluation method to compare denoising algorithms,” Stanford
University, Stanford, CA [Online]. Available: http://www.stanford.
edu/slansel/DenoiseLab/
4. K. Dabov, A. Foi, V. Katkovnik, and K. Egiazarian, “A nonlocal and
shape-adaptive transform-domain collaborative filtering,” in Proc. Int.
Workshop Local and Non-Local Approx. Image Process., Lausanne,
Switzerland, Aug. 2008, pp. 179–186.
5. K. Dabov, A. Foi, V. Katkovnik, and K. Egiazarian, “BM3D image
denoising with shape-adaptive principal component analysis,” in Proc.
Workshop Signal Process. Adaptive Sparse Structured Represent.,
Saint-Malo, France, Apr. 2009.
6. Chen-Kuei Yang; Ja-Chen Lin; Wen-Hsiang Tsai , Color image compression
by moment-preserving and block truncation coding techniques, IEEE
Transactions on Communications, Volume 45, Issue 12, Dec. 1997
Page(s):1513 - 1516 .
7. Marino, F.; Acharya, T.; Karam, L.J., A DWT-based perceptually lossless
color image compression architecture, Conference Record of the Thirty-Second
Asilomar Conference on Signals, Systems & Computers, 1998, Volume
1, 1-4 Nov. 1998 Page(s):149 - 153 vol.1.
58
8. Mallat, S. G. , A theory for multiresolution signal decomposition: the
Wavelet Representation , IEEE Transactions on Pattern Analysis and Machine
Intelligence, Vol. 11, No. 7 (1989), p. 674- 693.
9. Sayood, Khalid (2000), Introduction to Data Compression, (Second edition
Morgan Kaufmann).
10. E.J. Candes, Ridgelets , Theory and Applications, Ph.D. Thesis, Aug.1998,
Department of Statistics, Stanford University.
11. E.J. Candes, D.L. Donoho, Ridgelets, A key to higher dimensional
intermittency, Phil. Transactions Royal Society of London A., pp 2495-2509,
1999.
12. Minh N. Do and Martin Vetterli, Orthonormal Finite Ridgelet Transform
for Image Compression, in Proceedings of IEEE International Conference on
Image Compressing, Vol. 2, pp. 367-370, September 2000.
13. Minh N. Do and Martin Vetterli, The Finite Ridgelet Transform for Image
Representation, IEEE Transactions on Image Processing, Vol. 12, No. 1
January 2003.
14. E. D. Bolker, The finite Radon transform, in Integral Geometry
(Contemporary Mathematics, Vol. 63),S. Helgason R. L. Bryant, V. Guillemin
and R. O. Wells Jr., Eds., pp. 27-50. 1987.
15. F. Matus and J. Flusser, Image representation via a finite Radon transform,
IEEE Trans. Pattern Anal. Machine Intelligence, vol. 15, no. 10, pp. 996-1006,
Oct 1993.
59