Entropy Based Information Measures in the Joint Time

Entropy Based Information Measures in the Joint
Time-Frequency Plane
Nicoletta Saulig, dipl. ing. el.
University of Rijeka - Faculty of Engineering,
Vukovarska 58, HR-51000 Rijeka, Croatia
Email: [email protected]
Abstract—Entropy measures applied to time-frequency signal
representations have been found to be reliable indicators of
various signal’s features. This paper gives an introduction to
the time-frequency signal analysis, emphasizing the properties of
time-frequency distributions that allow the application of entropy
measures to quantify signals complexity. The paper includes an
overview of the key properties of the time-frequency entropy
measures, as well as possible applications of these measures
as estimators of the signal information content or number of
signal components. Entropy measures have been also used for
quantification of the performance of different time-frequency
distributions, leading to kernel design based on local or global
entropy minimization.
Index Terms—Nonstationary signals, time-frequency distributions, information measure, Rényi entropy, performance, kernel
design, optimization
I. I NTRODUCTION
One of the fundamental information when analyzing and
processing nonstationary signals in the various fields of engineering (telecommunications, acoustics, biomedical engineering) is the quantification of signal complexity. The concept of
signal complexity relies on the assumption that signals of high
complexity (and therefore high information content) must be
constructed from large numbers of elementary components [1].
Following this criterium, nonstationary signals can be characterized as highly complex signals requiring several pieces of
information for their characterization. From independent signal
representations, in time and in frequency, information about
the signal duration and frequency content can be obtained.
Unfortunately, the time-bandwidth product, obtained from
time and frequency signal representations, quantifies neither
the signal complexity nor the information content [1]. Joint
time and frequency representations overcome many of the
limitations of the classical signal representations, showing how
the frequency content of a signal changes in time [2]–[4]. From
such representations different components that are present
in a signal can be identified, as well as their instantaneous
frequencies [2], [5], [6]. Some of the characteristics of the
time-frequency distributions allow to use entropy measures
from information theory to quantify the signal complexity [7].
This paper provides a review of the information about the
signal that can be obtained when entropy measures are applied
to its time-frequency distribution.
II. T IME - FREQUENCY REPRESENTATION
Time-frequency representations/distributions, (TFRs, TFDs)
are two variable functions, Cs (t, f ), defined over the twodimensional (t, f ) domain [2], [8], [9]. Such a joint timefrequency representation shows how the frequency content of
a signal changes in time [10].
One of the most popular TFDs, has been introduced by Wigner
and afterwards extended by Ville to analytic signals [2], [11],
[12]. The intuitive idea of the Wigner-Ville distribution (WVD)
was to obtain a kind of instantaneous signal spectrum by
performing the Fourier transform of a function related to the
signal, called the kernel function Ks (t, τ ) [13]. The WVD of a
signal s(t), denoted as Ws (t, f ), represents a monocomponent
frequency modulated (FM) signal as a knife-edge ridge in the
(t, f ) plane, whose crest is the instantaneous frequency (IF)
of the signal [2].
Let s(t) be an analytic FM signal of the form [2], [11]
s(t) = a(t)ejφ(t) ,
(1)
where a(t) is the instantaneous signal amplitude, and the
signal IF is defined as the time derivative of its instantaneous
phase φ(t) [2]
φ (t)
.
(2)
2π
The requirement for the WVD to be a knife-edge ridge can
mathematically be interpreted as a series of delta functions
tracking the signal IF [14]:
fi (t) =
Ws (t, f ) = δ(f − fi (t))
(3)
which leads to the signal kernel definition:
j2πfi (t)τ
Ks (t, τ ) = Fτ−1
= ejφ (t)τ .
←f {δ(f − fi (f ))} = e
(4)
Since φ (t) is not directly available, it can be replaced by the
central finite-difference approximation [2], [15]
τ
τ
1
[φ(t + ) − φ(t − )].
(5)
τ
2
2
By substituting (5) into (4), the kernel function and the WVD
are defined as [2]
τ
τ
τ
τ
(6)
Ks (t, τ ) = ejφ(t+ 2 ) e−jφ(t− 2 ) = s(t + )s∗ (t − ),
2
2
φ (t) ≈
Ws (t, f ) =
=
τ
τ
Fτ →f {s(t + )s∗ (t − )
2
2
∞
τ ∗
τ −j2πf τ
s(t + )s (t − )e
dτ.
2
2
−∞
The Quadratic class of time-frequency distributions
(7)
Hence, the WVD can be understood as the Fourier transform
of the signal kernel Ks (t, f ), also known as the instantaneous
autocorrelation function (IAF) of s(t).
From (7), we can notice that using the IAF as the kernel function brings nonlinearity in the WVD. The effects of this nonlinearity will be most evident in the case of multicomponent
signals, as explained below. Note that, in general, a component
in the (t, f ) domain is a ridge of energy concentration whose
peaks follow the component IF law [14].
Let us consider an analytic signal of the form
s(t) = s1 (t) + s2 (t)
(8)
Its IAF is [2]:
Ks (t, τ ) = Ks1 (t, τ ) + Ks2 (t, τ ) + Ks1 s2 (t, τ )
+Ks2 s1 (t, τ )
(9)
where
Ks1 s2 (t, τ ) = s1 (t +
τ ∗
τ
)s (t − ),
2 2
2
(10)
Ks2 s1 (t, τ ) = s2 (t +
τ ∗
τ
)s1 (t − ),
2
2
(11)
are the instantaneous cross-correlation functions that will add
the third term to the WVD of the two-component signal [2],
[14], [16]:
Ws (t, f ) = Ws1 (t, f ) + Ws2 (t, f ) + 2Re{Ws1 s2 (t, f )}. (12)
This third term in the summation in (12) is called the crossterm defined as
τ
τ
(13)
Ws1 s2 (t, f ) = Fτ →f {s1 (t + )s∗2 (t − )}
2
2
It appears in the (t, f ) plane in between the signal components,
often degrading the quality of signal representation in the
(t, f ) plane [17].
The rule of interference construction in the WVD can be
summarized as follows. Two points belonging to the signal
will interfere to create a third point which will be located
on their geometrical midpoint [16]. The amplitude of the
interference will be proportional to the double product of
the amplitudes of the interfering points. In addition, the
interferences oscillate perpendicularly to the line joining the
two signal points, assuming both positive and negative values,
with the frequency of oscillation being proportional to the
distance between these two points [2], [18].
It can be deduced from the general interference rule
that interferences will be also present in the case of
monocomponent signals with nonlinear FMs, called inner
cross-terms in [16].
A generalization of the WVD is given by a class of timefrequency distributions known as the Quadratic class [2].
Distributions from this class are defined as
Cs (t, f ) = γ(t, f ) ∗t ∗f Ws (t, f ).
(14)
where γ(t, f ) is the time-frequency kernel filter, and the
double asterisk denotes a double convolution in t and f . Each
TFD belonging to the Quadratic class can thus be defined by
the double convolution of the WVD and the time-frequency
kernel. Eq. (14) can be rewritten in terms of the IAF function
and the Doppler-lag kernel g(ν, τ ) [4]:
∞ ∞ ∞
τ
τ
g(ν, τ )s(u + )s∗ (u − ) ×
Cs (t, f ) =
2
2
−∞ −∞ −∞
ej2π(νt−νu−f τ ) du dν dτ. (15)
The advantages of time-frequency signal representations over
the classical analysis approaches will be illustrated on an
example of a two component signal whose components are
linearly and sinusoidally frequency modulated. Fig. 1 shows
that the time and frequency representations are not adequate
for this multicomponent nonstationary signal analysis; from
the time representation (Fig.1(a)) no information about the
signal IF can be obtained, while the frequency representation
(Fig. 1(b)) does not provide any information about the
arrival times of individual frequencies [14]. Neither of
these representations indicates the presence of multiple
components. Fig. 1(c) shows the spectrogram (the simplest
TFD, corresponding to the squared magnitude of the Shorttime Fourier transform [2], [4], [8]) of the signal with the
Hamming window of duration 65 s. The WVD of the same
two-component signal is shown in Fig. 1(d), which is a
representation highly corrupted by interference terms. The
interference gets successfully reduced by the time-frequency
smoothing performed by the SPWVD [14], [16] (Fig. 1(e))
with the time and lag Hamming windows of durations 25 s
and 65 s, respectively. The SPWVD also achieves high
energy concentration around the signal components IFs, when
compared to the spectrogram.
III. E NTROPY MEASURES OF TIME - FREQUENCY
DISTRIBUTIONS
The idea of using entropy measures to estimate the signal
complexity and information content relies on the idea that
signals of high complexity (and therefore high information
content) must be constructed from a large number of elementary components. Moment-based measures, such as the timebandwidth product and its generalizations to second-order
time-frequency moments [1], [2] have found wide application,
but unfortunately they measure neither the signal complexity nor the information content of the signal [19]–[21]. To
illustrate this, consider a signal comprised of two components
of compact support, and note that while the time-bandwidth
2
5
4.5
4
0
3.5
3
−2
50
100
150
Time (s)
200
250
(a)
2.5
2
1.5
1
6000
0.5
4000
0
−0.5
2000
0
Frequency (Hz)
Frequency (Hz)
10
15
20
25
30
35
Displacement (s)
0
0.1
0.2
0.3
Frequency (Hz)
0.4
0.5
(b)
Frequency (Hz)
5
0.4
Fig. 2. Time-bandwidth product (dotted line), and third order Rényi entropy
(solid line) of a two component signal, plotted versus the time displacement
between the two components.
leads to the instantaneous power and the energy spectrum:
∞
2
Cs (t, f ) df = s(t) ,
(17)
0.2
0
50
100
150
Time (s)
(c)
200
−∞
250
∞
−∞
0.4
0.2
0
50
100
150
Time (s)
(d)
200
250
0.4
0.2
0
50
100
150
Time (s)
(e)
200
250
Fig. 1. (a) Time representation, (b) Magnitude spectrum, (c) Spectrogram,
(d) WVD, and (e) SPWVD of a two-component nonstationary signal.
product increases without bound with separation, signal complexity clearly does not increase once the components become
disjoint, as shown in Fig. 2 by the dotted line.
Some useful properties of the TFDs belonging to the
Quadratic class refer to the preservation of signal energy in
the (t, f ) plane and the marginal conditions. The integration of
the TFD over the entire (t, f ) plane results into signal energy
[2]:
∞ ∞
Cs (t, f ) dtdf = Es ,
(16)
−∞
−∞
while the integration over frequency and time respectively
2
Cs (t, f ) dt = S(f ) .
(18)
It is well known from the information theory that the
information content and complexity of a probability density
function can be measured by the entropy function [1], [7],
[20]. Since the TFDs from the Quadratic class satisfy the
marginal conditions given by Eqs. (17) and (18), and after
the normalization of Cs (t, f )
∞ ∞
C (t, f )
∞ ∞ s
Csn (t, f ) =
dt df (19)
−∞ −∞ −∞ −∞ Cs (u, v)du dv
the instantaneous power and energy spectrum can be understood as one-dimensional densities of signal energy in time and
frequency, thus, TFDs may be interpreted as 2-D probability
density functions [7].
Hence, we would expect the classical Shannon entropy [1],
[22]
∞ ∞
Csn (t, f ) log2 Csn (t, f ) dt df (20)
H(Csn ) := −
−∞
−∞
to be an acceptable tool for measuring the complexity and
information content of a nonstationary signal in the (t, f ) domain. Since Csn (t, f ) in (20) represents a probability density
function, it is natural to expect that a multicomponent signal
will have larger entropy when compared to a single pulse in
the (t, f ) plane.
As explained in the Section II, nonpositivity (due to the
presence of interfering terms) is one of the characteristics of
the Quadratic class of TFDs. As a TFD can be negative in
some regions of the (t, f ) plane, the Shannon entropy can not
be used in practice as a signal complexity measure, due to the
logarithm in (20).
A solution to this limitation was proposed in [1], introducing
the generalized entropies of Rényi [23]:
Hα (Csn ) :=
1
log2
1−α
∞
−∞
∞
−∞
Csαn (t, f ) dt df,
Real part
Signal in time
1
0.5
(21)
0
−0.5
SPWV of a Gaussian pulse
Linear scale
0.5
0.45
or in the discrete form:
(22)
l=−L k=−K
Frequency (Hz)
k
L
1
log2
Csαn (l, k),
Hα (Csn ) :=
1−α
Energy spectral density
0.4
0.35
0.3
0.25
0.2
0.15
0.1
where α is the order of the Rényi entropy. As shown in [1],
when the parameter α (for the Shannon entropy α → 1)
is an odd integer value, the oscillatory structure of eventual
interferences between the components will be annulled under
the integration.
0.05
0
1500 1000 500
50
100
150
Time (s)
200
250
Fig. 3. A Gaussian atom in time (top), frequency (left), and joint timefrequency domain using the SPWVD (center).
Signal in time
1
Real part
IV. E NTROPY BASED ESTIMATION OF THE SIGNAL
COMPLEXITY
In this section will be shown that the time-frequency entropies are valuable tools in evaluating the signal complexity,
since it is intuitive to expect that signals composed of a
large number of non-overlapping elementary components, or
by components that can be obtained by their combinations,
achieve larger entropy values when compared to a single elementary component. In the case of ”quasi-ideal” TFDs (where
each signal component contributes separately to the overall
TFD, with no interference terms between the components)
the analogy between TFDs and probability density functions
would predict a counting behavior of the Rényi entropy.
0.5
0
−0.5
SPWV of two Gaussian pulses
Linear scale
0.5
0.45
Energy spectral density
0.4
Frequency (Hz)
0.35
0.3
0.25
0.2
0.15
0.1
0.05
6000 4000 2000
0
50
100
150
Time (s)
200
250
Fig. 4. Two Gaussian atom in time, frequency, and time-frequency (SPWVD).
A. The entropy counting property
The counting property of the generalized Rényi entropy
can be illustrated as follows: consider a compactly supported
signal (Gaussian atom) s1 (t) = s(t), and then form the twocomponent signal by adding to the s1 (t) a second component,
by shifting the first component in time,s2 (t) = s(t − Δt),
where Δt represents translation by time. Assuming that the
two signals have separated time supports, the resulting distribution is given by (12):
Ws1 +s2 (t, f ) = Ws1 (t, f ) + Ws2 (t, f ) + Xs1 s2 (t, f ),
(23)
where Xs1 s2 (t, f ) is the cross-WVD defined by (13).
By taking into account that due to their oscillatory structure the
cross-terms are annulled under the integration for odd powers
of α [1], the Rényi entropy of the WVD of the two-component
signal is:
(Ws1 +s2 (t, f ))α dtdf
1
log2 Hα (Ws1 +s2 (t, f )) =
1−α
(
(Ws1 +s2 (t, f )dtdf )α
=
=
=
=
=
α
Ws1 (t, f ) + Ws2 (t, f ) + Xs1 s2 (t, f ) dtdf
1
α
log2 1−α
Ws1 (t, f ) + Ws2 (t, f ) + Xs1 s2 (t, f )dtdf
α
α
Ws1 (t, f )dtdf +
W (t, f )dtdf
1
α s2
α
log2 1−α
Ws1 (t, f )dtdf +
Ws2 (t,f ) dtdf
α
2
W (t, f )dtdf
1
α
log2 s1
α
1−α
Ws1 (t, f )dtdf
2
α
Ws1 (dtdf )
1
α
log2 21−α 1−α
Ws1 (t, f )dtdf
Hα (Ws1 (t, f )) + 1.
(24)
The Rényi entropy of the two component signal TFD,
Hα (Ws1 +s2 (t, f )), carries exactly one more bit of information
when compared to the Rényi entropy of the one component
signal TFD, Hα (Ws1 (t, f )) (as long as the time shift Δt is
larger than the time support of the Gaussian atom). Thus, the
number of components can be determined as
n = 2Hα (Ws1 +s2 (t,f ))−Hα (Ws1 (t,f )) .
(25)
When the third order Rényi entropy (α = 3) is computed
for the Gaussian atoms in Figs. 3 and 4, the following results
and therefore carrying less information. In this case the results
obtained by the counting propriety will be significantly conditioned by the choice of the parameter α, since larger entropy
orders will emphasize the amplitude differences between the
components [1].
The counting property does not hold generally when the signal
cross-components overlap with the auto-components or other
cross-components. The time-frequency displacement operator
Real part
Signal in time
0.5
0
−0.5
Linear scale
SPWV of two Gabor logons
0.5
0.45
Frequency [Hz]
Energy spectral density
0.4
0.35
0.3
0.25
0.2
0.15
(Ds)(t) := ej2πΔf t s(t − Δt)
0.1
0.05
15000 10000 5000
0
50
100
150
Time [s]
200
250
Fig. 5.
Two Gaussian atoms with different durations, shown in time,
frequency, and time-frequency (SPWVD).
are obtained
H3 (SP Ws1+s2 (t, f )) = 1.3913,
H3 (SP Ws1 (t, f )) = 0.3913.
From (25) it follows that
n = 21.3913−0.3913 = 21 = 2.
This example confirms the accuracy of the Rényi entropy
counting property when the entropy of one of the components
is known in advance.
Let us now consider a signal composed of two components
with different time durations. The signal, consisting of two
Gaussian atoms, s1 (t) and s2 (t), with durations 96 s and 32 s
respectively, having same frequency supports (note that the
Rényi entropy is invariant to time and frequency shifts of the
signal [1]) is shown in Fig. 5.
The third order Rényi entropies are:
H3 (SP Ws1+s2 (t, f )) = 1.4469,
H3 (SP Ws1 (t, f )) = 0.8812,
H3 (SP Ws2 (t, f )) = 0.2169.
Clearly,
2H3 (SP Ws1+s2 (t,f ))−H3 (SP Ws1 (t,f ))
= 2H3 (SP Ws1+s2 (t,f ))−H3 (SP Ws1 (t,f ))
1.4801 = 2.3457 = 2.
As expected, the signal component s1 (t), that occupies a larger
region of the (t, f ) plane exhibits a considerably larger value
of the Rényi entropy when compared to the entropy of the
component with the shorter time support, s2 (t). Consequently,
the estimation of the number of components based on the
difference of the Rényi entropies of the entire signal and one of
its components fails regardless which of the two components
is chosen as the reference signal.
Clearly, Eq. (25) will not hold when applying the Rényi
entropy to a signal whose components present different amplitudes, since smaller components are dominated by larger ones
translates the signal by the distance
D = (Δt)2 + (Δf )2 .
(26)
(27)
It has been shown in [1] that in the case of non-compactly
supported components [3], auto-terms and cross-terms will in
general overlap to some degree, and so the counting propriety
can be rewritten as:
lim|D|→∞ Hα (Cs+Ds ) = Hα (Cs ) + 1.
(28)
B. Rényi dimension
From the counting property of the Rényi entropy, the Rényi
dimension Dα (Cs ) of a signal s(t) in terms of its TFD,
Cs (t, f ), and a basic building block function b (Gaussian
atom), can be defined as [21]
Dα (Cs ) := 2Hα (Cs )−Hα (Cb ) .
(29)
This dimension can be used as an indicator of the number of
basic building blocks required to ”cover” the TFD of s(t). For
the simplest signals, composed of disjoint, equal-amplitude
copies of one basic function, the Rényi dimension simply
counts the number of components. As the relative amplitudes
of these components change, the dimension estimate will
also change, since some components become dominate in the
signal.
V. E NTROPY BASED PERFORMANCE MEASURES OF
TIME - FREQUENCY DISTRIBUTIONS
As explained in Section II, an ideal TFD should represent a
signal component as a series of delta functions following the
its individual IF, with no spreads of the spectral content, and
without producing unwanted cross-terms between the various
components [2]. TFD concentration measures can provide
quantitative criteria to evaluate performance of different TFDs,
and hence they can be helpful during the optimizing procedure
of a TFD. An extensive review of the existing TF concentration
measures is given in [24], [25].
A. Global entropy concentration measures
When considering the representation quality of a TFD on
the entire (t, f ) plane one of the widely accepted measures
of concentration and ”peakedness”, introduced by Jones and
Parks in [2], [26] is:
∞ ∞ 4
−∞ −∞ Cs (t, f )dtdf
∞
(30)
M = ∞
( −∞ −∞ Cs2 (t, f )dtdf )2
which is the fourth power of the ratio of the L4 -norm and the
L2 -norm of a TFD. Maximizing M concentrates the signal
energy in the (t, f ) plane since the fourth power in the
numerator favors a peaky distribution.
The Rényi entropies make excellent measures of the information extraction performance of TFDs. By analogy to
probability density functions, minimizing the complexity or
information in a particular TFD is equivalent to maximizing
its concentration, peakiness, and, therefore, resolution. In fact,
maximizing M is equivalent to minimizing the differential
entropy 3H4 (Cs ) − 2H2 (Cs ). The Rényi entropy has been
proposed as measure of representation quality of TFDs in [27].
Two normalization schemes have been discussed which allow
the user to give more or less weight to the cross-term [28].
• Energy normalization:
•
B. Local entropy concentration measures
The quality of representation of a TFD can be evaluated
not only over the entire (t, f ) plane, but also locally, taking
into consideration the instantaneous (or local) features of
the structure of one or multiple components. In [30] the
instantaneous concentration performance of a TFD of a monocomponent signal, has been quantified as:
As (t) B(t)
.
(36)
p(t) = Am (t) f (t)
A well performing TFD achieves small values of p(t), by minimizing the sidelobe amplitude As (t) relative to the mainlobe
amplitude Am (t), and the mailobe bandwidth B(t) relative to
the central frequency f (t). For multicomponent FM signals,
the good performance of a TFD should be characterized by
the minimization of the instantaneous bandwidth of each com
ponent, the ratio of the side-lobe to the main-lobe amplitude,
K
L
α
C (l, k)
1
s
log2
Hα (Cs ) =
. (31) and the ratio of the cross-terms to the main-lobe amplitudes.
Cs (l, k)
1−α
Additionally, the components’ separation measure defined as
L=−l k=−K
In this case the cross-terms do not contribute in the [30]
denominator term, while they contribute in the numerator
B2 (t)
B1 (t)
S(t) = f2 (t) −
− f1 (t) +
(37)
term only in the case of even powers of α.
2
2
Volume normalization:
should be maximized. In [30] a measure that takes into account
all of these characteristics has been proposed as:
α
K
L
Cs (l, k)
1
Hα (Cs ) =
log2
. (32)P (t) = 1 − 1 As (t) + 1 Ax (t) + 1 − S(t)
(38)
|Cs (l, k)|
1−α
3 A (t) 2 A (t) D(t)
L=−l k=−K
A simplified explanation for the effects of volume
normalization would be that the total volume is affected
by the cross-terms, hence, the TFD with smaller
cross-terms (measured by volume) will have a smaller
uncertainty measure when compared to TFDs with larger
cross-terms and the same resolution.
Kernel design for global entropy minimization
After determining the desired normalization model, an adaptive kernel design has been proposed in [27], based on the
scheme introduced in [29]. First, the kernel function is derived
from a real valued primitive function h(t) as:
∞
h(t)e−jΘτ t dt,
(33)
φ(Θτ ) = H(Θτ ) =
m
m
where, for a pair of components, Am (t) and As (t) are
respectively the average amplitudes of the components’
mainlobes and sidelobes, Ax (t) is the cross-term amplitude,
and D(t) = f2 (t) − f1 (t) is the difference between the
components’ IFs. The measure P (t) is close to 1 for well
performing TFDs and 0 for poorly performing ones. The
practical estimation of the parameters in (38) is described in
[31].
The Rényi entropy, when applied to a short time interval of a
TFD, is sensitive to all the parameters taken into consideration
in (38), so that by computing the Rényi entropy on each short
time interval of the TFD, the obtained results are consistent
to the measure P (t) values.
−∞
where H(Θτ ) is the Fourier transform of h(t). The discrete
version of the kernel function is:
U
π
φ(m, n) =
h(u)e−ju( LK )nm ,
(34)
u=−U
where h(u) are the samples of h(t), and m = −L, ..., L and
n = −K, ..., K. Next a steepest gradient method is applied to
optimize the primitive function h(u):
∂Hα
h(u)k+1 = h(u)k − β
(35)
∂h(u)k
where β is the step size of each iteration. It has been shown
that the Rényi entropy will asymptotically decrease with the
increment of the iterations [27].
Kernel design for local entropy minimization
In [32], a simple scheme for adaptive kernel design has been
proposed. The method, initially based on a local ratio-ofnorms, uses the concentration measure
∞ ∞
|Cs,p (t, f )w(τ − t)|4 dtdf
−∞ −∞
∞
, (39)
M (t, p) = ∞
( −∞ −∞ |Cs,p (t, f )w(τ − t)|2 dtdf )2
where w(τ ) is a one-dimensional window [26]. Its modification is based on the local entropy concentration measure
computed in the form of a short-term Shannon entropy
∞ ∞
Cs,p (t, f )w(τ − t) log2 Cs,p (t, f ) ×
H(t, p) := −
−∞ −∞
w(τ − t) dt df. (40)
The method provides an adaptive procedure for time adaptivity
of a single parameter p of a TFD Cs,p (t, f ). The optimal timevarying parameter of a TFD is thus determined as
p∗ (t) = argmaxpM (t, p),
(41)
when the ratio of norms is used as the concentration criterium,
or as
p∗ (t) = argminp H(t, p),
(42)
when the entropy measure is used. The optimal parameter
p∗ (t) of a TFD at each time instant t would be selected by
computing M (t, p) or H(t, p) for different values of p, and
the locally optimally concentrated TFD will be constructed by
varying the value of p at each time instant.
VI. C ONCLUSION
The entropy measure is a well known tool for measuring
the information content of a probability distribution. In [20]
the entropy measure has been adapted to the time-frequency
plane, exploiting some of the key properties of TFDs [33], in
order to quantify the information content and complexity of
individual signals, as well as their separation in the TF plane.
The generalized Rényi entropy [23] when applied to a TFD
exhibits desirable properties such as invariancy under time and
frequency shifts of the signal, and cross-component invariance
(for odd entropy orders). Entropy measures have been also
shown to be an effective indicator of the number of components in a given signal, under several limiting assumptions [1]:
one of the signal components must be known a priori, all the
components must be equally supported in the TF plane, and
they need to exhibit similar spectral amplitudes. Since in reallife applications these conditions are rarely satisfied, the signal
information content may be evaluated w.r.t. a reference signal,
obtaining the information on how many reference signals are
required to form the analyzed signal [1], [34].
Apart from being a measure of signal information content,
the generalized Rényi entropy can provide information on the
performance of different TFDs [27]. In fact, the results obtained by applying entropy measures of TFDs are comparable
to those obtained by classical TF performance measures [30].
As a result of this, different kernel filters can be optimized by
following criteria of entropy minimization. Iterative methods
for designing optimal kernels based on the Rényi entropy
have been proposed. They either minimize the representation
entropy over the entire TF plane [27], or are based on datadependent algorithms, considering separately small portions of
the TF plane [32].
R EFERENCES
[1] Baraniuk, R.G., Flandrin, P., Janssen, A.J.E.M. and Michel O.J.J.: ‘Measuring time-frequency information content using the Renyi entropies’,
IEEE Transactions on Information Theory, 2001, 47, (4), pp. 1391-1409.
[2] Boashash, B.: ‘Time frequency signal analysis and processing: A comprehensive reference’, Elsevier, Oxford, UK, 2003.
[3] Flandrin, P., ‘Time-Frequency and Time-Scale Analysis’, San Diego, CA:
Academic, 1999.
[4] Sejdic, E., Djurovic, I., and Jin J.:‘Time-frequency feature representation
using energy concentration: An overview of recent advances’, Elsevier
Digital Signal Processing 19 (2009) pp. 153-183.
[5] Stankovic, Lj., Stankovic, S., Djurovic, I., and Dakovic, M.: ‘TimeFrequency Signal Analysis’, Podgorica, 2011.
[6] Hussain, Z., and Boashash, B.: ‘Adaptive instantaneous frequency estimation of multicomponent FM signals using quadratic time-frequency
distributions,’ IEEE Transactions on Signal Processing, 50, 2002, pp.
2127-2135.
[7] Aviyente, S., and Williams, William J.: ‘Minimum Entropy TimeFrequency Distributions,’IEEE Signal Processing Letters, (12), (1), 2005,
pp. 37-40.
[8] Cohen, L.: ‘Time-Frequency Analysis, Prentice Hall, New Jersey, 1995.
[9] Cohen, L. : ‘Distributions concentrated along the instantaneous frequency,’ SPIE: Advanced Signal-Processing Algorithms, Architectures,
and Implementations, 1348, 1990, pp. 149-157.
[10] Boashash, B.: ‘Estimating and interpreting the instantaneous frequency
of a signal-Part 1: Fundamentals; Part 2: Algorithms and applications,’
Proceedings IEEE, 80, 1992, pp. 519-568.
[11] Hahn Stefan, L.: ‘Hilbert transforms in signal processing, Artech House,
Inc., Boston, 1996.
[12] Ville, J.: ‘Theorie et applications de la notion de signal analytique,’Cables et Transmissions, 2A, (1). In French. English translation:
I. Selin, Theory and applications of the notion of complex signal, Rand
Corporation Report T-92, 1948, pp. 61-74.
[13] Martin, W., and Flandrin, P.:‘Wigner-Ville Spectral Analysis of Nonstationary Processes’, IEEE Transactions on Acoustics, Speech, and Signal
Processing, 33, (6), 1985, pp. 1461-1470.
[14] Hlawatsch, F., and Boudreaux-Bartels, G. F.: ‘Linear and quadratic timefrequency signal representations,’ IEEE SP Magazine, 1053, (5888), 1992,
pp. 21-67.
[15] Boashsh, B., and Escudie, B.:‘Wigner-Ville analysis of asymptotic
signals and applications,’ Signal Processing, 8, (1985), pp. 315-327.
[16] Hlawatsch, F., and Flandrin, P.: ‘The interference structure of the Wigner
distribution and related time-frequency signal representations,’ in The
Wigner Distribution-Theory and Applications in Signal Processing, W.
Mecklenbrauker and F. Hlawatsch, Eds. Amsterdam, The Netherlands:
Elsevier, 1997.
[17] Boashash, B. and O’Shea P.: ‘Use of the cross Wigner-Ville distribution
for estimation of instantaneous frequency,’ IEEE Transactions Signal
Processing, 41, 1993, pp. 1439-1445.
[18] Saulig N., Sucic V., Boashash B.: ‘An automatic time-frequency procedure for interference suppression by exploiting their geometrical features,’Proceedings of Systems, Signal Processing and their Applications
(WOSSPA), 2011, pp. 311-314.
[19] Baraniuk, R.G., Flandrin, P., and Michel O.J.J.:‘Time-frequency complexity and information’, in Proc. IEEE Int. Conf. Acoustics, Speech, and
Signal Processing-ICASSP ’94, pp. 329-332.
[20] Williams W. J. , Brown M., and Hero A.: ‘Uncertainty, information
and time-frequency distributions’, SPIE Advanced Signal Processing
Algorithms, 1556, 1991, pp. 144-156.
[21] Baraniuk, R.G., Flandrin, P., and Michel O.J.J.:‘lnformation and Complexity on the Time-Frequency Plane,’ Quatorzihe Colloque GRETSI,
Juan-Les-Pins, 1993, pp. 359-362.
[22] Shannon, C. E.: ‘A mathematical theory of communication-Part I,’ Bell
Syst. Tech J., 27, 1948, pp. 379-423.
[23] Rényi, A.: ‘On measures of entropy and information,’ in Proc. 4th
Berkeley Symp. Mathematics of Statistics and Probability, 1, 1961, pp.
547-561.
[24] Stankovic, LJ.:‘A measure of some time-frequency distributions concentration,’ Signal Processing, 81, (3), 2001, pp. 621-631.
[25] Sejdic, E., Djurovic, I., and Jiang, J.:‘Time-frequency Feature representation using energy concentration: an owerviw of recent advances,’ Digital
Signal Processing, 19, (1), 2009, pp. 154-188.
[26] Jones, D. L., and Parks, T. W.: ‘A high-resolution data-adaptive timefrequency representation,’ IEEE Transactions on Acoustics, Speech, and
Signal Processing, (38), 1990, pp. 2127-2135.
[27] Williams, W. J., and Sang, T. H.: ‘Adaptive RID kernels which minimize
time-frequency uncertainty,’ in Procedings of the IEEE Int. Symp. TimeFrequency and Time-Scale Analysis, 2005, pp. 96-99.
[28] Boutana, D., Benidir, M., Marir, F., and Barkat, B.: ‘A comparative
study of some time-frequency distributions using the Renyi criterion,’ in
Procedings of the 13th Europea signal processing conference, 2005.
[29] Jeong, J., and Williams, W. J.: ‘Kernel design for reduced interference
distributions’, IEEE Transactions on Signal Processing, 40, (2), 1992, pp.
402-412.
[30] Boashash, B., and Sucic, V.: ‘Resolution measure criteria for the
objective assessment of the performance of quadratic timefrequency
distributions’, IEEE Transactions on Signal Processing, 51, 2003, pp.
1253-1263.
[31] Boashash, B., and Sucic, V.: ‘Parameter selection for optimising timefrequency distributions and measurements of time-frequency characteristics of non-stationary signals,” Proceedings of the IEEE International
Conference on Acoustics, Speech and Signal Processing (ICASSP’01),
(6), 2001, pp. 3557-3560.
[32] Jones, D. L., and Baraniuk, R.G.: ‘A simple scheme for adapting timefrequency representations’, IEEE Transactions on Signal Processing, 42,
(12), 1994, pp. 3530-3535.
[33] Stankovic, LJ.:‘An analyisis of some time-frequency and time-scale
distributions,’ Annales des Telecommunication, 49, (9/10), 1994.
[34] Korac, D., Saulig, N., Sucic. V., Sersic, D., and Stankovic, S.: ‘Detecting
the number of components in a nonstationary signal using the Renyi
entropy of its time-frequency distributions,’ Engineering Review, 32, (1),
2012, pp. 23-31.