contributions to the truncated von mises distribution for the

ESCUELA TÉCNICA SUPERIOR DE INGENIEROS
INFORMÁTICOS
UNIVERSIDAD POLITÉCNICA DE MADRID
MASTER’S THESIS IN ARTIFICIAL
INTELLIGENCE
CONTRIBUTIONS TO THE
TRUNCATED VON MISES
DISTRIBUTION FOR THE
UNIVARIATE AND BIVARIATE
CASE
AUTHOR
: Pablo Fernández González
SUPERVISORS : Concha Bielza Lozoya
Pedro Larrañaga Mugica
Jun, 2014
ii
Acknowledgments
To my master thesis supervisors Concha Bielza and Pedro Larrañaga, for their advice
and guidance through the process of attaining this work to completion.
iii
iv
Resumen
En esta tesis describimos y caracterizamos la distribución von Mises truncada en su
forma univariante y bivariante y proveemos de desarrollos adicionales que en conjunto especifican la definición y aplicaciones de esta distribución de probabilidad.
Establecemos esta distribución a un nivel de desarrollo suficiente para que pueda ser
aplicada en problemas de modelado y simulación que pudieran aparecer en cualquier
campo del conocimiento siendo explorado por el ser humano. Aplicamos los resultados de este trabajo para modelar y estudiar la distribución de ángulos dendrı́ticos en
neuronas piramidales de la capa III del cortex cerebral en ratones, como un ejemplo
de lo que puede conseguirse utilizando la metodologa desarrollada.
v
vi
Abstract
In this thesis we describe and characterize the truncated von Mises distribution in
its univariate and bivariate form and we provide different additional developments
that in conjunction will specify the definition and applications of this probability
distribution. We set this distribution to a sufficient level of development for it to be
applied to modeling and simulation problems that may arise in any area of knowledge
under human exploration. We apply the findings of this work to model and study the
distribution of dendritic angles in cerebral cortex layer III mice pyramidal neurons
as an example of what can be achieved by analyzing the data with the developed
methodology.
vii
viii
List of Figures
1.1
1.2
1.3
1.4
that the classical mean
In radians, the incorrect distance of (2π)7
9
computed (red) compared to the correct solution of (2π)2
(blue). . . .
9
3
The incorrectly calculated mean of 0◦ , 30◦ and 360◦ using standard
statistics (red) compared to the correct solution (blue). . . . . . . . .
4
Both circular Cartesian and complex number coordinates approaches
to reference the angle θ = 34 π in the circle once initial direction (counterclockwise) and reference angle (0 degrees) have been chosen. . . . .
5
For angles 0◦ , 30◦ , 55◦ , 78◦ , 145◦ and 330◦ , the correctly calculated
mean and the mean resultant length. The calculated values were:
θ = 54◦ 260 49.200 and R = 0.5828. . . . . . . . . . . . . . . . . . . . . .
9
2.1
Example of different von Mises density functions with varying µ, κ
parameters. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.2
The von Mises distribution functions of the previously shown von
Mises density distributions. . . . . . . . . . . . . . . . . . . . . . . . 14
3.1
Several truncated von Mises distributions with varying parameters
that include all cases. Symmetrical truncation w.r.t. the mean (red),
strictly increasing function (blue), strictly decreasing function (green),
symmetrical antimode truncation (black), maximum and minimum
included truncation (yellow). . . . . . . . . . . . . . . . . . . . . . . 25
3.2
The distribution functions of all the truncated distributions described
in Figure 3.1. Notice how the functions do not increase outside the
truncation limits. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.3
I0 (x) evaluated in the interval [0, 2π]. . . . . . . . . . . . . . . . . . . 28
3.4
Example of the bi-dimensional von Mises distribution with parameters λ = 1, µ1 = 2, µ2 = 4, κ1 = 3, κ2 = 2, a1 = 0, b1 = 3.8, a2 =
2, b2 = 5. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
3.5
Same distribution as in Figure 3.4 although projected, appreciating
the truncation parameters from two axis perspectives. . . . . . . . . . 46
ix
x
LIST OF FIGURES
3.6
Several truncated marginals showing unimodality (red) with parameters λ = 5, µ1 = π, µ2 = 0, κ1 = 1, κ2 = 4, a1 = 0, b1 = 2π, a2 =
π − 0.2, b2 = 2π, two equal maxima (blue) with parameters λ =
5, µ1 = π, µ2 = 0, κ1 = 1, κ2 = 4, a1 = 0, b1 = 2π, a2 = 0, b2 = 2π,
truncated unimodality (green) with parameters λ = 1, µ1 = 4, µ2 =
2, κ1 = 3, κ2 = 4, a1 = 0, b1 = 5, a2 = 2, b2 = 2π and 2 distinct maxima (black) with parameters λ = 10, µ1 = 6, µ2 = 1, κ1 = 0.3, κ2 =
6, a1 = 0, b1 = 2π, a2 = 0, b2 = 5 respectively. . . . . . . . . . . . . . . 54
3.7
Marginal truncated von Mises with parameters λ = 5, µ1 = π, µ2 = 4,
κ1 = 2, κ2 = 4 and b2 = 5. The difference between each of them
is given by variation on the a2 truncation parameter. For a2 = 2
(black), we have cos(b2 − µ2 ) > cos(a2 − µ2 ) and therefore a maximum (the global maximum) is found in the interval [ π2 , π]. For a2 = 3
(blue), cos(b2 − µ2 ) = cos(a2 − µ2 ) where the distribution presents
two global maxima. For a2 = 3.2, cos(b2 − µ2 ) < cos(a2 − µ2 ) and
fumtvM (θ10 ) presents two critical points in the interval [ π2 , π]. For
a2 = 3.3565 (approximated value), cos(b2 − µ2 ) < cos(a2 − µ2 ) and
fumtvM (θ10 ) presents exactly one critical point in [ π2 , π]. For a2 = 3.5,
cos(b2 −µ2 ) < cos(a2 −µ2 ) and fumtvM (θ10 ) presents no critical point in
the interval [ π2 , π] and therefore the distribution is unimodal. Lastly,
for a2 = 4 we fall into the most restrictive case of cos(b2 − µ2 ) <
Rµ
Rb
cos(a2 − µ2 ) where − a22 f0v20 (θ2 ; π2 )dθ2 ≤ µ22 f0v20 (θ2 ; π2 )dθ2 (the previous cos(b2 − µ2 ) < cos(a2 − µ2 ) cases fell under the complementary
case, where the integral comparison did not verify the inequation) and
more specifically the case where a2 , b2 ∈ [µ2 , µ2 + π], which forces the
distribution to present a unimodal behavior regardless of the other
parameter values in the interval [ π2 , π]. The progression followed by
the distribution under modifying the a2 parameter can be seen, under appearances, as an “area shifting” process where approaching µ2
displacing a truncation parameter carries with it as well a displacement of the area of the distribution towards that direction, leaving
the global maxima always in the π2 −interval including µ1 associated
with the truncation parameter whose circular distance to µ2 is higher.
The “displacement” of a2 in this case seems to increase the value of
the maxima in [π, 32 π] and decrease the value of the maxima in [ π2 , π]
in the bi-maximal case until the distribution becomes unimodal, and
then continue by decreasing the area under the monotonic curve. . . . 60
4.1
Graphical visualization of the organization of the dataset. . . . . . . . 65
4.2
Estimated truncated von Mises distribution for the entire dataset.
This distribution corresponds to the parameter values of the 9th row
(named “All”) in Table 4.1. . . . . . . . . . . . . . . . . . . . . . . . 66
LIST OF FIGURES
4.3
4.4
4.5
xi
Estimated bivariate truncated von Mises distribution for the joint
data of the bifurcation levels 1 and 2. The parameter values of this
distribution are those in the second column of Table 4.4 (named “Bif12”). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
Marginal distribution of the first component (Bifurcation 1) in the
bivariate case for Bifurcation levels 1 and 2 shown in Figure 4.3. . . . 71
Marginal distribution of the second component (Bifurcation 2) in the
bivariate case for Bifurcation levels 1 and 2 shown in Figure 4.3. . . . 71
xii
LIST OF FIGURES
List of Tables
4.1
Parameter values of truncated von Mises distributions of each group
according to the brain area, and the whole dataset. . . . . . . . . . .
4.2 Estimated truncated von Mises distributions for the entire dataset
separated in 6 bifurcation levels. We can notice the emergence of a
pattern when examining the values of the µ parameter, that seem to
decrease when increasing the level we look at. . . . . . . . . . . . . .
4.3 Estimated truncated von Mises distributions for the different brain
areas and for the different bifurcation levels. We can notice how the
decreasing µ pattern is highly consistent appearing in every subgroup
except for PrL and M1 in the fewer samples estimator (levels 4 and
5, respectively). . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.4 Estimated truncated bivariate von Mises distributions for pairs of
bifurcation levels from one to five in the whole dataset. We can
notice that the estimation seems to show tendency to independence
by a decreasing tendency in the λ parameter. Also, there exists a
decreasing tendency shown by both means µ1 , µ2 . . . . . . . . . . .
4.5 Estimated truncated bivariate von Mises distributions for pairs of bifurcation levels from one to five in the M1 region. Here the decreasing
tendency in the λ parameter is not followed by either Bif1-2 or by Bif2-3.
4.6 Estimated truncated bivariate von Mises distributions for pairs of
bifurcation levels from one to four in the M2 region. . . . . . . . . . .
4.7 Estimated truncated bivariate von Mises distributions for pairs of
bifurcation levels from one to four in the PrL region. . . . . . . . . .
4.8 Estimated truncated bivariate von Mises distributions for pairs of
bifurcation levels from one to five in the S1 region. . . . . . . . . . .
4.9 Estimated truncated bivariate von Mises distributions for pairs of
bifurcation levels from one to four in the S2 region. . . . . . . . . . .
4.10 Estimated truncated bivariate von Mises distributions for pairs of
bifurcation levels from one to four in the V1 region. . . . . . . . . .
4.11 Estimated truncated bivariate von Mises distributions for pairs of
bifurcation levels from one to four in the V2 region. . . . . . . . . .
xiii
66
67
68
69
72
72
73
73
74
74
75
xiv
LIST OF TABLES
Contents
List of Figures
ix
List of Tables
xiii
1 Introduction
1.1 Scope, motivation and objectives of the present work . . . . . . . . .
1.2 Directional statistics . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.2.1 Coordinate systems and the limitations of classical statistics .
2 The
2.1
2.2
2.3
2.4
2.5
von Mises distribution
Definition . . . . . . . . . . . .
Properties . . . . . . . . . . . .
Maximum likelihood estimation
Characteristic function . . . . .
Moments . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
3 Truncated von Mises distribution
3.1 Truncated distribution . . . . . . . . . . . . . . . . . . . . . . . . . .
3.2 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.3 Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.4 Bessel functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.4.1 Some results on the modified Bessel functions of the first kind
3.4.2 Calculating the indefinite integral of the unnormalized von
Mises function by means of its power series expansion . . . . .
3.5 Maximum likelihood estimation of the parameters . . . . . . . . . . .
3.6 Characteristic function . . . . . . . . . . . . . . . . . . . . . . . . . .
3.7 Moments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.8 The bi-dimensional truncated von Mises distribution . . . . . . . . .
3.8.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.8.2 Parameter estimation . . . . . . . . . . . . . . . . . . . . . . .
3.8.3 Conditional and marginal truncated von Mises distributions .
1
1
2
3
11
12
13
15
17
18
21
21
22
23
26
28
31
36
40
40
42
43
47
49
4 Application in Neuroscience
63
4.1 Data organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
4.2 Unidimensional von Mises distribution fitting . . . . . . . . . . . . . 64
xv
xvi
CONTENTS
4.3
4.4
Bidimensional von Mises distribution fitting . . . . . . . . . . . . . . 69
Conclusions and further studies . . . . . . . . . . . . . . . . . . . . . 75
5 Conclusions and future work
79
Bibliography
81
Chapter 1
Introduction
1.1
Scope, motivation and objectives of the present
work
When analyzing and developing a probability distribution, acknowledged calculations and descriptors are to be attained for the particular case we are working with.
Some of these may include moments, characteristic function, maximum likelihood
estimators, properties and the expressions of the conditional and marginal distributions that can be obtained when working with a multivariate distribution. In this
work, we will cover all the mentioned results over the truncated von Mises distribution on the circle. Also, we aim to establish the distribution as a valid option
to modeling and simulation problems under the need of statistical analysis, as an
equivalently available option to other alternatives. Additional motivation comes
from noticing that the von Mises distribution on the truncated case has barely received attention. To the best of our knowledge, only one paper from Bistrian and
Iakob (2008) shows developments in this specific direction, although work with the
concept of truncation and work regarding the non-truncated case of the von Mises
distribution can be easily found in the statistical and mathematical community.
Therefore, additional developments are needed.
The thesis is organized as follows:
In the current chapter we introduce the reader to the field of directional statistics
in which the rest of the presented work is based. It will thus provide the framework
of development that we need to properly address the attainment of the main objectives of the master’s thesis. Constant references to this basic knowledge are to be
found in the next chapters.
Chapter 2 reviews the von Mises distribution in its non-truncated version, as it
was considered a necessary prerequisite to the main work that is going to be developed in Chapter 3.
1
2
CHAPTER 1. INTRODUCTION
Chapter 3 is the main development of this work, where all the previously stated
objectives are attained.
Chapter 4 contains the application of the distribution to original data in the
field of neuroscience. More specifically, the selected dataset contains measurements
of cerebral cortex layer III mice pyramidal neurons (Ballesteros-Yáñez et al. (2010)).
Chapter 5 is devoted to the conclusions, where we further refer to the specific
achievements and magnitude of the present work.
1.2
Directional statistics
Directional statistics is a particular case of the statistical theory and methodology
where the format of the observations meets the particular requirement of having
a vectorial representation of fixed length (1 by convention). It was first developed
as such by Kanti V. Mardia (Jupp and Mardia (1989)) to properly handle circular and/or spherical observations, whose properties are not correctly addressed by
conventional statistics. Kanti V. Mardia and Peter E. Jupp can be considered the
essential authors and the main specialists in the field gathering a number of additional contributions such as Mardia and Jupp (2000).
All possible vectors of a fixed length in an n-dimensional space conform an ndimensional sphere of that fixed radius. Distributions can be drawn out of the
different configurations at which we can find the observations to be given as well as
apply many other statistics to describe them. Directional statistics are also referred
to as circular statistics as the unidimensional case conforms a circular space and
then a circular observation can be regarded as a point in the perimeter of the circle.
Circular distributions arising in this reformulation of classical statistics can easily
appear as proper distribution models for a variety of phenomena in the application
domain. Most classical examples include measurements of wind directions from a
stationary point, time measurements where we are interested in the positions of the
clock’s hands rather than the absolute time, compass measurements, angles that
javelin throwers produce respect to the ground line, and many others.
Circular statistics can be considered a transformation from classical statistics
where the observations on the perimeter of a circle contrast with the infinite line of
the classical approach. We will define the points in the perimeter of a circle of radius
1 (and refer to them from now on simply as points in the circle, unless stated otherwise) as the O set, which we can express in a Cartesian coordinate bi-dimensional
space as O = {(x, y) ∈ R2 such that x2 + y 2 = 1} and use the classical R real set for
the line.
When analyzing the points in the circle, a fundamental difference between both
spaces (R and O) is clear under observation: The circle space has a close perimeter,
1.2. DIRECTIONAL STATISTICS
3
as it could be viewed as a line whose two extrema are connected, or differently said,
the circle comprises a closed shape inside its perimeter. This fundamental difference
allows the representation of periodic functions in a natural way and also implies the
insufficiency of the classical statistics to compute correctly circular data and/or to
summarize and describe the observations properly.
1.2.1
Coordinate systems and the limitations of classical
statistics
Points in the circle need to be represented and referred properly in O. If we were
to address the problem with unidimensional Cartesian coordinates, and attempt to
address the fundamental difference by
xw = x mod 2π,
(where xw denotes a wrapped variable), restricting our values to 2π with the modulus
periodicity, we may find that the linear statistics used to summarize and describe
our data fail to calculate the expected solution. As an example, problems may arise
when trying to obtain a point that is at distance d from another. In the circle,
the shortest path between two points is defined through the circumference with no
distinction between the point we consider the reference and any other. Thus, if we
and (2π)8
(in radians), our linear statistics distance
compute the distance between 2π
9
9
expression would calculate:
(2π)8 (2π) (2π)7
9 − 9 = 9 ,
yielding an incorrect solution since we were expecting to obtain
1.1).
(2π)2
9
(see Figure
4
CHAPTER 1. INTRODUCTION
that the classical mean comFigure 1.1: In radians, the incorrect distance of (2π)7
9
(2π)2
puted (red) compared to the correct solution of 9 (blue).
This problem appears under the special consideration the 0 value has, as it is
considered to be “the beginning” of a circle. This example not only suggests that
the distance notion has to be rewritten but also shows how classical Cartesian coordinates are not directly compatible with the notion of circle.
Further extending the drawbacks of the classical approach, another example
arises when computing the sample mean of a set of observations. Let us consider
a set of 3 observations θ1 = 30◦ , θ2 = 0◦ , θ3 = 330◦ ∈ O (in degrees) and use the
classical sample mean µ̂
n
1X
θi .
µ̂ =
n i=1
◦
◦
◦
Here we obtain (30 +03+330 ) = 120◦ (see Figure 1.2). The result given by the classical mean again does not acknowledge the closed nature of the circle. In the circle
0◦ = 0◦ + 360◦ k, k ∈ Z so it is possible to say with care (specifying the k periodic values in both expressions) that 0◦ ≥ 330◦ or otherwise exposed, 330◦ has
a difference of 30◦ + 360◦ k respect to the 0◦ that is not acknowledged by the classical mean, thus yielding an incorrect result (it treats the circle as if it was cut at 0◦ ).
Figure 1.2: The incorrectly calculated mean of 0◦ , 30◦ and 360◦ using standard
statistics (red) compared to the correct solution (blue).
We need therefore a coordinate system that will naturally address the properties
of O over which we can define the statistics to properly describe and summarize our
data.
The solution was found to be to consider the points in the circle as vectors of
modulus one in R2 and refer to them by the angle they create w.r.t. a preferred
1.2. DIRECTIONAL STATISTICS
5
angle and orientation, that is, using polar coordinates. Unless otherwise stated,
points on circular statistics and on the O set are to be regarded as angular values.
Equipped with those considerations we can finally redefine the Cartesian coordinates to its circular analogue by means of:
x = (sin(θ), cos(θ))
where θ is the angle created with respect to the initial direction and a reference angle
that needs to be specified. It needs to be noted that despite the representation uses
a 2-dimensional coordinate system, the interdependence of the coordinates created
by the use of only one argument (θ) prevents it to address every point in the plane,
and by means of the angular trigonometrical representation the set of addressed
points results to be only the allowed O perimeter set. We can see this by increasing
the θ value and observing how the specified points under the coordinate system are
“drawing” O and only O. Also, it needs to be noted how periodicity is now naturally
handled (as expected by definition) and how now ∀θ1 , θ2 ∈ O, θ1 + θ2 ∈ O, that is,
we have closed operations w.r.t. the O set as well as all the well known properties
that operations between angles satisfy in O.
More formally, if we consider the new coordinate system as an embedding function C we have that C : R → O, that is, C “shrinks” the R line (as we are referring
to 1-D quantities) into the subset of the points that belong to the circle in O ∈ R2 .
Another proposal is to regard the points in the circle’s perimeter as complex
numbers of the form: z = eiθ = cos(θ) + i sin(θ) (see Figure 1.3). Both notations
are commonly used and will appear in developments of this work.
Figure 1.3: Both circular Cartesian and complex number coordinates approaches to
reference the angle θ = 43 π in the circle once initial direction (counterclockwise) and
reference angle (0 degrees) have been chosen.
6
CHAPTER 1. INTRODUCTION
Solving the problem of the coordinates is not enough as the distance example
brought to observation. New statistics need to be defined in order to effectively
study data on the circle.
The redefinition of the mean goes through the definition of two statistics. Let
Θ = {θ1 , θ2 , · · · , θn } be a set of angular observations (note that if we were given the
unitarian vectors as observations, the angles with respect to our reference system
would be calculated to use them as the data). We define the mean components of
the circular Cartesian coordinates as:
n
n
1X
1X
S=
sin(θi ), C =
cos(θi )
n i=1
n i=1
Then the mean angle is calculated as:

S

if C ≥ 0
 arctan( C )
θ=

 arctan( S ) + π if C < 0
C
(1.1)
This expression will give the same mean as the classical linear sample mean as
long as the observations are in [0◦ , 180◦ ] (with a counterclockwise direction and a
reference point of 0◦ ) where acknowledging or not if the line is closed on itself is
simplified under appearances.
It can be noted that if we represent the point (S, C) in the plane it may not be
in the circle as it could happen that it produces a non-unitarian vector. The length
of this vector is called the mean resultant length. It can be calculated as
q
2
2
R= S +C
(1.2)
or
n
R=
1X
cos(θi − θ)
n i=1
(1.3)
And additionally related to C and S by
C = R cos(θ)
(1.4)
S = R sin(θ)
(1.5)
where θ is the mean angle (see Figure 1.4).
The R value has a meaning in the description of the set of observations as it
results to be a measure of the concentration as opposed to the concept of variance
in classical statistics. If we were in the position to place some observations on the
circle and compute its mean resultant length, to maximize its expression we must
1.2. DIRECTIONAL STATISTICS
7
place all of them at the same point. We can get more detailed insights about these
results by examining and noticing that
Lemma 1.2.1. R ∈ [0, 1].
Proof. The proof of R ≥ 0 is trivial as we can observe that R is the square root of
a solely possible positive quantity, as it is composed by the sum of squared terms.
The proof of R ≤ 1 can be found to be shown by in Equation (1.6) in the proof of
Lemma 1.2.3. below
Lemma 1.2.2. If Θ can be expressed as Θ = {θ1 , · · · , θn , θ1 + π, · · · , θn + π} then
R=0
Proof. In that case ∀θi , (i = 1, · · · , n) ∃θj such that θi = θj + π and therefore
cos(θi ) = − cos(θj ) and sin(θi ) = − sin(θj ). That is, all opposite angles cancel each
others coordinates in the C, S computations.
Lemma 1.2.3. R = 1 only when θ1 = θ2 = θ3 = · · · = θn−1 = θn ∈ Θ (All angles
are equal).
Proof. The case where R = 1 occurs only when (S, C) satisfies the fundamental
theorem of trigonometry, thus corresponding with a point in the circle.
We need to prove that if ∃θi , θj ∈ Θ such that θi 6= θj then R < 1. Or equivalently.
s
sin(θ1 ) + sin(θ2 ) + · · · + sin(θn )
n
2
+
cos(θ1 ) + cos(θ2 ) + · · · + cos(θn )
n
2
< 1
(sin(θ1 ) + sin(θ2 ) + · · · + sin(θn ))2 + (cos(θ1 ) + cos(θ2 ) + · · · + cos(θn ))2 < n2
We can develop the squared terms as:
n
z
}|
{
(sin(θ1 ) + sin(θ2 ) + · · · + sin(θn ))2 = sin2 (θ1 ) + · · · + sin2 (θn )


2
n −n
2
}|
{
z
+2 
sin(θ
)
sin(θ
)
+
·
·
·
+
sin(θ
)
sin(θ
)
1
2
n−1
n


n
z
}|
{
(cos(θ1 ) + cos(θ2 ) + · · · + cos(θn ))2 = cos2 (θ1 ) + · · · + cos2 (θn )


2
n −n
2
}|
{
z

+2 cos(θ1 ) cos(θ2 ) + · · · + cos(θn−1 ) cos(θn )

8
CHAPTER 1. INTRODUCTION
now applying cos2 (θ) + sin2 (θ) = 1 to the squared terms we obtain:
n + 2(sin(θ1 ) sin(θ2 ) + cos(θ1 ) cos(θ2 ) + · · · +
sin(θn−1 ) sin(θn ) + cos(θn−1 ) cos(θn )) < n2
sin(θ1 ) sin(θ2 ) + cos(θ1 ) cos(θ2 ) + · · · +
sin(θn−1 ) sin(θn ) + cos(θn−1 ) cos(θn ) <
n2 − n
2
Grouping the terms by means of the equality cos(φ−θ) = cos(φ) cos(θ)+sin(φ) sin(θ)
we obtain:
n2 −n
2
z
}|
{ n2 − n
(1.6)
cos(θ2 − θ1 ) + cos(θ3 − θ1 ) + · · · + cos(θn − θn−1 ) <
2
At this point we can see, given cos(x) ∈ [−1, 1], that the only configuration that
contradicts the inequation is that where all the terms reduce to cos(0) = 1 which
requires θi = θj ∀i, j. Since we exclude this possibility in at least one of them, we
can say that ∃ cos(θi − θj ) such that cos(θi − θj ) < 1 in the previous expression, thus
satisfying the inequality for all permitted values. Inequation (1.6) also shows that
any possible configuration other than all angles equal will necessary produce R < 1,
thus proving Lemma 1.2.1.
With this information, we define another statistic that was conceptually introduced before: the distance between two angles φ and θ as
d(φ, θ) = 1 − cos(φ − θ).
So we are now in conditions to interpret R as the mean of the “1−distance to
the mean” that each of our observations present. Thus, R only contains and uses
the information of computing the average of the distances to the mean, which can
be considered the nature of its concentration diagnosing capabilities.
Formally,
n
n
1X
1X
d(θi , θ) = d =
(1 − cos(θi − θ))
n i=1
n i=1
then, by using Equation (1.3),
n
d =
n
n
1X
1X
1X
1−
cos(θi − θ) =
1−R
n i=1
n i=1
n i=1
(1.7)
1.2. DIRECTIONAL STATISTICS
9
We obtain
n
n
1X
1X
1−
d(θi , θ) = R
n i=1
n i=1
n
1X
(1 − d(θi , θ)) = R
n i=1
as stated above.
Figure 1.4: For angles 0◦ , 30◦ , 55◦ , 78◦ , 145◦ and 330◦ , the correctly calculated mean
and the mean resultant length. The calculated values were: θ = 54◦ 260 49.200 and
R = 0.5828.
It is now straightforward to introduce as a generalization of the mean restriction
imposed in Equation (1.7), the statistic for computing the dispersion of a set of
angles Θ about a given angle θ as:
n
1X
D(Θ, θ) =
(1 − cos(θi − θ)).
n i=1
This distance notion takes into consideration the periodicity of the circle, but
its results are not expressing perimeter distances. Accounting the perimeter scaling,
another notion of distance was found in this work to be:
d2 (θ1 , θ2 ) = arccos(cos(θ1 − θ2 ))
Which can be considered the circular analogue to that on the line
d(x1 , x2 ) = |x1 − x2 |.
Lastly, it has been proposed as the circular analogue to the linear variance the
statistic
V = 1 − R ∈ [0, 1]
although other proposals also exist.
10
CHAPTER 1. INTRODUCTION
Chapter 2
The von Mises distribution
In this chapter we will give a complete addressing of the von Mises distribution as
its knowledge intersects highly that of the truncated von Mises distribution of the
next chapter. Similarly to the line, probability distributions followed by a random
circular variable (random variable that produces angular values or unitarian vectors)
can also be subject to study and definition. Distributions on the circle are angular
l-periodic distributions (where l ∈ R and ∃n ∈ N/nl = 2π), that is, periodic distributions whose period is multiple of 2π. They can be obtained mainly by two related
procedures: natively defining them on O or wrapping them from distributions on
the line.
A wrapped on the circle random variable is obtained from a random variable on
the line by introducing the fundamental difference between both sets on its definition.
In this case a random circular variable Xw is defined w.r.t. the line random variable
X as:
Xw = X mod 2π.
Using the complex numbers notation, it is defined as:
Xw = eiX .
and the density function of the probability distribution associated to that variable
can also be written in terms of the line density function as:
fw (θ) =
∞
X
f (θ + 2πk).
k=−∞
The most significant example is the wrapped normal distribution:
∞
X
−(θ−µ+2πk)2
1
2σ 2
fW N (θ; µ, σ) = √
e
σ 2π k=−∞
that as we will see shares some relationships with the von Mises distribution.
11
(2.1)
12
CHAPTER 2. THE VON MISES DISTRIBUTION
Native circular distributions are directly defined in the O domain, although one
can establish a mapping between both line and circle’s perimeter and therefore find
or hypothesize the existence of their linear counterpart and vice-versa.
Let θ be a continuous random variable that follows a circular density distribution,
f (θ) satisfies:
R 2π+a
1. a
f (θ)dθ = 1, where a ∈ R
2. f (θ + 2πk) = f (θ), ∀k ∈ Z
That is, the properties that mostly differentiate both scenarios (linear and circular) are the redefinition of the integral coefficients to those of the circle (1.) and
the periodicity of the density function (2.).
2.1
Definition
The von Mises probability distribution is natively defined as
eκ cos(θ−µ)
fvM (θ; µ, κ) =
2πI0 (κ)
(2.2)
where
1. µ ∈ [i, i + 2π], i ∈ R, is the location parameter as it defines where the mode
of the distribution is going to be placed. In this case, the maximum value of
the cos(.) function is reached at θ = µ, thus relating µ directly with the mode.
The i value in this context enables the selection of the 2π-length interval where
the distribution is going to be observed. Most common values in literature are
i = 0 or i = −π and in this work, unless otherwise stated, the considered
interval is [0, 2π). Additionally, the µ parameter is commonly called the mean
parameter as in this case as well as other well known cases such as the normal
distribution, the mode and the mean have similar value (these distributions
are called “mean centered distributions” as the density tends to concentrate
around it).
2. κ ∈ (0, ∞) is the scale or concentration parameter, as opposed to the σ parameter on the normal distribution. It determines the concentration of the
distribution around the highest values of it (in this case the mean). The
higher κ is, the more concentrated around the mean the distribution becomes.
In the special case where κ = 0 the distribution reduces to the uniform circular
1
distribution: fvM (θ; µ, 0) = u(θ) = 2π
.
P
x2m
3. I0 (κ) = ∞
m=0 22m (m!)2 is the first kind modified Bessel function of order 0. It
will be addressed properly in section 3.3.
2.2. PROPERTIES
13
Figure 2.1: Example of different von Mises density functions with varying µ, κ parameters.
By manipulating the µ, κ parameters, the resulting von Mises function may differ
in location and concentration from other von Mises distributions (see Figure 2.1),
as suggested by the parameters definition.
2.2
Properties
The von Mises distribution is composed by the periodic function
fuvM (θ; µ, κ) = eκ cos(θ−µ)
(2.3)
which will be referred to as unnormalized von Mises distribution and its integral
over any 2π−length interval [i, i + 2π] is
Z
i+2π
eκ cos(θ−µ) dθ = 2πI0 (κ).
i
Therefore, analyzing Equation (2.3) allows us to observe and report many of the
properties of the distribution. fuvM can be subdivided into a continuous strictly
increasing function e(.) , a positive constant κ and a cos(.) ∈ [−1, 1] function.
14
CHAPTER 2. THE VON MISES DISTRIBUTION
With this we can conclude
fuvM (θ; µ, κ) ∈ [e−κ , eκ ]
Realizing now that I0 (κ) is a positive strictly increasing function for κ > 0 allows
us to say that
fvM (θ; µ, κ) > 0 ∀θ, µ, κ
Rx
which implies that its distribution function FvM (x) = 0 fvM (θ; µ, κ)dθ for fvM
defined in [0, 2π] and x ∈ [0, 2π] is a strictly increasing function in [0, 2π]. In general,
R x+i
FvM (x) = i fvM (θ; µ, κ)dθ > 0 provided x ∈ [i, i + 2π] (see Figure 2.2).
Figure 2.2: The von Mises distribution functions of the previously shown von Mises
density distributions.
The distribution is symmetrical w.r.t. the location parameter as:
fvM ((µ + θ) − µ) = fvM ((µ − θ) − µ)
fvM (θ) = fvM (−θ)
2.3. MAXIMUM LIKELIHOOD ESTIMATION
15
This behavior is obtained from the known even property of the cos(.) function where
cos(−x) = cos(x), as it takes the independent variable (θ) as input.
An interesting result comprehending both wrapped normal distribution and von
Mises distribution is the increasing approximation capability as κ grows that both
share: the von Mises distribution tends to converge to a corresponding wrapped
normal distribution for large κ. More formally, the obtained results reported in
Mardia and Jupp (2000) were:
r !
1
lim fvM (θ; µ, κ) = fW N θ; µ,
k→∞
κ
where fW N was defined in Equation (2.1).
The existance of the progressive approximation to the previous equality as κ
grows is acknowledged in the literature and allows the use of fW N instead of the von
Mises distribution for different problems where it could be applied.
2.3
Maximum likelihood estimation
Inside the statistical inference scenario, we are interested in approximating the underlying probability distribution that a random variable follows by the information
provided solely by the samples collected from it. In this section, we will develop for
contextual purposes the maximum likelihood estimator of the von Mises distribution
parameters. It can be found also in Mardia and Jupp (2000).
Given the data Θ = {θ1 , θ2 , ...θn }, the log-likelihood function
ln L(µ, κ; θ1 , θ2 , · · · , θn ) =
n
X
ln f (µ, κ; θi )
i=1
is, for the von Mises distribution,
ln L(µ, κ; θ1 , θ2 , · · · , θn ) =
n
X
κ cos(θi − µ) − n ln(2πI0 (κ))
i=1
We seek to solve the system of log-likelihood equations created by:
∂ ln L
=0
∂µ
∂ ln L
=0
∂κ
16
CHAPTER 2. THE VON MISES DISTRIBUTION
These are two equations with two unknown variables. For the partial derivative
of µ we obtain:
n
X
∂ ln L
=
κ sin(θi − µ) = 0
∂µ
i=1
or
n
=
1X
κ sin(θi − µ) = 0
n i=1
We know by definition that κ > 0. Thus, in the case of the existence of a solution,
it is independent of the κ value. Therefore
n
1X
sin(θi − µ) = 0
n i=1
n
1X
(sin(θi ) cos(µ) − sin(µ) cos(θi )) = 0
n i=1
Pn
cos(µ) n1 i=1 sin(θi )
P
= 1
sin(µ) n1 ni=1 cos(θi )
tan(µ) =
S
C
S
µ̂ = arctan
C
That is, the µ parameter reaches a critical point at the definition of the sample mean
(1.1).
Now we proceed with the partial derivate of κ as:
n
X
I1 (κ)
∂ ln L
=
cos(θi − µ) − n
=0
∂κ
I0 (κ)
i=1
or
n
1X
I1 (κ)
cos(θi − µ) =
n i=1
I0 (κ)
We have used Equation (3.1) for the Bessel function derivative, stated as
∂In (x)
n
=
In (x) + In+1 (x)
∂x
x
2.4. CHARACTERISTIC FUNCTION
17
although equations (3.2),(3.3) and (3.4) could have also been used considering
I−1 (κ) = I1 (κ) (For a more detailed addressing of the Bessel functions in this work,
see Section 3.3).
At this point we can observe that we are dealing with the definition of R in
Equation (1.3) as we have
I1 (κ)
(2.4)
R̂ =
I0 (κ)
Equation (2.3) is commonly referred to in the literature (for example in Mardia
and Jupp (2000)) as the maximum likelihood estimator of R.
If we now consider the system of log-likelihood equations

 µ̂ = arctan(S/C)

1
n
Pn
i=1
cos(θi − µ) =
I1 (κ)
I0 (κ)
We can consider to have found the estimator
M LE(µ) = µ̂ = arctan(S/C)
as its expression is independent of all remaining parameters (κ) in the system and
depends solely on the sample data.
The estimator of κ, also independent, introduces the non trivial problem of
obtaining the inverse function of
A(κ) =
I1 (κ)
.
I0 (κ)
(2.5)
However, in this case we can consider to calculate R by equations (1.2) and
(1.3) and approximate numerically its value with A(κ) by assessing it for different
κ values.
2.4
Characteristic function
The characteristic function of a random variable is widely used in literature as a tool
to handle the underlying probability distribution followed by that variable. Among
its interesting properties we have that a probability distribution is uniquely determined by its characteristic function, which can then be used to refer uniquely to such
distribution when performing studies over it and its existence for any probability
distribution.
18
CHAPTER 2. THE VON MISES DISTRIBUTION
The general expression of the characteristic function of a circular random variable
X is defined as the sequence of complex numbers given by the expression:
ΦX (t) = E[eitX ]
Where t ∈ Z.
For the von Mises density function in [0, 2π] we have:
ΦXvM (t) = E[e
itX
Z 2π
1
] =
eitx eκ cos(x−µ) dx
2πI0 (κ) 0
Z 2π
1
(cos(tx) + i sin(tx)) eκ cos(x−µ) dx
=
2πI0 (κ) 0
R 2π
R 2π
cos(tx)eκ cos(x−µ) dx i 0 sin(tx)eκ cos(x−µ) dx
0
=
+
R 2π
R 2π
κ cos(x−µ) dx
e
eκ cos(x−µ) dx
0
0
The second addend is 0, ∀t ∈ Z, when the distribution is symmetrical w.r.t. the
mean. As it is always the case and considering Equation (3.1), we can simplify the
former expression by
ΦXvM (t) = eitµ
It (κ)
I0 (κ)
(2.6)
Where It (κ) is the modified Bessel function of the first kind and order t. Note that
ΦXvM (−t) = ΦXvM (t).
2.5
Moments
The moments of a probability distribution are descriptors associated to power values
of its population and can be derived from the characteristic function associated to
that distribution. More precisely, the t-th trigonometric moment (with t ∈ Z) mt in
the circle is calculated as the expectation as
h
i
iX t
mt = E e
= E[eitX ].
It can be immediately noticed that the sequence of all possible moments for t is
equivalent to the characteristic function of that random variable.
Unlike distributions in the line, an important result acknowledged in Mardia and
Jupp (2000) reveals that any circular distribution is completely determined by its
characteristic function, implying that any circular distribution has well defined moments for every value of t. This result appears to arise from a practical fundamental
difference of the closed space of the circle w.r.t. the line and that is the lack of the
infinite extension in the domain of any distribution function, which frees us from
2.5. MOMENTS
19
needing it in the circular expectation operators and calculation definitions.
We can derive the moments of the von Mises distribution about the a direction
by:
mtvM = E[eit(X−a) ]
Without considering m0 = 1, the first moment about the 0 direction for the von
Mises distribution is
R 2π
m1vM =
0
cos(x)eκ cos(x−µ) dx
R 2π
eκ cos(x−µ) dx
0
Or equivalently:
m1vM = E[eiX ]
= E[cos X + i sin X]
= E[cos X] + iE[sin X]
Now applying the population versions of equations (1.4) and (1.5) we can follow
with:
m1vM = R cos(µ) + iR sin(µ)
= Reiµ
I1 (κ) iµ
e
=
I0 (κ)
Which constitutes the final expression for the first moment. For the second moment
we have
R 2π
m2vM
m2vM
cos(2x)eκ cos(x−µ) dx
=
R 2π
eκ cos(x−µ) dx
0
I2 (κ) i2µ
=
e
I0 (κ)
0
Where I2 (κ) is the modified Bessel function of the first kind and order 2.
Since our distribution location is controlled by µ parameter, for location independent descriptions it is interesting to consider the moments about the real µ direction
as:
20
CHAPTER 2. THE VON MISES DISTRIBUTION
R 2π
m01vM
0
=
cos(x − µ)eκ cos(x−µ) dx
R 2π
eκ cos(x−µ) dx
0
which results in:
m01vM =
I1 (κ)
I0 (κ)
And
R 2π
m02vM =
0
cos(2(x − µ))eκ cos(x−µ) dx
R 2π
eκ cos(x−µ) dx
0
!
which results in:
m02vM =
I2 (κ)
.
I0 (κ)
We can generalize the notion of moments about the 0 direction for the von Mises
distribution as
mtvM =
I|t| (κ) itµ
e
I0 (κ)
Where |.| is the absolute value operator.
And for the moments about the µ direction we have:
m0tvM =
I|t| (κ)
.
I0 (κ)
Chapter 3
Truncated von Mises distribution
In this chapter the truncated von Mises distribution is presented and developed.
Given the lack of documentation regarding the truncated case for the von Mises
distribution, the work conducted here can be considered only based on Bistrian and
Iakob (2008) and Mardia and Jupp (2000) and original in all proposed and attained
goals. It is established as the main chapter and main development motivation of the
present work.
3.1
Truncated distribution
A random variable X defined in R is distributed according to a truncated probability
distribution when the distribution’s expression, belonging to a family of probability
distributions has also an additional specification that restricts its positive support
to a subinterval defined by parameters a, b. Truncated distributions are conditional
distributions on that specification and can be written as:

f 1 (x)
 F 1 (b)−F 1 (a) if a < x ≤ b
fa,b (x) = f 1 (x|a < X ≤ b) =

0
otherwise
Where f 1 (x) is the non-truncated, or commonly called, parent’s density and F 1 (x)
its distribution function.
Truncations can also occur in only one of the a, b parameters; this is called single truncation as opposed to the previous double truncation, and in this case the
f 1 (x)
positive support section of the previous definition changes to f (x|X > a) = 1−F
1 (a)
for x > a, or f (x|X ≤ b) =
f 1 (x)
F 1 (b)
for x ≤ b.
One of the most significant differences of truncated distributions is precisely its
definition by means of a parent distribution. The parameters of a truncated distribution are the same as those in the parent’s distribution (besides the truncation
21
22
CHAPTER 3. TRUNCATED VON MISES DISTRIBUTION
parameters) but we lack the information of how the truncated distribution behaves
outside the restricted support (as our sample and density is contained in the [a, b]
interval). This may add difficulty to our calculations and cause problems to appear
that were simplified in the non-truncated case. Resulting distributions may not
be symmetrical nor have some maxima or minima that the distribution presented
outside the defined support (among many other possibilities). The modifications
over the support of the distribution have effects on distribution descriptors like expectation −now integration is between truncation parameters a, b, like in F (x)−,
moments calculation and parameter estimation −where samples come only from inside the truncation interval−. In the latter case, parameter estimation techniques
are at risk of resulting in either biased estimators or not sufficiently good results
except for simplified cases of truncation. For example, truncations that preserve
symmetry in symmetrical w.r.t. the mean distributions in mean centered distributions will still be able to produce an unbiased estimator of that parameter.
Truncated distributions are specially interesting to consider, model or simulate
problems where we have reliable knowledge about existing boundaries in the random
variable, which can be also of interest to estimate. It needs to be noted that leaving
values outside the truncation interval implies a strong commitment where under
our model those possibilities “cannot exist” or “cannot occur”. Therefore, when
the situation exposes us to the risk of this event (for example, deciding or not if
reestimate the truncation parameters if new data is given after an initial estimation)
it shall be handled with care. We will see that in a sample dependent parameter
estimation scenario with a sufficient number of samples, our worries about this may
notably decrease (in direct correlation with the number of samples available) as the
truncation case can be considered a generalization of the non-truncated case and if
we use it for estimation in a non-truncated scenario our estimated interval will tend
to occupy the whole circle. This also suggests that under the necessity to choose
either the truncated or non-truncated case for a given problem, outside contextual
considerations that may influence this decision, it could be argued or considered the
existence of the trade-off between mathematical tractability, which also depends on
the particular distribution, and generality, as choosing the truncated case will cover
both truncated and non-truncated scenarios.
3.2
Definition
The truncated von Mises distribution in a 2π-length interval is defined as:
ftvM (θ; µ, κ, a, b) =
where:


eκ cos(θ−µ)
NT

0
if a ≤ θ < b
(3.1)
otherwise
3.3. PROPERTIES
23
R 2π
1. N = 0 eκ cos(θ−µ) dθ = 2πI0 (κ) is the normalization term found in the fvM , in
Equation (2.2).
R b κ cos(θ−µ)
2. T = a e 2πI0 (κ) dθ is the redefinition term. It transforms the previous normalization term, that accounts for the function’s positive support in all the
interval, to the restricted interval [a, b].
3. a, b ∈ [i, i + 2π] such that a ≤ b and i ∈ R are the truncation parameters that
define the positive support section of the function and regulate the output in
its definition. This externalization of the influence of the truncation parameters will need further addressing when computing the maximum likelihood
estimation, as when we vary some of the truncation parameters, the resulting
function changes in both shape and positive support.
4. µ, κ are the same as in the non-truncated case.
After computing both N, T terms we can observe that the resulting expression
of its positive support definition is
ftvM (θ; µ, κ, a, b) = R b
a
eκ cos(θ−µ)
eκ cos(θ−µ) dθ
if a ≤ θ ≤ b
(3.2)
This redefinition of the normalization constant shows more clearly the situation
of the probability distribution and is a consequence of satisfying the properties of a
probability density function as we now have
Z b
ftvM (θ; µ, κ, a, b)dθ = 1
a
3.3
Properties
Within its positive support, we can notice some variations between the original von
Mises and the truncated von Mises distributions:
1. ∃a, b, µ ∈ [i, i + 2π] such that ftvM (θ; µ, κ, a, b) is a strictly decreasing function
in its positive greater than zero support, or strictly increasing, or increases
and decreases reaching a single maxima, or increases and decreases reaching
a global minima and or increases and decreases with both single maxima and
single minima (see Figure 3.1).
Here is put under observation the different shapes we can create just by
manipulating the truncation parameters under fixed µ (if κ > 0 it is independent of these considerations) and previously defined support interval.
(a) If ftvM (θ; µ, κ, a, b) is strictly decreasing, then truncation parameters a, b
satisfy a, b ∈ [µ, µ + π] or a, b ∈ [µ − 2π, µ − π]. Its trivial to notice
that in the interval [µ, µ + π] of a Von Mises distribution, it decreases
24
CHAPTER 3. TRUNCATED VON MISES DISTRIBUTION
from the maximum to the minimum value of the distribution. However
is also possible to define an support interval that includes values from
the parent distribution in the interval [µ − 2π, µ − π], where the previous
decreasing behavior in the periodic function takes place. If the truncation
parameters belong entirely to those intervals the resulting truncated von
Mises distribution presents a monotonic decreasing behavior.
(b) Analogously, if ftvM (θ; µ, κ, a, b) is strictly increasing, truncation parameters a, b satisfy a, b ∈ [µ − π, µ] or a, b ∈ [µ + π, µ + 2π]
(c) If ftvM (θ; µ, κ, a, b) increases and decreases reaching a single maxima then
truncation parameters a, b satisfy µ ∈ (a, b) and µ + π, µ − π ∈
/ [a, b]
(d) If ftvM (θ; µ, κ, a, b) increases and decreases reaching a single minima then
truncation parameters a, b satisfy µ + π ∈ (a, b) and µ, µ + 2π ∈
/ [a, b].
Also we can symmetrically consider µ − π ∈ (a, b) and µ, µ − 2π ∈
/ [a, b]
(e) If ftvM (θ; µ, κ, a, b) increases and decreases with both single maxima and
single minima, the truncation parameters a, b satisfy either µ, µ+π ∈ [a, b]
or µ, µ − π ∈ [a, b]
3.3. PROPERTIES
25
Figure 3.1: Several truncated von Mises distributions with varying parameters that include all cases. Symmetrical truncation w.r.t. the mean (red),
strictly increasing function (blue), strictly decreasing function (green), symmetrical antimode truncation (black), maximum and minimum included truncation (yellow).
2. Given fvM (θ; µ, κ) and ftvM (θ; µ, κ, a, b) such that b − a < 2π then fvM (θ) <
ftvM (θ), ∀θ ∈ [a, b].
This result can be seen intuitively as the density that is cut with the
truncation is “absorbed” by the remaining density inside the truncation limits by means of the normalization factor, that is now lower in value (that
of the truncated distribution). It can be restated in the [0, 2π] interval as:
R 2π κ cos(θ−µ)
R b κ cos(θ−µ)
e
dθ
>
e
dθ when b − a < 2π
0
a
3. Given c, d such as c ≤ d, c, d ∈ [i, i+2π] and [a, b] ∈ [c, d] then
Rb
f (θ; µ, κ, a, b) for a, b ∈ [i, i + 2π] (see Figure 3.2).
a tvM
Rd
c
ftvM (θ; µ, κ, a, b) =
26
CHAPTER 3. TRUNCATED VON MISES DISTRIBUTION
Z
d
ftvM (θ; µ, κ, a, b)
c
Z
a
=
Z
ftvM (θ; µ, κ, a, b) +
c
Z
b
Z
ftvM (θ; µ, κ, a, b) +
a
d
ftvM (θ; µ, κ, a, b)
b
b
ftvM (θ; µ, κ, a, b)
=
a
As the truncated von Mises in Equation (3.1) behaves outside the selected
interval by the truncation parameters as the constant zero function.
Figure 3.2: The distribution functions of all the truncated distributions described in
Figure 3.1. Notice how the functions do not increase outside the truncation limits.
3.4
Bessel functions
Bessel functions provide important results in many fields and have appeared historically since the middle of the XVIII century in physics problems and as solutions in the domain of differential equations. They are named after Friedrich Bessel
(1784-1846) who showed them as the canonical solutions to the Bessel’s differential
equation but its discovery is originally attributed to Daniel Bernoulli (1700-1782).
3.4. BESSEL FUNCTIONS
27
A couple of examples of famous appearances of the Bessel functions are in Leonard
Euler’s (1707-1783) work, when he used Bessel functions of integer order in 1764 in
the analysis of a stretched membrane problem and around 1817, in the problem of
determining the motion of three bodies moving under mutual gravitational attraction studied by Bessel.
Bessel functions arise as solutions of the Bessel’s differential equation defined as
the following second order differential equation:
x2
d2 y
dy
+ x + (x2 − v 2 )y = 0
2
dx
dx
(with v ∈ R) which is known as Bessel’s equation. The solutions are of the form
y = AJv (x) + BYv (x),
where A, B are unspecified constants, Jv (x) is the Bessel function of the first kind
and order v and Yv (x) denotes the Bessel function of the second kind and order v.
If subsequently in the Bessel’s equation we modify x by ix, we obtain as solutions:
y = CIv (x) + DKv (x)
x>0
with again C, D unspecified constants and Iv (x), Kv (x) the modified Bessel functions of order v and first and second kind respectively. Bessel functions have been
subject to intense attention and an extensive collection of results is available through
different publications (Rosenheinrich (2013), Abramowitz and Stegun (1964), Gradshteyn and Ryzhik (2007)).
In this work we will only operate with the modified Bessel functions of integer
order n ∈ Z and first kind, denoted In (x). The modified Bessel function of the first
kind and order 0 (see Figure 3.3), that appears in the definition of the von Mises
density function is defined as:
I0 (x) =
∞
X
x2m
22m (m!)2
m=0
(3.3)
28
CHAPTER 3. TRUNCATED VON MISES DISTRIBUTION
Figure 3.3: I0 (x) evaluated in the interval [0, 2π].
3.4.1
Some results on the modified Bessel functions of the
first kind
We look and account for some results involving the Modified Bessel functions of the
first kind and integer order (that we will refer to as MBFFK) that were found to
be specially relevant to the development of the subsequent work. Every exposed
result, unless otherwise stated, can be found in (Rosenheinrich (2013), Abramowitz
and Stegun (1964), Gradshteyn and Ryzhik (2007)).
These functions are defined as
1
I|n| (x) =
2π
Z
2π
ex cos θ cos(nθ)dθ.
(3.4)
0
Here we can observe a more general relationship of modified Bessel functions of
first kind with a type of integrals that comprises that of the von Mises distribution.
When n = 0 it particularizes for the exact von Mises function normalization factor.
The general expression for MBFFK is given by:
In (x) =
∞
X
x2m+n
.
22m+n m!(m + n)!
m=0
3.4. BESSEL FUNCTIONS
29
This result allows us to see all definitions of MBFFK and different integer orders
as well as the relationships that exist between their expressions:
∂I0 (x)
= I1 (x) = I−1 (x)
∂x
(3.5)
And in general:
n
∂In (x)
=
In (x) + In+1 (x)
∂x
x
∂In (x)
n
= In−1 (x) − In (x)
∂x
x
1
∂In (x)
=
[In−1 (x) + In+1 (x)]
∂x
2
(3.6)
(3.7)
(3.8)
This allows us to explain the results of the differentiation operation in terms of
combinations of MBFFK of different orders.
An original result on Bessel functions was obtained (to the best of our knowledge) when conducting this work, when considering its expression as infinite series.
We can observe that
∞
X
xm 1 xm+n
1
In (x) =
.
m
m+n
m! 2 (m + n)! 2
m=0
The expression of
Pthe Bessel function is comprised by the product of terms that
individually (if the
operator were to be applied individually to each term) have
finite expressions if n in In (x) is finite. Concretely:
∞
X
xm
= ex
m!
m=0
∞
X
(3.9)
n−1
X xi
xm+n
= ex −
(m + n)!
i!
m=0
i=0
∞
X
1
= 2
2m
m=0
∞
X
m=0
1
2m+n
=
(3.10)
(3.11)
1
2n−1
(3.12)
Expression (3.9) is the power series expansion of ex . (3.10) can be expressed using
(3.9) and a remaining finite sum that depends on the order of the MBFFK. (3.11)
is the geometric progression convergence of 21 as well as (3.12) of 21n .
We can state
30
CHAPTER 3. TRUNCATED VON MISES DISTRIBUTION
Theorem 3.4.1. ∃Sn (x) such that Sn (x) > In (x) ∀x ∈ R+ and ∀n ∈ Z as:
Sn (x) = ex
1
ex −
2n−2
n−1 i
X
x
i=0
!
i!
Proof. If we now consider A = (1, 12 , 41 , 18 , · · · ) as the sequence of all terms that apx2 x3
pear in the infinite
sum
of
(3.11),
B
=
(1,
x,
2 , 6 , · · · ) as a similar sequence for
n
n+1
n+2
n+3
x
x
x
(3.9) and Bn = xn! , (n+1)!
, (n+2)!
, (n+3)!
, · · · as a similar sequence for (3.10) we can
realize that the difference between both Sn (x), In (x) functions lies in the arrangement of those sequences to yield a final expression composed by product operators.
We can then see that Sn (x) contains the elements of the previous sequences under
product operators and can be rewritten as:
Sn (x) =
∞
X
!
∞
1 X
ai
2n i=1
ai
i=1
!
∞
X
!
bi
i=1
∞
X
!
bni
i=1
where ai , bi , bni are the general elements belonging to the sequences A, B, Bn respectively. Simmilarly we can rewrite the expression of In (x) as
In (x) =
∞
X
i=1
1
ai n ai bi bni
2
!
and alternatively as
1
In (x) = Pwp A, n A, B, Bn
2
where Pwp(., ., ., .) would be a point-wise quaternary product operator applied to
the previous sequences (Notice that a similar to Pwp(., ., ., .) operator but in a binary
form is commonly known as the matrix product operator). Thus it suffices to prove
that in an scenario composed only of solely positive sequences the arrangement of
the sequences for Sn (x) produces results with greater value than the arrangement
of the sequences for In (x).
Therefore for general sequences X = {x1 , x2 , ...}, Y = {y1 , y2 , ...}, Z = {z1 , z2 , ...}
and K = {k1 , k2 , ...} such as ∀xi , yi , zi , ki > 0 we have to prove that
∞
X
i=1
xi yi zi ki <
∞
X
i=1
!
xi
∞
X
i=1
!
yi
∞
X
i=1
!
zi
∞
X
!
ki
i=1
Now if we define D = {(i, j, l, k) such as i = j = l = k, i, j, l, k ∈ N} and DC its
complementary over i, j, l, k ∈ N we have
3.4. BESSEL FUNCTIONS
31
x1 y1 z1 l1 + · · · < x1 y1 z1 k1 + x1 y1 z1 k2 + · · · + x1 y1 z2 k1 + · · · + x1 y2 z1 k1 + · · ·
∞
∞
∞
X
X
X
xi yj zl kr <
xi yj zl kr +
xi yj zl kr
D
0 <
D
∞
X
DC
xi yj zl kr
DC
And given that all xi , yi , zi , ki > 0 and DC 6= ∅ the last inequation holds.
If we now particularize the order of Sn (x) we get the following results:
I0 (x) < 4e2x ∀x ∈ R+
I1 (x) < 2ex (ex − 1) ∀x ∈ R+
3.4.2
Calculating the indefinite integral of the unnormalized von Mises function by means of its power series
expansion
The indefinite integral of the unnormalized von Mises function its expressed as:
Z
I(x; µ, κ) = eκ cos(x−µ) dx
and in this work an expression for µ = 0 and taking w = b n2 c + mod n2 − 1 was
calculated as:
Z
eκ cos(x) dx =
∞
X
κn
n=0
n!
sin(x)
((−1)n + 1)
w
X
2i
Y
j
cosn−2i−1 (x) (n − j)−(−1)
i=0
j=0
Qw
j
−(−1)
x
i=0 (n − j)
2
!!
+
!
= I(x; 0, κ)
which can be trivially modified to include a non specified µ as:
Lemma 3.4.1.
I(x; µ, κ) = x +
∞
X
κn
n=1
n!
sin(x − µ)
((−1)n + 1)
w
X
2i
Y
j
n−2i−1
cos
(x − µ) (n − j)−(−1)
i=0
Qw
!
+
j=0
i
−(−1)
(x − µ)
i=0 (n − i)
2
!
(3.13)
32
CHAPTER 3. TRUNCATED VON MISES DISTRIBUTION
Proof. It has been discussed that the function ex has the power series expansion:
∞
X
xn
ex =
n=0
n!
And this series can be used to re-express the indefinite integral of the unnormalized von Mises distribution in Equation (2.3) as:
Z
I(x; µ, κ) =
Z
ftvM (x; µ, κ)dx =
κ cos(x−µ)
e
Z X
∞
(κ cos(x − µ))n
dx.
dx =
n!
n=0
In order to further operate
with this expression, we seek to classify the func(κ cos(x−µ))n
tion f (n, x) =
as a candidate for Fubini’s-Tonelli’s theorem application
n!
(given that summation can be considered as a particular discrete case of integration). Fubini’s-Tonelli’s theorem gathers conditions on which a double integral (or
in our case, an integral and a summation operator) can be resolved iteratively and
commutatively w.r.t. the integrals, allowing us to pick the convenient order used to
determine the solution.
Fubini’s-Tonelli’s sufficient conditions for application in our case are:
1. f (n, x) > 0 ∀n, x ∈ R
Z X
XZ
|f (n, x)|dx < ∞ or
2.
|f (n, x)|dx < ∞
f (n, x) does not satisfy condition 1 as for a proper µ, x such that cos(x − µ) < 0
and an n such that n = 2m + 1, m ∈ N we have f (n, x) < 0.
In the case of the second condition we have:
Z X
∞
n=0
Z X
∞ (κ cos(x − µ))n dx
|f (n, x)|dx =
n!
n=0
=
Z X
∞
|(κ cos(x − µ))n |
n=0
n!
dx
Noticing that ∀x such that cos(x−µ) ≥ 0, we have |f (x)| = f (x) =
and that ∀x such that cos(x − µ) < 0 we can write
∞
X
|(κ cos(x − µ))n |
n=0
n!
=
∞
X
(−κ cos(x − µ))n
n=0
n!
Z
=
R
eκ cos(x−µ) dx
e−κ cos(x−µ) dx
3.4. BESSEL FUNCTIONS
33
allows us to properly account for the absolute value modification in the power series
expansion as:
Z
Z X
∞
|(κ cos(x − µ))n |
dx = e|κ cos(x−µ)| dx
n!
n=0
where f (x) = e|κ cos(x−µ)| ∈ [1, eκ ] if κ is finite.
Now if we assume finite integral coefficients a, b it suffices to prove that
Z
b
e|κ cos(x−µ)| dx < ∞
a
which can be easily derived from the fact that is a bounded solely positive periodic
function. This finally allows us to say that the function satisfies the second condition
and is suitable for the appliance of the Fubini’s-Tonelli’s theorem.
Subsequently we follow with the procedure for the indefinite integral as:
Z X
∞
(κ cos(x − µ))n
I(x; µ, κ) =
dx
n!
n=0
∞ Z
X
(κ cos(x − µ))n
=
dx
n!
n=0
Z
∞
X
κn
=
(cos(x − µ))n dx
n!
n=0
The integral presented above is defined in a recursive way as
Z
sin(x) cosn−1 (x) n − 1
cos (x)dx =
+
n
n
n
Z
cosn−2 (x)dx
It can be calculated by the procedure of integration by parts. In this work,
however, a non-recursive expression was here obtained as:
Z

bn
c+ mod
2
cosn (x)dx = sin(x) 
X
n
−1
2
Q2i
cosn−2i−1 (x) Qi
i=0
j=0 (n − j)
2
j=0 (n − 2j)
!

∀n such that n = 2m + 1
with m ∈ N. If we observe the numerical regularities that appear when “unfold-
34
CHAPTER 3. TRUNCATED VON MISES DISTRIBUTION
ing” the recursive expression:
Z
Z
sin(x) cos(x)n−1 n − 1
n
+
cos (x)dx =
cosn−2 (x)dx
n
n
Z
n−1
n − 1 sin(x) cos(x)n−3 n − 3
sin(x) cos(x)
n−4
+
+
=
cos (x)dx
n
n
n−2
n−2
n−1
1
sin(x) cos(x)n−1 +
sin(x) cos(x)n−3
=
n
(n)(n − 2)
Z
(n − 1)(n − 3)
(n − 1)(n − 3)(n − 5)
n−5
+
sin(x) cos(x)
+
cosn−6 (x)dx
(n)(n − 2)(n − 4)
(n)(n − 2)(n − 4)
We can account for them coupled with the odd n restriction with:

bn
c+ mod
2
= sin(x) 
X
n
−1
2
Q2i
cosn−2i−1 (x) Qi
j=0 n − j
2
j=0 (n − 2j)
i=0
!

However, while this first expression does suffice forR odd n, an extra term appears
for the even case as we reach a point where the term cos0 (x)dx is computed. This
can properly be reflected by adding an addend that takes into account the parity of
the formula. In our case, it has the form:
(−1)n h(x) + h(x)
((−1)n + 1)h(x)
=
2
2
where ∀n ∈ Z such as n = 2m and m ∈ Z, g(2m, x) = h(x) and 0 otherwise.
g(n, x) =
On a shorter notation and adding the parity term, the considered expression
becomes:
Z

bn
c+ mod
2
cosn (x)dx = sin(x) 
X
2i
Y
j
cosn−2i−1 (x) (n − j)−(−1)
i=0
j=0
((−1)n + 1)
n
−1
2
Qb n2 c+ mod n2 −1
i=0
j
(n − j)−(−1) x
!
+
!
2
Thus merging all the factors we obtainR the final expression for
trivially leads to the final expression for eκ cos(x−µ) dx.
R
eκ cos(x) dx, which
In order to observe and show the correctness of the reached expression, a simple
procedure could be particularizing (3.13) when the limits of the integral are 0, π,
expecting that the expression corresponds or is equivalent to the definition of the
modified Bessel function of order 0 when µ = 0, as it needs to be.
3.4. BESSEL FUNCTIONS
Lemma 3.4.2.
Rπ
0
35
eκ cos(x) dx = [I(x; 0, κ)]π0 = πI0 (κ)
Proof.
[I(x; 0, κ)]π0
κ2
κ2
κ3
= π + κ sin(π) + π +
cos(π) sin(π) + (cos2 (π) sin(π)) +
4
4
18
κ4
κ2
κ2
π + · · · − 0 − κ sin(0) + 0 −
cos(0) sin(0) − · · ·
64
4
4
All terms that contain a sin(.) are nullified as well as all of the second half of the
expression under the minus sign. Regrouping the terms results in:
[I(x; 0, κ)]π0
Z
= πI0 (κ) =
π
eκ cos(x−µ) dx.
0
If we were to consider the whole 2π interval without any mean restriction we
would have
Lemma 3.4.3.
R 2π
0
eκ cos(x−µ) dx = [I(x; µ, κ)]2π
0 = 2πI0 (κ).
Proof. All terms involving the trigonometric functions will simplify at the difference
between the integration coefficients by sin(2π − µ) = sin(−µ) and cos(2π − µ) =
cos(−µ) and by the evaluation on the integration coefficients we will obtain:
[I(x; µ, κ)]2π
0
2
∞ X
κn
= 2π
+0
2n n!
n=0
[I(x; µ, κ)]2π
0 = 2πI0 (κ).
This result is extended, by using the properties of the periodic values in the
trigonometric functions, to any values a, b as integration coefficients such that |b −
a| = 2π.
Lemma 3.4.4. If we define [G(x|µ, κ)]ba as a function that comprehends all trigonometrical terms in the indefinite integral expression. We can write the general case
of the definite integral as
[I(x; µ, κ)]ba = (b − a)I0 (κ) + [G(x; µ, κ)]ba .
36
CHAPTER 3. TRUNCATED VON MISES DISTRIBUTION
Proof. In the general case we obtain:
κ2
κ2
κ2
cos(b − µ) sin(b − µ) + b − µ +
4
4
4
3
4
κ
κ
(cos2 (b − µ) sin(b − µ) + 2 sin(b − µ)) +
···
18
96
κ2
κ2
κ2
−a − κ sin(a − µ) −
cos(a − µ) sin(a − µ) − a + µ −
4
4
4
3
4
κ
κ
(cos2 (a − µ) sin(a − µ) − 2 sin(a − µ)) −
···
18
96
It can be observed how the non trigonometric terms in the sum resulting from
the even power resolutions of the cosine integral can be regrouped in terms of the
modified Bessel function of order 0.
[I(x; µ, κ)]ba = b + κ sin(b − µ) +
This expression accounts for the appearance of the modified Bessel function of order 0 in the general case. For integration coefficients that allow direct simplification
relationships between the trigonometric terms involved, possible simpler definitions
of the indefinite integral as well as the definite general case can be directly obtained.
Further treatment of the former expression can be achieved and another way to
organize the terms of the indefinite integral was found to be:
I(x; µ, κ) = µ+(x−µ)I0 (κ)+(I0 (κ)−1) cos(x−µ) sin(x−µ)+sin(x−µ)
∞
X
r=0
∞
X
i=2
κ2r+1
+
((2r + 1)!!)2
∞
2i − 3 X
κ2j−1
cos2j−2 (x − µ) sin(x − µ)+
2i − 2 j=i ((2j + 1)!!)2
!
∞
2i − 2 X κ2j
cos2j−1 (x − µ) sin(x − µ)
2i − 1 j=i (n!)2 2n
which specially regularizes the progression followed by the trigonometrical terms.
The former expression involves the use of the double factorial operator, defined as:
Qb n2 c+ mod n2 −1
a!! = i=0
(a − 2i) or differently said, a factorial-like operator that for
an even input outputs the product of all even numbers lower or equal than it and
respectively for an odd input.
3.5
Maximum likelihood estimation of the parameters
In this section we seek to determine the MLE (Maximum Likelihood Estimator) of
each of the parameters of the von Mises truncated distribution.
3.5. MAXIMUM LIKELIHOOD ESTIMATION OF THE PARAMETERS
37
We obtain for the Truncated von Mises distribution (3.2):
ln(L(µ, κ, a, b; θ1 , θ2 , · · · , θn )) =
=
n
X
i=1
n
X
ln
eκ cos(θi −µ)
Rb
a
!
eκ cos(θ−µ) dθ
Z
κ cos(θi − µ) − n ln
b
e
κ cos(θ−µ)
dθ
a
i=1
(3.14)
Where µ ∈ [0, 2π], κ > 0, a < b, θi ∈ [a, b], ∀i = 1, · · · , n.
We seek now to solve the system of four log-likelihood equations created by the
four parameters of the distribution. For parameters µ, κ we have:
∂ ln L
=0
∂µ
∂ ln L
=0
∂κ
In considering the truncation parameters a and b we can use the restrictions over
them to ensure proper boundaries for the values of the truncation limits by:
b ∈ [max({θ1 , · · · , θn }), π + µ]
a ∈ [µ − π, min({θ1 , · · · , θn })]
If b were to be below the maximum, we wouldn’t be able to explain results that
are above it as its density is estimated to be 0. Analogously a needs to permit every
found individual to be at the positive support region of the function. The above
intervals are obtained as they are found to cover every value configuration (Notice
also that the 2π−length interval where we observe the support of the function has
not been chosen in the estimation scenario).
If we consider these insights together with the likelihood expression to be maximized in Equation (3.14) we can trivially
Robserve that the influence of both a, b
b
parameters is restricted to the term −n ln a eκ cos(θ−µ) dθ . So it suffices to observe
how the values of a, b, under fixed µ, κ parameters, contribute to the maximum.
From this point, we can notice that ln(.) is an strictly increasing function, and
therefore maximizing the expression inside will yield the maximum possible value
that argument could have provided to it. Also, n ∈ N. It suffices then the study of
Z b
q(µ, κ, a, b) = −
eκ cos(θ−µ) dx
a
38
CHAPTER 3. TRUNCATED VON MISES DISTRIBUTION
As stated in the properties shown above of the von Mises distribution, the function fuvM (θ) (2.3) has infinite positive support, as its values oscillate from [e−k , ek ].
Now considering function g(x), I1 = [i11 , i12 ], I2 = [i21 , i22 ] such that i11 ≤ i21 and
i12 ≥ i22 where g(x) is a continuous integrable function of infinite positive support
(∀x ∈ R, g(x) > 0) and I1 , I2 ⊂ R we can state that:
Z i22
Z i12
g(x)dx
g(x)dx ≥
i21
i11
Intuitively, we can say that the value of the integral function in these types of
functions could be treated as the area under the curve of the function for the selected
interval. Therefore, if we were to pick a subinterval where to observe the value of the
integral, it would be an equivalent operation as to pick a subsection of the area. We
can use this consideration to determine the maximum value under the permitted
intervals previously defined for the a, b parameters.
In our case a = µ − π, b =
R
b κ cos(θ−µ)
dθ . The minus sign at
µ + π maximizes the integral expression n ln a e
R
b
the beginning in −n ln a eκ cos(θ−µ) dθ transforms the maximization problem to a
minimization problem, to find out the quantity that less takes from the value of the
function. The solution is therefore
â = min(θ1 , · · · , θn )
b̂ = max(θ1 , · · · , θn )
We will now attempt to observe the behavior of the remaining parameters in
contributing to the maximum.
For the mean parameter µ we have:
n
X
∂
ln(L(µ, κ, a, b; θ1 , θ2 , · · · , θn )) =
κ sin(θi −µ)−n
∂µ
i=1
eκ cos(a−µ) − eκ cos(b−µ)
Rb
eκ cos(θ−µ) dθ
a
!
=0
Or equivalently:
n
1X
eκ cos(a−µ) − eκ cos(b−µ)
sin(θi − µ) −
=0
Rb
n i=1
k a eκ cos(θ−µ) dθ
Here we can observe how the symmetry of the non-truncated von Mises produced
cos(a − µ) = cos(b − µ), since [a, b] = [i, i + 2π]. This simplifies the equation as the
second term cancels for symmetric w.r.t. the mean or equal values on the truncation
parameters.
3.5. MAXIMUM LIKELIHOOD ESTIMATION OF THE PARAMETERS
39
For parameter κ we have:
n
X
∂
ln(L(µ, κ, a, b; θ1 , θ2 · · · θn )) =
cos(θi − µ) − n
∂κ
i=1
Rb
a
cos(θ − µ)eκ cos(θ−µ) dθ
=0
Rb
κ cos(θ−µ) dθ
e
a
Or equivalently:
n
1X
cos(θi − µ) −
n i=1
Rb
a
cos(θ − µ)eκ cos(θ−µ) dθ
=0
Rb
κ cos(θ−µ) dθ
e
a
Here we can observe that the first equation is similar to that of the regular
von Mises distribution except for the characteristic redefinition of the truncation
parameters. If we consider:
n
1X
R̂ =
cos(θi − µ)
n i=1
Then
Rb
R̂ =
a
eκ cos(θ−µ) cos(θ − µ)dθ
.
Rb
eκ cos(θ−µ) dθ
a
This can be considered an R estimator for the truncated case. This is the mean
resultant length of the parent distribution µ w.r.t. the data −which is not necessarily inside the truncation limits a, b as the existing sample mean−. It can be
considered related (as explained in equation 1.7) to the average of distances to the
mean that the data presents.
Finally, we have the system of log-likelihood equations:
n
1X
eκ cos(a−µ) − eκ cos(b−µ)
sin(θi − µ) −
=0
Rb
n i=1
k a eκ cos(θ−µ) dθ
Rb
n
cos(θ − µ)eκ cos(θ−µ) dθ
1X
=0
cos(θi − µ) − a R b
κ cos(θ−µ) dθ
n i=1
e
a
min(θ1 , · · · , θn ) = a
max(θ1 , · · · , θn ) = b
As two of our parameters already present the form of isolated estimators, we are
in conditions to conclude
MLE(a) = â = min(θ1 , · · · , θn )
40
CHAPTER 3. TRUNCATED VON MISES DISTRIBUTION
MLE(b) = b̂ = max(θ1 , · · · , θn ).
At this point, no simple forms of isolating any of the parameters other than the
R of the parent’s µ were observed (and to calculate this, we need to estimate the
parent’s mean first). So in our study and applications we use optimization methods
for these 2 parameters (µ, κ). Concretely, we have regarded the optimization of the
log-likelihood expression in Equation (3.14) as a non-linear programming problem
that we can solve in the form of a system of Karush-Kuhn-Tucker conditions.
3.6
Characteristic function
Remembering the expression involved in the definition of the characteristic function
of a random variable X:
Φ(t) = E[eitX ]
and particularizing now for the truncated von Mises density function we obtain:
ΦtvM (t) = E[e
itX
Z
b
eitx R b
eκ cos(x−µ)
dx
eκ cos(x−µ) dx
Z b
1
= Rb
eitx eκ cos(x−µ)dx dx
κ
cos(x−µ)
e
dx a
a
Rb
Rb
κ cos(x−µ)
cos(tx)e
dx
i
sin(tx)eκ cos(x−µ) dx
= a Rb
+ a Rb
eκ cos(x−µ) dx
eκ cos(x−µ) dx
a
a
] =
a
a
The latter term is 0, ∀t ∈ Z, when the distribution is symmetrical w.r.t. the
mean. Since the truncated case is not restricted to symmetry and also the mean
parameter does not correspond to the sample mean the latter term is not necessarily canceled. This can be considered the main difference calculation-wise between
the truncated and the non-truncated case. If we particularize the truncation coefficients to 0, 2π or any a, b such that b−a = 2π the previous equation reduces to (2.6).
3.7
Moments
The moments of the truncated von Mises distribution about the d direction are
expressed as:
mttvM = E[eit(X−d) ]
Given the particularities of the truncated case, the statistics that summarize the
3.7. MOMENTS
41
data such as population mean or mean resultant length are not directly describing
the parental distribution, since it is not acknowledged straightforwardly in the population (the shape of the distribution is that of the truncated case).
The first three moments about the 0 direction are:
m0tvM = 1
cos(x)eκ cos(x−µ) dx i
+
Rb
κ cos(x−µ) dx
e
a
Rb
cos(2x)eκ cos(x−µ) dx i
+
Rb
κ cos(x−µ) dx
e
a
Rb
Rb
a
m1tvM =
Rb
a
m2tvM =
a
a
sin(x)eκ cos(x−µ) dx
Rb
eκ cos(x−µ) dx
a
sin(2x)eκ cos(x−µ) dx
Rb
eκ cos(x−µ) dx
a
In our case it is interesting to consider the moments about the real µ direction
as:
Rb
m01tvM =
a
Rb
=
Rb
m02tvM
=
a
a
cos(x − µ)eκ cos(x−µ) dx i
+
Rb
κ cos(x−µ) dx
e
a
Rb
a
sin(x − µ)eκ cos(x−µ) dx
Rb
eκ cos(x−µ) dx
a
cos(x − µ)eκ cos(x−µ) dx i(eκ cos(b−µ) − eκ cos(a−µ) )
+
Rb
Rb
κ cos(x−µ) dx
k
eκ cos(x−µ) dx
e
a
a
cos(2(x − µ))eκ cos(x−µ) dx i
+
Rb
eκ cos(x−µ) dx
a
Rb
a
sin(2(x − µ))eκ cos(x−µ) dx
Rb
eκ cos(x−µ) dx
a
Which are not simplified under symmetry since the real µ direction does not have
to be equal to the sample mean (as seen before).
Since moments are descriptors of the distribution’s population, it is worth noticing that in order to apply Equations (1.4),(1.5) to m1tvM = E[cos(x)] + iE[sin(x)],
different statistics, µ0 , R0 appear. µ0 is the population mean and R0 the population
mean resultant length. This occurs since the use of the basic statistics descriptors and operations like expectation are only concerned with the population of the
distribution. We can therefore obtain
m1tvM = E[cos(x)] + iE[sin(x)]
= R0 cos(µ0 ) + iR0 sin(µ0 )
0
m1tvM = R0 eiµ .
If we compute then the first moment about the µ0 direction we obtain:
42
CHAPTER 3. TRUNCATED VON MISES DISTRIBUTION
m001vM = E[cos(x − µ0 )] + iE[sin(x − µ0 )]
Here the value of the second term in the sum is 0 by definition, resulting in:
m001vM = E[cos(x − µ0 )]
m001vM = R0
Rb
cos(x − µ0 )eκ cos(x−µ) dx
0
E[cos(x − µ )] = a R b
eκ cos(x−µ) dx
a
These three moments about 0, µ, and µ0 respectively are related and we can
account those relationships with the following expressions:
0
m1tvM = R0 eiµ
0
m1tvM = m001tvM eiµ
(3.15)
m1tvM = m01tvM eiµ
(3.16)
as previously stated and
for the parent’s mean case.
More interestingly and derivable from merging (3.15) and (3.16) we can state:
0
ei(µ −µ) R0 = m01tvM
(3.17)
which can be seen as a valuable expression as it involves both parent and sample
mean. It can be noticed that when the truncated distribution is symmetrical around
µ we have µ0 = µ and Equation (3.17) reduces to m001tvM = m01tvM = R0 = R.
3.8
The bi-dimensional truncated von Mises distribution
In this section the bivariate truncated von Mises distribution is introduced and developed. Bivariate distributions deal with events defined by a pair of values (x1 , x2 )
that could or not share some dependencies between them, fact that its captured by
an additional parameter. The 2-D truncated von Mises density function that is used
in this master thesis is defined on the surface of a torus
fbtvM : O × O ∈ R3 → R
where the two coordinate angles determine a reference to a specific point in a specific
location of its surface. This parent’s bivariate von Mises distribution (that is, the
3.8. THE BI-DIMENSIONAL TRUNCATED VON MISES DISTRIBUTION
43
non-truncated case of this distribution) was first proposed by Singh (2002) and it
is obtained by replacing the quadratic and linear terms on the normal bivariate
distribution with their circular analogues. It is known as the “sin variant bivariate
von Mises distribution” and has also been extended and developed in Mardia et al.
(2008) and Mardia and Voss (2011). It is expressed as:
f (θ1 , θ2 ) = Ceκ1 cos(θ1 −µ1 )+κ2 cos(θ2 −µ2 )+λ sin(θ1 −µ1 ) sin(θ2 −µ2 )
where κ1 , κ2 ≥ 0, −∞ < λ < ∞, µ1 , µ2 ∈ [i, i + 2π] and C is the normalization
constant.
This section comprises the definition of the distribution, maximum likelihood
estimations and a detailed study on the marginal and conditional distributions that
can be obtained given a specified truncated bivariate von Mises distribution.
3.8.1
Definition
The joint probability distribution of two random variables θ1 , θ2 that is regarded as
truncated von Mises is expressed as:
fbtvM (θ1 , θ2 ) =







eκ1 cos(θ1 −µ1 )+κ2 cos(θ2 −µ2 )+λ sin(θ1 −µ1 ) sin(θ2 −µ2 )
N2 T2
0
if θ1 ∈ [a1 , b1 ], θ2 ∈ [a2 , b2 ]
otherwise
(3.18)
where N2 stands for the required normalization factor and can be explicitly expressed
as (Singh (2002))
2m
∞ X
2m
λ
−m
κ−m
N2 = 4π
1 Im (κ1 )κ2 Im (κ2 ).
2
m
m=0
2
In the truncated distribution function, we can see that the terms N2 , T2 can
simplify with each other, since the term T2 is a transformation of the normalizing
factor to properly ensure the preservation of the density inside the boundaries of
the truncation coefficients. So the final expression of the normalizing factor for the
bivariate case results to be:
Z b1 Z b2
N2 T2 =
eκ1 cos(θ1 −µ1 )+κ2 cos(θ2 −µ2 )+λ sin(θ1 −µ1 ) sin(θ2 −µ2 ) dθ2 dθ1
a1
a2
Therefore, the expression that accounts for its positive support appears as:
44
CHAPTER 3. TRUNCATED VON MISES DISTRIBUTION
fbtvM (θ1 , θ2 ; PbtvM ) = R b1 R b2
eκ1 cos(θ1 −µ1 )+κ2 cos(θ2 −µ2 )+λ sin(θ1 −µ1 ) sin(θ2 −µ2 )
eκ1 cos(θ1 −µ1 )+κ2 cos(θ2 −µ2 )+λ sin(θ1 −µ1 ) sin(θ2 −µ2 ) dθ2 dθ1
(3.19)
Where function (3.19) presents the truncation coefficients boundary conditions stated
in function (3.18) and PbtvM = {λ, µ1 , µ2 , κ1 , κ2 , a1 , b1 , a2 , b2 } is the set of nine parameters where:
a1
a2
1. µ1 , µ2 correspond to the mean values of the individual θ1 , θ2 components, respectively.
2. κ1 , κ2 correspond to the concentration parameters of the individual θ1 , θ2 components.
3. a1 , b1 and a2 , b2 correspond to the univariate truncation parameters of the individual θ1 , θ2 components, respectively.
4. λ ∈ (−∞, ∞) is the correlation parameter that measures and accounts for the
degree of interdependence between the variables that compose the bivariate
case. Its value is proportional to the “strength” of the dependency.
3.8. THE BI-DIMENSIONAL TRUNCATED VON MISES DISTRIBUTION
45
Figure 3.4: Example of the bi-dimensional von Mises distribution with parameters
λ = 1, µ1 = 2, µ2 = 4, κ1 = 3, κ2 = 2, a1 = 0, b1 = 3.8, a2 = 2, b2 = 5.
46
CHAPTER 3. TRUNCATED VON MISES DISTRIBUTION
Figure 3.5: Same distribution as in Figure 3.4 although projected, appreciating the
truncation parameters from two axis perspectives.
As suggested, the bivariate case can be observed as the dependent product of
two univariate truncated von Mises distributions (see Figure 3.4 and Figure 3.5). If
we observe the case where λ = 0 we have from Equation (3.19) that:
fbtvM (θ1 , θ2 ; PbtvM ) = R b1 R b2
a1
= R b1
a1
eκ1 cos(θ1 −µ1 )+κ2 cos(θ2 −µ2 )
eκ1 cos(θ1 −µ1 )+κ2 cos(θ2 −µ2 ) dθ2 dθ1
a2
κ2 cos(θ2 −µ2 )
κ1 cos(θ1 −µ1 )
e
eκ1 cos(θ1 −µ1 ) dθ1
e
R b2
a2
eκ2 cos(θ2 −µ2 ) dθ2
Or otherwise written,
fbtvM (θ1 , θ2 ; 0, µ1 , µ2 , κ1 , κ2 , a1 , b1 , a2 , b2 ) = ftvM (θ1 ; µ1 , κ1 , a1 , b1 )ftvM (θ2 ; µ2 , κ2 , a2 , b2 ).
That is, the bivariate distribution turns into the independent product of its two
univariate distribution components. Additionally, if we consider first the product of
2 independent von Mises distributions we observe:
3.8. THE BI-DIMENSIONAL TRUNCATED VON MISES DISTRIBUTION
ftvM (θ1 ; µ1 , κ1 , a1 , b1 )ftvM (θ2 ; µ2 , κ2 , a2 , b2 ) = R b1
a1
R b2
eκ2 cos(θ2 −µ2 ) dθ2
a2
κ1 cos(θ1 −µ1 )+κ2 cos(θ2 −µ2 )
= R b1 R b2
a1
eκ2 cos(θ2 −µ2 )
eκ1 cos(θ1 −µ1 )
eκ1 cos(θ1 −µ1 ) dθ1
47
a2
e
eκ1 cos(θ1 −µ1 )+κ2 cos(θ2 −µ2 ) dθ2 dθ1
which results into the bivariate expression for parameter λ = 0.
3.8.2
Parameter estimation
We will construct the MLE estimation for the bivariate case where our samples are
of the form {(θ1i, θ2i )} i = 1, · · · , n.
We have the log-likelihood expression for function (3.19):
ln(L(PbtvM ; (θ11 , θ21 ), · · · , (θ1n , θ2n )))
!
n
X
eκ1 cos(θ1i −µ1 )+κ2 cos(θ2i −µ2 )+λ sin(θ1i −µ1 ) sin(θ2i −µ2 )
=
ln R b1 R b2
eκ1 cos(θ1 −µ1 )+κ2 cos(θ2 −µ2 )+λ sin(θ1 −µ1 ) sin(θ2 −µ2 ) dθ2 dθ1
i=1
a1 a2
n
X
=
(κ1 cos(θ1i − µ1 ) + κ2 cos(θ2i − µ2 ) + λ sin(θ1i − µ1 ) sin(θ2i − µ2 ))
i=1
Z
b1
Z
b2
−n ln
κ1 cos(θ1 −µ1 )+κ2 cos(θ2 −µ2 )+λ sin(θ1 −µ1 ) sin(θ2 −µ2 )
e
a1
dθ2 dθ1
a2
Now we proceed to obtain the individual members of each of the nine equations that will conform our system of log-likelihood equations with nine unknown
variables. Considering the unnormalized function in an analogous way as in (2.3),
fubvM (θ1 , θ2 ) = eκ1 cos(θ1 −µ1 )+κ2 cos(θ2 −µ2 )+λ sin(θ1 −µ1 ) sin(θ2 −µ2 ) ,
we have:
∂
ln(L(PbtvM ; (θ11 , θ21 ), · · · , (θ1n , θ2n ))) = 0
∂µ1
That is,
n
X
κ1 sin(θ1i − µ1 ) − λ cos(θ1i − µ1 ) sin(θ2i − µ2 )
i=1
n
−
R
b1
a1
R b2
a2
(κ1 sin(θ1 − µ1 ) − λ cos(θ1 − µ1 ) sin(θ2 − µ2 )fubvM (θ1 , θ2 ))dθ2 dθ1
R b1 R b2
f
(θ , θ )dθ2 dθ1
a1 a2 ubvM 1 2
=0
(3.20)
48
CHAPTER 3. TRUNCATED VON MISES DISTRIBUTION
Similarly, the partial derivate w.r.t. µ2 gives
n
X
κ2 sin(θ2i − µ2 ) − λ cos(θ2i − µ2 ) sin(θ1i − µ1 )
i=1
R
b1
a1
R b2
(κ2 sin(θ2 − µ2 ) − λ cos(θ2 − µ2 ) sin(θ1 − µ1 )fubvM (θ1 , θ2 ))dθ2 dθ1
R b1 R b2
f
(θ , θ )dθ2 dθ1
a1 a2 ubvM 1 2
(3.21)
For κ1 we have,
n
−
a2
∂
ln(L(PbtvM ; (θ11 , θ21 ), · · · , (θ1n , θ2n ))) = 0
∂κ1
That is,
n
1X
cos(θ1i − µ1 ) −
n i=1
R b1 R b2
a1
cos(θ1 − µ1 )fubvM (θ1 , θ2 )dθ2 dθ1
=0
R b1 R b2
f
(θ
,
θ
)dθ
dθ
ubvM
1
2
2
1
a1 a2
a2
(3.22)
Similarly, the partial derivate w.r.t. κ2 gives
n
1X
cos(θ2i − µ2 ) −
n i=1
R b1 R b2
a1
cos(θ2 − µ2 )fubvM (θ1 , θ2 )dθ2 dθ1
=0
R b1 R b2
f
(θ , θ )dθ2 dθ1
a1 a2 ubvM 1 2
a2
(3.23)
At this point, we can see that both equations (3.22),(3.23) involving κ1 , κ2 parameters respectively preserve their analogy with the univariate case and correspond to
estimators of E[cos(θ1 − µ1 )] and E[cos(θ2 − µ2 )], respectively.
∂
ln(L(PbvM ; (θ11 , θ21 ), · · · , (θ1n , θ2n ))) = 0
∂λ
That is,
n
1X
sin(θ1i − µ1 ) sin(θ2i − µ2 )
n i=1
R b1 R b2
−
a1
a2
sin(θ1 − µ1 ) sin(θ2 − µ2 )fubvM (θ1 , θ2 )dθ2 dθ1
=0
R b1 R b2
f
(θ
,
θ
)dθ
dθ
ubvM
1
2
2
1
a1 a2
(3.24)
Which also corresponds to the estimation of E[sin(θ1 − µ1 ) sin(θ2 − µ2 )].
We obtain by an analogous reasoning to the univariate case, the MLEs of the
truncation parameters as:
â1
b̂1
â2
b̂2
=
=
=
=
min({θ11 , · · · , θ1n })
max({θ11 , · · · , θ1n })
min({θ21 , · · · , θ2n })
max({θ21 , · · · , θ2n })
(3.25)
(3.26)
(3.27)
(3.28)
3.8. THE BI-DIMENSIONAL TRUNCATED VON MISES DISTRIBUTION
49
which are observed to be obtained in an independent way from the rest of the parameters.
The system of log-likelihood equations is composed by Equations (3.20), (3.21),
(3.22), (3.23), (3.24), (3.25), (3.26), (3.27) and (3.28). For the parameters κ1 , κ2 , µ1 ,
µ2 , λ no further simplification of the estimators was observed due to the interdependence in their expressions. We use optimization techniques to obtain the M LE
values of the these parameters for each particular problem, considering again a
Karush-Kuhn-Tucker system of equations for the likelihood expression in a similar
fashion than for the univariate case.
3.8.3
Conditional and marginal truncated von Mises distributions
In this subsection we are interested in learning about the form of marginal and conditional distributions that arise in a 2-dimensional truncated von Mises distribution.
Definitions
Given fbtvM (θ1 , θ2 ; λ, µ1 , µ2 , κ1 , κ2 , a1 , b1 , a2 , b2 ), the marginalization of one of its variables (in this case θ1 ) is calculated as
Z b2
fmtvM (θ1 ) =
fbtvM (θ1 , θ2 )dθ2
a2
Considering additionally the truncation criteria this leads us to
fmtvM (θ1 ) =


Rb
a
1
a1
eκ1 cos(θ1 −µ1 )+κ2 cos(θ2 −µ2 )+λ sin(θ1 −µ1 ) sin(θ2 −µ2 ) dθ2
2
R b R b2
2
a2
eκ1 cos(θ1 −µ1 )+κ2 cos(θ2 −µ2 )+λ sin(θ1 −µ1 ) sin(θ2 −µ2 ) dθ2 dθ1
if θ1 ∈ [a1 , b1 ]
 0
otherwise
With similar parameters than the bivariate distribution. The conditional distribution is then constructed as:
f (θ2 |θ1 ) =
f (θ1 , θ2 )
f (θ1 )
For our particular case it results in:
fctvM (θ2 |θ1 ) = R b1 R b2
a1
R b1 R b2
a1
a
R b22
a2
a2
eκ1 cos(θ1 −µ1 )+κ2 cos(θ2 −µ2 )+λ sin(θ1 −µ1 ) sin(θ2 −µ2 )
eκ1 cos(θ1 −µ1 )+κ2 cos(θ2 −µ2 )+λ sin(θ1 −µ1 ) sin(θ2 −µ2 ) dθ2 dθ1
eκ1 cos(θ1 −µ1 )+κ2 cos(θ2 −µ2 )+λ sin(θ1 −µ1 ) sin(θ2 −µ2 ) dθ2 dθ1
eκ1 cos(θ1 −µ1 )+κ2 cos(θ2 −µ2 )+λ sin(θ1 −µ1 ) sin(θ2 −µ2 ) dθ2
50
CHAPTER 3. TRUNCATED VON MISES DISTRIBUTION
And considering additionally the truncation criteria:
(
κ2 cos(θ2 −µ2 )+λ sin(θ1 −µ1 ) sin(θ2 −µ2 )
fctvM (θ2 |θ1 ) =
Rb
2
a2
e
eκ2 cos(θ2 −µ2 )+λ sin(θ1 −µ1 ) sin(θ2 −µ2 ) dθ2
0
if θ2 ∈ [a2 , b2 ]
otherwise
With similar parameters than the bivariate and marginal truncated case, except
for the truncation parameters and the concentration parameter of the marginalized
variable (in this case a1 , b1 and κ1 ) that are not included.
Study and conclusions about the conditional and marginal distributions
We consider the conditional distribution of the bivariate truncated von Mises in its
positive support as:
fctvM (θ2 |θ1 ) = R b2
a2
eκ1 cos(θ1 −µ1 )+κ2 cos(θ2 −µ2 )+λ sin(θ1 −µ1 ) sin(θ2 −µ2 )
eκ1 cos(θ1 −µ1 )+κ2 cos(θ2 −µ2 )+λ sin(θ1 −µ1 ) sin(θ2 −µ2 ) dθ2
if θ2 ∈ [a2 , b2 ]
(3.29)
Lemma 3.8.1. All conditional distributions of a bivariate truncated von Mises distribution are univariate truncated von Mises distributions.
Proof. If we take c1 = κ1 cos(θ1 − µ1 ), c2 = λ sin(θ1 − µ1 ), we can write Equation
(3.29) as:
fctvM (θ2 |θ1 ) = R b2
a2
= R b2
a2
ec1 +κ2 cos(θ2 −µ2 )+c2 sin(θ2 −µ2 )
ec1 +κ2 cos(θ2 −µ2 )+c2 sin(θ2 −µ2 ) dθ2
eκ2 cos(θ2 −µ2 )+c2 sin(θ2 −µ2 )
eκ2 cos(θ2 −µ2 )+c2 sin(θ2 −µ2 ) dθ2
(3.30)
At this point, if we examine the independence case where λ = 0, then c2 = 0 and,
as expected,
fctvM (θ2 |θ1 ) = R b2
a2
eκ2 cos(θ2 −µ2 )
eκ2 cos(θ2 −µ2 ) dθ2
= ftvM (θ2 ; µ2 , κ2 , a2 , b2 )
As in the well know result for the Gaussian distribution, this allows us to conclude
that the conditional distributions under independence are univariate truncated von
Mises distributions. More concretely, they are the univariate truncated von Mises
distribution followed by the unconditioned individual component.
If λ 6= 0 we can still consider the former expression a univariate truncated von
Mises distribution given that the current exponential exponent (of function (3.30))
can effectively be expressed by means of a formula of the type κ0 cos(x − µ0 ) for some
κ0 , µ0 permitted by the definition of function (3.1). We observe this by means of the
obtained trigonometrical equality:
3.8. THE BI-DIMENSIONAL TRUNCATED VON MISES DISTRIBUTION
51
κ2 cos(x) + c2 sin(x)
c2
c2
c2
= κ2 cos arctan
+ c2 sin arctan
cos x − arctan
κ2
κ2
κ2
(3.31)
Now if we consider that:
c2
c2
+ c2 sin arctan
κ2 cos arctan
κ2
κ2
c2
κ2 + κ22
=r
2
1 + κc22
q
= κ22 + c22
then (3.31) turns into
q
c2
2
2
κ2 + c2 cos x − arctan
= κ2 cos(x) + c2 sin(x)
κ2
(3.32)
Now we can adequate Equation (3.30) to the univariate truncated von Mises exponential exponent by properly selecting:
κ0 =
µ
0
q
κ22 + c22
= µ2 + arctan
c2
κ2
Therefore for a given conditional distribution expressed as in Equation (3.30), another truncated von Mises distribution with transformed parameters can be found
to be that distribution by means of the equivalence:
fctvM (θ2 |θ1 ; λ, µ1 , µ2 , κ2 , a2 , b2 ) =
q
λ sin(θ1 − µ1 )
ftvM θ2 ; µ2 + arctan
, κ22 + (λ sin(θ1 − µ1 ))2 , a2 , b2
κ2
Since this transformation holds for the parameters values in the general case, it is
obtained that function (3.30) can be considered as another way to rewrite a certain
univariate truncated von Mises expression (by means of Equation (3.32)), which
extends to the general definition conditional truncated distribution. This result
completely characterizes the conditional distribution.
52
CHAPTER 3. TRUNCATED VON MISES DISTRIBUTION
The marginal distribution of the bivariate truncated von Mises is, for θ1 ∈ [a1 , b1 ]
R b2
fmtvM (θ1 ) =
a
R b1 R b22
a1 a2
eκ1 cos(θ1 −µ1 )+κ2 cos(θ2 −µ2 )+λ sin(θ1 −µ1 ) sin(θ2 −µ2 ) dθ2
eκ1 cos(θ1 −µ1 )+κ2 cos(θ2 −µ2 )+λ sin(θ1 −µ1 ) sin(θ2 −µ2 ) dθ2 dθ1
Rb
eκ1 cos(θ1 −µ1 ) a22 eκ2 cos(θ2 −µ2 )+λ sin(θ1 −µ1 ) sin(θ2 −µ2 ) dθ2
(3.33)
= R b1
R
κ1 cos(θ1 −µ1 ) b2 eκ2 cos(θ2 −µ2 )+λ sin(θ1 −µ1 ) sin(θ2 −µ2 ) dθ dθ
e
2
1
a2
a1
Lemma 3.8.2. The marginal distributions of the truncated bivariate von Mises
distribution, under independence on their variables, are truncated von Mises distributions as well.
Proof. Considering again the case where both variables are independent (i.e., λ = 0),
Rb
eκ1 cos(θ1 −µ1 ) a22 eκ2 cos(θ2 −µ2 ) dθ2
hR
i
fmtvM (θ1 ) = R b
b2 κ cos(θ −µ )
1 κ cos(θ −µ )
1
1
1
2
2
2 dθ
e
e
dθ1
2
a1
a2
R
b
eκ1 cos(θ1 −µ1 ) a22 eκ2 cos(θ2 −µ2 ) dθ2
= R b1
Rb
eκ1 cos(θ1 −µ1 ) dθ1 a22 eκ2 cos(θ2 −µ2 ) dθ2
a1
= R b1
a1
eκ1 cos(θ1 −µ1 )
eκ1 cos(θ1 −µ1 ) dθ1
For the complementary case (i.e., λ 6= 0) in the marginal distributions, the existence of interdependence between the integrals generalizes the previous result to
a non-von Mises distribution. We follow with the study of this distribution and its
properties until the end of the chapter.
This study is organized as follows:
1. A theoretical introduction to the distribution under study with focus on the
non-truncated case
2. The analysis of the truncated case by means of Lemma 3.8.2, whose proof is
subdivided as follows:
(a) Obtaining the expression of the derivate function.
(b) Analysis of the sub-term v2 .
(c) Analysis of the expression of the derivate function and the marginal function.
(d) Determining the cases of the Lemma.
3.8. THE BI-DIMENSIONAL TRUNCATED VON MISES DISTRIBUTION
53
We start with the introduction and focus on the non-truncated case (1.).
The dependent marginal distribution of Equation (3.33) comprises the product
between a varying area of an unnormalized von Mises distribution (stated in Equation (2.3)) with another unnormalized von Mises distribution. If we were to apply
transformation (3.32) to the expression within the integral, we would observe how
the independent variable θ1 ultimately modifies the value of the “κ” parameter of
the univariate von Mises distribution whose area is computed. This variation in
the integral area causes the marginal distribution to present properties such as bimaximality/unimodality under certain sets of parameter values.
For clarity purposes, we can rewrite Equation (3.33) for truncation coefficients
a2 = 0, b2 = 2π (as they conform an example of the non-truncated case) as:
p
κ1 cos(θ1 −µ1 )
2
2
κ2 + (λ sin(θ1 − µ1 ))
e
2πI0
p
fmtvM (θ1 ) = R b
1 κ cos(θ −µ )
2
2
1
1
1
e
2πI0
κ2 + (λ sin(θ1 − µ1 )) dθ2 dθ1
a1
Where previous insights can be observed more easily.
The precise conditions where this distribution presents unimodality or bi-maximality
can be regarded as a key focus on studies that attempt to describe it. In this matter,
a previous study regarding the unimodality and bi-maximality of the marginal distribution in Equation (3.33) for the non-truncated case was reported in the efforts
of Singh (2002). The boundaries of bi-maximality were there precised (for µ1 = 0)
by equation:
κ1 κ2
A(κ2 ) = 2
λ
Where A(κ2 ) corresponds to Equation (2.5). The case where the first member
of the equation is smaller than the second, belongs to the definition of unimodal
marginal distribution, and respectively, when it is higher, bi-maximal with two equal
maxima. Also, the modes were calculated to be the symmetrical w.r.t. the value θ1
that solved the Equation (for µ1 = 0):
p
2
2
κ2 + λ sin (θ1 )
A
κ1
p
cos(θ1 ) = 2
2
2
λ
κ2 + λ sin (θ1 )
In our case, however, additional insights appear when considering the effect of
the truncation parameters. Contrary to the non-truncated case, truncated marginals
that show 2 maxima may have only one global maxima, nor if showing one maxima,
the distribution is necessarily centered around the mean (See Figure 3.6). It is therefore of our interest to see how generalizing the truncation coefficients to cover the
truncated case affects the behavior of this distribution and how much of the previous
analysis holds and how much is consequently under the need of generalization.
54
CHAPTER 3. TRUNCATED VON MISES DISTRIBUTION
Figure 3.6: Several truncated marginals showing unimodality (red) with parameters
λ = 5, µ1 = π, µ2 = 0, κ1 = 1, κ2 = 4, a1 = 0, b1 = 2π, a2 = π − 0.2, b2 = 2π,
two equal maxima (blue) with parameters λ = 5, µ1 = π, µ2 = 0, κ1 = 1, κ2 =
4, a1 = 0, b1 = 2π, a2 = 0, b2 = 2π, truncated unimodality (green) with parameters
λ = 1, µ1 = 4, µ2 = 2, κ1 = 3, κ2 = 4, a1 = 0, b1 = 5, a2 = 2, b2 = 2π and 2 distinct
maxima (black) with parameters λ = 10, µ1 = 6, µ2 = 1, κ1 = 0.3, κ2 = 6, a1 =
0, b1 = 2π, a2 = 0, b2 = 5 respectively.
We now proceed with the analysis of the truncated case (2.):
Lemma 3.8.3. In the marginal truncated case, fmtvM (θ1 ) can be unimodal with center in µ1 , bimodal with two equal maxima, present two differentiated maxima and
unimodal with the mode not at µ1 , strictly by manipulating parameters λ, κ1 , κ2 , µ1 , µ2 , a2
and b2 .
Proof. We will prove this lemma by identifying the ranges of parameter configurations that yield all distinctive shapes of the marginal distribution, thus proving a
more general case than that of the lemma.
In order to identify the changes in the shape and growth of the marginal distribution we study the unnormalized marginal truncated von Mises distribution. Taking
θ10 = θ1 − µ1 (from now on) we have:
3.8. THE BI-DIMENSIONAL TRUNCATED VON MISES DISTRIBUTION
κ1 cos(θ10 )
Z
b2
fumtvM (θ ) = e
10
eκ2 cos(θ2 −µ2 )+λ sin(θ10 ) sin(θ2 −µ2 ) dθ2
55
(3.34)
a2
(2. (a)) Differentiating fumtvM (θ1 ) w.r.t. θ1 we obtain:
0
fumtvM
(θ10 )
κ1 cos(θ10 )
Z
b2
= −κ1 sin(θ10 )e
Z
eκ2 cos(θ2 −µ2 )+λ sin(θ10 ) sin(θ2 −µ2 ) dθ2
a2
b2
sin(θ2 − µ2 )eκ2 cos(θ2 −µ2 )+λ sin(θ10 ) sin(θ2 −µ2 ) dθ2
+λ cos(θ10 )eκ1 cos(θ10 )
a2
Z b2
κ1 cos(θ10 )
eκ2 cos(θ2 −µ2 )+λ sin(θ10 ) sin(θ2 −µ2 ) dθ2
−κ1 sin(θ10 )
= e
a2
Z b2
κ2 cos(θ2 −µ2 )+λ sin(θ10 ) sin(θ2 −µ2 )
sin(θ2 − µ2 )e
dθ2 (3.35)
+λ cos(θ10 )
a2
Taking the function
R b2
a2
v2 (θ10 ) = λ
sin(θ2 − µ2 )eκ2 cos(θ2 −µ2 )+λ sin(θ10 ) sin(θ2 −µ2 ) dθ2
R b2
eκ2 cos(θ2 −µ2 )+λ sin(θ10 ) sin(θ2 −µ2 ) dθ2
a2
(3.36)
we are interested in observing the values for which the derivate expression has value
0
zero, we proceed by treating equation fumtvM
(θ1 ) = 0 to yield:
−κ1 sin(θ10 ) + v2 (θ10 ) cos(θ10 ) = 0
(3.37)
Our intention now is to assess how Equation (3.35) and Equation (3.37) behave
for different values of θ10 in the [−π, π] interval. However, more knowledge about
the sub-term v2 is needed in order to reach useful results. We will first address it.
Rb
(2. (b)) Considering that any function of the type f (x) = a e(.) dx where a ≤ b
satisfies f (x) ≥ 0 we can primarily filter our efforts to the sub-expression contained
in Equation (3.36):
Z
b2
v20 (θ10 ) =
sin(θ2 − µ2 )eκ2 cos(θ2 −µ2 )+λ sin(θ10 ) sin(θ2 −µ2 ) dθ2
a2
and the inside-integral expression:
f0v20 (θ2 ) = sin(θ2 − µ2 )eκ2 cos(θ2 −µ2 )+λ sin(θ10 ) sin(θ2 −µ2 )
Notice that in f0v20 (θ2 ), the argument is θ2 since it creates the area that is going
to be computed in v20 (θ10 ). θ10 can be considered here a modifying parameter.
We can then say:
56
CHAPTER 3. TRUNCATED VON MISES DISTRIBUTION
1. If a2 , b2 ∈ [µ2 , µ2 + π] then v20 (θ10 ) > 0 ∀θ10 , λ, κ2
Intuitively, if with the truncation parameters we were to select only a
positive region of the function, the result of the integration will also be positive. It needs to be noted how f0v20 (θ2 ) comprises the product of a solely
positive function of the type e(.) and a sin(.) function. Therefore, f0v20 (θ2 ) is
negative/positive based on the sign of the sin(.) function.
2. Analogously if a2 , b2 ∈ [µ2 − π, µ2 ] then v20 (θ10 ) < 0 ∀θ10 , λ, κ2
Rµ
3. If µ2 ∈ (a2 , b2 ) then v20 (θ10 ) can be split into a22 f0v20 (θ2 ; θ10 )dθ2 +
R b2
R µ2
R b2
0 )dθ2 ≤ 0 and
0 )dθ2 where
f (θ ; θ 0 )dθ2 ≥ 0.
f
(θ
;
θ
f
(θ
;
θ
0v
2
1
0v
2
1
0
0
2
2
µ2 0v20 2 1
a2
µ2
4. Therefore, if
R µ2
a2
f0v20 (θ2 )dθ2 = −
R b2
µ2
f0v20 (θ2 )dθ2 then v20 (θ10 ) = 0
If we were restricted to truncation coefficients similar to the non-truncated
case (b2 − a2 = 2π ) then v20 (θ10 ) = 0 only for certain parameter values since
it can only occur if both curves are similar in area. However, in our case, it
can be said that ∀µ2 ∈ (0, 2π), κ2 , λ, θ10 , ∃a2 , b2 such that µ2 ∈ [a2 , b2 ] and
R µ2
R b2
f
(θ
)dθ
=
−
f (θ )dθ2 which shows part of the additional com0v
2
2
0
2
a2
µ2 0v20 2
plexity in determining the precise conditions where the marginal distribution
shows distinctive behaviors.
Now accounting the influence of the θ10 as parameter of the expression to be computed in v20 (θ10 ) we can say:
5. If θ10 ∈ (−π, 0), λ > 0 then v20 (θ10 ) < 0 if the
R µ truncation parameters are not
a2 , b2 such that b2 > µ2 and a2 > c such as c 2 f0v20 (θ2 ; θ10 )dθ2
Rb
= − µ2 f0v20 (θ2 ; θ10 )dθ2
√
If we look at the
exponential
sub-term
in f0v20 (θ2 ) as
κ22 +(λ sin(θ10 ))2 cos x2 −µ2 −arctan
λ sin(θ10 )
κ2
e
we can interpret its contribution to
f0v20 (θ2 ) as a “modifier” of the shapes of the negative and positive curve produced by the sin(.) sub-term also present in the expression. If the value of
θ10 produces the sub-term to be centered somewhere in (µ2 , µ2 + π) the area
under the positive curve in the integral computation is higher than the negative and the opposite case when is centered somewhere in (µ2 − π, µ2 ) (In all
cases if truncation parameters allow so). We can here also conclude that given
that the λ parameter takes also part in determining where the center of the
sub-term is going to be placed, if λ < 0 then this case follows for θ10 ∈ (0, π)
6. Analogously if θ10 ∈ (0, π), λ > 0 then v20 (θ10 ) > 0 if the truncation
parameters
Rµ
are not a2 , b2 such that a2 < µ2 and b2 < c such as a22 f0v20 (θ2 ; θ10 )dθ2 =
Rc
− µ2 f0v20 (θ2 ; θ10 )dθ2
3.8. THE BI-DIMENSIONAL TRUNCATED VON MISES DISTRIBUTION
57
7. As a particular case, if λ = 0 or θ10 = 0 or θ10 = π and cos(a2 − µ2 ) =
cos(b2 − µ2 ) then v20 (θ10 ) = 0
This result appears as the particular case where the exponential sub-term
in f0v20 (θ2 ) has its maximum value at µ2 and therefore contributes equally to
the area under both curves of the sin(.) sub-term. We observe that in order
to center the exponential sub-term at µ2 we do,
λ sin(θ10 )
= 0
arctan
κ2
λ sin(θ10 )
= 0
κ2
λ sin(θ10 ) = 0
Which gives us the conditions on θ10 , λ stated above. Notice that only with
symmetrical parameters this configuration yields v20 (θ10 ) = 0.
We have analyzed v20 (θ10 ) and by extension v2 (θ10 ) and therefore, we are now in
conditions to assess equations (3.35) and (3.37) for different values of θ10 ∈ [−π, π].
(2. (c)) In our results, we focus our attention in the truncation parameters, as the
non truncated case was already studied in Singh (2002).
If we consider θ10 ∈ [−π, − π2 ]:
1. For this case sin(θ10 ) negative, κ1 positive and cos(θ10 ) negative
2. If a2 , b2 satisfy either a2 , b2 ∈ [µ2 , µ2 + π] or µ2 ∈ (a2 , b2 ) such as
Rµ
Rb
− a22 f0v20 (θ2 ; − π2 )dθ2 ≤ µ22 f0v20 (θ2 ; − π2 )dθ2 and λ > 0 then v2 (θ10 ) > 0. In
this case, a minimum can be found in the examined interval as shown by:
0
fumtvM
(−π)
= e
−κ1
Z
−λ
b2
κ2 cos(θ2 −µ2 )
sin(θ2 − µ2 )e
dθ2
<0
a2
0
fumtvM
Z b2
π
= κ1
eκ2 cos(θ2 −µ2 )−λ sin(θ2 −µ2 ) dθ2 > 0
−
2
a2
(3.38)
Notice that if a2 , b2 ∈ [µ2 , µ2 +π] the minimum is forced regardless of the effect
of the other parameters. Also, if λ < 0 then the interval [ π2 , π] would have the
critical point instead.
3. If not in the previous case, v2 (θ10 ) < 0, resulting in a monotonic increasing
behavior.
If θ10 ∈ [ π2 , π]:
58
CHAPTER 3. TRUNCATED VON MISES DISTRIBUTION
1. For this case, sin(θ10 ) positive, κ1 positive and cos(θ10 ) negative
2. Analogously, if a2 , b2 satisfy either a2 , b2 ∈ [µ2 − π, µ2 ] or µ2 ∈ (a2 , b2 ) such
Rb
Rµ
as µ22 f0v20 (θ2 ; π2 )dθ2 ≤ − a22 f0v20 (θ2 ; π2 )dθ2 and λ > 0 then v2 (θ10 ) < 0. An
analogous minimum can be found in the examined interval.
3. If not in case 2. v2 (θ10 ) > 0, resulting in a monotonic decreasing behavior.
Now if we consider θ10 ∈ [− π2 , 0] :
1. For this case sin(θ10 ) negative cos(θ10 ) positive. If we assume λ > 0 we have:
2. If similar restrictions in the truncation parameters than the stated in 2. for
the [−π, − π2 ] case, then fumtvM (θ10 ) monotonic with increasing behavior.
3. Otherwise, fumtvM (θ10 ) could present either zero, one or two critical points. If
0
we examine fumtvM
(θ10 ) in the interval:
0
fumtvM
(0)
κ1
Z
b2
κ2 cos(θ2 −µ2 )
sin(θ2 − µ2 )e
= λe
dθ2
a2
=
λ κ1 κ2 cos(a2 −µ2 )
e e
− eκ2 cos(b2 −µ2 )
κ2
(3.39)
0
(0) < 0 then a single critical point, a maximum,
(a) We know that if fumtvM
exists in the interval (considering for this the already calculated Equation
(3.38)). Thus, in this case we can isolate another truncation parameters
configuration by selecting truncation parameters a2 , b2 such as cos(b2 −
µ2 ) > cos(a2 − µ2 ) that is, making the parameter b2 “closer” in circular
distance to the mean than parameter a2 . Notice that this configuration is
exclusive w.r.t. the configuration stated in 2. for the [−π, − π2 ] case, as it
implied cos(b2 −µ2 ) < cos(a2 −µ2 ). Under this considerations we can also
identify exclusive cases where cos(b2 −µ2 ) = cos(a2 −µ2 ) and additionally,
the complementary subset of cases where cos(b2 − µ2 ) < cos(a2 − µ2 ) but
Rb
Rµ
− a22 f0v20 (θ2 ; − π2 )dθ2 ≥ µ22 f0v20 (θ2 ; − π2 )dθ2 .
(b) If cos(b2 − µ2 ) = cos(a2 − µ2 ) then by Equation (3.39) a critical point
exists at fumtvM (0) that is either a minimum (two equal maxima) or a
maximum (unimodal) depending on the result of
R b2 2
sin (θ2 − µ2 )eκ2 cos(θ2 −µ2 ) dθ2
κ1
a2
T (λ, µ2 , κ1 , κ2 , a2 , b2 ) = − 2 +
R b2
λ
eκ2 cos(θ2 −µ2 ) dθ2
a2
If T (λ, µ2 , κ1 , κ2 , a2 , b2 ) > 0 then fumtvM (θ10 ) presents a minimum critical point and the distribution presents two equal maxima, respectively
if T (λ, µ2 , κ1 , κ2 , a2 , b2 ) < 0 then fumtvM (θ10 ) presents a maximum critical point and the distribution is unimodal. This result generalizes the
3.8. THE BI-DIMENSIONAL TRUNCATED VON MISES DISTRIBUTION
59
obtained in Singh (2002) for the non-truncated case for symmetrical parameters different than a2 , b2 such that b2 − a2 = 2π. Notice also that
sufficiently proximal truncation parameters could turn an otherwise bimaximal distribution into a unimodal.
Rµ
(c) If cos(b2 − m2 ) < cos(a2 − m2 ) but − a22 f0v20 (θ2 ; − π2 )dθ2 ≥
R b2
f (θ ; − π2 )dθ2 then fumtvM (θ10 ) can present zero, one or two critical
µ2 0v20 2
points according to the solutions of Equation (3.37). The case with zero
critical points corresponds to a unimodal distribution with maximum in
[0, π2 ], the case of one critical point corresponds to the “border” between
the unimodal and the bi-maximal case and the case with two critical
points corresponds to the bi-maximal case.
(d) Thus, we sort the cases described above intuitively as “how the distribution behaves when varying one truncation parameter from being the
one (of the 2 truncation parameters) that presents the highest circular
distance w.r.t. µ2 , that necessarily has the maximum on the interval,
to the one that presents the lowest distance in the most restrictive case,
that is shown to be necessarily strictly increasing (See Figure 3.7). This
covers all possible shapes.
0
4. If λ < 0 the behavior of fumtvM
(θ10 ) corresponds to that of the interval [0, π2 ]
w.r.t. λ > 0.
Lastly, the interval θ10 ∈ [0, π2 ] is described in an analogous way to the [− π2 , 0]
interval and all accounted and only all accounted behavior and truncation criteria
hold with only the trivial modifications to address this interval instead of [− π2 , 0].
60
CHAPTER 3. TRUNCATED VON MISES DISTRIBUTION
Figure 3.7: Marginal truncated von Mises with parameters λ = 5, µ1 = π, µ2 = 4,
κ1 = 2, κ2 = 4 and b2 = 5. The difference between each of them is given by
variation on the a2 truncation parameter. For a2 = 2 (black), we have cos(b2 −µ2 ) >
cos(a2 −µ2 ) and therefore a maximum (the global maximum) is found in the interval
[ π2 , π]. For a2 = 3 (blue), cos(b2 − µ2 ) = cos(a2 − µ2 ) where the distribution presents
two global maxima. For a2 = 3.2, cos(b2 − µ2 ) < cos(a2 − µ2 ) and fumtvM (θ10 )
presents two critical points in the interval [ π2 , π]. For a2 = 3.3565 (approximated
value), cos(b2 − µ2 ) < cos(a2 − µ2 ) and fumtvM (θ10 ) presents exactly one critical
point in [ π2 , π]. For a2 = 3.5, cos(b2 − µ2 ) < cos(a2 − µ2 ) and fumtvM (θ10 ) presents no
critical point in the interval [ π2 , π] and therefore the distribution is unimodal. Lastly,
for a2 = 4 we fall into the most restrictive case of cos(b2 − µ2 ) < cos(a2 − µ2 ) where
Rµ
Rb
− a22 f0v20 (θ2 ; π2 )dθ2 ≤ µ22 f0v20 (θ2 ; π2 )dθ2 (the previous cos(b2 − µ2 ) < cos(a2 − µ2 )
cases fell under the complementary case, where the integral comparison did not
verify the inequation) and more specifically the case where a2 , b2 ∈ [µ2 , µ2 + π],
which forces the distribution to present a unimodal behavior regardless of the other
parameter values in the interval [ π2 , π]. The progression followed by the distribution
under modifying the a2 parameter can be seen, under appearances, as an “area
shifting” process where approaching µ2 displacing a truncation parameter carries
with it as well a displacement of the area of the distribution towards that direction,
leaving the global maxima always in the π2 −interval including µ1 associated with the
truncation parameter whose circular distance to µ2 is higher. The “displacement” of
a2 in this case seems to increase the value of the maxima in [π, 32 π] and decrease the
value of the maxima in [ π2 , π] in the bi-maximal case until the distribution becomes
unimodal, and then continue by decreasing the area under the monotonic curve.
3.8. THE BI-DIMENSIONAL TRUNCATED VON MISES DISTRIBUTION
61
(2. (d)) Now we proceed with determining the cases of the lemma, accordingly
to our analysis, as:
1. fmtvM (θ10 ) is unimodal with center (maximum) in µ1 , only when
T (λ, µ2 , κ1 , κ2 , a2 , b2 ) < 0 and cos(b2 − µ2 ) = cos(a2 − µ2 )
2. fmtvM (θ10 ) is bi-maximal with equal maxima, only when T (λ, µ2 , κ1 , κ2 , a2 , b2 ) >
0 and cos(b2 − µ2 ) = cos(a2 − µ2 ), also in this case, a minimum can be found
in θ10 = 0.
3. fmtvM (θ10 ) presents two differentiated maxima only if one of the two following
cases applies:
(a) cos(b2 − µ2 ) < cos(a2 − µ2 ), T (λ, µ2 , κ1 , κ2 , µ2 , 2µ2 − b2 , b2 ) > 0 and a2 ∈
0
(2µ2 − b2 , a∗ ) where a∗ such as fumtvM
(θ10 ; λ, µ1 , µ2 , κ1 , κ2 , µ2 , a∗ , b2 ) has
exactly one zero point in [− π2 , 0]
(b) cos(b2 − µ2 ) > cos(a2 − µ2 ), T (λ, µ2 , κ1 , κ2 , µ2 , a2 , 2µ2 − a2 ) > 0 and
0
b2 ∈ (b∗ , 2µ2 − a2 ) where b∗ such as fumtvM
(θ10 ; λ, µ1 , µ2 , κ1 , κ2 , µ2 , a2 , b∗ )
has exactly one zero point in [0, π2 ]
That is, if the truncation parameter, in a “bi-maximal by parameters”
distribution with the other truncation parameter and µ2 fixed is not distant
w.r.t. µ2 enough as to reach or surpass the distance where the symmetry is
attained, and not close enough to cause fumtvM (θ10 ) to present one or zero
critical points in the interval [− π2 , 0] if speaking about a2 , or [0, π2 ] if speaking
about b2 . If we look at Figure 3.7, we can think of the distribution satisfying
case a) for the a2 values in the interval (3, 3.3565)
4. fmtvM (θ10 ) unimodal with mode not at µ1 if the parameters do not fall in any
of the previous cases.
Truncation parameters a1 , b1 behave similarly than in the truncated von Mises
univariate case, and it is possible to select them to not include one of the maxima
of the distribution, thus obtaining a unimodal distribution when the parameters
produce a bi-maximal distribution, or to show any of the other behaviors that could
be created by the manipulation of the truncation parameters.
62
CHAPTER 3. TRUNCATED VON MISES DISTRIBUTION
Chapter 4
Application in Neuroscience
In this chapter we apply all the previously developed analysis tools to the study
of dendritic angles in cerebral cortex layer III mice pyramidal neurons. We use
an angular data set obtained from the study conducted in Ballesteros-Yáñez et al.
(2010), where neuron’s dendritic trees were traced and their angles obtained in order
to observe changes on neuronal density and arborization after the deletion of the
β2−subunit in nAChRs proteins present in the mice’s brains. We will subdivide and
summarize this data into different categories and then conduct separated and joined
distribution studies in our attempt to learn more about the relationship between
the data and the postulated underlying truncated von Mises distributions, as well
as the intrinsic behavior of the dendritic trees structures.
4.1
Data organization
The data set can be subdivided into several categories corresponding to angular
measurements on neurons in different parts of the brain, so several different studies,
taking into account some differences or not, can be conducted.
It is organized as follows:
Cortex region
Neuron
Tree index
Maximum tree order
Tree number of nodes
Bifurcation order
Angle
M1, M2, PrL, S1, S2, V1, V2
Neuron identifier
Tree identifier
Maximum level of bifurcation of the selected tree
Number of nodes of the selected tree
The level of the measured angle
of the selected tree
The measured angle
The neurons correspond to different cortex regions or areas of the brain (M1 =
primary motor cortex, M2 = secondary motor cortex, PrL/Il = prelimbic/infralimbic
cortex, S1 = primary somatosenory cortex, S2 = secondary somatosenory cortex,V1
63
64
CHAPTER 4. APPLICATION IN NEUROSCIENCE
= primary visual cortex and V2 = secondary visual cortex) where the measurements
were taken. It should be interesting to study if significant changes in the distributions occur if we filter the data by their different cortical regions. Each neuron has
several dendritic trees that connect it with its surrounding neurons. Therefore, for
each neuron, measurements on the angles are referred to the tree they come from
by means of the tree index. The dataset further describes the tree of the angular
observation with the maximum order field, where the total amount of levels of the
tree are counted, and with the Tree number of nodes field, where the total amount
of angular observations is recorded. The level (bifurcation order) of a tree is similar
to the standard notion in the data structure in computer science: a node (or an
angular observation of a node) is at level one if it is the root, and at level two if
its parent angle is that of the root and so on. A consequence in the organization of
this data set is that per tree more than one observation per level can occur, in every
level but the root (See Figure 4.1). In total, the dataset is composed by angular
measurements on 650 neurons making a total of 13432 angular measurements. They
can be subdivided in: 2370 measurements for region M1, 2597 measurements for
M2, 1228 measurements for PrL, 2460 measurements for S1, 1348 measurements for
S2, 1406 measurements for V1 and 2023 measurements for V2.
The studies that are going to be conducted are:
1. The data is separated into groups according to their brain area, and we fit
separately distributions for the angles in each of the bifurcation levels, not
taking into account if they come or not from the same neuron or how many
of the angles are in each of the levels of each tree (Section 4.2). Also we fit
bi-dimensional distributions that take into account two adjoining bifurcation
levels (Section 4.3).
2. The whole data is used to fit distributions for the angles in each of the bifurcation levels (Section 4.2) and adjoint couples of levels, (Section 4.3) regardless
of the brain area.
3. Using the previous groups for brain area, fit a distribution with no further
discrimination (Section 4.2).
4. The whole data is used to fit a distribution, without further considerations
(Section 4.2).
The next two sections reflect the efforts in gathering the data and fitting the
distributions, followed by a final section (Section 4.4) that contains conclusions and
insights about the data and the estimated distributions.
4.2
Unidimensional von Mises distribution fitting
We will start by fitting distributions separated by brain areas with no further discrimination. We obtain the parameter values in Table 4.1 and a visualization of the
4.2. UNIDIMENSIONAL VON MISES DISTRIBUTION FITTING
Figure 4.1: Graphical visualization of the organization of the dataset.
65
66
CHAPTER 4. APPLICATION IN NEUROSCIENCE
global distribution in Figure 4.2.
M1
M2
PrL
S1
S2
V1
V2
All
µ
0.9586
0.8716
1.0397
0.8977
1.0405
1.0630
0.9402
0.9553
κ
6.7874
6.7658
5.0904
5.8081
6.3850
5.7608
5.4111
5.9437
a
0.0435
0.0241
0.0532
0.0372
0.0682
0.0413
0.0413
0.0241
b
2.7829
2.7387
2.7622
2.7947
2.5094
2.7281
2.4604
2.7947
NumSamples
2370
2597
1228
2460
1348
1406
2023
13432
Table 4.1: Parameter values of truncated von Mises distributions of each group
according to the brain area, and the whole dataset.
Figure 4.2: Estimated truncated von Mises distribution for the entire dataset. This
distribution corresponds to the parameter values of the 9th row (named “All”) in
Table 4.1.
Subsequently we will consider the bifurcations independently of the brain area.
The distribution parameters are estimated to be those shown at the Table 4.2.
4.2. UNIDIMENSIONAL VON MISES DISTRIBUTION FITTING
Bifurcation
Bifurcation
Bifurcation
Bifurcation
Bifurcation
Bifurcation
level
level
level
level
level
level
1
2
3
4
5
6
µ
1.1152
0.9881
0.8722
0.8176
0.7749
0.7494
κ
6.1557
6.1180
6.4098
6.7795
6.7829
8.3659
a
0.0849
0.0214
0.0331
0.0284
0.0968
0.1297
b
2.7947
2.7387
2.6796
2.7829
1.8229
1.8358
67
NumSamples
3160
4382
3656
1704
439
78
Table 4.2: Estimated truncated von Mises distributions for the entire dataset separated in 6 bifurcation levels. We can notice the emergence of a pattern when
examining the values of the µ parameter, that seem to decrease when increasing the
level we look at.
In the data set, bifurcations of level 7 and 8 were also present but their number
is too low (12 and 1 respectively) to obtain valuable information of the underlying
distribution.
Now we proceed with the most restrictive univariate case, where bifurcations are
obtained on brain areas separated groups. The resulting parameters are shown in
Table 4.3.
68
CHAPTER 4. APPLICATION IN NEUROSCIENCE
M1
Bifurcation
Bifurcation
Bifurcation
Bifurcation
Bifurcation
M2
Bifurcation
Bifurcation
Bifurcation
Bifurcation
Bifurcation
PrL
Bifurcation
Bifurcation
Bifurcation
Bifurcation
S1
Bifurcation
Bifurcation
Bifurcation
Bifurcation
Bifurcation
S2
Bifurcation
Bifurcation
Bifurcation
Bifurcation
V1
Bifurcation
Bifurcation
Bifurcation
Bifurcation
V2
Bifurcation
Bifurcation
Bifurcation
Bifurcation
level
level
level
level
level
1
2
3
4
5
level
level
level
level
level
1
2
3
4
5
level
level
level
level
1
2
3
4
level
level
level
level
level
1
2
3
4
5
level
level
level
level
1
2
3
4
level
level
level
level
1
2
3
4
level
level
level
level
1
2
3
4
µ
1.0950
1.0025
0.8962
0.8274
0.832
µ
1.0591
0.9120
0.8
0.73
0.6833
µ
1.1205
1.03
0.9379
0.9677
µ
1.0814
0.9401
0.8109
0.7425
0.7018
µ
1.1944
1.0689
0.9305
0.9219
µ
1.1579
1.0764
0.9896
0.9564
µ
1.1469
0.9438
0.8681
0.8036
κ
6.4282
6.7029
7.8967
7.3457
8.8018
κ
6.4921
7.2924
7.32
8.4814
7.0187
κ
5.1347
5.7942
4.0840
3.5071
κ
6.6099
6.0594
5.7411
6.6456
4.6349
κ
7.3074
5.7819
6.5320
6.7647
κ
5.423
5.6469
6.1741
5.4722
κ
5.6495
5.0379
5.9403
6.6848
a
0.0849
0.08460
0.0435
0.0607
0.0968
a
0.01245
0.0214
0.0331
0.0284
0.1276
a
0.1015
0.0532
0.0901
0.0751
a
0.972
0.0602
0.0372
0.0665
0.01402
a
0.2317
0.18
0.0682
0.01734
a
0.1008
0.0594
0.0413
0.02502
a
0.01145
0.0857
0.0413
0.0658
b
2.3636
2.7302
2.1037
2.7829
1.8229
b
2.4070
2.7387
1.9482
1.9996
1.7037
b
2.7622
2.3905
2.4392
2.3062
b
2.7947
2.5562
2.6796
2.1260
1.6897
b
2.5094
2.4550
2.0386
2.2480
b
2.7281
2.7017
2.0317
2.0169
b
2.2754
2.4604
2.2450
1.7391
NumSamples
503
766
696
306
85
NumSamples
539
810
749
385
92
NumSamples
434
424
243
95
NumSamples
540
772
683
340
102
NumSamples
304
437
379
173
NumSamples
379
506
350
146
NumSamples
461
667
556
259
Table 4.3: Estimated truncated von Mises distributions for the different brain areas
and for the different bifurcation levels. We can notice how the decreasing µ pattern
is highly consistent appearing in every subgroup except for PrL and M1 in the fewer
samples estimator (levels 4 and 5, respectively).
4.3. BIDIMENSIONAL VON MISES DISTRIBUTION FITTING
69
In all cases only the levels with enough information to conduct studies under reasonable reliability are shown. Not enough information was found about the remaining
unanalyzed levels to create a descriptive distribution with meaningful parameters.
Similarly, some of the 4 and 5 bifurcation levels contain few observations and therefore the estimated distributions shall be used with care. These observed distributions
present the apparent under observation property of containing the mean under the
truncation parameters a, b.
4.3
Bidimensional von Mises distribution fitting
We proceed now with the study of the data separated by bifurcation levels, where
two adjoining levels are used to fit a bivariate von Mises distribution, and marginal
distributions are subsequently obtained. See Table 4.4 for the parameter values
and Figure 4.3 for the visualization of the bivariate distribution estimated from
bifurcations 1 and 2.
λ
µ1
µ2
κ1
κ2
a1
b1
a2
b2
NumSamples
Bif1-2
0.1321
1.1150
0.9793
6.1575
6.4512
0.0849
2.7947
0.0214
2.7387
3160
Bif2-3
0.0069
0.9924
0.8722
6.3501
6.4098
0.0214
2.7387
0.0331
2.6796
3656
Bif3-4
0.0016
0.8579
0.8175
6.8053
6.7795
0.0331
2.4392
0.0284
2.7829
1704
Bif4-5
0
0.8192
0.7749
7.4951
6.7828
0.0607
2.7829
0.0968
1.8229
439
Table 4.4: Estimated truncated bivariate von Mises distributions for pairs of bifurcation levels from one to five in the whole dataset. We can notice that the estimation
seems to show tendency to independence by a decreasing tendency in the λ parameter. Also, there exists a decreasing tendency shown by both means µ1 , µ2
70
CHAPTER 4. APPLICATION IN NEUROSCIENCE
Figure 4.3: Estimated bivariate truncated von Mises distribution for the joint data
of the bifurcation levels 1 and 2. The parameter values of this distribution are those
in the second column of Table 4.4 (named “Bif1-2”).
The marginal distributions of the variables are shown in Figure 4.4 and Figure 4.5.
4.3. BIDIMENSIONAL VON MISES DISTRIBUTION FITTING
71
Figure 4.4: Marginal distribution of the first component (Bifurcation 1) in the bivariate case for Bifurcation levels 1 and 2 shown in Figure 4.3.
Figure 4.5: Marginal distribution of the second component (Bifurcation 2) in the
bivariate case for Bifurcation levels 1 and 2 shown in Figure 4.3.
Now we begin with the study for the angles grouped by brain areas, see Tables
4.5 − 4.11.
72
CHAPTER 4. APPLICATION IN NEUROSCIENCE
M1
λ
µ1
µ2
κ1
κ2
a1
b1
a2
b2
NumSamples
Bif1-2
1.182×10−4
1.0950
0.9851
6.4282
6.6112
0.0849
2.3636
0.0846
2.7302
503
Bif2-3
0.2258
1.0044
0.8959
6.6223
7.9025
0.0846
2.7302
0.0435
2.1037
696
Bif3-4
4.257×10−6
0.8753
0.8274
7.4440
7.3457
0.1366
1.9245
0.0607
2.7829
306
Bif4-5
1.639×10−6
0.7845
0.8032
8.9957
8.8018
0.1037
1.5883
0.0968
1.8229
85
Table 4.5: Estimated truncated bivariate von Mises distributions for pairs of bifurcation levels from one to five in the M1 region. Here the decreasing tendency in the
λ parameter is not followed by either Bif1-2 or by Bif2-3.
M2
λ
µ1
µ2
κ1
κ2
a1
b1
a2
b2
NumSamples
Bif1-2
5.626×10−5
1.0591
0.8990
6.4921
7.6665
0.1245
2.4070
0.0494
2.1460
539
Bif2-3
0.0332
0.9050
0.7999
7.3429
7.3200
0.0214
2.1460
0.0331
1.9482
749
Bif3-4
0.1304
0.7810
0.7297
7.2363
8.4832
0.0484
1.9482
0.0284
1.9996
385
Table 4.6: Estimated truncated bivariate von Mises distributions for pairs of bifurcation levels from one to four in the M2 region.
4.3. BIDIMENSIONAL VON MISES DISTRIBUTION FITTING
PrL
λ
µ1
µ2
κ1
κ2
a1
b1
a2
b2
NumSamples
Bif1-2
3.031×10−5
1.1181
1.0300
5.0894
5.7942
0.1015
2.7622
0.0532
2.3905
424
Bif2-3
0.4010
1.0418
0.9369
5.0658
4.1053
0.0532
2.3905
0.0901
2.4392
243
73
Bif3-4
1.5938×10−4
0.9795
0.9676
3.2597
3.5052
0.0901
2.1333
0.0751
2.3067
95
Table 4.7: Estimated truncated bivariate von Mises distributions for pairs of bifurcation levels from one to four in the PrL region.
S1
λ
µ1
µ2
κ1
κ2
a1
b1
a2
b2
NumSamples
Bif1-2
0.0388
1.0813
0.9462
6.6073
6.2157
0.0972
2.7947
0.0602
2.5562
540
Bif2-3
2.6816×10−6
0.9418
0.8109
6.0741
5.7432
0.0602
2.5562
0.0372
2.6796
683
Bif3-4
1.4244×10−5
0.7927
0.7425
5.6382
6.6491
0.0414
2.6796
0.0665
2.1260
340
Bif4-5
0.7758
0.7224
0.6954
6.4157
4.7152
0.1093
2.1260
0.1402
1.6897
102
Table 4.8: Estimated truncated bivariate von Mises distributions for pairs of bifurcation levels from one to five in the S1 region.
74
CHAPTER 4. APPLICATION IN NEUROSCIENCE
S2
λ
µ1
µ2
κ1
κ2
a1
b1
a2
b2
NumSamples
Bif1-2
3.3963×10−4
1.1944
1.0492
7.3069
5.9968
0.2317
2.5094
0.2185
2.4450
304
Bif2-3
0.3654
1.0740
0.9294
5.9562
6.5507
0.1800
2.4550
0.0682
2.0386
379
Bif3-4
0.8097
0.9155
0.9210
7.5549
6.8241
0.0682
1.9599
0.1734
2.2480
173
Table 4.9: Estimated truncated bivariate von Mises distributions for pairs of bifurcation levels from one to four in the S2 region.
V1
λ
µ1
µ2
κ1
κ2
a1
b1
a2
b2
NumSamples
Bif1-2
2.2086×10−5
1.1579
1.0840
5.4260
5.8202
0.1008
2.7281
0.1283
2.5066
379
Bif2-3
0.5036
1.0779
0.9885
6.0449
6.2072
0.1283
2.5066
0.0413
2.0317
350
Bif3-4
3.6420×10−5
0.9821
0.9565
5.6997
5.4751
0.0428
1.9700
0.2502
2.0169
146
Table 4.10: Estimated truncated bivariate von Mises distributions for pairs of bifurcation levels from one to four in the V1 region.
4.4. CONCLUSIONS AND FURTHER STUDIES
V2
λ
µ1
µ2
κ1
κ2
a1
b1
a2
b2
NumSamples
Bif1-2
0.5981
1.1430
0.9530
5.6930
4.8082
0.1145
2.2754
0.1262
2.3372
461
Bif2-3
1.5503×10−6
0.9412
0.8681
4.9758
5.9418
0.0881
2.4604
0.0413
2.2450
556
75
Bif3-4
3.5550×10−5
0.8655
0.8036
5.5421
6.6849
0.0413
2.2450
0.0658
1.7391
259
Table 4.11: Estimated truncated bivariate von Mises distributions for pairs of bifurcation levels from one to four in the V2 region.
Similarly as before, distributions obtained with a sample size < 200 shall be interpreted with care.
4.4
Conclusions and further studies
1. All considered univariate distributions satisfy that µ1 ∈ [a, b] and all bivariate
distributions satisfy that (µ1 , µ2 ) ∈ [a1 , b1 ] × [a2 , b2 ] (or at least, µ1 ∈ [a1 , b1 ]
and µ2 ∈ [a2 , b2 ]) which together with the particular values of the truncation
parameters can yield information about characteristic behavior of dendritic
arborizations or dendritic trees. Also, as briefly introduced before, there is a
remarkable tendency on the mean parameter that seems to decrease accordingly to the increase of the bifurcation level. This could be initially considered
an indicator of an angles overall decrease of value when bifurcation level increments, however truncation and concentration parameters do not show direct
support of this hypothesis as concentration does not seem to steadily increase
or decrease and truncation parameters do not show to be significantly different
in any of the levels for which we have enough data.
2. Truncation parameters varied from the minimum a parameter, a = 0.0241 to
the maximum b, b = 2.7947 which correspond to 1◦ 30 and 160◦ 120 respectively.
It can be noted that dendritic angles do not surpass 180◦ which could be considered an angle where the new level “does not gain distance from the origin”.
This functional appreciation could suggest that in the design and development
of dendritic trees, there is an interest to grow distant from the neuron’s soma
with new bifurcations, which could be directly related to the need of establishing new connections with other neurons. Angles superior to that amount can
be considered “to go backwards” respect to the origin point and seem not very
76
CHAPTER 4. APPLICATION IN NEUROSCIENCE
beneficial when trying to establish new connections with distant neurons. The
minimal angle could be also viewed as the angle that transports subsequent
bifurcations more in space. It can be hypothesized solely from these appreciations that a tendency of higher angles in the primary bifurcation levels that
could be reflected in the mean and concentration of the data would be present,
and when growing the bifurcation levels,the angle separation would decrease.
However, this is not confirmed at an experimental level, and it can also be because the additional interest to grow in length coupled with the hypothesized
interest to grow in width. These notions can further induce to consider that
in a 3-dimensional space, dendritic trees construction shows interest in width,
length and depth expansion as is trying to maximize their communication capabilities with other neurons, with no preference for any of them identified in
the data.
3. The comparisons between the univariate distributions in the separations with
brain area criteria or bifurcation criteria revealed few overall differences for all
of them, behaving remarkably similar despite the differentiation criteria applied to the dataset. However, when dividing the data with bifurcation criteria
coupled with the brain area criteria, the resulting sub-datasets could have presented not enough samples to conduct reliable analysis, specially when dealing
with bifurcation levels beyond three. All distributions showed proximity to
symmetry and relatively high concentration around their mean.
4. Bivariate distributions of bifurcation levels showed an overall remarkable proximity to independence, with many of them showing values below λ = 10−4 with
the optimization techniques that were applied for parameter estimation. In the
global bifurcations study, a consistent decreasing tendency on λ was observed,
which itself could suggest that the influence of bifurcations on a level on the
immediately higher may decrease as the level increases. However, it is also
suggested that in any of the cases such influence would not represent a nonsmall contributing factor, as the highest lambda value in this study was found
to be λ = 0.1321. When the study was conducted region-wise, higher values
of the λ parameter were found, but still were considered small and insufficient
for explanations with clearly identified elements. The consistent higher value
of the λ parameter in Bif2-3 w.r.t. the other bifurcation pairs in 3 of the 8
tables may suggest a localized phenomena worth of study.
It is still not clear that dendritic ramifications do not respond at a biological
level to their context immediate connection needs or also that the bifurcation
levels are not locally related to their previous bifurcation parent angles.
5. Some estimations that produced more distinct results are considered and concluded to be highly contaminated by the lack of a proper number of samples.
Distributions of similar configurations to the obtained in this study may suffer
from parameter poor quality estimations specially regarding truncation pa-
4.4. CONCLUSIONS AND FURTHER STUDIES
77
rameters, where the less likely individuals (in this case, by the expected shape
of the distribution) are the truncation limits of the distribution.
Further studies could include observing the behavior of n−dimensional truncated
von Mises distributions when analyzing the dendritic trees and all their possible subdivisions and to obtain new discrimination criteria that allows to select appropriate
subgroups of the dataset showing unknown relationships currently unobserved by
the conducted subgroup selection criteria. For example, it could be useful to examine the bifurcation levels locally w.r.t. their parent bifurcations to see it further
patterns arise, or if the obtained decreasing mean pattern is additionally supported
by findings that allow for better explanations of the observed angular behaviors.
78
CHAPTER 4. APPLICATION IN NEUROSCIENCE
Chapter 5
Conclusions and future work
In this work we have developed the theoretical framework of the truncated von Mises
distribution. This objective was achieved by:
1. The successful determination of the expressions of maximum likelihood estimators. For both univariate and bivariate cases, maximum likelihood estimators
of the truncation parameters were found in isolated and solely sample dependent form, while the other parameters showed interdependency also in both
cases. A system of Karush-Kuhn-Tucker equations was used to model and
solve the remaining parameters in a numerical estimation approach.
2. Obtaining the moments of the univariate case and existing relationships between them.
3. The properties of both bivariate and univariate case, specially the results concerning the additional manipulability and shapes that the distribution can
present when modifying the truncation parameters. They allow us to see the
most characteristic particularities of the truncated case, where a similar distribution in the remaining parameters can show remarkable distinct behaviors,
such as being a strictly increasing or strictly decreasing function, presenting
symmetry or not, or concentrating its positive support in a sub-interval as
short as we let it be. In this work, parameter conditions for those cases have
been gathered for both univariate and bivariate cases.
4. The bivariate case and studies of the shape and behavior of marginal and
conditional resulting probability distributions. We determined that every conditional distribution on a truncated bivariate von Mises distribution is a truncated univariate von Mises distribution independently of the value of the λ
parameter. For the case of the marginal distribution, we concluded that only
for parameter λ = 0, that is, independence between the variables of the bivariate, the distribution behaves like a truncated univariate von Mises distribution.
When variables show some degree of dependency, the resultant marginal distribution is not von Mises, but a potentially bimaximal distribution (otherwise
unimodal). We concluded 4 different cases based on the configuration of the
79
80
CHAPTER 5. CONCLUSIONS AND FUTURE WORK
truncation parameters that allowed us to isolate the parameter ranges and
configurations where the truncated marginal von Mises shows all its different
behaviors. More concretely, we were able to identify the parametric circumstances for a bimaximal distribution to be bimaximal, to present either one of
two or two global maxima and how the minimum value is not at the mean value
µ1 in the first case but necessarily at the second. Also, the parametric circumstances for a unimodal distribution to be unimodal and for it to present its
maximum at µ1 (or not and how) were identified, leaving no behavior or shape
of the marginal distribution unclassified and undocumented by our analysis.
The theoretical extent of this work also covers appropriate introductions and easements regarding the use of Bessel functions and the indefinite integral of (2.3), that
are of necessary consideration if further work is to be conducted on the subject
matter.
Future lines of research can be orientated to further simplify the expressions that
describe the different calculations conducted in this work, finding expression equivalences that show clearly unknown but present properties or are more efficiently
analyzable to more accurately derive the existent results. This could benefit the
expressions involving the indefinite integrals of moments, maximum likelihood estimators and expectancies showed in this masters thesis.
Relatedly, research on integral calculus to further push away the limits of mathematical tractability and work with infinite series may be of direct effect on the final
expressions that were reported here. Works and results regarding Bessel functions,
specially concerning generalizations that cover the case where the integral coefficients are not restricted to a 2π−length (or any multiple of π) can be of immediate
application.
Another line of research regarding estimators could be developments that more
efficiently take into account the truncation limitations of the data and could by
the mathematical properties of the distribution create better approximations, thus
addressing the interdependence of the parameters that the current and applied estimation techniques (maximum likelihood estimation) show in a more elaborated way.
This work is also susceptible to continuation in calculations that further describe
the truncated von Mises distribution. Some examples of this may include sample
mean and sample mean resultant length distributions, among others.
Bibliography
Abramowitz, M. and Stegun, I. (1964). Handbook of Mathematical Functions: With
Formulas, Graphs, and Mathematical Tables. Applied Mathematics Series. Dover
Publications.
Ballesteros-Yáñez, I., Benavides-Piccione, R., Bourgeois, J.-P., Changeux, J.-P.,
and DeFelipe, J. (2010). Alterations of cortical pyramidal neurons in mice lacking
high-affinity nicotinic receptors. Proceedings of the National Academy of Sciences,
107(25):11567–11572.
Bistrian, D. A. and Iakob, M. (2008). One-dimensional truncated von mises distribution in data modeling. Annals of Faculty of Engineering Hunedoara.
Gradshteyn, I. S. and Ryzhik, I. M. (2007). Table of Integrals, Series, and Products.
Jupp, P. E. and Mardia, K. V. (1989). A unified view of the theory of directional
statistics, 1975-1988. International Statistical Review, 57(3):261–294.
Mardia, K. and Jupp, P. (2000). Directional Statistics. Wiley Series in Probability
and Statistics.
Mardia, K. V., Hughes, G., Taylor, C. C., and Singh, H. (2008). A multivariate
von mises distribution with applications to bioinformatics. Canadian Journal of
Statistics, 36(1):99–109.
Mardia, K. V. and Voss, J. (2011). Some fundamental properties of a multivariate
von Mises distribution. ArXiv e-prints.
Rosenheinrich, W. (2013). Tables of some idefinite integrals of bessel functions.
University of Applied Sciences Jena.
Singh, H. (2002). Probabilistic model for two dependent circular variables.
Biometrika, 89(3):719–723.
81