The total probability theorem for belief functions:
reasoning with time series under partial data
Dr Fabio Cuzzolin
Department of Computing and Communication Technologies, Oxford Brookes University
http://cms.brookes.ac.uk/staff/FabioCuzzolin/
Abstract
Providing sensible predictions in many scenarios such as climate change, rare events contingency planning, or disaster
risk analysis is difficult, since the available data, normally in
the form of a time series, is either scarce, incomplete, or even
missing. In such as case, “cautious” approaches to uncertainty modeling have an edge over classical methods, as they
are designed to come up with robust (albeit imprecise) predictions when data are lacking or partial. However, “imprecise” estimation and decision making require mathematical
tools only partially developed yet, due to the more complex
mathematical behavior.
In particular, the generalization of classical total probability
has a crucial role in the formulation of a complete such framework. We propose here to bring to full development the theory of total probability for random sets or “belief functions”,
as arguably one of the most powerful imprecise-probabilistic
theories, in order to make such methodologies viable tools
for practitioners of all fields of science, with potentially vast
repercussions in all the outlined partial-data scenarios.
In machine learning and computer vision, more specifically,
learning and estimation are normally based on manually collected and labelled training sets that are typically very small
compared to the true extent of the problem. We propose to apply the total belief framework to the example-based pose estimation problem, an extremely active vision topic with growing
commercial and societal repercussions.
1 Previous research track record
The Proposer: Dr Fabio Cuzzolin
Dr Cuzzolin graduated in 1997 from the University of Padua
(Universitas Studii Paduani, founded 1222, is the seventh most
ancient university in the world) with a laurea magna cum
laude in Computer Engineering and a Master’s thesis on “Automatic gesture recognition”. He received a Ph.D. degree from
the same institution in 2001, for a thesis entitled “Visions of a
generalized probability theory”. He was first Visiting Scholar
at the Washington University in St. Louis (currently 12th in
the US universities ranking), and later appointed fixed-term
Assistant Professor with Politecnico di Milano, Italy (consistently recognized as the best Italian university). Subsequently,
he moved as a Postdoctoral Fellow to the University of California at Los Angeles, and received a Marie Curie Fellowship
in partnership with INRIA Rhone-Alpes, France. In addition,
Dr Cuzzolin classified second in the 2007 Senior Researcher
national recruitment at INRIA, and had interviews with/offer
from Oxford University, EPFL, Universitat Pompeu Fabra,
UCSD, GeorgiaTech, U. Houston, Honeywell Labs, Riya. He
joined the internationally recognized Computer Vision group
at Oxford Brookes University in September 2008; he was of-
fered a Senior Lectureship by the same institution in July 2011
and is a Reader there since September 2011. Dr Cuzzolin is an
integral part of the joint Oxford University - Oxford Brookes
research group led by Professors Phil Torr and Andrew Zisserman, well known as one of world’s best and has won an
incredible number of awards in the recent past, last ones the
best paper awards at the British Machine Vision Conference
2010 and the European Conference on Computer Vision 2010.
He has taken on the role of Head of the Machine Learning
research group in September 2012.
Publication Record. Dr Cuzzolin’s research interests span
both machine learning applications to computer vision, including gesture and action recognition and identity recognition
from gait, and uncertainty modeling via imprecise probabilities, to which he has contributed by developing an original
geometric approach to belief functions and other uncertainty
measures [6]. His scientific productivity is extremely high, as
the forty papers he has published in the last five years attest.
Dr Cuzzolin is currently author of 73 peer reviewed scientific
publications (published or under review), most of them as first
or single author, including a monograph, two book chapters,
19 journal papers, and 9 chapters in collections.
Dr Cuzzolin’s top papers have been all rated *four star* at the
latest mock REF meeting by a panel of both external and internal reviewers.
Awards. His work has won him a Best Paper Award at
the recent Pacific Rim Conference on AI symposium (PRICAI’08), a Best Poster Award at the recent ISIPTA’11 symposium on Imprecise Probabilities, and a best poster prize at
the last INRIA 2012 Summer School on Machine Learning
and Visual Recognition. He was also short-listed for prizes
at the ECSQARU’11 and British Machine Vision (BMVC’12)
conferences, where he was given the Outstanding Reviewer
Award.
Proposer’s Track Record in Uncertainty Theory. Dr
Cuzzolin is recognized as one of the most prominent experts in
the field of non-additive probabilities, and of belief functions
in particular. He has been recently re-elected to the Board of
Directors of the “Belief Functions and Applications Society”,
and is a member of the “Society for Imprecise Probabilities
and Their Applications”.
His most important contribution in the field of uncertainty theory and imprecise probabilities is an all-purpose geometric approach to uncertainty measures, in which probabilities, possibilities and belief functions can all be represented as points
of a Cartesian space and there analyzed [6, 5, 7, 9]. Evidence aggregation operators (the analogues of Bayes’ rule in
the Bayesian formalism) can also be seen as geometric operators [12]. The issues of how to approximate a belief function
with an additive probability or a possibility measure, or what
probability transformation is appropriate for decision making
can be all solved by geometric means [7]. He has developed
an approach to conditioning based on minimizing appropriate
distances between belief measures [11], to which all existing
conditioning methods could potentially be reduced. In his recent award-winning paper [10], Dr Cuzzolin has investigated
alternative combinatorial foundations for the theory of belief
functions, and their algebraic properties.
His monograph “The geometry of uncertainty” collecting all
his contributions to the mathematics of uncertainty is under
review by Springer-Verlag.
Proposer’s Track Record in Computer Vision. The proposer is also very active in computer vision-based human motion analysis. He has recently explored the use of bilinear and
multi-linear models to identity recognition from gait [4, 8], a
relatively new but promising branch of behavioral biometrics,
and is exploring manifold learning techniques for dynamical
models representing (human) motions, in order to learn the
optimal metric of the space they live in and maximize classification performance [13]. Together with his student Michael
Sapienza and Philip Torr has achieved extremely promising
first results on the selection of discriminative action parts for
recognition and localization [43], that won them a Best Poster
Prize at the last INRIA Summer School. He has published
several papers on spectral motion capture techniques [35, 16],
focusing in particular on the crucial issue of how to select and
map eigenspaces generated by two different shapes in order
to track 3D points on their surfaces or consistently segment
bodyparts along sequences of voxelsets [17]. In direct relation
to the topic of this proposal, he is working on purely examplebased, bottom-up pose estimation approaches in which multiple images features are integrated in the framework of belief
calculus [14].
Editorial Boards and TPC Memberships. Dr Cuzzolin
is currently an Associate Editor of the “IEEE Transactions on
Systems, Man, and Cybernetics - Part C”, and has been a Guest
Editor for “Information Fusion”. He collaborates with several
other international journals in both computer vision and probability, such as: Artificial Intelligence, the IEEE Tr. on Systems, Man, and Cybernetics B and C, the Int. J. on Approximate Reasoning, Computer Vision and Image Understanding,
the IEEE Trans. on Fuzzy Systems, Information Sciences, the
Journal of Risk and Reliability, the International Journal on
Uncertainty, Fuzziness, and Knowledge-Based Systems, Image and Vision Computing. He has served in the program
committee of more than 30 international conferences in both
imprecise probabilities (e.g. ISIPTA, ECSQARU, BELIEF)
and computer vision (e.g. VISAPP). He is a reviewer for top
international vision conferences such as BMVC, ICCV and
ECCV.
Dr Cuzzolin will be the chair and local organizer of the upcoming 3rd International Conference on the Theory of Belief
Functions (BELIEF 2014), to be held in Oxford.
Supervision of Ph.D. students and post-docs. Dr Cuzzolin has supervised several MSc students, is currently supervising two Ph.D. students, and is in the process of hiring a postdoctoral researcher as a result of a recent successful proposal. His group will receive two visiting Ph.D. students in 2013. His plans aim at developing a fully fledged
research group on Artificial Intelligence of some ten collaborators within the next two years, in the context of the larger
Machine Learning group.
Professional Links. Dr Cuzzolin has acquired considerable international experience by working in the past for some
of the most prominent research laboratories in both the US
and Europe. He enjoys personal links with several world class
companies (many of them with research divisions in the UK)
such as Microsoft Research (A. Fitzgibbon and A. Blake),
Honeywell Labs (I. Cohen), Boston’s MERL (M. Brand, S.
Ramalingam), GE (G. Doretto), Google (A. Bissacco, M. Andreetto, M. Marszalek), Riya, EADS (G. Powell), Lockeed
Martin (M. Chan).
Grants & external funding. Dr Cuzzolin has been
awarded in 2011 a 122K EPSRC First Grant, for a project
which has received exceptional reviews. Started in August
2011, the project involves hiring a postdoc for the second year.
He has also applied for a Project Grant on “The total probability theorem for finite random sets” with the Leverhulme Trust.
Such outline proposal passed the first stage, and the Trust invited the submission of a full proposal on the topic.
At European level he has just submitted to the last FP7
Call 9 a 3 million euro STREP on “Dynamical Generative and Discriminative Models for Action and Activity Localization and Recognition” as the coordinator, with IDSIA
(Switzerland), Universiteit Gent (Belgium), SUPELEC and
Dynamixyz (France) as partners. He is also finalizing (again as
the Coordinator) a Short Proposal for a Future and Emerging
Technology (FET-Open) FP7 EU grant on “Large scale hybrid manifold learning”, with INRIA (France), Pompeu Fabra
(Spain) and Technion (Israel) as partners.
At UK level, Dr Cuzzolin has just submitted a joint EPSRC
proposal on “Locating and recognizing complex activities via
dynamic generative/discriminative modeling” with professor
Philip Torr, and a Leverhulme proposal on “Guessing plots for
video googling” with Professor T. Lukasiewicz of Oxford University. He is setting up an EPSRC Network on Uncertainty
Theory (NUTS) with Professors J. Hall and T. Lukasiewicz
(Oxford University), J. Lawry (Bristol), F. Coolen (Durham),
J-B. Yang (Manchester Metropolitan), W. Liu (U. Belfast), A.
Hunter (UCL) and others. Finally, he is involved as a partner
in a different EPSRC network proposal on Compressed Sensing, led by professor Mark Plumbley.
Host organization: Oxford Brookes University
The Department of Computing and Communication Technologies currently comprises some 30 academic staff, including
Professor David Duce (co-chair of the Eurographics conferences), Professor Rachel Harrison, Editor in Chief of Software
Quality Journal, and Professor Philip H.S. Torr, world leader
in Computer Vision and Machine Learning. Projections for
the next REF indicate a score of at least 3.1.
Brookes Machine Learning and Vision Groups. Dr Cuzzolin has recently taken on the role of Head of the Machine
Learning research group. The group includes the Head of Department Dr Nigel Crook and Dr Tjeerd Olde Scheper, plus 6
or 7 Ph.D. students and postdocs.
He also belongs to the Oxford Brookes vision group founded by Professor Philip Torr (
cms.brookes.ac.uk/research/visiongroup/),
which
comprises some fifteen staff, students and post-docs who will
add value to this project. Professor Torr was awarded the Marr
Prize, the most prestigious prize in computer vision, in 1998.
Members of the group have recently received awards in 8 other
conferences, including best paper at CVPR’08, ECCV’10
and BMVC’10 and honourary mention at NIPS’08, the top
machine learning conference. The group enjoys ongoing
collaborations with companies such as 2d3, Vicon Life, Yotta,
Microsoft Research Europe, Sharp Laboratories Europe, Sony
Entertainments Europe. The group’s work with the Oxford
Metrics Group in a Knowledge Transfer Partnership 2005-9
won the National Best Knowledge Transfer Partnership of
the year at the 2009 awards, sponsored by the Technology
Strategy Board, out of several hundred projects.
2 Proposed research and its context
2.1 Background
Topic of research: uncertainty theory. Decision making
and estimation are central problems in most applied sciences,
as both people and machines need to make inferences about
the state of the external world, and take appropriate actions.
Traditionally, the (uncertain) state of the world is assumed to
be described by a probability distribution over a set of alternative, disjoint hypotheses. Making appropriate decisions or
assessing quantities of interest requires therefore estimating
such a distribution from the available data. Uncertainty is normally handled in the literature within the Bayesian framework,
which is indeed quite intuitive and easy to use, and capable of
providing a number of “off the shelf” tools to make inferences
or compute estimates from time series. Sometimes, however,
such as in the case of extremely rare events (e.g., a volcanic
eruption), few statistics are available to drive the estimation
[26]. Part of the data can be missing [21]. Furthermore, under
the law of large numbers, probability distributions are the outcome of an infinite process of evidence accumulation, drawn
from an infinite series of samples, while in all practical cases
the available evidence only provides some sort of constraint
on the unknown probabilities governing the process [46]. All
these issues have led to the recognition of the need for a coherent mathematical theory of uncertainty.
State of the art. Different kinds of constraints are associated with different generalizations of probabilities, formulated
to model uncertainty at the level of probability distributions.
The simplest way to constrain a probability distribution is to
put upper u(x) and lower l(x) bounds to its values on all the
elements (x, y, z etc) of the domain: we get an interval probability [18]. A more sophisticated constraint requires the (unknown) probability to fall inside a convex set of probability
distributions or “credal set” [34] (Figure 1-left). Convexity (as
a mathematical requirement) is a natural consequence, in these
theories, of rationality axioms such as coherence [56].
A battery of different uncertainty theories have indeed been
developed in the last century or so, starting from De Finetti’s
pioneering work [28]. The most powerful and successful
frameworks are, arguably, the theory of belief functions or
“theory of evidence” ([22, 44]), possibility-fuzzy set theory
([57, 24]), the theory of random sets [38] and that of imprecise
probabilities [56]. Other approaches to uncertainty modeling
have also been proposed, including monotone capacities, Choquet integrals, rough sets, hints [31], game theory [48]. Also
referred to as imprecise probabilities (as most of them comprise classical probabilities as a special case) they form in fact
an entire hierarchy of encapsulated formalisms.
Rationales for a mathematics of uncertainty. Most imprecise probability theories share the same rationale, the need
to cope with some serious issues about the way uncertainty is
traditionally handled in probability theory. In particular:
1. Infinite accumulation of evidence (or lack thereof). The
neglected assumption of an infinite process of evidence accumulation is particularly strident for statistically rare events
[26] (think volcanic eruptions or critical accidents in a nuclear power plant). Another counterexample is provided by
the training sets used in computer vision and machine learning to learn action or object categories, in which only a few
thousand examples of a few dozen classes are normally contemplated [27]. In example-based pose estimation [1, 14], in
particular, the pose of the object of interest is estimated from
a very limited number of examples.
2. Representation of ignorance. Ignorance is represented in
classical probability theory as a uniform prior. This is in fact
Figure 1: Two imprecise probability theories. Left: credal
sets are convex sets of probability measures. Right: belief
functions are probability distributions on a power set, and can
be interpreted as a special class of credal sets.
a rather precise model of what is going on, especially considering that, due to the nature of Bayes’ rule, an inadequate
choice of the prior can be overturned by the available evidence
only very slowly and with great effort. In addition, it can be
shown that when dealing with different but compatible decision spaces or “frames”, “uninformative” uniform priors on
different domains are not compatible [44].
3. Choice of a model. In the Bayesian framework epistemic
uncertainty is typically addressed by imposing additional assumptions in order to pick a specific a-priori distribution: in
other words, people choose a specific “model”. The subsequent series of mathematical inferences is rigorous and selfcontained, but, as it should be obvious, it only relates to the
chosen “model” and has possibly little to do with the actual
phenomenon at hand.
4. Missing data. Most often, decision have to be made under incomplete or missing data. It can be shown that [37, 21],
when part of the data used to estimate the desired probability
distribution is missing, the resulting constraint is a credal set
[34] of the type associated with a belief function.
Indeed, G. Shafer’s theory of belief functions or BFs [44]
allows us to express partial belief by providing lower and upper bounds to probability values.
The notion of belief function originally derives from a series
of Dempster’s works on upper and lower probabilities induced
by multi-valued mappings. Briefly, given a probability distribution p : Ω → [0, 1] on a given domain Ω, and a one-to-many
map x 7→ Γ(x) ⊂ Θ to another domain Θ, the original probability induces a probability distribution m : 2Θ → [0, 1] on
the power set of the second collection [22], i.e., a “random set”
[38]. The term “belief function” (BF) was coined when Shafer
[44] adopted these mathematical objects to represent evidence
in the framework of subjective probability, and gave an axiomatic definition of them as non-additive probability measures. Belief
P functions can be also seen as “sum functions”
b(A) = B⊆A m(B), A ⊂ Θ on the power set. More controversially, they mathematically correspond to a special case
of credal set: as they determine a lower and an upper bound to
the probability of each event A, they are naturally associated
with the convex set of probabilities which dominate them.
The theory of belief functions is appealing because it addresses all the above mentioned issues with the handling of
uncertainty: it does not assume an infinite amount of evidence
to model imprecision, but uses all the available partial evidence; it represents ignorance in a natural way, by means of
the mass assigned to the whole decision space or “frame”, and
deals with the problem of having to represent uncertainty on
different but compatible domains; it does not need to resort to
any specific “model” to enable us to make deductions on the
observed phenomenon, but preserves the appropriate level of
uncertainty at all stages of the calculations; it copes with missing data in the most natural of ways. Furthermore: its rationale
is rather neat and simple; it is a straightforward generalization
of probability theory, and it does not require to abandon the notion of event (as Walley’s imprecise probability theory [56]);
it contains as special cases both fuzzy set and possibility theory. The widespread influence of uncertainty at different levels
explains why belief functions have been increasingly applied
to fields as diverse as robotics, fault analysis, machine vision
[41, 40, 15], and many more.
Time series, conditional belief functions, and total belief. In many of these applications, observations come from
time series of observations or measurements. Sometimes, the
problem is so complex that a large number of variables is
necessary to model it: conditional independence assumptions
need then to be applied to simplify the structure of a joint distribution. This is true, for instance, in object recognition and
image segmentation [32], where conditional independence is
crucial to make optimization problems tractable. Different
definitions of conditional belief functions have been proposed
in the past by Jaffray [30], Denneberg [23], Smets [49], Kyburg [33], and others. As explored by the proposer, conditional
belief functions can be also defined by minimizing appropriate geometric distances [11]. The most common approach,
though, consists on using conditional BFs induced by Dempster’s original evidence combination rule, a generalization of
Bayes’ rule to belief functions.
Object tracking [1] is an example of a problem in which uncertainty affects estimation from time series of observations. If
conditional constraints on some targets’ future positions given
their past locations are available, and if such constraints are
described by belief functions, then predicting current target locations requires combining conditional BFs in a total function
[3], in a process which generalizes the classical total probability theorem for standard probabilities.
2.1.1 Industrial and societal context
Indeed, uncertainty is widespread in all fields of applied and
natural sciences, and involves scenarios of the utmost importance to both the public and the government.
Natural disasters are a paradigmatic example of rare events
for which extremely few statistics are available: nevertheless,
the potential cost of a large scale devastating event materializing is just to high to ignore, forcing the government at all
levels to come up with sometimes costly contingency plans
for the management of an emergency. Cautious, impreciseprobabilistic techniques are designed to cope with just such
scenarios, and can greatly contribute to the public debate and
government policy on the matter [19].
In the aftermath of the Fukushima tragedy the nuclear debate is still heated; the issue is whether nuclear power plants
will continue to play a role in the UK’s future energy strategy.
Crucial is the ability to assess the hazard associated with nuclear energy on the basis of extremely scarce data, as incidents
at nuclear power plants are (fortunately) rare and no experiments other than simulations can be run to collect evidence.
This policy issue could clearly benefit from the help of a fully
fledged theory of imprecise prediction on time series
Climate change has risen in recent years to top priority for
policy makers up and down the world, in virtue of the incontrovertible evidence on the rising of global temperatures and its
correlation with human emissions. Nevertheless, uncertainties
affect hugely the time series data on which estimates of future
temperature rise, for instance, are based: producing scenarios
as accurate as possible is a fundamental factor that will shape
our society and economy for decades to come. Policymakers
depend on experts to provide them with usable information
prior to pushing for a decision on this matter. Completing belief calculus with a total belief core will allow CC experts to
make full use of these methodologies [2, 42].
Figure 2: Scenarios involving making prediction under heavy
uncertainty: disaster prevention, nuclear incidents, climiate
change, financial trading, large-scale learning, missing data.
In finance and banking, time series prediction is bread and
butter for traders, who have to come up with buy/sell/hold
decisions based on these predictions. An estimated 10% of
the UK’s economic output is attributed to the financial sector alone, while the public’s sensitivity to these themes have
spiked in the recent times. Imprecise-theoretical tools based
on a fully fledged belief calculus, integrating proper treatment
of total probability, could be of invaluable help to professionals active in the area of risk assessment [47].
More relevant to this proposal, in machine learning and vision the availability of a wealth of internet data has originated
new applications such as video retrieval and image googling
which are changing the end users’ experience. However, the
relevant techniques are strongly affected by training sets which
are insignificantly small and cause therefore over-fitting, while
manually gathering and labeling data is extremely expensive.
This is true in particular for real-time pose estimation, which
lays at the base of popular game consoles such as Microsoft’s
Kinect: the Oxford Brookes Vision Group enjoys continuing
strong links with Microsoft Research. Imprecise-probabilistic
techniques can provide more cautious approaches to information retrieval and learning, softening the over-fitting issue
without the need to over-boost the size of the training sets, directly improving the commercial prospects of IT companies.
2.2 Research hypotheses and objectives
Research idea. As we pointed out above, decision making
and estimation on time series in the presence of missing or incomplete data is widespread in all fields of science and many
crucial components of modern society. The theory of belief
functions, with its relative simplicity, its manifold semantics
and its generalization potential, is arguably one of the most
developed theories of mathematical modeling of uncertainty,
and a most serious candidate for the job.
However, belief calculus is not yet entirely a viable alternative to traditional probability theory. When partial observations come from time series, classical probability provides a
straightforward method to integrated conditional probabilities
into a single,
P “total” probability measure through the formula:
P (A) = Bi ∈Π P (Bi )P (A|Bi ), where Π = {Bi } is a dis-
Figure 3: Left: the total belief problem is the generalization of the classical total probability theorem to belief functions, as
finite random sets. A total belief function b on Θ is sought whose restriction to a partition Ω of Θ is equal to another b.f. b0 ,
and its conditional version with respect to each element Θi of the partition Ω coincides with a given belief function bi . Middle:
the problem has in principle a plurality of solutions, associated with linear systems, which form a graph (an example is given
here) whose symmetries are related to the size of the problem. Right: belief functions can be conditioned, given an event A, by
projection onto the “conditioning simplex” BA associated with the conditioning event A, in the simplex of belief functions.
joint partition of the universe. In opposition, a full characterization and solution of this problem within the more general
framework of belief calculus, which amounts to a generalization of the total probability theorem, has not yet been provided.
This is the goal of the present proposal.
The total probability problem for belief functions reads as
follows. Consider a set Θ and a disjoint partition Ω of Θ.
Suppose a belief function is defined on each element of the
partition, and a (a-priori) BF is given on Ω itself. We seek a
total belief function on the whole of Θ whose restriction to
Ω coincides with the a-priori, and whose conditional versions
(under Dempster’s conditioning, for instance, but not necessarily) coincide with the given ones for all the elements Θi of
the partition Ω of Θ (see Figure 3-left).
Novelty and Goals of the project. With this project we
aim at achieving the following breakthroughs:
1 - The development of a coherent general theory of total
probability for belief functions. Belief functions are complex
objects, especially in the case of continuous domains, as they
allow for a more general, cautious, data driven treatment of
time series data. The price to pay is a more complex and
interesting mathematical formulation of the total probability
problem. The main goal of the present proposal is the development of a comprehensive, definitive treatment of total belief
functions in both the discrete and the continuous cases under
Dempster’s conditioning rule.
This is a fundamental methodological contribution, which can
boost the application of belief calculus to the manifold fields
of applied science and a huge number of societal issues.
2 - A comprehensive study of conditioning for belief functions. The greater complexity of belief functions with respect
to probabilities is also reflected by the significant number of
alternative conditioning operators that have been formulated
for them. As in principle different conditioning rules imply
different total belief problems, we aim at both studying the
alternative total belief frameworks induced by the most successful competitors to Dempster’s conditioning, and settle the
matter by coming up with a general conditioning framework.
3 - The development of an evidential tracking approach to
example-based pose estimation based on total belief. To keep
the proposed project sufficiently focussed, and to build on the
proposer’s expertise in computer vision and machine learning, we propose to test the developed methodologies in the
example-based pose estimation and tracking problem, central
to both the gaming and the surveillance industries, building on
preliminary results by the proposer [15, 14].
Milestones. The project’s objectives can be further decomposed into the following verifiable, measurable milestones:
M1 - the solution of the restricted case, in which the a-priori
function is a probability measure;
M2 - the study of the structure of the set of all the multiple
solutions to the restricted problem;
M3 - the solution of the general total belief theorem in the discrete case, including the multiplicity of its solutions;
M4 - the extension of our analysis to the continuous case, in
which belief functions are defined on arbitrary domains;
M5 - the extension of the analysis to the case of the other major conditioning operators, other than Dempster’s rule ..
M6 - .. and the attempt to formulate a general theory of conditioning;
M7 - the analysis of the links to similar problems in related
fields, such as marginal extension in imprecise probabilities;
M8 - the application of the total belief framework to examplebased pose estimation and tracking.
The above milestones will be measured in terms of highimpact publications in top conferences in Artificial Intelligence (such as Uncertainty in Artificial Intelligence (UAI),
AAAI and the International Joint Conference on AI (IJCAI)
2013-15) and top journals of the like of Artificial Intelligence,
the Journal of the Royal Statistical Society, IEEE PAMI and
possibly (given the fundamental nature of the proposed research) Science. Milestone 8 will be measured in terms of a
comprehensive toolbox collecting the routines implementing
the different stages of our evidential framework for examplebased pose tracking made available on the internet, and stateof-the-art results in example-based pose estimation and tracking achieved on all public datasets.
The timeliness of such a fundamental methodological contribution is supported by the variety of current societal trends
and issues that we discussed in the relevant section: the nuclear debate, the need for robust and reliable climate change
predictions, financial trading, management of natura disasters,
but also the need to learn from time series under huge uncertainty due to insufficient data that arises from IT applications such as motion capture in gaming and entertainment, and
video retrieval from the internet. This will ensure that a positive scientific outcome of the project is guaranteed a rapid diffusion and possibly, at a later stage, translate into new patents
and products.
Feasibility. While admittedly ambitious, the proposed research (as detailed above) is realistically feasible given the
time span of 3 years, the manpower available to complete it
and, most of all, the expertise and background of the proposer.
Dr Cuzzolin is a recognized leader in the field of belief functions and uncertainty theory. His original views on the mathematics of uncertainty have attracted growing interest, espe-
cially in the last four years, questioning more traditional approaches to the problem. He is a pioneer in the study of the total probability problem, while at the same time he possesses a
significant expertise in machine learning and vision, in particular the pose estimation problem, and is part of a world class
vision research group with expertise in bottom-up pose estimation [?] and links with companies active in the gaming and
entertainment industry (Sony, Vicon).
The project itself is rather focussed, well defined effort, with
clear goals and very clear ideas on the tools and techniques
necessary to achieve them, what are the main issues that can
possibly arise, and what actions can be taken in response.
As for the necessary manpower, the proposer enjoys connections with all major research groups in both the wider field of
imprecise probabilities at world level (e.g. Durham, Lugano,
Toulouse, Kansas, Rutgers, Cornell) and in computer vision
(Oxford’s VGG, Caltech, UCLA, INRIA, etcetera), which will
allow him to recruit without difficulties the skilled young researchers which are crucial for the success of the project.
Finally, the infrastructure available within the Oxford Brookes
vision group will enormously facilitate the tests that will constitute a healthy reality check for the techniques developed in
the course of the project.
2.3 Programme and methodology
2.3.1 Methodology
Recalling the general formulation of the total belief problem given in Figure 3-left, and our focus on the example-based
pose estimation application, the project will be articulated into
three clearly defined stages.
Stage 1: the restricted case.
Proof of the restricted case. The “restricted” case in which
the a-priori BF is in fact a probability measure has been empirically analyzed by the proposer [3]. Unlike the classical
case, the total belief theorem seems to possess several admissible solutions which form the nodes of a graph endowed with
a number of interesting symmetries (see Figure 3-middle).
The candidate solutions to such a problem correspond to linear systems with the same number of equations and unknowns.
Their solution yields in general a sum function m : 2Θ →
[0, 1] on Θ whose mass values can be negative, instead of
valid belief functions. Such linear systems are linked by linear transformations. The graph of all possible candidate solutions (whose edges correspond to linear transformations) can
be built (see an example in Figure 3-middle). As a consequence, starting from any arbitrary node of the graph, an admissible solution (corresponding to non-negative masses) can
be reached in a constructive way. A formal existence proof for
the restricted case will be our first milestone.
Study of the structure of the set of all solutions. Unlike
the classical total probability theorem, the total belief theorem
(even in the restricted case) has multiple admissible solutions.
The graph of all candidate solutions exhibits a number of interesting symmetries that we will explore (see Figure 3-middle
again). Via this graph, interesting links between total belief
and the theory of positive linear systems and other branches
of discrete mathematics like transversal matroids [39] seem to
emerge and will be investigated.
Stage 2: the general case.
Proof of the general total belief theorem. In Stage 2 we will
analyze the general case of an arbitrary a-priori BF. Preliminary (unpublished) results seem to suggest that the proof of
the general total belief theorem, at least in the final case, will
follow the same lines as in the restricted case. Two main issues will be tackled at this stage: the case of continuous belief
functions, and the different versions of the total belief theorem
under different conditioning operators.
Treatment of the continuous case. The theory of belief
functions has originally been formulated on finite domains
(frames) [44], as a result of its focus on subjective probability. Several possible extensions of the theory to continuous
domains have subsequently been studied ([45, 38, 31]). In this
more high-risk part of the project we will seek to contribute to
the resolution of this issue, focussing in particular on the completion of the theory of continuous BFs (and in particular the
extension of our results on total belief to the continuous case)
on closed, Borel intervals [53].
Further study of conditioning. The question of how to update or revise the state of belief represented by a belief function when new evidence becomes available is also crucial in
the theory of evidence. In Bayesian reasoning, this role is performed by Bayes’ rule. In the theory of belief functions, after
an initial proposal by Arthur Dempster, several other aggregation operators (e.g. disjunctive/conjunctive rules of combination) have been proposed, leaving the matter still far from settled ([30, 25, 29, 50, 52, 55]), though important efforts aimed
at putting some order in this issue have been led by key researchers (e.g., Denoeux [20]). As the second constraint of
the problem is that conditioning a total belief function must
yield the given conditional BFs, different conditioning operators can yield different total belief problems altogether, even
though most do end up coinciding with the above formulation.
In this perspective, the proposer is currently studying the problem of conditioning a belief function b with respect to an event
A by geometrically projecting such a belief function onto the
“simplex” (in blue) associated with A in the space of all belief
functions [6] (see Figure 3-right). Such an approach seems to
produce elegant results with natural interpretations in terms of
degrees of belief ([11]).
Total belief theorem for different conditioning operators.
Now, we may wonder what classes of conditioning rules can
be generated by such a process. Do they span all known definitions of conditioning? In particular, is Dempster’s conditioning itself a special case of geometric conditioning?
The goal of this part of the project will therefore be to investigate the potential of geometric conditioning as a general
framework for conditioning, extend our analysis of the total
belief problem to other definitions of conditional belief functions not based on Dempster’s rule (such as Suppes’ geometric
conditioning [54] or Smets’ generalized Jeffrey’s rule [50]),
and run a comparative discussion of the results obtained.
Relationship with marginal extension of coherent lower
previsions. In the more general context of “coherent lower
previsions” [56] the generalization of the total probability theorem goes under the name of marginal extension [36]. In
Walley’s general theory of imprecise probabilities [56], belief
functions are indeed a special case of coherent lower previsions. However, marginal extension produces results which
are not closed under the BF framework, as the marginal extension of two belief functions is, in general, a coherent lower prevision. A comparative study of the unique result of marginal
extension with the multiple admissible solutions of the total
belief problem is in order.
Stage 3: total belief in example-based pose tracking. As
we clearly stated, our philosophy is based on grounding theoretical advances on field testing and application. The pose
estimation problem, in which the configuration or “pose” an
evolving non-rigid object has to be estimated from a sequence
of images, is a perfect candidate for the job: the necessarily
limited size of the available training sets determines strong degrees of uncertainty which affect the accuracy and robustness
of the estimates. In addition, when tracking is tackled con-
ditional constraints arise which require a solution of the total
belief theorem. Last but not least it is a high impact issue, as
pose estimation is a fundamental ingredient of motion capture,
a technique widely used in the entertainment industry, medical
analysis, and human-computer interaction.
Example-based pose estimation. Unlike model-based approaches [51], in which some articulated or kinematic model
of the object is fit to the data to estimate its pose, discriminative or “example-based” methods [1] rely on a training set
of examples, to exploit the fact that the set of typical human
poses is far smaller than the set of kinematically possible ones.
They work by learning maps from image cues to poses: in a
training session the object performs a number of representative motions, while a number of “features” are extracted from
the available image(s), and an “oracle” provides us with the
corresponding ground truth poses [14]. Learning-based methods are appealing as a way to initialize model-based ones, and
because of their speed for real-time applications.
Figure 4: Evidential approach to example-based tracking.
In a training session an approximate pose space Q̃ is gathered
via motion capture equipment, and a set of maps from image
features to poses are learnt. During tracking, the pose estimate
is represented by a belief function on Q̃: when new image
evidence arrives at time t + 1, it is combined with the current
belief estimate to obtain an updated one.
Evidential modeling. From the training data one-to-many
mappings linking feature values to sets of training poses can
be easily constructed: a probability measure on each feature
space is then naturally mapped to a belief function (random
set) on the set of training poses. A set of feature measurements translates therefore into a set of belief functions there,
which can be combined to yield an entire family of probability distributions on the pose space. From this family either
a point-wise estimate, a finite set of extremal estimates, or a
confidence level can be computed to indicate how reliable the
estimate is ([15, 14]).
From evidential pose estimation to pose tracking. When
videos are available it is possible to exploit dynamics to make
the estimation both time-consistent and more robust. The issue
of combining conditional belief functions in a total function on
the pose space arises as soon as we try to exploit the constraint
that a human body is a collection of rigid bodies, subject to
rigid motion, as rigid motion constraints are conditional on
the previous pose in nature.
2.3.2 Programme of work and milestones
The P.I. will be assisted by a Research Assistant for the entire duration of the project (three years). The R.A. will take
care of the bulk of the experimentations and the dissemination of the results, and participate in the theoretical analysis of
the problem. Project management is expected to be relatively
simple for a short term project with just two people involved,
and will be taken care of by the P.I. himself, who is already
supervising a number of PhD and MSc students. The theoretical and empirical pillars of the project will advance in parallel
throughout the project.
First year. The target of the first year of the project is to deliver a first comprehensive set of results on the simplified total
belief problem in which the a-priori is a probability measure,
to publish towards the end of 2013, and lay the foundations
for empirical testing on benchmark tracking databases and a
comparison with state of the art approaches. This involves:
1.1 - the design and test of different image feature representation in the example-based pose estimation problem, for
both conventional and range cameras, exploiting the available
equipment at Oxford Brookes;
1.2 - the solution of the restricted version of the total belief
theorem, at least in the discrete case, and the study of the multiplicity and structure of its solutions;
1.3 - a preliminary testing on the public pose estimation
datasets: the results will be fed back to the activities at 1.1
and 1.2 to adjust the course of the research if necessary.
Second year. In the course of the second year, we will further pursue the development of a full fledged total probability framework for belief functions, and apply these tools to
example-based pose estimation. In particular we will deliver:
2.1 - a full solution to the general total belief theorem in the
discrete case, building on the outcome of the restricted case
analyzed in 1.2;
2.2 - the generalization of the obtained results to the case of
continuous belief functions: these two will be the main tasks
of the P.I. during the second year;;
2.3 - in parallel, the R.A. will validate the novel theoretical
advances achieved in the second year.
Third year. In the third year of the project, we will explore
alternative frameworks based on other major conditioning operators, and test and compare Bayesian and non-Bayesian approaches to pose tracking in the scenarios illustrated above. In
particular, we will work on:
3.1 - a comprehensive study of the different conditioning operators in belief calculus, and the extension of the results on total
belief to the other major conditioning frameworks alternative
to Dempster’s conditioning;
3.2 - an analysis of the links with related approaches, such as
marginal extension in Walley’s imprecise probability;
3.3 - tests, mainly conducted by the R.A., on the effectiveness
of the evidential tracking approach in all the major scenarios
such as gaming, entertainment industry, and compare these approaches with more traditional, “precise” Bayesian solutions
in order to understand in what situations uncertain modeling is
preferable, as a function of the size of the training set.
The outcomes of the project will consist of: state of the art
results in dynamic pose estimation and tracking from sets of
examples; a wealth of dedicated code (in the form of a readyto-use Matlab toolbox) available from a web site; scientific
output in the form of submissions to top journals in artificial
intelligence, statistics and computer vision, such as Artificial
Intelligence, IEEE PAMI, JRSS part B. More details are given
in the attached document.
2.4 Relevance to academic beneficiaries
The proposed work concerns a fundamental methodological tool which can be used to handle uncertainty in time series in dozens of different application fields by thousands of
practitioners already, many of them in the fast growing Asian
scientific and technological community. The potential for an
even more widespread application will be greatly increased by
a success of this project, since the latter targets explicitly a
crucial aspect (the total belief theorem) of the formalism.
The academic impact of the project in pose estimation and
tracking could also be considerable, considering the manifold
applications of motion capture in games and entertainment. In
addition, a side effect of this project will be to introduce novel
statistical tools, and imprecise-probabilistic techniques in particular, in the community of computer vision, opening up a
potentially huge line of research.
The dissemination of the outcomes of the project will be
greatly helped by proposer’s position at the crossroad between
the communities of imprecise probabilities, belief functions,
computer vision, and machine learning, possibly facilitating
the introduction of the methodology to fast-growing communities such as vision and machine learning.
References
[1] A. Agarwal and B. Triggs, Recovering 3d human pose from monocular
images, IEEE Trans. PAMI 28 (2006), no. 1, pp. 44–58.
[2] M. Collins, R.E. Chandler, P.M. Cox, J.M. Huthnance, J. Rougier and
D.B. Stephenson, Quantifying future climate change, Nature Climate
Change 2 (2012).
[3] F. Cuzzolin, Visions of a generalized probability theory, PhD dissertation,
Università di Padova, 2001.
[4] F. Cuzzolin, Using bilinear models for view-invariant action and identity
recognition, Proceedings of CVPR, vol. 2, pp. 1701–1708, 2006.
[5] F. Cuzzolin, Two new Bayesian approximations of belief functions based
on convex geometry, IEEE Tr. SMC-B 37 (2007), no. 4, 993–1008.
[6] F. Cuzzolin, A geometric approach to the theory of evidence, IEEE Tr.
SMC-C 38 (2008), no. 4, 522–534.
[7] F. Cuzzolin, Credal semantics of Bayesian approximations in terms of
probability intervals, IEEE Tr. SMC-B 40 (2010), no. 2, 421–432.
[21] G. de Cooman and M. Zaffalon, Updating beliefs with incomplete observations, Artif. Intell. 159 (2004), no. 1-2, 75–125.
[22] A. P. Dempster, Upper and lower probability inferences based on a sample from a finite univariate population, Biometrika 54 (1967), 515–528.
[23] D. Denneberg, Conditioning (updating) non-additive probabilities, Ann.
Operations Res. 52 (1994), 21–42.
[24] D. Dubois and H. Prade, Possibility theory, Plenum, New York, 1988.
[25] R. Fagin and J. Y. Halpern, A new approach to updating beliefs, Proceedings of UAI, pp. 347–374, 1991.
[26] M. Falk, J. Husler and R.-D. Reiss, Laws of small numbers: Extremes
and rare events (2004).
[27] P. Felzenszwalb and D. Huttenlocher, Pictorial structures for object
recognition, Int. Journal of Computer Vision 61 (2005).
[28] B. De Finetti, Theory of probability, Wiley, London, 1974.
[29] I. Gilboa and D. Schmeidler, Updating ambiguous beliefs, Journal of
economic theory 59 (1993), 33–49.
[30] J. Y. Jaffray, Bayesian updating and belief functions, IEEE Tr. SMC 22
(1992), 1144–1152.
[31] J. Kohlas and P.-A. Monney, A mathematical theory of hints - an approach to the Dempster-Shafer theory of evidence, Springer, 1995.
[32] P. Kohli and Ph. Torr, Efficiently solving dynamic Markov random fields
using graph cuts, Proc. of ICCV, vol. 2, 2005, pp. 922–929.
[33] H. E. Kyburg, Bayesian and non-Bayesian evidential updating, Artificial
Intelligence 31 (1987), 271–293.
[34] I. Levi, The enterprise of knowledge, MIT Press, 1980.
[35] D. Mateus, R. Horaud, D. Knossow, F. Cuzzolin, and E. Boyer, Articulated shape matching using laplacian eigenfunctions and unsupervised
point registration, Proceedings of CVPR, 2008.
[36] E. Miranda and G. de Cooman, Marginal extension in the theory of coherent lower previsions, IJAR 46 (2007), no. 1, 188–225.
[37] S. Moral and L. M. de Campos, Partially specified belief functions, Proceedings of UAI, pp. 492–499, Washington, DC, USA, 1993.
[38] H. T. Nguyen, On random sets and belief functions, J. Mathematical
Analysis and Applications 65 (1978), 531–542.
[39] J. Oxley, Matroid theory, Oxford University Press, 1992.
[40] D. Pagac, E.M. Nebot, and H. Durrant-Whyte, An evidential approach
to map-bulding for autonomous vehicles, IEEE Transactions on Robotics
and Automation 14 (1998), no. 4, 623–629.
[41] A. Rakar, A. Jurii and P. Ballé, Transferable belief model in fault diagnosis, Engineering Applications of AI 12 (1999), 555–567.
[42] J. Rougier, Probabilistic inference for future climate using an ensemble
of climate model evaluations, Climatic Change 81 (2007), 247–264.
[43] M. Sapienza, F. Cuzzolin and Ph. Torr, Learning discriminative spacetime actions from weakly labelled videos, Proc. of BMVC, 2012.
[44] G. Shafer, A mathematical theory of evidence, Princeton, 1976.
[8] F. Cuzzolin, Multilinear modeling for robust identity recognition from
gait, Behavioral Biometrics for Human Identification, IGI, 2009.
[45] G. Shafer, Allocations of probability, Annals of Probability 7 (1979),
no. 5, 827–839.
[9] F. Cuzzolin, The geometry of consonant belief functions: simplicial complexes of necessity measures, FSS 161 (2010), no. 10, 1459–1479.
[46] G. Shafer, Belief functions and parametric models, Journal of the Royal
Statistical Society B 44 (1982), 322–352.
[10] F. Cuzzolin, Three alternative combinatorial formulations of the theory
of evidence, Intelligent Data Analysis 14 (2010), no. 4, 439–464.
[47] G. Shafer and R. Srivastava, The Bayesian and belief-function formalism: A general perspective for auditing, Auditing: A Journal of Practice
and Theory (1989).
[11] F. Cuzzolin, Geometric conditional belief functions in the belief space,
Proceedings of ISIPTA, Innsbruck, Austria, 2011.
[12] F. Cuzzolin, Geometry of Dempster’s rule of combination, IEEE Tr.
SMC-B 34 (2004), no. 2, 961–977.
[13] F. Cuzzolin, Learning pullback manifolds of generative dynamical models for action recognition, IEEE Trans. PAMI (2012, under review).
[48] G. Shafer and V. Vovk, Probability and finance: It’s only a game!, Wiley,
New York, 2001.
[49] Ph. Smets, Belief functions: the disjunctive rule of combination and the
generalized Bayesian theorem, IJAR 9 (1993), 1–35.
[50] Ph. Smets, Jeffrey’s rule of conditioning generalized to belief functions,
Proceedings of UAI, pp. 500–505, 1993.
[14] F. Cuzzolin, A belief-theoretical approach to example-based pose estimation, IEEE Transactions on Fuzzy Systems (under review) (2012).
[51] C. Sminchisescu and B. Triggs, Kinematic jump processes for monocular 3d human tracking, Proc. of CVPR, vol. 1, pp. 69–76, 2003.
[15] F. Cuzzolin and R. Frezza, Evidential modeling for pose estimation, Proceedings of ISIPTA, Pittsburgh, PA, 2005.
[52] M. Spies, Conditional events, conditioning, and random sets, IEEE
IEEE Tr. SMC 24 (1994), 1755–1763.
[16] F. Cuzzolin, D. Mateus and R. Horaud, Robust coherent Laplacian protrusion segmentation along 3D sequences, IJCV (2012, under review).
[53] T. M. Strat, Continuous belief functions for evidential reasoning, Proceedings of AAAI, pp. 308–313, 1984.
[17] F. Cuzzolin, D. Mateus, D. Knossow, E. Boyer, and R. Horaud, Coherent
laplacian protrusion segmentation, Proceedings of CVPR, pp. 1–8, 2008
[54] P. Suppes and M. Zanotti, On using random relations to generate upper
and lower probabilities, Synthese 36 (1977), 427–440.
[18] L. de Campos, J. Huete, and S. Moral, Probability intervals: a tool for
uncertain reasoning, IJUFKS 1 (1994), 167–196.
[55] Y. Tang and J. Zheng, Dempster conditioning and conditional independence in evidence theory, Advance in Artificial Intelligence, vol.
3809/2005, Springer Berlin/Heidelberg, 2005, pp. 822–825.
[19] S. Demotier, W. Schon, and T. Denoeux, Risk assessment based on weak
information using belief functions: a case study in water treatment, IEEE
Tr. SMC-C 36 (2006), no. 3, 382– 396.
[20] T. Denoeux, Conjunctive and disjunctive combination of belief functions
induced by non distinct bodies of evidence, Artificial Intelligence (2007).
[56] P. Walley, Statistical reasoning with imprecise probabilities, Chapman
and Hall, London, 1991.
[57] L. A. Zadeh, Fuzzy sets as a basis for a theory of possibility, Fuzzy Sets
and Systems 1 (1978), 3–28.
© Copyright 2026 Paperzz