Improved interactive acquisition of spatial proximity relation using

Improved interactive acquisition of spatial proximity
relation using Dempster-Shafer theory for fuzzy sets
Sourjya Sarkar 1 and Debnath Mukherjee 2 and Anupam Basu 3
Abstract. This paper proposes a method for modeling a user’s
concept of spatial proximity through an interactive session using
Dempster-Shafer theory for fuzzy sets. A fuzzy set (‘near’) characterized by a set of points located at fixed distances from a reference is initialized using a negative exponential function such that
the membership values monotonically decrease with increasing distances. In each iteration of the interactive session, a selectively chosen subset of points labelled with a nearness measure is displayed
on screen. A user feedback representing the user’s partial agreement
to the model is used to derive belief and plausibility measures for
the fuzzy set which in turn is used to adapt a parameter of the membership function. The user model is iteratively updated using fuzzy
union or intersection operations on the adapted fuzzy set. The session is terminated once a proposed convergence criterion is satisfied.
Experiments involving interactions from several users are conducted
to validate the proposed approach. A comparative study is performed
where ‘proximity modeling’ using only fuzzy sets is considered as a
baseline. Observations reveal that the proposed method outperforms
the baseline in terms of accuracy of the acquired user-specific ‘proximity models’. It is also demonstrated that the proposed approach efficiently converges with significantly fewer number of user feedback
iterations as compared to the baseline.
1
INTRODUCTION
Qualitative spatial relations [3][7] are important for users who query
geographical information systems. While some spatial relations such
as IN (e.g., “Point A is IN Region B”) or NORTH OF (e.g.,“Point
A is NORTH OF point B”) can be specified crisply, others such
as “NEAR”, “FAR” etc., suffer from vagueness due to the inherent fuzziness of the linguistic term “NEAR”, “FAR” etc. Moreover, experiments such as [4][7] suggest that users’ perception of
the term “NEAR” varies individually, which makes proximity calculation even more difficult.
Thus one of the problems in answering GIS queries involving
“NEAR” is to understand the user’s perception of the relation e.g.,
by forming a model of the user’s perception. The related work so
far for modeling the “NEAR” relation has been confined to fuzzy
logic theory [5]. A description of fuzzy logic based proximity calculation is in [4]. In [4], the program uses artificial intelligence (AI)
techniques such as “parameter adjustment”, “generate-and-test” and
“search heuristics” to interactively acquire the notion of “near” for a
1
2
3
Indian Institute of Technology Kharagpur, email:
[email protected]
TCS Innovations Lab Kolkata, email: [email protected]
Indian Institute of Technology Kharagpur, email:
[email protected]
souranu-
person. However, this work primarily suffers from two major shortcomings as discovered during experimentation: 1) A large number of
iterations (often exceeding 12) are required for adapting the parameter 2) Even after adaptation, the acquired models are not as desired.
This paper proposes an approach for computing the proximity relation by augmenting fuzzy set theory based approach with DempsterShafer (DS) theory [6] [1]. Dempster-Shafer theory allows specification of probability intervals which is different from pure Bayesian
probability theory (which assumes that the crisp probabilities are
known), and allows for imprecise evidence. It is shown that using this
augmented scheme, the number of iterations required to converge to
the user’s model of proximity is reduced considerably and the final
result is also more desirable (compared to using only fuzzy logic).
The unique contribution of this paper is the application of fuzzy set
theory augmented by Dempster-Shafer theory to model the proximity
relation to give an enhanced user experience with fast convergence to
the user’s model of proximity. Through a wide set of empirical observations it is validated that the proximity model closely resembles that
of the actual user’s perception. It is to be noted that the scope of the
present work is restricted in considering nearness to be strictly dependent on distance measures. Thus, other confounding aspects such
as the mode of transport [7], physical or phychological nuances of
individual perception which may affect a user’s notion of proximity,
are deliberately avoided in the present work.
The rest of the paper is organized as follows. Section 2 gives a formal statement of the problem addressed in the present work. Section
3 briefly introduces the baseline system used for comparison in the
present work. Section 4 discusses the proposed method of acquiring
spatial proximity relation. The experiments conducted are discussed
in Section 5 followed by results and discussion in Section 6. A brief
conclusion of the work is given in Section 7.
2 PROBLEM STATEMENT
The problem is to derive fuzzy membership values for the proximity
relation. As in [4], the universe of discourse is defined by a set X consisting of discrete elements {xi }n
i=1 where xi denotes the distance
of the ith point from a fixed reference. Thus, for n discrete points,
the set consists of distances x1 , x2 , ....xn . The fuzzy set “near” is
described by the set: A={µa (x1 )/x1 , µa (x2 )/x2 , ...., µa (xn )/xn }.
The problem is to determine the membership values (µa ) elements of
the set above i.e., determine the µa function for all possible values of
the xi s. For a specific user, the goal is to find the µa s that correctly
model the user.
3
RELATED WORK
The interactive acquisition of spatial proximity model using fuzzy
sets was introduced by Robinson in [4].
4. Lastly, the convergence is very slow thereby requiring a prolonged
user-interaction.
4 PROPOSED METHOD
Stop
Initialization of Fuzzy Set
(User Model)
No
Yes
Convergence criteria
satisfied?
Adapting Membership Function
Parameter and Updation of model
The proposed method aims to efficiently model a user’s concept of
“nearness”. Specific emphasis is laid in overcoming the caveats of the
baseline method [4] as mentioned in the previous section, using DS
theory [1]. Figure 2 shows a block diagram of the proposed approach.
Selection of a single
point to be displayed on
screen
Stop
Initialization of Fuzzy Set
(User Model)
User’s binary response
(yes/no)
No
Figure 1.
Figure 1 shows a summarized block diagram of the method. A
user’s concept of near (C), is iteratively learnt through a questionanswering (Q/A) session. A fuzzy set “near” represented by A, defined on the computer’s universe of discourse X, is initialized. Typically, a negative exponential is chosen as the membership function
for the fuzzy set which monotonically decreases with increasing distance of points from a fixed reference.
i = 1, 2, .., n
Convergence criteria
satisfied?
Adapting Membership Function
Parameter and Updation of model
User’s feedback in a
scale of 0 to 10
Derivation of Belief/Plausibily
function using Dempster-Shafer
theory
Selection of points to be
displayed on screen
Block diagram of the baseline approach
µA (xi ) = exp{−αxi }
Yes
(1)
Based on the user’s binary response (yes/no) to the question of a selected point being ‘near’ or ‘not near’, the current concept is modified
by fuzzy operations (union/intersection) on the point’s membership
value with the pre-existing concept.
{
Ck−1 ∪ µA (x),
yes
Ck =
(2)
Ck−1 ∩ (1 − µA (x)), no
A parameter (α) which determines the spread of the membership
function, is adaptively changed in each iteration ‘k’ based on the
user’s response, to capture the boundary between the user’s concept
of near and not near.
{
log(0.85)/xi , yes
(3)
αk =
log(0.50)/xi , no
where xi is a point selected by maximizing the expected Kaufmann’s
index of fuzziness [2] (refer to Section 4 for more details). A number
of distinct drawbacks can be easily observed in the above process.
1. Firstly, the interaction essentially requires a binary response from
a user for any given question, thereby providing no scope for capturing any degree of confidence on the assertion. This implicitly
affects the acquisition process since the adaptation of the α parameter is crisp.
2. Secondly, due to the appearance of only a single point on the
screen each time, there’s no scope for comparing relative distances.
3. Thirdly, certain integers are assigned to each point of the fuzzy
set while displaying them during user interaction (refer Figure 6
in Section 5 for more details). These numbers, which represents
the degree of nearness of a point for a user, are derived based on
the membership of the point in lambda cuts of the entire fuzzy set.
Since a fixed set of thresholds (lambda’s) are used for deriving the
lambda cuts, it results in erroneous intervals.
Figure 2.
Block diagram of the proposed approach
The major steps involved in the process can be outlined as
1. Initialization of the fuzzy set (user’s concept) i.e., membership
function and the tuning parameter.
2. Selection of points to be displayed on screen according to their
sampling weights and re-adjustment of the sampling weights according to a normalized fuzziness measure.
3. Interpretation of user feedback and derivation of belief functions
according to Dempster-Shafer theory.
4. Modification of the fuzzy membership function and updation of
the user model.
In the following sections, we briefly explain each of these.
4.1 INITIALIZATION OF THE FUZZY SET
This step is similar to that of the baseline as discussed earlier in Section 3. The universe of discourse X is defined by a pre-determined
set of ‘n’ points whose distances from a fixed reference R is given
by x1 , x2 , ....xn . A user model consists of a set Ck which is iteratively updated in each iteration ‘k’ using union/intersection operation on a fuzzy set “near” represented by A defined on X. Intuitively,
the points closer to the reference should have a higher membership
value than those farther away. A preferable choice of the membership function satisfying the above property is the negative exponential µA (xi ) = exp{−αxi } as defined in Eq. 1. where α is a tuning
parameter which determines the spread of the function. For a suitable comparison of the proposed method with the baseline [4], similar choices of the initialization parameters are used. The parameter
is initialed as
α = log(0.2)/ max(xi )
(4)
where max(xi ) is the maximum distance of a point from the reference. Adjustment of the tuning parameter plays a crucial role in
the modeling process. As observed from the nature of the membership function, increasing the parameter results in higher resolution of
points closer to R and vice-versa.
4.2 ADJUSTING SAMPLING WEIGHTS FOR
SELECTION OF POINTS TO BE
DISPLAYED
where ϕ is the null set. It was shown in [8] that the DS theory
can be generalized for fuzzy sets with non-fuzzy bpa. The two major
steps of the process comprises
The selection of points to be displayed for user interaction should
be in accordance to a fuzziness measure. An index of fuzziness provides an indication of how closely the user’s concept fits to a crisp
or non-fuzzy concept. This is in conformity with the concept of defuzzification [5] which aims to discriminate between the user’s concept of nearness and not nearness.
Similar to the baseline [4], in the present work, the Kaufmann’s
(Kf) measure of fuzziness [2], is used for selecting points. However,
in contrast to the baseline method, where only a single point (which
maximizes the expected Kf measure) is selected for user interaction,
in the present work expected Kf values are used as sampling weights
for selecting multiple points randomly. Displaying multiple points
onscreen allows a user to compare relative distances between them
from a visual perspective thus providing scope for better elicitation of
his/her notion of nearness. To ensure a valid probability distribution,
all Kf values are normalized in the unit interval [0 1] by scaling individual values by the sum of all values. A fuzzy subset B (B ⊂ A)
is defined by a fixed number of points (say ‘m’) selected from the
original fuzzy set A by discrete random sampling in each iteration.
The Kf index is defined as a distance measure between the fuzzy set
A and an ordinary set S in n dimensional space.
∑
√
[µA (x) − µS (x)]−1/2 }
(5)
I = 2/n{
1. Decomposition of a fuzzy set A (interchangebly called focal element) into non-fuzzy subsets Aλ ∈ X using lambda cuts as follows
x∈X
where µS (x) is the membership function of crisp set S
{
0.0, µA (x) < 0.5
µS (x) =
1.0, µA (x) ≥ 0.5
The expected Kf index is given by
E(I) = γIyes + ηIno
(6)
where Iyes and Ino are the index of fuzziness for the fuzzy user concepts given by Eq. 2, γ = exp{−2αxi } and η = 1 − γ.
4.3 APPLICATION OF DEMPSTER-SHAFER
THEORY IN THE PROPOSED
FRAMEWORK
The Dempster-Shafer (DS) [6] [1] theory of evidence allows one to
capture the vaguenes (imprecision) of assigning an element to one or
more crisp sets. This is achieved by means of plausibility and belief
functions derived from a belief measure (mass). The mass expresses
the degree of support (evidence) for a collection of elements defined
by one or more crisp sets on the power set of the universe. It is formally represented as a mapping (also known as basic probability assignment (bpa)) from the power set of the universe (X) to the unit
interval.
m : 2X → [0, 1]
(7)
The belief (bel) and plausibility (pl) function, or the minimum and
maximum evidence in support of a hypothesis (crisp set A ∈ 2X ),
respectively are given
∑
∑
bel(A) =
m(B) ; pl(A) =
m(B)
(8)
B⊆A
B∩A̸=ϕ
Aλ = {x|µA (x) ≥ λ}
(9)
where µA (x) is the membership function of A. Assuming a total
of n elements in A, the masses of the decomposed subsets are
given by
m(Aλi ) = (λi − λi−1 ) × m(A)
i = 1, 2, ...n
(10)
where λ0 = 0 and λn = 1
2. Calculating the probability mass that focal elements A induces on
the belief and plausibility of a fuzzy subset (B)
bel(B) =
∑
m(A)
∑
A
|λi − λi−1 | × infx∈Aλi µB (x) (11)
i
A
pl(B) =
∑
m(A)
∑
|λi − λi−1 | × supx∈Aλi µB (x) (12)
i
where inf and sup denotes infimum and supremum of a set, respectively. It can be noted that in case of availability of a single
focal element (as in our case), the outer sum can be entirely omitted.
The salient advantage of DS theory is its framework for assigning
a belief measure to the entire fuzzy set which can be decomposed
amongst its elements and subsequently be used for deriving the belief
(and plausibility) functions for the same. In the proposed method,
this framework is exploited to capture a user’s feedback in a scale
of [0 10], where 0 and 10 indicates total disagreement and perfect
agreement with the model, respectively.
The fuzzy subset B comprises the set of points chosen randomly
by sampling the normalized Kaufmann’s index as discussed in Section 4.2. The user’s feedback is interpreted as the belief measure Eq.
7 by scaling. It is reasonable to assume that irrespective of the source
of evidence (individual user), a higher mass (user feedback) indicates
greater compliance to the learned model. The feedback (f ) in the kth
iteration is scaled to the unit interval [0 1] as follows
massk = f /10
(13)
The belief and plausibility function of B are derived using Eqs. 11
and 12 and subsequently used to update the user model as discussed
in the next section.
4.4 UPDATION OF THE USER MODEL
Updation of the user comprises of three major steps as discussed in
the following points
1. Fuzzy Parameter Tuning:- The belief interval i.e.,
[bel(B) pl(B)] represents the uncertainty associated with
the hypothesis B. Higher belief and a lower belief interval
therefore suggests stronger evidence in support of the hypothesis.
Taking into account this phenomenon, the parameter α of µA is
updated in the kth iteration according to the following equation
αk = bel(B)/(pl(B) − bel(B))
(14)
3. Convergence criterion:- The model is assumed to be converged
if two succesive iterations produce no changes in the fuzzy membership values. The process is then terminated.
µkA (x) = µk−1
A (x)
(16)
Predicted labels from user’s proximity model (P)
2. Concept Modification:- The user’s model Ck updation in the kth
iteration is similar to the baseline except for the fact that intercomparison of the user feeedback in two succesive iterations is
taken into account to minimize inconsistency in modeling
{
Ck−1 ∪ µA (x),
massk ≥ massk−1
(15)
Ck =
Ck−1 ∩ (1 − µA (x)), massk < massk−1
9
8
7
6
5
4
Labels
3
Line of best fit
2
Line of perfect fit (P=R)
1
1
2
3
4
5
6
7
8
9
User−defined labels (R)
5
EXPERIMENTS
All experiments were conducted using the MATLAB software. A set
of n = 24 points randomly scattered around a fixed reference point,
was simulated. Twenty users individually participated in the concept
acquisition process and interacted using the Q/A session. Two sets of
experiments i.e., the baseline and the proposed method were individually carried out for each user as discussed in Sections 3 and 4, respectively. In the baseline experiment, a single question (point) was
selected in each iteration by maximizing the expected Kaufmann’s
index (Eq. [6]). The model parameters were updated accordingly [4]
using Eqs [1-2] until the expected Kf value crossed the threshold of
0.4. In the proposed method, m = 6 (empirically chosen) points
were selected in each iteration for user display by ramdom sampling
of normalized Kf values, as discussed in Section 4.2. This was followed by deriving the belief measures and updation of the user model
(Eqs[9 -16]) until the convergence criterion was satisfied. In order to
display integer labels for each point displayed on screen (refer to Figure 6) the range of membership values were divided into 8 (empirically chosen) uniform intervals (z1 , ..z8 ) in descending order. The
integer label (indicative of the degree of nearness) to be displayed on
screen, for a given point (xi ) was given by
{
lab(xi ) = 10 − j, zj−1 < µA (xi ) ≤ zj
(17)
(a) Best Case
Predicted labels from user’s proximity model (P)
The rationale behind the choice of the proposed convergence criterion is intuitive. As shown in Figure 2, the acquisition process
operates as a feedback system adapting the fuzzy set based on the
user response (feedback). On the verge of complete acquisition,
when the acquired model closely approximates the user’s concept,
fluctuations in the fuzzy membership values are expected to be
minimimal. A feedback in any subsequent iteration which causes
a steep variation in the membership function can be regarded as
self-contradictory or inconsistent with the user’s concept learned
so far. It is implied that method models the user response as a
first order Markov process. However higher ordered models can
be reasonably applied for improved accuracy.
9
8
7
6
5
4
Labels
3
Line of best fit
2
Line of perfect fit (P=R)
1
1
2
In order to validate the accuracy of the acquired models a separate
experiment was conducted in which users were asked to manually
label each point of five randomly generated datasets. The assigned
4
5
6
7
8
9
User−defined labels (R)
(b) Worst Case
Figure 3. Figure showing the regression models corresponding to (a) Best
and (b) Worst cases. The x-axis and y-axis represents the user-defined labels
and the predicted labels from the corresponding user’s model, respectively.
The red and black lines represent the line of best fit and perfect fit,
respectively.
labels were compared with the predicted labels derived from the
corresponding user’s proximity model by linear regression. In other
words, the predicted labels were approximated in terms of the assigned labels in a least-squares sense. The method is described as
th
follows. Let X j = {xji , yij }24
dataset (j = 1, 2, .., 5).
i=1 be the j
j
j
Let ri , pi be the user-defined and predicted label for the ith point in
the j th dataset, respectively. Without
loss of generality,
∑
∑ the average
of the labels given by ri = 51 5j=1 rij and pi = 15 5j=1 pji , were
considered for the regression model defined by P = αR + β, where
P = [p1 , p2 ..., p24 ]T , R = [r1 , r2 ..., r24 ]T are column vectors composed of average predicted and user-defined labels while α and β are
linear regression parameters. The parameters derived by minimizing
the least squares are given by
−1
This strategy could also overcome the drawbacks of the naive thresholding scheme as discussed in point 3 of Section 3.
5.1 VALIDATION OF PREDICTED MODELS BY
LINEAR REGRESSION
3
A = (R̂T R̂)
where R̂ = [R
ones.
1] , A = [α
R̂T P
(18)
β]T and 1 = [1, 1..]T is a vector of
6 RESULTS & DISCUSSION
Figure 4 shows the individual user feedback (Eq. [13]) to the inital model (blue bars), the baseline proximity model (green bars) and
20
Baseline
18
Proposed
Approach
16
14
Number of iterations
the proposed proximity model (red bars) after a complete acquisition
process. With the exception of a few outliers, it is observed in most
cases that the proposed model outperforms the baseline significantly.
It also performs considerably better than the inital model. It is interesting to note that the baseline model shows worse performance in a
number of cases, in comparison to the initial model, thereby validating the typical drawbacks of the baseline as highlighted in Section
3.
The high user ratings assigned to the final proposed model futher
corroborates the significance of utilizing the non-binary user responses in support of the model using DS theory. Figure 5 shows
12
10
8
6
4
2
0
15
2
4
6
8
10
12
14
16
18
20
User Number
Initial Model
Baseline Model
User’s Feedback
Proposed Model
Figure 5.
10
7 CONCLUSION
5
0
0
Figure showing number of iterations required per user
2
4
6
8
10
12
14
16
18
20
User Number
Figure 4.
Figure showing individual user feedback for the proximity
models
the number of iterations (i.e, questions asked) required for modeling individual users for the baseline method (red) and the proposed
method (blue). A significant difference in the required number of
iterations can be easily observed. At least an average of 12 iterations were required for the baseline method compared to that of only
4 iterations for the proposed method. This evidently shows that the
proposed method is quite efficient and a user-friendly approach.
Figure 3 shows the regression models corresponding to the best
and worst fits. The deviation between the line of best fit (red) (obtained by Eq. [18]) and the line of perfect fit (black) is a measure of
the errors of the predicted models. The mean squared error of prediction obtained in the best and worst cases were 0.32 and 0.71, respectively. The low error range is suggestive of the efficiency of the
acquired models. This is also apparent from a significant number of
points intersecting the line of perfect fit.
A typical screenshot of the interaction model has been shown in
Figure 6. The red bubbles represent the fuzzy set A of points at fixed
distance from the reference (represented by red cross). The numbers
enclosed in each bubble represents the degree of nearness of the point
in a scale interpreted as ( (9-8) - very near, (7-5) - near, (4-3) - not so
near, (<3) - far). Each of these integer labels correspond to a lambdacut derived by thresholding the fuzzy set A as defined by Eq[17]. Assignment of a label to each point was determined by its crisp membership to the corresponding lambda-cut. The poor performance of
the baseline model can be noticed from the large number of inconsistent labels on screen. In contrast, the final model learned using the
proposed method is free from such anomalies.
In this paper we proposed the acquisition of spatial proximity relation
from individual users using Dempster-Shafer Theory for fuzzy sets.
The proximity models derived from a large number of users revealed
that the proposed approach performed efficiently and effectively in
capturing a user’s concept of nearness. Through a comparative study,
involving subjective tests from individual users, it was demonstrated
that the proposed approach outperformed the baseline method (using
only fuzzy sets) both in terms of the number of iterations and quality of models obtained. Future may include exploring agent-based
system or reinforcement learning for acquiring proximity models.
Acknowledgements
The authors are grateful to everyone who willingly participated in
the user interaction process.
REFERENCES
[1] Arthur P. Dempster, ‘A generalization of Bayesian inference’, Journal of
the Royal Statistical Society, 30(2), 205–247, (1968).
[2] A. Kaufmann, Introduction to the theory of fuzzy subsets, Academic
Press New York, 1975.
[3] Matteo Palmonari and Davide Bogni, ‘Commonsense spatial reasoning
about heterogeneous events’, in Proceedings of the 1st International
Workshop on Stream Reasoning, (2009).
[4] Vincent B. Robinson, ‘Interactive machine acquisition of a fuzzy spatial
relation’, Computers & Geosciences, 16, 857–872, (1990).
[5] Timothy J. Ross, Fuzzy Logic with engineering aplications, Wiley publishers, 2008.
[6] Glenn Shafer, A Mathematical Theory of Evidence, Princeton University
Press, 1976.
[7] Xiaobai Yao and Jean-Claude Thill, ‘Spatial queries with qualitative locations in spatial information systems’, Computers, Environment and
Urban Systems, 30(4), 485–502, (2006).
[8] John Yen, ‘Generalizing the DempsterShafer theory to fuzzy sets’, IEEE
Transactions on Systems Man and Cybernetics, 20, 559–570, (1990).
1.2
(9−8) very near
1
(7−5) near (4−3) not so near (<3) not near
7
9
8
9
7
8
0.8
5
9
6
0.6
Y
8
4
9
9
0.4
7
2
9
0.2
8
9
6
9
6
8
2
9
0
−0.2
0
5
10
15
20
25
X
(a) Baseline method
1.2
(9−8) very near
1
(7−5) near (4−3) not so near (<3) far
4
7
1
8
3
0.8
2
2
9
2
0.6
Y
6
6
9
4
0.4
5
2
0.2
7
4
8
3
5
3
3
2
4
0
−0.2
0
5
10
15
20
X
(b) Proposed method
Figure 6. Figure showing screen-shots of a user’s acquired proximity
models using (a) Baseline and (b) Proposed method. The red bubbles
represent points in the fuzzy set A. The red cross denotes the reference. The
integer label enclosed within a bubble indicates its degree of nearness to the
reference.
25