ssdbm03_slides - Worcester Polytechnic Institute

A Strategy Selection Framework
for Adaptive Prefetching
in Visual Exploration
Punit R. Doshi, Geraldine E. Rosario, Elke A. Rundensteiner,
and Matthew O. Ward
Computer Science Department
Worcester Polytechnic Institute
Supported by NSF grant IIS-0119276.
Presented at SSDBM2003, July 10, 2003.
Motivation
• Why visually explore data?
– Ever increasing data set sizes make data
exploration infeasible
– Possible solution: Interactive Data Visualization -humans can detect certain patterns better and faster
than data mining tools
• Why cache and prefetch?
– Interactive visualization tools do not scale well,
yet we need real-time response
2
Example Visual Exploration Tool: XmdvTool
Flat Display
Data Hierarchy
Hierarchical Display
3
Example Visual Exploration Tool: XmdvTool
Drill Down:
Structure-Based Brush1
Parallel Coordinates (Linked with Brush1)
Roll-Up:
Structure-Based Brush2
4
Parallel Coordinates (Linked with Brush2)
Characteristics of a Visualization Environment
Exploited for Prefetching
Move
up/down
• Locality of exploration
• Contiguity of user
movements
• Idle time due to user
viewing display
Move left/right
5
Overview of Prefetching
• Locality of exploration
• Contiguity of user
movements
• Idle time due to user
viewing display
User’s next request can
be predicted with high
accuracy
Time to prefetch
Fetchin
g
Idle
time
New user
query
Cache
Prefetchin
g
DB
6
Static Prefetching Strategies
Random Strategy
1/4
Direction Strategy
(m-1)
m
(m+1)
1/4
1/4
1/4
7
Drawbacks of Static Prefetching
• Lacks a feedback
mechanism
 Generates predictions independent
• Different users have
different exploration
patterns
 No single strategy will work
• A user’s pattern may be
changing within same
session
 A single strategy may not be
of past performance.
best for all users.
sufficient within one user session.
This calls for Adaptive Prefetching – changing prediction
behavior in response to changing data access patterns.
8
Types of Adaptive Prefetching
• Fine tuning one strategy:
– Change parameter values of one strategy over time
depending on past performance
• Strategy selection among several strategies:
– Given a set of strategies, allow the choice of
strategy to change over time within same session,
depending on past performance
9
Strategy Selection
Requirements for strategy selection:
1.
2.
3.
4.
Set of strategies to select from
Performance measures
Fitness function
Strategy selection policy
10
Set of Strategies & Performance Measures
Strategy
#Correctly
Predicted
#Not
Predicted
#MisPredicted
No Prefetch
Random
Direction
Strategies
Performance measures
Required by user
Predicted
by prefetcher
Yes
No
Yes
Correctly
predicted
Mis-predicted
No
Not predicted
11
Fitness Function
Strategy
#Correctly
Predicted
#Not
Predicted
#MisPredicted
Local Avg. MisClassification Cost
No Prefetch
Random
Direction
C NP # NP CMP # MP
misclassco st 
#CP  # NP  # MP
C NP  Cost of No prediction
CMP  Cost of Mis-prediction
CNP  CMP  1
Fitness function
Other fitness functions:
• global average misclass. cost
• local average response time
• global average response time
12
Fitness Function Definitions
Global Average:
t
globalAvg t  
 misClassCost i
i 1
t
Local Average (using exponential smoothing):
localAvgt     misClassCost t   1    localAvgt 1
13
Strategy Selection Policy
Strategy
#Correctly
Predicted
#Not
Predicted
#MisPredicted
Local Avg. MisClassification Cost
No Prefetch
12
38
86
0.5
Random
10
116
148
0.4
Direction
4
125
107
0.3
Overall
26
279
341
0.4
Strategy selection policies:
1. Best
2. Proportionate
14
Performance Evaluation
Setup –
• XmdvTool as testbed
• 14 real user traces analyzed
• User traces were analyzed for:
• Tendency to move in the same direction
• Frequency of movement
• Size of sample focused on
• 3 user types: random-starers, indeterminates, directionalmovers
We will show:
• Detailed analysis and results for 2 user traces
• Summary results for all user types
15
Directional User: Navigation Patterns Over
Time
% directional vs Time
•Ave 73%
directional
100
90
% directional
80
•Ave 70
queries/min
70
60
50
40
30
20
10
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
17
18
19
20
21
22
23
24
25
26
27
28
29
time
# queries vs Time
120
100
# queries
•Navigation
pattern
changes
over time
80
60
40
20
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
t ime
16
Directional User: Navigation Patterns Over
Time
regions visited per 1 min snapshot
0.7
0.6
0.4
0.3
0.2
0.1
0
0
1
2
3
4
5
6
7
8
9
10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29
1 minute snapshots
Brush movements over time
xycenter
level
2
extents
Move up
or down
then move
left to right
to left
level
0.5
1.5
1
0.5
0
0 1 2 3 3 4 5 5 6 6 7 7 8 9 10 10 11 12 12 13 14 15 17 18 19 19 22 23 24 25 25 26 26 27 28 29
time (mins)
17
Directional User: Directional prefetcher is
best
Selection
matched
more
directional
navigation
pattern.
Any kind of
prefetching
is better than
none.
18
… but SelectBest is even better
SelectBest
chose Directional
& No-Prefetching
% times selected vs time
% times no-prefetch selected
% times directional selected
100
% times selected
No-Prefetching
selected when
#queries/min
is high & %dir
is low.
% times random selected
80
60
40
20
0
1
2
3
4
5
6
7
8
9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29
time
19
Directional User: Other performance
measures
%Not Predicted vs Time
DIR
RANDOM
NO PREFETCH
BEST
120
100
80
%NP
Misclassification
cost = trade-off
between %NP
& %MP.
60
40
20
0
SelectBest
gave low
%NP and
high %MP.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
time
DIR
%Mis-Predicted vs Time
RANDOM
NO PREFETCH
BEST
90
80
70
%MP
60
50
40
30
20
10
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
time
16
17
18
19
20
21
22
23
24
25
26
27
28
29
20
Directional User: Other performance
measures
SelectBest
gave best
%CP &
response time
but this will
not always
be the case.
%CP
%Correctly Predicted vs Time
DIR
RANDOM
NO PREFETCH
BEST
20
18
16
14
12
10
8
6
4
2
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
time
Total Response Time vs Time
DIR
RANDOM
NO PREFETCH
BEST
Choice of
fitness function
is important.
Total Response Time
140000
120000
100000
80000
60000
40000
20000
0
1
2
3
4
5
6
7
8
9
10
11
12 13
14
15 16
time
17 18
19
20 21
22 23
24
25 26
27 28
29
21
Indeterminate User: Navigation Patterns
Over Time
•Ave 50%
directional
regions visited per 1 min snapshot
0.7
0.6
level
•Ave 40
queries/min
0.5
0.4
0.3
0.2
0.1
•Pattern
changes
over time
0
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
1 minute snapshots
Brush movements over time
level
2
1.5
1
0.5
time (mins)
23
22
22
21
21
20
19
19
18
17
17
16
16
15
15
14
13
13
12
12
11
10
9
8
7
6
5
5
4
3
3
2
2
1
1
0
0
Move right
then perturb
up & down.
xycenter
extents
•Move left
then perturb
up & down.
22
Indeterminate User: SelectBest is better
SelectBest
chose Random
& No-Prefetching
% times selected vs time
% times no-prefetch selected
% times random selected
% times directional selected
No-Prefetching
selected when
#queries/min
is high & %dir
is low.
% times selected
120
100
80
60
40
20
0
1
2
3
4
5
6
7
8
9
10 11 12
13 14 15
16 17
18 19 20
21 22
23 24
time
23
0.45
0.40
0.35
0.30
0.25
0.20
0.15
0.10
0.05
Random-Starers
Indeterminates
Best
Direction
Random
No
Prefetch
Best
Direction
Random
No
Prefetch
Best
Direction
Random
0.00
No
Prefetch
Reduced prediction
error for randomstarters and
directional-movers.
Directional-Movers
Cluster
100
90
80
70
60
50
40
30
20
10
Random-Starers
Indeterminates
Cluster
Best
Direction
Random
No
Prefetch
Best
Direction
Random
No
Prefetch
Best
Direction
0
Random
Normalized Response Time
(Averaged)
No improvement in
response time.
0.50
No
Prefetch
Experiments
repeated 3x and
averaged.
Global Average Misclassification
Cost (Averaged)
Summary Across All User Types
Directional-Movers
24
Related Work
• Adaptive Prefetching –
• Strategy Refinement - Davidson98, Tcheun97,
Curewitz93, Kroeger96, Palpanas99
• Learning - Agrawal95, Swaminathan00
• Adaptation Concepts – Mitchell99, Waldspurger94,
Avnur00
• Performance Measures – Joseph97,Weiss25,
Mitchell99
• Database support for Interactive Applications –
Stolte02, Tioga96
25
Observations
• Prefetching is better than no prefetching
• Different users have different navigation patterns,
same user has varying navigation patterns within
same session
• No single prefetcher works best in all cases
• Strategy selection allows prefetcher to adapt
• Performance of strategy selection depends on
fitness function being optimized
26
Contributions
• The first to study adaptive prefetching in the
context of visual data exploration
• A proposed framework for adaptive
prefetching via strategy selection, as opposed
to common approach of strategy refinement
• Empirical results showing benefits of
strategy selection over a wide range of user
navigation traces
27
That’s all folks
XmdvTool Homepage:
http://davis.wpi.edu/~xmdv
[email protected]
Code is free for research and education.
Contact author: [email protected]
28