re-finding - People.csail.mit.edu

Finding and Re-Finding
Through Personalization
Jaime Teevan
MIT, CSAIL
David Karger (advisor), Mark Ackerman, Sue Dumais, Rob Miller (committee),
Eytan Adar, Christine Alvarado, Eric Horvitz, Rosie Jones, and Michael Potts
Thesis Overview
• Supporting Finding
– How people find
– Individual differences affect finding
– Personalized finding tool
• Supporting Re-Finding
– How people re-find
– Finding and re-finding conflict
– Personalized finding and re-finding tool
Old
New
Thesis Overview
• Supporting Finding
– How people find
– How individuals find
– Personalized finding tool
• Supporting Re-Finding
– How people re-find
– Finding and re-finding conflict
– Personalized finding and re-finding tool
Supporting Re-Finding
• How people re-find
– People repeat searches
– Look for old and new
• Finding and re-finding conflict
– Result changes cause problems
• Personalized finding and re-finding tool
– Identify what is memorable
– Merge in new information
Supporting Re-Finding
• How people find
Query log analysis
– People repeat searches
– Look for old and new
• Finding and re-finding conflict
– Result changes cause problems
• Personalized finding and re-finding tool
– Identify what is memorable
– Merge in new information
Memorability study
Re:Search Engine
Related Work
• How people re-find
– Know a lot of meta-information [Dumais]
– Follow known paths [Capra]
• Changes cause problems re-finding
– Dynamic menus [Shneiderman]
– Dynamic search result lists [White]
• Relevance relative to expectation [Joachims]
Query Log Analysis
• Previous log analysis studies
– People re-visit Web pages [Greenberg]
– Query logs: Sessions [Jones]
• Yahoo! log analysis
– 114 people over the course of a year
– 13,060 queries and their clicks
• Can we identify re-finding behavior?
• What happens when results change?
Re-Finding Common
Unique click
Repeat click
40%
of queries
86%
Repeat query
33%
26%
of queries
of queries
87%
of repeat
queries
38%
of repeat
queries
of queries
Change Reduces Re-Finding
• Results change rank
• Change reduces probability of repeat click
– No rank change: 88% chance
– Rank change: 53% chance
• Why?
– Gone?
– Not seen?
– New results are better?
Change Slows Re-Finding
• Look at time to click as proxy for Ease
• Rank change  slower repeat click
– Compared with initial search to click
– No rank change: Re-click is faster
– Rank change: Re-click is slower
• Changes interfere with re-finding

?
Old
New
“Pick a card, any card.”
Case 1
Case 2
Case 3
Case 4
Case 5
Case 6
Your Card is GONE!
People Forget a Lot
Change Blindness
Change Blindness
Old
New
We still need magic!
Memorability Study
• Participants issued
self-selected query
• After an hour, asked
to fill out a survey
• 129 people
remembered
something
Memorability a Function of Rank
P(Remem|R,C)
Clicked - C
Not clicked
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
1
2
3
4
5
6
Rank - R
7
8
9
10
Remembered Results Ranked High
Remembered Rank
12
10
8
6
4
2
0
-2 -2 0
2
4
6
Actual Rank
8
10
12
Old
New
Re:Search Engine Architecture
Search engine
result list
query
Index
of past
queries
query 1
query 2
…
query n
score 1
score 2
Result
cache
result list 1
…
…
result list n
score n
Merge
result list 2
Web browser
User client
result list
User interaction
cache
Components of Re:Search Engine
query
Index
of past
queries
• Index of Past Queries
• Result Cache
query 1
query 2
…
query n
query 1
query 2
Result
cache
result list 1
result list 2
…
…
result list n
query n
• User Interaction Cache
User interaction
cache
result list
• Merge Algorithm
result list 1
result list 2
…
result list n
Merge
result list
score 1
score 2
…
score n
Index of Past Queries
query
Index
of past
queries
query 1
query 2
…
query n
score 1
score 2
…
score n
• Studied how queries differ
– Log analysis
– Survey of how people remember queries
• Unimportant: case, stop words, word order
• Likelihood of re-finding deceases with time
• Get the user to tell us if they are re-finding
– Encourage recognition, not recall
result list
result list 1
result list 2
…
Merge
result list
Merge Algorithm
result list n
• Benefit of New Information score
– How likely new result is to be useful…
– …In a particular rank
• Memorability score
– How likely old result is to be remembered…
– …In a particular rank
• Chose list maximizes memorability and
benefit of new information
Benefit of New Information
• Ideal: Use search engine score
• Approximation: Use rank
• Results that are ranked higher are more
likely to be seen
– Greatest benefit given to highly ranked results
being ranked highly
Memorability Score
• How memorable is
a result?
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
1
• How likely is it to
be remembered at
a particular rank?
2
3
4
5
6
7
8
9
10
12
10
8
6
4
2
0
-2
-2
0
2
4
6
8
10
12
Choose Best Possible List
• Consider every combination
• Include at least three old and three new
• Min-cost network flow problem
New
b1
…
7
b2
b10
…m
s 10
1
7
…
m2
m10
Old
10
…
Slots
t
Old
New
Evaluation
• Does merged list look unchanged?
– List recognition study
• Does merging make re-finding easier?
– List interaction study
• Is search experience improved overall?
– Longitudinal study
List Interaction Study
• 42 participants
• Two sessions a day apart
– 12 tasks each session (“stomach flu”)
• Tasks based on queries
• Queries selected based on log analysis
– Session 1 (“Symptoms of stomach flu?”)
– Session 2
• Re-finding (“Symptoms of stomach flu?”)
• New-finding (“What to expect at the ER?”)
List Interaction Study
Experimental Conditions
• Six re-finding tasks
Old
– Original result list
– Dumb merging
– Intelligent merging
• Six new-finding tasks
– New result list
– Dumb merging
– Intelligent merging
New
Old 5
New 1
Old 1
Old 7
New 2
New 3
New 4
Old 4
New 5
New 6
Experimental Conditions
• Six re-finding tasks
Old
– Original result list
– Dumb merging
– Intelligent merging
• Six new-finding tasks
– New result list
– Dumb merging
– Intelligent merging
New
Old 1
Old 2
Old 4
New 1
New 2
New 3
New 4
New 5
New 6
Old 10
Measures
• Performance
– Correct
– Time
• Subjective
– Task difficulty
– Result quality
Experimental Conditions
• Six re-finding tasks
– Original result list
– Dumb merging
– Intelligent merging
• Six new-finding tasks
– New result list
– Dumb merging
– Intelligent merging
Faster, fewer clicks,
more correct answers,
and easier!
Similar to Session 1
Results: Re-Finding
Performance
Original
99%
% correct
38.7
Time (seconds)
Dumb
88%
70.9
Intelligent
96%
45.6
Results: Re-Finding
Subjective
Original
% correct
99%
Time (seconds)
38.7
Task difficulty
1.57
Result quality
3.61
Dumb
88%
70.9
1.79
3.42
Intelligent
96%
45.6
1.53
3.70
Results: Re-Finding
Similarity
Original
% correct
99%
Time (seconds)
38.7
Task difficulty
1.57
Result quality
3.61
List same?
76%
Dumb
88%
70.9
1.79
3.42
60%
Intelligent
96%
45.6
1.53
3.70
76%
• Intelligent merging better than Dumb
• Almost as good as the Original list
Results: New-Finding
Performance
New
% correct
73%
Time (seconds) 139.3
Dumb
74%
153.8
Intelligent
84%
120.5
Results: New-Finding
Subjective
New
% correct
73%
Time (seconds) 139.3
Task difficulty
2.51
3.38
Result quality
Dumb
74%
153.8
2.72
2.94
Intelligent
84%
120.5
2.61
3.19
Results: New-Finding
Similarity
New
% correct
73%
Time (seconds) 139.3
Task difficulty
2.51
Result quality
3.38
List same?
38%
Dumb
74%
153.8
2.72
2.94
50%
Intelligent
84%
120.5
2.61
3.19
61%
• Knowledge re-use can help
• No difference between New and Intelligent
Results: Summary
• Re-finding
– Intelligent merging better than Dumb
– Almost as good as the Original list
• New-finding
– Knowledge re-use can help
– No difference between New and Intelligent
• Intelligent merging best of both worlds
Conclusion
• How people re-find
– People repeat searches
– Look for old and new
• Finding and re-finding conflict
– Result changes cause problems
• Personalized finding and re-finding tool
– Identify what is memorable
– Merge in new information
Future Work
• Improve and generalize model
– More sophisticated measures of memorability
– Other types of lists (inboxes, directory listings)
• Effectively use model
– Highlight change as well as hide it
• Present change at the right time
– This talk’s focus: what and how
– What about when to display new information?
Thesis Overview
• Supporting Finding
– How people find
– How individuals find
– Personalized finding tool
• Supporting Re-Finding
– How people re-find
– Finding and re-finding conflict
– Personalized finding and re-finding tool
Thank You!
Jaime Teevan
[email protected]
David Karger (advisor), Mark Ackerman, Sue Dumais, Rob Miller (committee),
Eytan Adar, Christine Alvarado, Eric Horvitz, Rosie Jones, and Michael Potts