Designing Better Shopbots

Designing a Better Shopbot
(with Karen Clay, Ramayya Krishnan,
and Kartik Hosanagar)
Alan Montgomery
Associate Professor
Carnegie Mellon University
Graduate School of Industrial Administration
e-mail: [email protected]
web: http://www.andrew.cmu.edu/user/alm3
Cornell Marketing Workshop, 26 January 2000
What is a shopbot?
Shopping Robot’s automatically search
a large number of stores for a specific
product
Example:
John Grisham’s The Brethren, list price
$27.95, prices range between $13.49
(buy.com) and $50.75
(totalinformation.com)
Design involves:
–
–
–
–
Computer science (agents)
Economics (value of price search)
Marketing (consumer behavior)
Statistical Modeling (uncertainty)
2
1
Outline
•
•
•
•
•
•
The State of Shopbots
Improving Shopbot Design
Modeling Consumer Utility
Application to Online Bookstores
Simulation Results
Conclusions
3
The State of Shopbots
2
Online Retail Shopping
1
2
3
4
5
6
7
8
9
10
13
32
All Digital Media
Retail
amazon.com
americangreetings.com
webstakes.com
barnesandnoble.com
mypoints.com
bizrate.com
directhit.com
cdnow.com
ticketmaster*
apple.com
dealtime.com
mysimon.com
bottomdollar.com
Millions of
Unique
Monthly % of Web
Visitors
Users
80,097
100.0
53,485
66.8
14,464
18.1
7,719
9.6
5,314
6.6
5,281
6.6
5,269
6.6
5,050
6.3
3,952
4.9
3,857
4.8
3,602
4.5
3,421
4.3
3,131
3.9
1,915
2.4
670
0.8
Source:
July 2000
5
Shopbot Trends
6
3
Problems with Shopbots
• Less than 10% of shoppers use shopbots, this is up
from last year.
• Why don’t more people use shopbots?
– Lack of awareness
– Lack of benefit (not enough price variation)
– Lack of information about book (no reviews), you must
already know the book
– Slow response time (the modal time for pricescan and
dealpilot is 45 seconds, amazon is <2 seconds)
– Poor interface, displays too much information
7
The returns of search
Source:
Brynjolfsson
and Smith
(2000)
8
4
Distribution of Response
Times
0.02
D
e
n
s
0.01
i
t
y
0
30
50
70
90 110 130
Response time
for evenbetter.com
150 170 190
TIME
9
Improving Shopbot Design
5
Current ShopBot Design
• Enter a book and
evenbetter.com will search
30+ bookstores and return
all searches ordered by price
• Yahoo currently lists over
100 bookstores
• Makes search quick and
simple
• Search is valuable
– Average range is $12
– Amazon lowest only 5% of
time (and dropping quickly)
11
Shopbot Operational Flow
Start
Stop retrieval
process
Predict prices
at each store
Determine
which stores
to ask for an
offer
Previous
offers
Determine
which offers
to present to
customer
Display
selected
offers
Begin
retrieval
threads for
each store
End
12
6
Operational Decisions
• Which stores to search?
Shopbots can form prior expectations about prices that can help
eliminate searching at high price stores
• How long to wait?
About 5% of store requests time out, also it may be better to
interrupt searches at a certain point
• Which offers to present?
It is very cognitively taxing for consumers to have to search
through scores of offers. Consumer research tells us they will use
less efficient comparison rules.
13
Model limitations
• Treat this as a batch job
Problem becomes a sequential decision process
• Consumer already knows what they want
Imagine designing a shopbot that could find things the consumer did not
identify (make tradeoffs in broader product classes)
• We know consumer preferences
Allow for random component, but assume that part-worths of utility
function are known
• Do not explicitly consider shopbot costs
We consider shopbot costs only to the extent they impact waiting time (and
therefore utility)
• Do not explicitly consider shopbot profits
We presume the shopbot wants to maximize consumer utility Þ maximize
purchase probabilities, however the shopbot may want to lead consumers
14
to purchase specific alternatives
7
Modeling Consumer Utility
Modeling Consumer
Interaction with a Shopbot
• Use a compensatory utility model to determine
consumer’s tradeoff between price, delivery, tax, and
waiting time
• Consider the cognitive costs that a consumer incurs
in making comparisons
• Use past information from previous web retrievals to
intelligently retrieve prices
Compare this approach with IR models that assume
noncompensatory rules
16
8
Utility Model
• Usual additive utility model for the ith product given
P alternatives with A attributes in the set:
ui*
N
ui = ∑ β ijaij + ε i − ξ ⋅ T − ω ⋅ Q − λ ⋅ C
j =1
attributes: price,
delivery time, etc.
waiting time
=max(t1,…,tM)
system
overhead to
process
requests
cognitive costs
µ (A-1)(P-1)
17
Utility of the Choice Set
• Utility of basket with P choices from M alternatives:
U = max( u(1) ,..., u( P ) )
= max( u*(1) − ξ ⋅ T − ω ⋅ Q − λ ⋅ C ,K, u(*P ) − ξ ⋅ T − ω ⋅ Q − λ ⋅ C )
= max( u*(1) ,K , u(*P ) ) − ξ ⋅ T − ω ⋅ Q − λ ⋅ ( A − 1)( P − 1)
present the best offers,
ordered observations
18
9
Formal Problem
• Sequential Optimization – solved backwards
max* E [max( U p )] s.t. p ≤ r ≤ q
q,p,t
q
r
p
t*
Variables:
offers to query
offers retrieved
offers presented
time to interrupt query
19
Example
• Shopbot can search the following stores:
Price1 ~ N(10,1), Price 2 ~ N(11,1), Price3 ~ N(12,1)
• Query stores 1 & 2
q=[1 1 0],
t=[8 12 10]
• Interrupt query at 10 seconds
t*=10, r=[1 0 0]
• Present offer from store 1 to customer
p=[1 0 0],
U=6
Objective: Maximize utility of the set offered to consumer
Current Solution: q=[1 1 … 1], t*=30, p=[1 1 … 1]
20
10
Proposed Solution
Sequential Optimization Problem:
M Which offers should be presented, given the retrieval
set
L When should they retrievals be interrupted, given the
queries were made
K Which stores should be queried
21
3. Which offers to present?
Assume that utility errors follows an extreme value
distribution with parameters (0, ?). Usual
multinomial logit model. Implies that the maximum
also has has an extreme value distribution:
{
}
A
 P

max( u , K, u ) = θ ln ∑ exp u R − i+1:R / θ  + θγ , u i = ∑ β ij aij
 i =1

j =1
*
(1)
*
( P)
22
11
Which offers to present?
Can now determine expected utility (conditioned on
retrieval information set)
{
}
 P

E [U ] = θ ln  ∑ exp u R −i +1:R / θ  + θγ − ξ ⋅ T − ω ⋅ Q − λ ⋅ ( A − 1)( P − 1)
 i =1

Solution:
Start with P*=R
Stop if E[U|P-1]<E[U|P]
Let P=P-1
23
Special Case
Suppose prices are all identical, how many offers to
present?
θ
P* =
λ( A − 1)
Example
variability (gain to
added item)
cognitive cost
– Average book generates 10 utils with s.d. of 2 utils, i.e.,
(U=9.1,?=1.6), Book has 4 attributes (A=4)
– ?=.1 ” P*=5.3, ?=.2 ” P*=2.7
– Offer set of 20 books: 8.7 utils
– Offer set of 5 books: 11 utils
24
12
Implications
Do not show all the results! Utility can decline
quickly as a result of added items.
-3.5
-4
-4.5
Utility
-5
-5.5
-6
-6.5
4
6
8
10
Number of offers presented
25
2. How long to wait?
Just because a store is queried doesn’t imply that it
will respond, there is a probability ? i of no response,
and let ti represent time to retrieve offer. If ti <t*
then observation is censored.
Probability of no response:
[
τ i = ηi + (1 − ηi ) Pr ti > t *
]
Assume the probability of response independent
across stores and also retrieved offer.
26
13
How long to wait?
Must now evaluate utility over all possible sets of
retrievals based on the queries made.
∑ (∏τ
ι ∈Ω
ιi
i
)
(1 − τ i )ιi ⋅ E[Uι ]
O is set of all possible combinations, dimension is 2Q .
Combinatorial explosion, Q=10 yields 1,024
combinations, Q=30 yields one billion.
27
Implications
Suppose times are gamma distribution (mean=1.5,
std=2.7), E[max of 10 variates]=7, chance of
retrieval=95%.
-3.46
-3.47
Utility
-3.48
-3.49
5
-3.51
10
15
20
Time
Act aggressively and truncate response
28
14
1. Which stores to query?
At this stage the shopbot does not know the price it
will retrieve, however it can guess (in fact pretty
well).
Order stores based on prior expectations, and the
problem now becomes how many stores to query?
29
Example
Suppose there are three stores that may be queried
and prices are normally distributed:
Utility1 ~ N(1,1), Utility2 ~ N(0, s 2), Utility 3 ~ N(0, s 2)
If you could only select two stores which ones will
yield the expected maximum utility?
Choose {1,2} if s <3.67, otherwise {2,3}
E[max( Utility1 ,Utility2 )]=:X M()/L)+:YM(-)/L)+LN()/L)
Where )=:X-:Y and L2=FX2+FY 2-2DFXFY
30
15
Which stores to query?
Unfortunately, while normality may be a good distributional
assumption in practice, theoretical properties of its order
statistics are not tractable.
A reasonable approximation is to assume that utility (prices) are
logistically distributed. Furthermore to yield an analytical result
we assume that the stores are i.i.d.. After some work (and
approximation) we get the following stopping rule:
Q≥
σ
ω + λ ( A − 1)( PR*+1 − PR* )
Increase set size as the gains to search are higher (s) and
reduce them as the disutility of waiting time increases.
31
Implications
Suppose there are 10 stores (utility ranges from -4 to
-5, std=1)
-2.4
-2.6
Utility
-2.8
4
6
8
10
-3.2
-3.4
Number of stores searched
Larger query size is better, but at a declining rate
32
16
Application to Online
Bookstores
Empirical Components
• Predicting Price Changes
• Response Times for Searching
• Consumer Utility Part-worths
34
17
Data
• Automated agents collected data from 2 shopbots
and several individual stores
• August 99 - January 00
• 600 books
– NY Times Bestsellers
– Randomly selected ISBNs
– Computer Books
35
Price changes of Book5 - hardcover, fiction
$18.00
$16.00
$14.00
$10.00
$8.00
Dropped out of
best sellers on
Sep 19
$6.00
Moving back
in best sellers
on Sep 26
$4.00
Dropped out of
best sellers on
Oct 10
$2.00
$0.00
Au
g1
31
Au 999
g2
3
Au 1999
g2
91
99
Se
9
p4
1
99
Se
p1 9
01
9
Se
p 2 99
11
99
9
Oc
t1
19
Oc 99
t8
1
O c 999
t1
31
Oc 999
t1
91
O c 999
t2
41
Oc 9 9 9
t2
91
No 999
v5
No 1999
v1
01
99
No
v1 9
51
99
9
Price
$12.00
amazon.com
barnesandnoble.com
b o r d e r s . c o m 36
18
Time between price change
Number of days between price changes follows an
exponentially declining distribution
F
r
e
q
u
e
n
c
y
40
20
0
0
25
50
75
100
125
150 37175
TIME
Predicting Prices
• Predict days between price changes using a Negative
Binomial Model with parameters (γ,δ), where:
ln(γ) = α0 + α0 days_since_bestseller_change
+ α 0 1BookStreet + α 0 amazon + α 0 bn
+ α 0 buy.com + α 0 borders
• Given that prices have changed predict the
magnitude of price change using an autoregressive
model.
RelPrice(t) = β0 + β1 RelPrice(t-1) + β2 uphard
+ β3 uppaper + β 4 downhard + β 5 downpaper
38
19
Summary of Results
• Amazon and Barnesandnoble responding quickly to
change in bestseller status
• Amazon shows some price leadership, but for the
most part weak relationships between price changes
at stores
• When books move onto the bestseller list prices drop
(more for hardcovers)
• When books move off the bestseller list prices rise
39
Correlation between Actual
and Predicted Prices
Price Collection Frequency Correlation
Only once (initial time)
Once every 30 days
Once every 14 days
Once every 7 days
Once every 3 days
Once every day
.30
.82
.91
.95
.99
1.00
40
20
2000
3000
80% of the time
store responds
within 2 seconds
0
1000
Count of price changes
4000
5000
Response Times
0
20
40
60
80
Seconds to respond to request
41
Modeling Response Times
• Mixture of no response and a gamma distribution
with parameters (a, b).
? 95% probability that store will not respond (within 90
seconds)
? Other responses well described by Gamma(.3,5)
3.5
3
2.5
2
1.5
1
0.5
1
2
3
4
5
6
42
Time
21
Predicted Price Plot
43
Shopbot Choice Model
Parameter
Total Price
Item Price
Shipping Price
U.S. Tax
Delivery Average
Delivery “n/a”
“Big 3”
Amazon
BarnesandNoble
Borders
Estimate
-.19
-.37
-.43
-.02
-.37
($1.00)
($1.95)
($2.26)
($.10)
($1.94)
.48
.17
.27
($2.52)
($.89)
($1.42)
44
22
Waiting/Cognitive Costs
• Disutility of waiting one second = $.01 (?=.01)
– If you make $70,000/year, your time is worth $.01/second
• Overhead for launching a thread to search an online
store = 10 milliseconds (? =.0001)
• Cost to make a comparison = $.05 (?=.05)
– To compare 5 items with 4 attributes implies a total
cognitive cost of $.60
45
Simulation Results
23
Which stores to query?
Store
Mean Std. Dev.
buy.com
booksamillion.com
Bookbuyer's Outlet
Borders.com
alldirect.com
Amazon
barnesandnoble.com
AlphaCraze.com
Fatbrain
Books.com
HamiltonBook.com
BCY Book Loft
kingbooks.com
A1 Books
.52
.59
.62
.62
.63
.63
.63
.64
.65
.70
.70
.72
.73
.75
.10
.12
.13
.13
.05
.13
.13
.09
.15
.09
.07
.07
.04
.06
Store
MeanStd. Dev.
varsitybooks.com
1bookstreet
bigwords.com
WordsWorth
booksnow.com
Cherryvalleybooks
Rainy Day Books
Rutherfords
Classbook.com
Baker's Dozen online
Book Nook Inc.
Codys Books
computerlibrary.com
page1book.com
.75
.76
.77
.83
.88
.89
.89
.89
.96
.99
.99
.99
.99
.99
.05
.13
.06
.10
.06
.02
.05
.05
.06
.06
.05
.06
.06
.07
47
Delivery Options for
BarnesandNoble.com
Service
U.S. Postal Service
Standard Ground
FedEx Second Day
UPS 2nd Day Air
FedEx Overnight
Delivery
5-9 days
4-7 days
3-4 days
3-4 days
2-3 days
Cost
$3.95
$3.99
$7.95
$7.99
$10.95
48
24
10
13.7 secs
to search
28 stores
6
8
10.9 secs
to search
14 stores
6.2 secs
to search
4 stores
4
Expected time to search
12
14
How long to wait?
0
10
20
Number of stores
49
Best set to offer
(if prices known)
Store
1BookStreet.com
Amazon.com
Buy.com
Borders.com
Service
USPS Parcel Post
USPS Priority Mail
Standard Shipping
Standard
Delivery
6-21 days
5-10 days
n/a
5-10 days
Price
$15.19
$12.59
$10.39
$12.39
Cost
$0
$3.95
$3.95
$3.90
Total
$15.19
$16.54
$14.34
$16.29
50
25
Best set to offer
(if prices are forecasted)
Probability Store
Still
include
these
Chance of
including
these
82% 1BookStreet.com
56% Amazon.com
51% Buy.com
42% Borders.com
34% barnesandnoble.com
28% barnesandnoble.com
26% booksamillion.com
21% Fatbrain.com
17% AlphaCraze.com
13% Bookbuyer’s Outlet
Service
Delivery
E[Price]
USPS Parcel Post
USPS Priority Mail
Standard Shipping
Standard
Standard Ground
U.S. Postal Service
Standard Ground
UPS Ground
USPS Special Rate
Standard
6-21 days
5-10 days
n/a
5-10 days
4-7 days
5-9 days
N/A
4-8 days
5-15 days
n/a
$
$
$
$
$
$
$
$
$
$
15.19
12.59
10.39
12.39
12.59
12.59
11.79
12.99
12.79
12.39
Cost E[Total]
$
$
$
$
$
$
$
$
$
$
3.95
3.95
3.90
3.99
3.95
3.95
3.95
3.50
4.50
$
$
$
$
$
$
$
$
$
$
15.19
16.54
14.34
16.29
16.58
16.54
15.74
16.94
16.29
16.89
51
-3.0
Utility
-2.5
-2.0
Effects on Utility
-3.5
Optimal with Prices Unknown
Optimal with Prices Known
Querty/Present All
0
5
10
15
Number of Stores
20
25
52
26
Findings
• Present small number of the best alternatives (~4 items)
• Only need to search small number of stores if prices known with
certainty
• If prices are unknown, it’s best to continue searching at up to
14 stores. Since not all results presented (and little
computational cost) go ahead and search more stores.
• Using traditional shopbot best to only search a couple of stores
• Probability consumer will prefer optimal shopbot over current
shopbot >99.9% or $2.00, or worth $.20 compared with best
result when all results presented from 2 stores
What happens if customers less sensitive to cognitive costs?
53
Utility
-2.0
Utility
-1.5
-1.0
(with Low Cognitive Costs)
-2.5
Optimal with Prices Unknown
Optimal with Prices Known
Querty/Present All
0
5
10
15
Number of Stores
20
25
54
27
Conclusions
Summary
• Intelligent design of shopbots can dramatically
increase the utility that consumers garner from their
use
• Instead of passively searching, can incorporate
information about utility and price expectations to
speed up search and satisfaction
• Incorporates cognitive effort, compensatory utility
functions, and information retrieval
56
28
An improved shopbot?
• Ask users for filtering
questions about preferences
or use information from
previous history
• Appropriately balance the
cost of asking for the
information with its benefits
• Allow further search
• Better understand how
consumers perceive waiting
time based on expectations,
provide ‘filler’ tasks
57
Future Directions
• Learning from past purchases and designing active shopbots
(versus the passive design presented)
• Need better understanding of how to quantify cognitive costs
and effects of waiting
• Train shopbots to be proactive in seeking out good deals. If
bestseller status changed today & shopbot knows a store
responds to status change in 2 days, it can make
recommendations (“wait 2 days and price at amazon likely to be
less by $10”)
• Identify baskets of products or more complex products like
travel (airline tickets, car rental, hotel, etc.)
• Applications to information goods (e.g., news stories,
recommender systems, search engines)
58
29
Reflections
on E-Commerce Research
• Most of the existing marketing literature focuses on
describing how consumers behave
• What is really needed it prescribing how agents
should behave so they can work with and/or replace
consumers
Can yield some new insights into old problems (e.g.,
how do consumers shop?)
59
Store Choice Analogy
Which stores to search?
How long to search?
Which products to present?
Which store to shop at?
How far to travel?
How does assortment affect choice?
60
30
Vernon Hills, Illinois
Hawthorn Plaza, 1993-Present
More than 12 Stores Selling
Grocery & Drug Products
61
31