An Empirical Analysis of E-book and E-reader

Intertemporal Price Discrimination with Complementary Products:
An Empirical Analysis of E-book and E-reader
Hui Li
February, 2014
Abstract
The impact of e-book on print book sales and the dispute of e-book pricing right have received increasing
attention in the publishing industry.
This paper estimates a dynamic demand model of consumer e-reader
adoption, book purchase and format choice. The estimated demand model is then used to study intertemporal
price discrimination (IPD) for a monopolist that sells a pair of complementary goods. This setting has rarely
been studied before but has become increasingly relevant in online business.
Using a unique individual level
transaction data set from years 2008-2012, I nd that consumers have heterogeneous reading tastes and price
elasticity, as well as extra format utility from e-book reading. Taking supply side prices as exogenously given,
counterfactual simulation shows that on average 28% of the e-book sales come from cannibalizing online print
book sales and 72% come from market expansion. I then solve for the monopoly platform's optimal intertemporal
price discrimination policies of e-reader and e-book with both rm and consumers forward-looking. Besides the
price skimming incentive in traditional IPD literature, the existence of e-book as a complementary good provides
an extra penetration incentive and changes the optimal pricing, protability, and welfare results from IPD. In
particular, it is not always protable for the rm to practice IPD on the complementary good. It implies that
agency model - where publishers set e-book price - can sometimes be more protable than wholesale model for
the platform.
1 Introduction
Platforms sell complementary hardware and software and conduct intertemporal price discrimination in many
industries these days, especially in digital and online business. The market for e-book and e-reader is one of the
most prominent but understudied settings. There has been signicant growth in e-reading since Amazon launched
its rst e-reader, Kindle, in 2007.
E-book sales in the U.S. reached $90.3 million in 2011, an increase of 202%
1
compared to 2010 . Amazon.com, the largest online seller of print books and e-books, reported that e-book sales
2
surpassed its total hardcover sales in July 2010, and surpassed its total print sales as of April 1, 2011 . There has
3
long been a dispute, however, over e-book pricing rights between publishers and platforms . As e-reader price is
always set by the platforms, the question boils down to how protability and welfare are aected when platforms
are able to price e-books as well.
Publishers and platforms have dierent pricing incentives. Platforms are concerned about both e-reader and
book sales. By controlling both prices they can use two price instruments to price discriminate. Publishers only
earn prot from print books and e-books, so they are concerned that if platforms are able to set e-book prices,
1 the Association of American Publishers February 2011 Sales Report, http://www.publishers.org/press/30/
2 http://news.cnet.com/amazon-kindle-books-outselling-all-print-books/8301-17938_105-20064302-1.html
3 For example, in April 2012, the United States Department of Justice sued Apple and ve major book publishers,
accusing them
of colluding to raise e-book prices (http://mediadecoder.blogs.nytimes.com/2012/04/11/justice-les-suit-against-apple-and-publishersover-e-book-pricing/).
1
then the pricing will be too low and their print book revenue will be cannibalized heavily. The supply side pricing
decisions depend critically on the demand side relationship between e-books and print books.
Specically, how
much e-book sales come from cannibalizing print books, and how much come from market expansion? Measuring
these eects is important for both rms seeking to make optimal decisions and policy makers seeking to understand
potential welfare eects of dierent e-book pricing arrangements.
I address two sets of research questions in this paper. The rst set of questions is on the demand side. Are there
any cannibalization or market expansion eects imposed by the e-book channel on traditional paperbacks in the
online book market? How are these eects inuenced by the e-reader price, the heterogeneity in consumer reading
taste and the format preference across book categories? Cannibalization is dened as those e-books that could have
been bought in paperback format in the absence of e-book format.
Market expansion is dened as e-book sales
purely created by the presence of e-book format: these sales would not have taken place as print books. The second
set of questions is on the supply side. I solve for the optimal e-reader and e-book pricing strategies for a monopoly
platform given the demand estimates. What is the optimal intertemporal price discrimination (IPD) strategy with
a pair of complementary products? How does the change in rm's ability to practice IPD on the complementary
good aect protability and social welfare?
This paper is the rst to use representative individual level observations of actual purchase data and structurally
estimate the degree of cannibalization and market expansion in the e-book market. In particular, I explicitly model
the dynamic e-reader adoption decision and allow consumers to self-select based on their heterogeneous tastes,
which are missing in the current literature on the interaction between Internet and brick-and-mortar economics.
I start from the micro foundation of utility maximization at the individual level. Compared to extant literature
where only one publisher in a particular policy setting is studied, my data covers a broader range of consumers,
book titles, and genres. On the supply side, there are few empirical studies on intertemporal price discrimination.
Lazarev (2013) nds that IPD hurts welfare in the airline industry, while Hendel and Nevo (2013) study the storable
goods markets in an oligopoly setting and nd the opposite result. In terms of IPD with complementary goods,
there are virtually no (theoretical) welfare results. My paper aims to ll this gap. Finally, the paper contributes
to the limited empirical literature on dynamic pricing problem where both rm and consumers are forward-looking
and the demand setup is realistic and relevant.
I rst model book purchase, format choice, and device adoption decisions in the demand model, taking supply
side prices as given. On the device adoption side, every period forward-looking consumers choose to buy a Kindle,
upgrade or wait. The ow utility of owning a device comes purely from book purchase: consumers will have an
enlarged choice set including both paperbacks and e-books. On the book purchase side, I model both consumer's
quantity and format choice of dierent book categories. Within the same category, the two formats share the same
category-specic taste while they dier in price and format utility. The device adoption side and book purchase
side are linked because (1) consumers take current and future book purchase utility into account when buying a
device; (2) device adoption status aects consumer's book choice set in that they cannot read an e-book unless
they own the device.
In the supply side optimal pricing problem, I use the estimated demand system to solve
for a pure-strategy Markov-perfect Nash equilibrium (MPNE) where consumer's expectation over future state is
consistent with the rm's optimal strategy. I compare the scenario where rm conducts IPD on both e-reader and
e-book to the scenario where rm can conduct IPD only on e-reader.
My model estimates allow me to quantify the degree of cannibalization and market expansion eects. Taking
supply side prices as exogenously given, counterfactual simulation shows that 28% of the e-book sales come from
cannibalizing print books online and 72% come from market expansion.
The magnitude of the two eects are
aected by device price and e-book prices. Interestingly, I nd that the substitution patterns across book categories
are dierent. Consumers prefer casual books in e-book format, compared to the other two categories lifestyle
and practical. On the supply side, besides the price skimming incentive in traditional IPD literature, the existence
2
of e-book as a complementary good provides an extra penetration incentive and changes the optimal pricing,
protability, and welfare results from IPD. Surprisingly, when e-book price increases, consumers can be better o
because of the drop in rm's IPD ability on the e-reader. When the rm can practice IPD on both e-reader and
e-book, it uses the product with higher elasticity of demand to "invest" and the product with lower elasticity of
demand to "harvest". In particular, it is not always protable for the rm to practice IPD on the complementary
good. It implies that agency model - where publishers set e-book price - can sometimes be more protable than
wholesale model for the platform. In the early stage where there is relatively larger proportion of avid readers, it is
more important for Amazon to adopt the wholesale contract and control e-book price. As the consumer composition
evolves, Amazon may be better o by practicing IPD on the device only and publishers can seize this opportunity
to persuade Amazon into the agency model.
In the next section I present a literature review. In Section 3, I describe the data set and the data patterns
that motivate my model specication. Section 4 presents the dynamic demand model of both e-reader and books.
Identication strategies and estimation method are discussed in Section 5.
I present the estimation results and
model t in Section 6. The supply side optimal intertemporal price discrimination problem is solved in Section 7.
Section 8 concludes.
2 Literature Review
This paper contributes to the literature on intertemporal price discrimination by studying the setting with
complementary good. With only one good, theoretical work shows that either price skimming or a single constant
price can be optimal depending on the distribution of consumer taste and rm's ability to commit (Stokey 1979,
1981). Empirically, Nair (2007) numerically solves the dynamic pricing problem of video game providers with both
rm and consumers forward-looking. In such one good setting, pricing skimming is the optimal strategy. This paper
diers from the extant IPD literature in that the rm has another competing price incentive - penetration pricing
- given the complementarity of the goods. Penetration incentive is also present in Liu (2010), where network eect
provides the video game console rms to invest in early adopters.
However, the penetration incentives induced
by complementarity and network eect are dierent: network eect is exogenous and only indirectly aects rm's
prot. The complementary product in my case, is both endogenously determined by the rm and enters the rm's
prot.
So besides the two competing incentives, rms can strategically use both the primary product and the
complementary product to conduct IPD. Another dierence from the current literature is the role of consumer
heterogeneity. An important feature of my model is that primary product adoption is a self-selection process based
on the taste for the complementary product. Early primary product adopters generate more complementary product
revenue. Consumer heterogeneity matters in the model not only because it oers price discrimination incentives,
but also requires that rms maximize the prot from both the primary and complementary products given the taste
information revealed in the self-selection process.
This paper also contributes to the study of welfare eect in empirical price discrimination literature.
The
impact of price discrimination on welfare is theoretically ambiguous (Robinson, 1933). In terms of intertemporal
price discrimination, we know little about the potential benets and costs from price discrimination empirically.
Exceptions include Nair (2007) on console video games, Hendel and Nevo (2013) on storable goods markets, and
Lazarev (2013) on airline tickets. While it is becoming increasingly common for the rm to sell both a primary
product and a complementary product in online business, there is virtually no empirical paper that studies the
welfare impact of the IPD in this setting.
My paper aims to ll this gap by solving a full dynamic equilibrium
model with both rm and consumers forward-looking.
The demand side model quanties the degree of cannibalization and market expansion eect from e-books on
print book sales in the online market. There has been studies on interactions between the Internet and brick-andmortar economies. Cannibalization has been found between online newspapers and physical ones (Gentzkow 2007),
3
Youtube viewing and television viewing (Waldfogel 2007) , le sharing and record sales (Oberholzer-Gee, Strumpf
2007), pdf and print format (Kanna, Pope and Jain 2009). In terms of e-book and print books, Hu and Smith (2011)
nd that delaying the release of e-books causes an insignicant change in overall hardcover sales, but a signicant
decrease in e-book sales. None of these, however, takes the device adoption decision into account. Device adoption
is a self-selection process which reveals consumer heterogeneity. As consumer heterogeneity is the foundation for
price discrimination, it is crucial to model the dynamic decision of e-reader purchase for both the cannibalization
analysis on the demand side and intertemporal price discrimination analysis on the supply side.
The demand model shares features with dynamic models of technology adoption (e.g.
Melnikov 2000; Song
and Chintagunta 2003; Gowrisankaran, Rysman 2012), and the demand estimation as a mixed discrete/continuous
model (e.g.
Dubin and McFadden 1984; Economides, Seim and Viard 2008).
The complementary good setup,
where platform oers both a primary good and a complementary good, is similar to other industries such as video
game and console (e.g. Lee 2013). The supply side dynamic pricing model with both rm and consumers forwardlooking is related to the literature on Markov-Perfect equilibrium models of dynamic pricing (e.g.
Ericson and
Pakes 1995, Goettler and Gordon 2011). Given both rm and consumers are forward-looking, the distribution of
the heterogeneous consumer preference is endogenously determined through pricing as current pricing aects future
demand and consumer expectation.
Prices are equilibrium outcomes of a game played between forward-looking
consumers who strategically delay purchases to avail of lower prices in the future, and a forward-looking rm that
takes this consumer behavior into account in formulating its optimal pricing policy.
3 Data Description and Reduced Form Results
3.1 The U.S. E-book Industry
Despite the fact that dedicated e-book readers were on the market from as early as the late 1990s, e-book
industry did not experience rapid growth in the beginning.
There was no ideal e-readers available and e-book
format standards were conicting. E-book was also not well-accepted by both publishers and avid book readers
who were generally older and less tech-savvy. The industry started to grow when Sony introduced electronic paper
display technology in 2004 and its e-book reader in 2006. Just one year later, Amazon released its e-reader Kindle.
It sold e-books at a substantial discount: most new books were sold at $9.99 compared to the $26 average retail
print book price. Amazon's existing relationship with publishers also enabled it to oer a wide variety of e-books.
4
The number of e-books available in Kindle store increases from 126,630 in 2008 to 1,429,500 in 2012 . Both the
low price and large availability of e-books helped Amazon establish its dominant place and boosted the industry
5
from a $20.0 million business in 2006 to $969.9 million in 2011 . Amazon oers a new generation of Kindle every
year.
The price of a particular Kindle model drops over time, while the most popular version each year has an
increasing quality and a stable price around $139-$199.
There are competitors:
Barnes and Noble released its
e-reader Nook in November 2009 and Apple started selling e-books on iPad in April 2010.
Nook is a dedicated
e-reader with comparable screen, size, functionality and price as Kindle. iPad, on the other hand, separated itself
from the e-readers with a multi-purpose nature and a much higher price. While Amazon's market share remains in
6
the range of 60%-70%, the other two players have captured 25% of the market at their peak .
In terms of e-book pricing, Amazon and publishers signed a wholesale contract in the early stage where Amazon
set e-book prices and paid a wholesale price to the publisher. The contract scheme switched to the agency model
initiated by Apple in year 2010. Publishers set e-book prices and platform took 30% of the book revenue share.
4 http://ilmk.wordpress.com/category/analysis/snapshots/
5 Source: Association of American Publishers
6 More supporting evidence is in Appendix C.
4
The e-book prices of New York bestsellers increased from $9.99 to $12.99-$14.99 after the contract change.
In
April 2012, the United States Department of Justice sued Apple and ve major book publishers, accusing them of
colluding to raise e-book prices. The contract scheme was changed back to wholesale contract since then.
In my model, the market is dened as the total online book market, including both Amazon.com and other
websites. Amazon.com is a monopolist in terms of e-reading and sells its e-reader and e-books to consumers, while
consumers can buy paperbacks on both Amazon and other websites. I focus on Amazon Kindle because it was the
7
dominant e-reader during the sample years 2008-2012 . I abstract away from oine book sales such as local book
stores for data availability reasons. Also, while the total book sales in 2011 is $13.7 billion, the sales on Amazon.com
8
is $7.96 billion . As Amazon.com is the major book seller for both e-books and paperbacks, my model is able to
9
cover a large portion of the market .
In practice, the impact of e-books on the traditional print book sales are
two-fold: on one hand, book readers switch from oine purchase to online purchase; on the other hand, online book
shoppers switch from other websites to Amazon.com. The cannibalization and market expansion eect I am able
to capture here should be interpreted within the latter scope: the impact of Amazon's e-book on the total online
book business. If oine purchase is also considered, the estimated cannibalization can be larger. Amazon.com lists
print book and its e-book version side by side on the product pages. This is a good setting for studying substitution
pattern because buyers can easily become aware of the competing e-book oers.
I assume that consumers only
read e-books on e-reader, not on other screens such as PC and Tablet. Survey results show that e-reader is the
dominant device used and consumers using a dedicated e-reader contribute the most to e-book consumption. More
supporting evidences are in Appendix C.
3.2 Data Description
I combine three unique data sets, two of which come from comScore. The comScore Web Behavior Database
captures detailed browsing and buying behavior by 100,000 Internet users across the United States at the domain
level. The panel is based on a random sample from a cross-section of more than 2 million global Internet users who
have given comScore explicit permission to condentially capture their Web-wide activity. It is weighted so that
the distribution of the demographics matches that in comparable universe population. ComScore conducts various
validity tests to prove its representativeness for the Internet user population.
The rst is an individual level data set of consumer online book purchase history from 2008 to 2012 gathered
by comScore. Each consumer is identied by a machine id which indicates the device he uses to access the website.
The population universe is the online shoppers sampled by comScore. Among them, around 40% buy at least one
book in a year. For these book buyers, I observe the purchase time, the book title, the price, and the quantity. For
the non-buyers, the quantity of books bought is zero. The data includes demographics such as household income,
age, family size, zip code, etc. There are 54,155 book purchase records made by 15,751 households over the 5 years.
I drop the consumers who buy more than 25 books (99% quantile) a year in my estimation. The households are
re-sampled every year, so I treat the 5 year data set as cross-sectional
10 and dene the time period to be a year.
Table 1 shows summary statistics of the book buyers' demographics. The non-buyers have similar characteristics.
There is no signicant dierence between shoppers on and outside Amazon.com.
The book format choice - whether consumers buy the titles in paperback or e-book format - is a key variable in
7 The
rst Kindle was launched in November 2007 while the rst Barnes & Noble Nook came into market in October 2009. Another
competitor, Apple, started to sell e-books on iTunes after the introduction of iPad in April 2010. So Kindle enjoyed two-year monopoly
in the early stage. It is still the dominant e-reader now. According to the survey conducted by Pew Research Center in January 2012,
62% of the e-reader owners have Kindle and 22% have Nook. The third biggest player, Pandigital, only accounts for 2% of the market.
For more information, see Appendix C.
8 http://www.fonerbooks.com/booksale.htm
9 Kindle is the dominant e-reader in year 2011.
According to the survey conducted by Pew Research Center in January 2012, 62% of
the e-reader owners have a Kindle and 22% have a Nook. The third biggest player, Pandigital, only accounts for 2% of the market.
10 Since
the cross-sectional data set does not provide information prior to the sampled year, I take a probabilistic point of view about
consumer device adoption status.
The probability that a consumer owns a Kindle at the beginning of the period is the aggregate
penetration rate of Kindle.
5
my model. ComScore reports the format information starting from year 2011, so I observe the format for the last
two years of my sample. 1172 consumers have ever bought e-books and 3870 e-books are bought in those two years.
I need more information about the books purchased in the rst data set, the book genre in particular, to conduct
empirical analysis. So I collect another unique complementary data set. For each title purchased in the rst data set,
11 , publishing date, and other book characteristics (e.g.
I collect price, rating, number of comments, ranking, genre
ISBN, publisher, author) of both formats - paperbacks and e-books. They are publicly available on the Amazon
website. In total, there are 91,487 pieces of title-format information. Merging the book characteristic data set with
the individual book purchase records, I get the genres of the books purchased.
I further group the genres into
12 - and use this coarser denition in my empirical estimation.
three categories - lifestyle , casual and practical
I do this because: (1) genres within the same category share similar reading purpose intrinsically, and there is
substantial similarity in consumer purchase pattern that is of major interest for the empirical analysis; (2) average
book characteristics such as prices are much closer within category. I then aggregate the book purchase records by
category-format to form the number of books purchased in each category-format for each consumer. For instance,
consumer
i
bought two casual e-books and three practical paperbacks in year 2011. I build my model to t this
observed purchase quantity. I average over the observed book prices within each category-format as sales-weighted
prices in the empirical analysis. Casual books are the cheapest, with an average e-book price $8.1 and paperback
price $13.8. Practical books, not surprisingly, are the most expensive ones, with with an average e-book price
$11.6 and paperback price $27.5.
The third data set is an individual level Kindle purchase data in years 2008-2012. I observe the time of purchase,
the price and the quantity. The individual demographics are also available. I use the price of the most popular
version Kindle in each period in the empirical estimation. I also get the number of e-books available every year in
Kindle Store from a widely-cited blog that is popular among Kindle lovers
13 .
To summarize, I build a model to t the individual book quantity and format choice, as well as Kindle purchase
decisions. The time period is a year and I have 5 year cross-sectional data. The book price and consumer's quantity
choice in each category-format come from merging the rst and second data sets. An observation is the quantity of
books a consumer purchases in each category-format in a year. The individual level Kindle purchase records come
from the third data set. For year 2008-2010, I observe the total number of books bought in each category. I do
not know the format of the books bought and whether consumers are Kindle owner or not. For year 2011-2012, I
observe the number of books bought in each category-format.
3.3 Data Patterns
The model specication is signicantly motivated by the data patterns I observe. In this section, I list a number
of tabulation results and reduced form regression results about consumer book purchase and Kindle adoption
patterns.
11 The
genre denition follows closely what Amazon uses on the website. The original ten genres are (1) lifestyle and home related,
including Lifestyle & Home, Fitness & Dieting, Cookbooks, Cooking, Food & Wine, Crafts, Hobbies & Home, Travel; (2)
Business & Investing; (3) Fiction; (4) Children's Book; (5) Comics & Graphic Novels, Arts & Photography; (6) Professional
& Technical, Computers & Technology; (7) Religion & Spirituality, Christian Books & Bibles; (8) Nonction; (9) Education &
Reference, Textbooks, Medical Books; (10) Biographies & Memoirs.
12 (1)
lifestyle category, including Lifestyle & Home, Cooking, Travel, Fitness & Dieting, Crafts, Hobbies & Home, Arts &
Photography, Children's Book etc. (2) casual category, including Fiction, Science Fiction, Humor, Non-Fiction, Biographies
& Memoirs etc. (3) practical category, including Computers & Technology, Business & Investing, Medical Books, Education &
Reference etc.
13 http://ilmk.wordpress.com/category/analysis/snapshots/.
It takes monthly snapshots since 2009 and reports information such as
the number of titles available, number of free books/textbooks/magazines/newspapers, number of books with price $0.01-$50, $0.01-$10,
$0.01-$2.98, $2.99-$9.99, the price of New York Times Hardback Fiction Equivalents, etc. I check with other newspress information on
the number of e-books available to validate the number on this blog. I linearly extrapolate the data to year 2008 and average over the
monthly data to form the yearly data.
6
(i) How many books do consumers buy per year?
Data Pattern 1: There is signicant heterogeneity in consumer general reading taste in this market.
In my data sample, 60% of the online shoppers do not buy any books.
For the book buyers, 13.8% of the
consumers who buy more than ve books a year account for nearly half (46.8%) of total book purchases.
The
histogram of the number of books bought by each book-buyer every year is highly skewed to the left, with median
2 and mean 3.64.
(ii) What kind of books do consumers buy - categories and formats?
Data Pattern 2:
Consumers are heterogeneous in their reading tastes across book categories which are not
correlated with observed household characteristics.
I tabulate the book categories each consumer has ever purchased in Table 2. Among the book buyers, 66.6% of
the consumers buy only one category throughout the year. In general, consumers prefer one category over another
for some reasons that are unobservable to the econometricians. There is little correlation between the genres bought
and the observed household characteristics. I use a nite mixture structure to capture the heterogeneous reading
taste across book categories in my model.
Data Pattern 3: There is strong substitution between e-book and paperback within the same category. Consumers
prefer casual books in e-book format compared to the other two categories.
This suggests the importance to allow for substitution between the two formats and category-specic e-format
utility in my model.
Before we look into the details of the substitution pattern, let me rst compare the two
formats. E-books are generally cheaper. For 75.2% of the book titles, e-book price is lower than paperback price.
The availability of e-books is increasing over time. The number of e-books available in Kindle store increases from
14 . As for a particular book title, 50% of the paperbacks have an e-book version
15
in year 2008 in my data, and the number increases to 70% in year 2012 . Table 3 lists the number of e-books
126,630 in 2008 to 1,429,500 in 2012
available and e-book price as a percentage of the paperback price in years 2008-2012. E-books become cheaper and
more available over time. Furthermore, e-books enjoy extra advantages: 1) instant delivery: there is no shipping
cost, which enables frequent shopping at zero cost; 2) lower management and storage cost, which makes purchasing
and storing books easier; 3) e-readers are easy to carry and convenient to read, especially while traveling, which
potentially makes people read more and faster. Thus people may gain extra e-format utility from reading e-books.
The low price, increasing availability, and the potential extra utility from e-reading serve as the foundation for
substitution.
Now let us look at the substitution pattern between the two formats. I examine only those people who have
ever bought e-books in year 2011 and 2012. They are Kindle owners and have the choice to substitute. It turns
out that 98.66% of the households buy books of a particular category either in only one format, or not buying at
all, suggesting a strong substitution between the two formats of the same category. In Table 4, the rst column
is the percentage of households who choose to buy paperbacks in a particular category. The second column is the
percentage of households who choose to buy e-books for that category. An interesting observation is that casual
e-books are stronger substitutes of their paperback counterpart. Consumers might have dierent e-format utility
for dierent book categories. This can also be reected in the aggregate market share of book categories. In my
data sample, the market shares of lifestyle, casual and practical books in paperback format are 36%, 39% and
25% respectively, while the numbers in e-book format are 18%, 76% and 6%.
This result is consistent with the
ndings in Bounie et al. (2011): Among the bestselling print books, 23% are non-ctions, 27% are ctions, and
24% are practicals. The numbers in e-book market are 12%, 70%, and 8%. Intuitively, the convenience of e-reading
14 http://ilmk.wordpress.com/category/analysis/snapshots/
15 The number is calculated from the title-format book information
I collect from Amazon website.
It is also consistent with the
survey data. Survey conducted by Pew Institute released in April 2012 show that 70% of e-content consumers say the material they
want is always available or available most of the time (http://libraries.pewinternet.org/2012/04/04/the-rise-of-e-reading/).
7
Table 1: Summary Statistics of Household Characteristics
N
15751
Household size:
Median income range
$50,000 - $75,000
1
13.6%
Black
7.8%
2
29.5%
Hispanic
9.5%
Household oldest age:
21.0%
35.9%
18-35
19.6%
35-50
35.3%
Has children:
No
35.7%
50+
45.1%
Yes
64.3%
Census region of residence:
Notes :
3
4+
Internet connection speed:
Northeast
21.1%
Not broadband
1.4%
North Central
20.4%
Broadband
98.6%
South
34.2%
West
24.0%
The table summarizes the observed household demographics in the individual book purchase record data set.
Table 2: Tabulation of Book Categories Purchased Per Household
Notes :
Categories ever purchased
Freq.
Percent
#1
3,593
22.8
#2
3,770
23.9
#3
3,124
19.8
#1,#2
1,798
11.4
#1,#3
1,124
7.1
#2,#3
1,060
6.7
#1,#2,#3
1,282
8.1
Total number of households
15,751
100.0
Category #1, #2, and #3 correspond to lifestyle, casual, and practical.
Table 3: E-book Prices and Availability Over Time by Format
Year
Number of e-books
E-book price as a percentage
in Kindle Store
of paperback price
2008
126,630
63.3%
2009
301,630
63.7%
2010
587,580
59.5%
2011
958,280
59.1%
2012
1,429,500
57.3%
8
is best reected in casual reading. Books with many pictures (e.g. cooking book in lifestyle category) or books
that need in-depth reading (e.g. textbooks in practical category) may even lead to dis-utility from e-reading.
(iii) Who are buying Kindle?
Data Pattern 4: Casual book buyers are more likely to buy Kindle.
Is there any dierence between Kindle owners and non-owners? I tabulate the demographics and book purchase
patterns across the two groups of consumers in Table 5. Kindle owners are more likely to have higher income and
age. The numbers of lifestyle and practical books they buy are comparable to those of the non-owners, while they
buy substantially more casual books. The average number of casual books bought is 2.10 for a Kindle owner and
0.25 for a non-owner.
To further explore the factors that aect consumer's Kindle purchase choice, I run a probit regression using the
consumer book purchase data in year 2011 and 2012. The results are in Table 6. The dependent variable is whether
the consumer is a Kindle owner. In model specication (i), I regress it on the number of books the consumer buys
in each category. In model specication (ii), I regress it on the dummies for dierent purchase variety combinations.
I control for household demographics and year xed eects.
The results suggest that buying casual book is
positively correlated with the Kindle adoption while buying practical books reduces the probability of having one.
Consumers may be self-selected into buying Kindle. The reduced form regressions cannot distinguish causality and
correlation. I structurally model the adoption incentives in my model to help reveal the mechanism behind it.
To summarize, the data patterns I observe show the importance to capture consumer heterogeneous reading taste.
Consumers substitute between the two formats, and the substitution pattern diers across categories, suggesting
dierent e-format utilities across categories. This also potentially contributes to the decision of buying Kindle. The
model I build in the next section is strongly motivated by the data patterns I observe here.
4 Model Setup
The time period is a year. Every period, consumers make Kindle adoption choice, and then choose how many
books to buy in each book category-format based on their device adoption status. Their expectation on the ow
utility from book purchase aects their device adoption decision.
Consumers can always purchase paperbacks
whether they have a Kindle or not, but can only buy e-books when they have a Kindle. There is an outside option
of not buying books. Once a consumer buys a Kindle, he can upgrade to a new version later. The benet of buying
a Kindle comes purely from e-book purchase: consumers will have an enlarged choice set including books in both
paperback and e-book formats. Consumers are forward-looking on device adoption decisions. In general, e-books
are cheaper and e-reading brings extra utility as it is more convenient. Consumers need to trade o between the
option value of buying the device and the disutility from paying the current device price. Intuitively, the two sides'
decisions are linked because (1) the ex-ante ow utility from book purchase aects Kindle adoption decision; (2)
Kindle adoption status inuences the book formats that consumers can choose from when buying books.
4.1 Book Purchase
There are three book categories indexed by
g = 1, 2, 3
which stand for lifestyle, casual and practical
respectively. For each category, there are two formats: e-book and paperback. There are six products in total. I use
superscript
g
E
to denote e-book and superscript
in paperback/e-book format, let
z
denote the quantity of book category
P
for paperback. Let
denote the numeraire and
g
yi
P E
pg , pg g=1,2,3
denote the prices of category
denote household income. Let
P E
qig , qig g=1,2,3
in paperback/e-book format.
Consumers having a Kindle choose optimal quantities of the six products
P
E
qig
, qig
g=1,2,3
- lifestyle paperback,
lifestyle e-book, casual paperback, casual e-book, practical paperback, practical e-book - to maximize their
utility
9
Table 4: Substitution Pattern Across Formats
Notes :
%
Paperback Format
E-book Format
Lifestyle
23.1
76.9
Casual
3.6
96.4
Practical
38.4
61.6
The table tabulates the behavior of consumers who have bought at least one e-book in year 2011 and 2012. The
rst column is the percentage of households buying books in paperback format for a particular category every period. The
second column is the percentage of households buying books in e-book format for that category.
Table 5: Kindle owner vs. non-owner
Demographics
Income
Age
Owners
Non-owners
Low
32.1%
34.3%
Middle
34.6%
34.0%
High
33.3%
31.7%
Young
18.4%
23.9%
Middle age
28.2%
32.8%
Senior
53.4%
43.3%
# books bought
Category
Owners
Non-owners
Lifestyle
0.63
0.27
Casual
2.10
0.25
Practical
0.22
0.24
Table 6: Probit Regression on Kindle Ownership
(i)
(ii)
Dependent Variable : Have Kindle in the Year
Number of lifestyle book bought
-0.0044
Dummy for buying casual book
0.2001**
Dummy for buying practical book
-0.0467**
Dummy for buying lifestyle
0.1691**
and casual books
(0.0161)
(0.0031)
Number of casual book bought
0.0486**
Number of practical book bought
-0.0342**
(0.0125)
(0.0027)
(0.0133)
(0.0037)
Dummy for buying lifestyle
-0.0229
and practical books
(0.0191)
Dummy for buying casual
-0.0121
and practical books
(0.0195)
Dummy for buying all categories
0.0190
(0.0205)
R -squared
N
Notes :
0.138
0.127
6800
6800
Robust standard error are in parentheses. Only year 2011 and 2012 data are used because Kindle ownership is not
observed in earlier years. All regressions control for household characteristics like household income, household size,
household oldest age, census region of residence, having children or not, connection speed of Internet, racial background,
etc. None of the controls are signicant except for the positively signicant coecient for the oldest age.
** Signicant at 1 percent.
10
max
P ,q E
{qig
ig }g=1,2,3
P
E
X 1
qig
+ qig
P
E E
ui = z +
aP
q
+
a
q
−
ig ig
ig ig
bi
2
g
X
P
E E
s.t.
pP
g qig + pg qig + z = yi
2 !
(1)
g
where
aP
ig
and
aE
ig
are taste parameters of consumer
i
on category
g
in paperback/e-book format.
bi
is the price
coecient. The numeraire price is normalized to be 1. There is perfect substitution between paperbacks and e-books
of the same category, while the consumption across categories are independent. Solving for the optimal quantity of
books, we can get that for each category, the quantity pair
P E
qig , qig g=1,2,3 =



{0, 0}






aP − bi pP

g, 0
 ig



E


0, aE

ig − bi pg




P
E
qig
, qig
should satisfy
aE
aP
ig
ig
E
bi and pg > bi
aE
aP
ig
ig
E
P
and pg >
if pg <
bi
bi ,
P
aig
aE
ig
P
E
or pg <
bi and pg < bi and
aE
aP
ig
ig
P
E
if pg >
bi and pg < bi ,
P
aig
aE
ig
P
E
or pg <
bi and pg < bi and
if
pP
g >
P
E
E
aP
ig − bi pg > aig − bi pg
(2)
P
E
E
aP
ig − bi pg < aig − bi pg
The quadratic utility functional form allows for diminishing marginal utility from each book category and corner
solutions (zero consumption)
16 . Also, the optimal quantity is a linear function of prices. So if e-books are cheaper,
people will buy more e-books. Economides, Seim and Viard (2008) adopt a quadratic functional form in their local
phone service paper where local and regional calls are independently consumed. My specication diers in the sense
that I allow for substitution between the two formats of the same category, which is crucial for me to investigate in
the cannibalization and market expansion eects
17 .
For those consumers who do not have a Kindle, they cannot buy e-books. They have only three products to
choose from: lifestyle paperback, casual paperback, and practical paperback.
max
P
{qig
}g=1,2,3
P
X 1
qig
P
aP
ui = z +
ig qig −
bi
2
g
X
P
s.t.
pP
g qig + z = yi
2 !
(3)
g
The optimal quantity is
P
qig
16 Notice

aP − b pP
i g
ig
=
0
if pP
g <
aP
ig
bi
(4)
otherwise
that this model specication cannot generate positive number of books bought for both formats at the optimum. In the data
set, 98.66% of the households buy books of a particular category in only one format, or not buying at all. The rest 1.34% consumers
buy positive quantity of both formats. I assume that for those consumers, there are two shopping occasions in a period, so that the
event
qP , qE
is treated as two independent events happening
qP , 0
and
0, q E
. This would overestimate the substitution between
paperbacks and e-books as the true substitution pattern is weaker than perfect subsitution.
17
I also try alternative specications. CES utility cannot generate perfect substitution results as the optimal solution is always to
consume positive amount of both formats, which contradicts with the observed data pattern. Other polynomial specications with more
parameters only lead to a more complicated functional form of the optimal quantity choice, but the general form remains. So I stay
with the quadratic utility functional form here.
11
I parametrize the individual-specic demand intercepts
aig
as
T
T
aTig = αig
+ vig
, f or T = P, E
(5)
where the deterministic part is
T
αig
= ρig + γD Diage + θgE + βage Diage + βn log nE · 1 {ebook}
The price coecient
bi
is parametrized as
bi = b + βincome Diincome
Diincome
ρig is
age
E
allows consumers' reading taste to dier by age, θg is the
the reading taste or the xed eect for category g , Di
E
e-format utility for category g , n is the number of e-books available which is the only time-varying component for
where
is household income. This allows consumers' price coecient to vary across income groups.
a particular household. This model specication is motivated strongly by the data patterns I observe. There are
two parts of the utility: the rst two terms are shared by the two formats in the same category, while the terms in
the bracket are e-book format exclusive.
First, households are heterogeneous in their observable and unobservable characteristics. The household's observable characteristics include household income and household oldest age. The unobservable characteristics are
their category-specic reading taste
variance
σ
2 18
.
ρig
and an i.i.d.
normally distributed error term
This is motivated by the observed data pattern:
T
vig
with mean zero and
there is signicant heterogeneity in consumer
reading taste for dierent categories and it is not correlated with observed household characteristics. The crosssectional data set does not allow me to estimate a set of reading taste xed eects for each consumer, so I model
it using a nite mixture specication.
To determine the number of dierent segments in the market, I proceed
by adding segments to the model until one of the segment sizes is not statistically dierent from zero. Besanko,
Dubé, and Gupta (2003) and Nair (2007) have taken a similar approach. The data reveal four distinct segments:
n
on
o n
o
small small small large small large
ρsmall
, ρlarge
, ρsmall
, ρ1
, ρ2
, ρ3
,
ρlarge
, ρlarge
, ρlarge
and
ρ1
, ρ2
, ρ3
,
1
3
2
1
2
3
with population
19
mass {m1 , m2 , m3 , 1 − m1 − m2 − m3 } . There are 9 observed types (3 income groups times 3 age groups) and 4
unobserved types in total. The unobserved types and the observed types are independent.
Secondly, there is strong substitution between e-books and paperbacks according to observed data pattern 3.
Once people adopt a Kindle, 85.3% of the times they buy e-books, suggesting the potential existence of extra
e-format utility. I allow the e-format extra utility to be category-specic
θgE
because I observe consumers buy more
casual books in e-book format than the other two categories. The e-format extra utility can change with the log
number of e-books available
log nE
and household age
Diage .
As the availability of e-books increases over time,
consumers are allowed to value the e-format more. Senior consumers are in general less tech-savy and can have
dierent e-format utility compared with young consumers. All the components in the deterministic part of the taste
parameter
form
aig
aect consumer book quantity choice. Recall that the demand function for each category is of the
aig − bi pg ,
so people respond to a price drop of books in each category-format.
βincome
is pinned down by the
variation of price elasticity across income groups.
18 In
a robustness check, I relax the assumption that
T
vig
is i.i.d by allowing the error terms of the two formats within the same category
to be correlated. The implied cannibalization and market expansion eects, and price elasticities are very stable with respect to this
specication change. So I keep the i.i.d. assumption here.
19
The tabulation and reduced form results on Kindle adoption suggest that the most relevant relative reading taste is whether
consumers are casual reading lovers or not. Also the numbers of lifestyle and practical books bought are positively correlated. So
ρi2
can take two levels
ρlarge
2
and
ρsmall
,
2
and{ρig }
g=1,3 can take two combinations of levels
12
ρsmall
g
g=1,3
and
n
ρlarge
g
o
g=1,3
.
0
Let subscript
denote not having a Kindle and subscript
from book purchase without Kindle and
fi1
1
denote having one.
fi0
is the ex-ante ow utility
is the ow utility with Kindle. Substituting the optimal book purchase
quantity and format decision into the utility function and taking expectation over the error terms
vig
in
aig ,
the
ex-ante indirect utilities are
fi0 = yi +
X
P
aP
ig − bi pg
2bi
E
g
fi1 = yi +
X
g
E
aTig − bi pTg
2bi
2
2
!
|
P
qig
P
> 0 Pr qig
>0
(6)
!
|
T
qig
> 0,
−T
qig
−T
T
= 0 Pr qig
> 0, qig
= 0 , T = P, E
(7)
The two equations above contain conditional expectations of a truncated normal error and its quadratic term,
where for Equation (7) the truncation point is a result of a maximization operator. I use quadrature method to
calculate it. The details are in the appendix.
4.2 Device side
Consumers are forward looking and hence can delay purchase.
They also make upgrading decisions if they
already have a Kindle in stock. I assume that consumers only use one Kindle at a time and there is no resale value
for Kindle. I also assume that Amazon oers one Kindle version per period and I take the most popular version
20 as the version in my model. Given the ex-ante ow utilities of books with current
Kindle bought in the data
device status and the oered one, consumers make Kindle adoption and upgrading decisions at the beginning of the
period. Consumers have 1-period perfect foresight about both the book side characteristics and device side Kindle
21 . For years beyond my sample period 2008-2012, I assume that these variables stop evolving and
price and quality
stay at the same value as in the last sample period. This is consistent with the commonly observed new technology
diusion pattern as e-book market grows relatively mature after ve years.
A consumer who does not have a Kindle at the beginning of time
t
receives utility
ui0t = σ f fi0t + εi0t
Here the ow utility from book purchase enters with a coecient
(8)
σf .
The idiosyncratic shock
εi0t
is identically
22
independently distributed extreme value type I error. I normalize the variance of the error to be 1 .
20 In
practice, Amazon launches a new generation of Kindle almost every year. Consumers are oered up to 2 generations of Kindle
every year, except for year 2012 where 3 generations (Kindle 3 Keyboard, Kindle Fire, Kindle Paperwhite) are on the market. The most
popular generation that consumers buy every year consists at least 70% of the sales. So I assume one product is oered at a time in my
model. Another reason is that in the policy implication section, I solve for Amazon's optimal Kindle pricing problem. Multi-product
rm pricing is computationally prohibitive. Goettler and Gordon (2011) also makes this single-product assumption for computational
reason.
21 In
an earlier version of the paper, I use rational expectation assumption where consumers expect that the variables follow an AR(1)
process. The coecients in the AR(1) model are empirically estimated using the observed data. The results are robust to dierent
consumer expectation assumptions. This is because instead of varying a lot, the observed book variety, prices and Kindle prices follow
a nearly linear trend: the number of e-books available increases and the e-book prices drop steadily, and the Kindle price drops by $50
on average every year. The estimated variance of the error term in the AR(1) model is small, which reduces the dierence between the
rational expectation assumption and the perfect foresight assumption. Intuitively, it is commonly observed that price drops and quality
increases in a digital industry. E-book industry is relatively new to the consumers, but they can form rational expectation over the
price and quality change based on their knowledge about other industries. Another reason to use perfect foresight assumption is that
in the supply side counterfactuals, I solve for the optimal pricing problem with both consumers and rm forward-looking. Consumer's
expectation about future prices and rm's optimal pricing strategy given consumer's optimal adoption behavior need to be consistent
in equilibrium. Assuming perfect foresight enables the internal consistency of the model.
22
Notice that I can also normalize it in another way by dropping the coecient
σf
and estimating the variance of the error term.
The two approaches are equivalent and I take the rst one. The coecient σ f is thus identied by the observed variation in Kindle
adoption decisions, similar to how variance of the error term is identied in the second normalization approach.
13
If she has a Kindle and does not upgrade this period, she receives utility
ui0t = σ f fi1t + k0t + εi0t
(9)
If she buys a Kindle/upgrades in the current period, she receives utility
ui1t = σ f fi1t + k1t − αiHW Pt + εi1t
where
HW
αiHW = αHW + βincome
Diincome
is the Kindle price.
k0t
(10)
is the price coecient which is allowed to vary across income groups.
is the quality dummy for the Kindle she owns at the beginning of the period.
quality dummy for the new version of Kindle oered in the current period. I use a log time trend
k1t
Pt
is the
kt = t0 + t1 log t to
t the quality variable over time. Consumers are assumed to have perfect foresight on both the price and quality.
I drop the type subscript
i
and time subscript
t
for notation simplicity. Let prime denote the next period value.
There are six values of book reading utilities for each type of consumers depending on their Kindle ownership
status:
σ f f0 , σ f f1 + k2008 , σ f f1 + k2009 , σ f f1 + k2010 , σ f f1 + k2011 , σ f f1 + k2012
, standing for no Kindle and
having the ve generations of Kindle respectively.
For each type of consumer, the state space includes (1) current device adoption status, or the current holding
Kindle quality
and
f1 ;
k̄ ;
(2) book prices and e-book variety
(3) oered Kindle price
P
and quality
k;
Ω,
which enter the ex-ante ow utility from book reading
(4) the idiosyncratic shocks on the device side
V k̄, Ω, P, k, ~ε denote the value function of a consumer with current device status k̄
D=1
indicates that she chooses to buy a Kindle/upgrade and
¯ = σ f f1 (Ω) + k0
Dene ξ
D=0
as the current utility from reading books, and
~ε ≡ {ε0 , ε1 }23 .
f0
Let
at the beginning of the period.
indicates that she does not buy/upgrade.
ξ = σ f f1 (Ω) + k1
as the oered utility from
reading books. The Bellman equation is
V k̄, Ω, P, k, ~ε = max ξ k̄, Ω + δE V k̄, Ω0 , P 0 , k 0 , ~ε0 | k̄, Ω, P, k, D = 0 + ε0 ,
HW
0
0 0 0
ξ (k, Ω) − α
P + δE V (k, Ω , P , k , ~ε ) | k̄, Ω, P, k, D = 1 + ε1
(11)
The rst element of the max operator is the choice-specic value function of waiting and the second is the the
choice-specic value function of buying/upgrading. Conditional on buying/upgrading, the device adoption status in
the state space evolves deterministically from
at
k̄ .
The rest of the state space
{Ω, P, k}
k̄
to
k.
If not buying/upgrading, the device adoption status remains
evolves to
{Ω0 , P 0 , k 0 }
next period market characteristics. Notice that the Kindle quality
Bellman equation only through
ξ.
k
based on consumer's perfect foresight about
and the book side characteristics
Ω
enter the
The Bellman equation can be rewritten as
¯
¯ ξ 0 , P 0 , ~ε0 | ξ, P, D = 0 + ε0 ,
V ξ, ξ, P, ~ε = max ξ¯ + δE V ξ,
ξ − αHW P + δE [V (ξ, ξ 0 , P 0 , ~ε0 ) | ξ, P, D = 1] + ε1
where
23
ξ¯ = ξ k̄, Ω
Notice that given
is the ow utility with the stock Kindle,
Ω and the consumer type parameters,
ξ = ξ (k, Ω)
(12)
is the ow utility with the currently
I can calculate the ex-ante ow utility from book reading
f0
and
f1
directly
without using the observed quantity choices. This is useful because I do not observe the corresponding book purchase history for each
individual in the Kindle adoption data set. Still, I can calculate their ow utility from book reading ex-ante and form their Kindle
adoption probability.
14
oered Kindle, and
ξ 0 = ξ (k 0 , Ω0 )
is the ow utility with the future oered Kindle.
distributed extreme value type I error with density
of the value function integrated over
~ε.
g (~ε).
Let
EV (·, Ω) =
´
ε
Assume
V (·, Ω, ~ε) dg~ε
~ε
is independently
denote the expectation
Then apply the logit aggregation in Rust (1987) to the Bellman equation,
I can get the expected value function equation
¯
¯ ξ 0 , P 0 | ξ, P, D = 0
EV ξ, ξ, P = ln exp ξ¯ + δE V ξ,
+ exp ξ − α
HW
0
0
P + δE [V (ξ, ξ , P ) | ξ, P, D = 1]
(13)
There is one expected value function for each type at each period. For the years beyond my sample period, assume
that price and quality stay the same after year 2012:
¯
¯ ξ, P | ξ, P, D = 0
EV ξ, ξ, P = ln exp ξ¯ + δE V ξ,
+ exp ξ − α
HW
P + δE [V (ξ, ξ, P ) | ξ, P, D = 1]
(14)
The probability of buying a Kindle or upgrading is
¯ ξ, P =
Pr D = 1 | ξ,
A
A+B
A = exp ξ − αHW P + δE [V (ξ, ξ 0 , P 0 ) | ξ, P, D = 1]
¯ ξ 0 , P 0 | ξ, P, D = 0
B = exp ξ¯ + δE V ξ,
Intuitively, consumers are motivated to buy a Kindle for three reasons: the current period book purchase need,
a desirable current device price, and the option value of device adoption. To see this, take the dierence of the two
choice-specic value functions, we can get
¯ ξ 0 , P 0 | ξ, P, D = 0
ξ − ξ¯ − αHW P + δ E [V (ξ, ξ 0 , P 0 ) | ξ, P, D = 1] − E V ξ,
The rst term represents the benet from an enlarged book format choice set in the current period for a Kindle
non-owner. For an upgrader, it represents the benet from Kindle quality increase. If the utility increase is relatively
high compared to the current utility, she is more likely to buy one/upgrade in that period. The second term indicates
that consumers will respond to a Kindle price drop. The third term is the option value which can be seen as the
discounted utility gain from Kindle in the future. This allows consumers to self-select in the sense that consumers
who gain more from having Kindle will buy Kindle earlier.
4.3 Likelihood Function
The probabilities that enter the likelihood function come from both the device side and book side. The total
log likelihood
L = LKindle + Lbook .
Device Side
For each observed type of consumers (9=3 age groups * 3 income groups), I take the number of people shopping
on Amazon.com every year in my data sample
N
as the potential market size for Kindle. Denote
number of consumers buying/upgrading Kindle at time
t
n1t
from the Kindle purchase data set. Denote
the number of consumers who do not buy/upgrade in period
15
t.
the observed
n0t = N − n1t
The model gives the probability of buying/upgrading conditional on the current stock of Kindle
current period oer
ξ
and
¯ ξ, P
P : Pr D = 1 | ξ,
, or
¯t
Pr D = 1 | ξ,
as there is only one set of
ξ, P
ξ¯
and the
for each
t.
For
each type of consumer, it is a 5 by 6 matrix - the rows are the years 2008-2012 and the columns are the Kindle
ownership status from not having a Kindle to having the ve versions of Kindle respectively.
This conditional
purchase probability matrix gives us the probability of purchase for each year/ownership status. Based on this, I
¯t .
Pr ξ,
can calculate the probability of being in each ownership status at each period
Finally, the probability that I observe a purchase at period
Pr (D = 1 | t) =
X
t
It is again a 5 by 6 matrix.
is
¯ t Pr ξ,
¯t
Pr D = 1 | ξ,
ξ̄
The device part of the log likelihood function for each type of the consumer is
LKindle =
2012
X
[n1t log [Pr (D = 1 | t)] + n0t log [1 − Pr (D = 1 | t)]]
t=2008
Sum over the 9 observed types and integrate over the 2 unobserved types, we get the nal log likelihood function.
Book Side
In the data, I observe people not buying or buying dierent numbers of books in each category-format. The model
generates the probabilities of each possible purchase pattern through the conditions that the error terms should
satisfy. Dene the realized error terms given quantity choice as
Dene the threshold of worth buying as
P
− αig
P
owners, the error terms vig
aE
ig
=
aP
ig
−
P
v̄ig
≡ bpP
g
P
E
αig and vig
=
−
P
P
P
v qig
≡ qig
+bpP
g −αig
E
v̄ig
≡
E
αig satisfy
and
E
bpE
g − αig .

P
E
E


v P < v̄ig
, vig
< v̄ig

 ig
P E
P
P
P
E
vig
= v qig
> max v̄ig
, vig + v̄ig
− v̄ig



v E = v q E > max v̄ E , v P − v̄ P − v̄ E ig
ig
ig
ig
ig
ig
if
P
E
qig
= 0, qig
=0
if
P
E
qig
> 0, qig
=0
if
P
E
qig
= 0, qig
>0
P
P
(vig
>
E
vig
+
P
v̄ig
− v̄ig
E
E
E
v qig
≡ qig
+bpE
g −αig .
Equation (2) implies that for Kindle-
Intuitively, people choose paperbacks because it is worth buying itself (vig
E
and
P
> v̄ig
)
(15)
and it is better than e-books
), vise versa. Cannibalization happens if paperback is worth buying itself but is less attractive
than e-book. Market expansion happens if paperback is not worth buying anyway while e-book is.
Equation (4) implies that the demand error
P
P
vig
= aP
ig − αig

v P = v q P > v̄ P
ig
ig
ig
v P 6 v̄ P
ig
ig
satises
if qig
P
>0
P
=0
if qig
Assume the error terms are independently normally distributed with mean zero and variance
the likelihood function
book
L
(16)
σ2 .
I write out
based on conditions (15) and (16). The details are in Appendix A1.
5 Identication and Estimation
5.1 Identication
There are 23 parameters to estimate:
(1) book side parameters:
(2) device side parameters:
ρsmall
g
αHW ,
, ρlarge
, γD , θgE g=1,2,3 , βage , βn , b, βincome , σ, m1 , m2 , m3 ;
g
g=1,2,3
g=1,2,3
HW
βincome
,
σ f , t0 , t1 .
For the book side parameters, I adopt a nite mixture setup to capture the unobserved heterogeneity in the
16
category-specic tastes. The number of books bought in each category identies these parameters. The dispersion
in the book purchase pattern across categories helps identify the population mass of the unobserved types. The
variation in the number of books bought across age groups identies
to the e-book format, the e-format utility for each category
θgE
γD .
In terms of the taste component specic
is pinned down by the number of e-books bought.
The changing substitution pattern of the two formats over time identies the coecient for e-book availability
as the number of e-books available is the only time-varying variable in the taste specication.
number of e-books bought across age groups identies
explained by the variance of the error term
form
aig − bi pg ,
σ2 .
βage .
βn
The variation of
All the remaining variation in purchase pattern is
Finally, recall that the demand function for each category is of the
so people respond to a price drop of books in each category-format.
βincome
is pinned down by the
variation of price elasticity across income groups.
For the device side parameters, dierent types of consumers have dierent ow utility from books
f.
Given
the same price and quality of Kindle each period, the numbers of Kindles bought by dierent types of consumers
identiy
σf
and
HW
βincome
.
For the same type of consumer, the adoption probability in response to time-varying prices
pins down the price coecient
αHW .The
qualities captured by the log time trend are identied from two sources:
(1) cross-sectionally, the dierent adoption/upgrading probabilities for each type of consumers with dierent Kindle
stock; (2) intertemporally, the adoption/upgrading probabilities for each type of consumers.
Several parameters are identied from both the device and the book data.
The observed Kindle sales not
only identify the device side parameters, but also help identify book side parameters through the heterogeneity of
consumers. As I observe the income and age of each Kindle buyer, I can link it to the book reading utility
f
because
dierent income and age groups dier in their book side parameters. So the income-specic price coecients and
age-specic taste coecients are identied both by the observed number of books bought across income-age groups,
and the dierent Kindle adoption patterns across income-age groups.
The identication of upgrading comes from the observed e-book purchase and Kindle sales.
no upgrading, Kindle sales and penetration rate are one-for-one.
In a world with
Given the taste on e-books, an extra Kindle
sold directly leads to extra units of e-books sold. Upgrading is identied if Kindle sales in later years do not lead
to a proportional increase in e-book sales. Furthermore, the demographics of Kindle buyers over time oer clues
about returning consumers. The early adopter of Kindle in the data are more likely to have higher income and
age. Without upgrading, the change in the demographic composition should be monotonic as the consumer pool is
exhausted. Upgraders can be identied if the income and age of Kindle adopters in later years looks similar to the
early years.
5.2 Computation
I use a NFXP algorithm. The book purchase side is a static model where people make purchase and format
choices. The device adoption side is a dynamic model and I need to solve the Bellman equation. For each iteration,
I calculate the expected value functions in the inner loop and use MLE in the outside loop.
The estimation proceeds as follows:
(i) Start with the book purchase side. Given a set of parameter guess and the observed book purchase quantity
and format, calculate the the book purchase probabilities and the ow utilities associated with the current device
adoption status
ξ¯ ,
the oered status
ξ
and the next period status
ξ0.
(ii) Feed the ow utilities to the device adoption side and solve the dynamic programming problem. The expected
value functions is a xed point of the equation solved by value function iteration. The tastes are heterogeneous and
type-specic, so the ow utilities and expected value functions are calculated separately by type every period.
(iii) Given the expected value function, calculate the conditional device adoption probability.
(iv) Combine the probabilities from book purchase and device adoption to form the total likelihood. Search over
the parameter space using simplex algorithm in Matlab to get the maximizer of the likelihood function.
17
I discretize the state space
ξ
using 31 grid points over its range. The values o the grid points are interpolated
using a cubic spline. The convergence of the xed point is evaluated at the grid points. To get an initial guess of
the parameter values, I maximize the likelihood of book purchase
conditional
on the observed e-reader adoption
status. I do not need to solve the dynamic programming problem because the book side is static. This estimation
yields consistent, but potentially inecient, estimates of the book purchase side parameters. In the second step,
using the estimates from the rst stage, I maximize the device side likelihood holding the book side parameters
xed. The two sets of parameter values are combined to serve as a hot start for the full MLE.
6 Estimation Results and Model Fit
6.1 Demand Estimates
Table 7 displays the MLE estimates of the coecients. All of them have expected signs and are signicant.
The rst observation is that consumers are highly heterogeneous in their genre-specic reading taste and price
n
o
large small
,
ρ
who have higher reading taste over "casual" books are
ρsmall
,
ρ
3
1
2
n
o
6.35% of the population and type 2 consumers with
ρlarge
, ρsmall
, ρlarge
who prefer lifestyle and practical
2
1
3
n
o
reading are 5.03% of the population. Only 2.33% of the population are type 3 with
ρlarge
, ρlarge
, ρlarge
who
1
2
3
elasticity.
Type 1 consumers with
have high reading taste on all genres and the rest 86.29% are type 4 who have low reading taste on all genres.
Results in the next section show that the rst three types share more in common in terms of demand elasticities,
cannibalization and market expansion eects.
To make the discussion easier, I call type 1, 2, and 3 who have
high taste on at least one genre avid readers and type 4 general readers.
In terms of the price coecient,
household with higher income enjoys lower price coecient for both device and books. The model allows consumers
to self-select into buying Kindle based on their heterogeneous reading taste and price elasticity.
The second observation is that for all types of consumers, there are dierent category-specic e-format utilities.
In particular, people prefer to buy casual reading books like ctions and memoirs in e-book format, compared to
the dis-utility from buying practical books like textbooks. This is intuitive: the convenience of e-reading is best
reected by the casual reading purpose. For books with many pictures like cookbooks in lifestyle category and
books that require in-depth reading like textbook in practical category, e-books may not be a good option. This
nding has managerial implications on e-book pricing in that publishers can set e-book prices by genres to maximize
total book revenue. Another result is that people do value the increasing availability of e-books. The coecient
on log number of e-books available in Kindle Store is signicantly positive.
As more and more e-books become
available, people are more likely to benet from e-reading, adopt Kindle, and buy e-books.
This is inline with
the industry observation that e-book sales grow at three-digit rate as e-books become more available. Finally, the
coecient on household age is signicantly positive, indicating that older people read more books. The interaction
of age with the e-format indicates that seniors dislike e-reading.
One result is noteworthy. There is considerable consumer heterogeneity in genre-specic reading taste and price
coecient, which is informative to the supply side. Avid readers are especially important for e-book sales because
they respond more to Kindle and e-book price change. Higher income people have lower price coecient and are
more likely to adopt Kindle. The number of avid readers and high income consumers who have not bought any
device shrinks over time. This demand composition evolvement will aect the optimal pricing strategy. It will also
play a crucial role on the magnitude of cannibalization and market expansion eects, in turn aects total book
revenue for both publishers and Amazon.
6.2 Model Fit
The t the of the aggregate predictions is essential for the counterfactual analysis on both the cannibalization
18
Table 7: Parameter Estimates of the Demand System
Variable
Device
Book
Notes :
αHW price coecient
βincome household income
σ f ow utility coecient
ρsmall
genre-specic reading
2
ρlarge
2
ρsmall
1
ρsmall
3
ρlarge
1
ρlarge
3
γD household age
m1
m2
m3
taste
nite mixture mass
Estimate
s.e.
0.0042**
5.4801e-4
-0.0002**
4.3322e-5
4.0748**
0.5004
2.5518**
1.3850e-4
8.0934**
0.1791
2.8297**
0.0032
4.1801**
0.0161
8.1492**
0.0221
8.7160**
0.0174
0.0100**
1.4668e-4
0.0635**
0.0124
0.0503**
0.0123
0.0233*
0.0123
θ1E e-format utility: lifestyle
θ2E e-format utility: casual
θ3E e-format utility: practical
βn log # of available books
βage household age
-3.0421**
0.0607
0.0272**
2.5951e-4
-0.0043**
2.5167e-4
b price coecient
βincome household income
σ s.d. of normal distribution
0.2412**
2.2655e-4
-0.0050**
2.282e-5
1.4112**
1.4252e-4
-0.5936**
0.0052
1.5242**
2.4339e-4
Log Likelihood
-113,041
# Book Obs.
124,536
# Device Obs.
15,567
Details of the model are given in the text. Household age takes value 1 if the age range is below 35, takes value 2 if
the age range is 35-50, and takes value 3 if the age range is above 50. Household income takes value 1 if the annual income
is below $35,000, takes value 2 if the income range is $35,000-$75,000, and takes value 3 if the income is above $75,000. The
two price coecients are positive because they enter the model negatively. Consumers are assumed to have perfect foresight
on both Kindle price and the quality dummies.
** Signicant at 1 percent, * signicant at 5 percent level
19
and market expansion eects, and the optimal Kindle and e-book pricing. I check my model prediction on Kindle
sales and the book sales in each category-format. I simulate the model 100 times and calculate the average value of
each measure. Table 8 shows the model t of the aggregate device sales and book purchase over time. For the device
adoption, my model is able to t both the level and the trend over time, as consumers respond to time-varying
Kindle price, e-book price, e-book availability, and the quality of Kindle.
For the aggregate quantity of books
purchased in each category-format, my model prediction is on average within 5% dierence from the observed ones.
The heterogeneity in consumer reading taste, and the category-specic e-format utility help pin down the market
size of each category and the substitution between paperback and e-book within category.
Table 8: Model Fit: Device Adoption and Book Quantity/Format Choice
# Kindles Sold
2008
2009
Observed
33
Predicted
25
# Books Sold
Year 2011
Year 2012
Year
2010
2011
2012
76
174
263
249
87
197
272
316
Lifestyle
Observed
Casual
Practical
Paperback
E-book
Paperback
E-book
Paperback
E-book
4226
169
5406
858
3433
56
Predicted
4583
262
5229
926
3060
75
Observed
3834
740
4649
2608
3351
231
Predicted
4534
745
4969
2342
3413
206
# Books
Year 2008
Year 2009
Year 2010
Lifestyle
Casual
Practical
Observed
4828
6204
3151
Predicted
4556
5743
3327
Observed
4568
5963
3239
Predicted
4609
5690
3327
Observed
4372
5440
3285
Predicted
4270
5333
3234
6.3 Price Elasticities
For the purpose of computing the benets and costs from intertemporal price discrimination, the key is the
heterogeneity in the price sensitivity and demand elasticity.
I rst calculate the own- and cross- elasticity of demand for e-books. Within each category, own-elasticity is
the quantity response to the price change for each format, and cross-elasticity is the quantity response of the other
format to the price change of the focal format.
Given the current paperback and e-book prices, I perturb the
e-book price uniformly across categories by 1%, and calculate the corresponding total e-book and paperback sales
percentage change. I calculate the own- and cross- elasticities for all the books within each category. The overall
elasticity is calculated from the sales-weighted average of category-specic elasticities. The results are summarized
in Table 9. The the numbers in the brackets indicate the 95% condence interval.
The own-elasticity of demand for e-books is -1.23 and the cross-elasticity of demand is 1.07. This means that if
e-book price increases by 1%, e-book sales will decrease by 1.23% and paperback sales will increase by 1.07%. The
own- and cross- elasticities I get are comparable in magnitude to the extant literature (e.g. Hu and Smith 2011).
20
Table 9: Price Elasticities of E-books and Kindle
Kindle
E-book
Own-elasticity (CI)
Cross-elasticity (CI)
Overall
-1.23 (-1.38, -1.07)
1.07 (0.77, 1.36)
Lifestyle
-1.38 (-1.68, -1.08)
0.72 (0.15, 1.30)
Casual
-0.93 (-1.11, -0.75)
1.91 (1.26, 2.55)
Practical
-4.06 (-4.45, -3.68)
1.03 (0.70, 1.37)
Low income
Middle income
High income
type
1
2
3
4
1
2
3
4
1
2
3
4
Young
-1.1437
-1.1926
-1.0845
-1.2651
-1.08
-1.12
-1.02
-1.19
-1.01
-1.05
-0.96
-1.11
Middle age
-1.1439
-1.1925
-1.0845
-1.2652
-1.08
-1.12
-1.02
-1.19
-1.01
-1.05
-0.96
-1.11
Senior
-1.1439
-1.1925
-1.0846
-1.2650
-1.08
-1.12
-1.02
-1.19
-1.01
-1.05
-0.96
-1.11
Overall
-1.16
Notes :21.30%,
35.85%, and 42.85% of the population are in young, middle-age, and senior groups respectively.
30.21%,
34.15%, and 35.64% of the population are in low, middle, and high income groups respectively. Type 4 is general reader and
type 1,2,3 are avid readers.
Interestingly, the own-elasticity for casual e-books is the lowest and the cross-elasticity for casual e-books is the
highest.
This suggests that consumers prefer casual e-books and consider them a stronger substitute for print
books. This is consistent with the intuition that the convenience of e-reading is best reected in casual reading.
When Amazon or publishers set e-book prices, they are suggested to set them by book category as consumers
respond dierently to price change across genres.
Table 9 also presents the price elasticity of Kindle across consumer types. Consumers with higher income have
lower price elasticities. Avid readers are less price sensitive than general readers. The eect of age on price elasticity
is mixed as senior population enjoy more reading in general but dislike e-reading. In terms of how heterogeneous
consumers are across types: (1) the heterogeneity in the unobserved types contributes the most to the heterogeneity
in the Kindle price elasticity. The avid readers are on average 10% less price sensitive than the general readers. (2)
the higher income group are on average 5% less price elastic than the lower income group. (3) age dierence only
changes the price elasticity very mildly.
The dierence in the price elasticity aects the probability and order of Kindle adoption.
Comparing across
time and consumers, avid readers have a higher probability of adopting Kindle and at an earlier time. Consumers
are more likely to buy as Kindle price drops and quality increases over time. In terms of upgrading, the probability
of upgrading is higher for consumers with earlier generation of Kindle in stock. People are more likely to upgrade
over time as Kindle quality increases.
6.4 Cannibalization and Market Expansion
One of the research questions that motivates this paper is whether the introduction of e-books has cannibalization
or market expansion eect on print book sales. Given my demand estimates, I simulate the market without the ebook option. The dierence between the simulated total paperback sales and the observed one is the cannibalization
eect of e-books on paperbacks.
Market expansion eect explains the rest of the observed e-book sales.
In the
data, e-books contribute to 15.8% of the total book sales. Counterfactual results show 28.3% of the e-book sales
comes from cannibalizing paperbacks while 71.7% comes from market expansion. This means that publishers' worry
about cannibalization does make sense. But there is also a positive side of the story as more books are sold in the
existence of e-book format. The industry also nds the market expansion eect of e-book to be prominent. Je
24 : What we nd is that when people buy a Kindle
Bezos, the CEO of Amazon, said in the interview with the BBC
they read four times as much as they did before they bought the Kindle. But they don't stop buying paper books.
24 http://www.forbes.com/sites/kellyclay/2012/10/12/amazon-conrms-it-makes-no-prot-on-kindles/
21
Table 10: Cannibalization: Price and Quality Decomposition
cannibalization (%)
total e-book (#)
cannibalization (#)
market expansion (#)
# Kindle owner
default
28.30
2899
820
2079
897
only price di
27.38
2969
813
2156
879
only quality di
35.65
1353
482
871
796
Kindle owners read four times as much, but they continue to buy both types of books.
Analysis by Consumer Type and Genre
Analyzing the cannibalization and market expansion eects by consumer types and genres further reveals information behind the aggregate gure. The cannibalization rate for type 1, 2, and 3 are 54%, 45% and 58% respectively.
While avid readers - type 1,2, and 3 - have on average 52% cannibalization rate, the general readers only have 8%
cannibalization rate. This is because cannibalization is dened as the percentage of e-books that would have been
bought in print book format. As avid readers are more likely to buy paperbacks when e-book does not exist, it is
more likely that the e-books bought are from cannibalizing paperbacks. For general readers, if they ever buy any
e-books, it is much more likely to be a market expansion eect.
For dierent book categories, casual books have the lowest cannibalization rate of 25%, while the number is
35% for lifestyle and 38% for practical.
So casual books such as ctions have a stronger market expansion
eect and a weaker cannibalization eect on print books because of the positive e-format utility.
The reasons for substitution between e-book and paperback are (1) book prices are dierent; (2) there is extra
e-format utility, or quality dierence. It would be interesting to see the cannibalization and market expansion eect
by decomposing the price and quality impact. I conduct another counterfactual under three scenarios: the default
scenarios with both price and quality dierence between formats, the case where the quality dierence is dropped,
and the case where price dierence is dropped. Table 10 displays the overall comparison results. We can see that if
there is only price dierence, the cannibalization rate drops mildly and the total number of e-books sold increases
mildly.
Changes are much larger when there is only quality dierence: the cannibalization eect increases from
28% to 36% while the total number of e-books sold drops by more than a half. The conclusion is that the price
advantage of e-book format plays a more important role in determining the cannibalization rate and substitution
pattern.
Table 11 shows the decomposition of price and quality impact by genre. In the absence of price advantage, the
number of e-book sold is only 37% for lifestyle and 7% for practical books, indicating that the price advantage
is particularly important for those two categories since they have negative e-format quality.
Without quality
disadvantage/advantage, practical books will sell 6.7 times more while casual books will sell only 40% of its
previous value. So quality dierence is still important for genres with signicant e-format quality. Aggregating over
the genres makes it less obvious. Finally, price dierence leads to a larger dierence in the cannibalization rate than
quality does for lifestyle, and the opposite is true for practical. In general, price and quality have signicantly
dierent impact on dierent genres.
Comparative Analysis: Vary E-book and Kindle Price
Cannibalization and market expansion eects are aected by the device and e-book prices.
If e-book price
changes, demand for e-book drops and the book purchase behavior of a Kindle owner converges to a non-owner.
Holding Kindle adoption and paperback price ($18.20) xed at the observed level, comparative statistics analysis
shows that cannibalization and market expansion eects shrink as e-book becomes more expansive.
It is note-worthy that eect magnitude of cannibalization and market expansion is also aected by the Kindle
price. Consumers that do not have Kindle cannot buy e-books, so there is no cannibalization or market expansion for
them. We can expect that a higher rate of device adoption enhances both the cannibalization and market expansion
22
Table 11: Cannibalization: By Genre
Genre
lifestyle
casual
practical
eects.
cannibalization (%)
total e-books (#)
cannibalization (#)
default
35.26
608
214
market expansion (#)
393
only price di
34.92
830
290
540
only quality di
47.65
227
108
119
default
25.40
2097
533
1565
only price di
35.01
844
295
548
only quality di
32.95
1113
367
746
default
37.88
194
74
121
only price di
17.57
1295
227
1067
only quality di
55.55
14
8
6
During years 2010-2012, platforms such as Amazon.com do not have pricing right over e-books and can
only control device prices. It seems that publishers have full control over e-book pricing, so they can fully limit
the negative eect from cannibalization. However, the variation of Kindle price itself can generate dierent degrees
of cannibalization and market expansion eects, which is out of the publishers' control. I run a counterfactual by
changing the Kindle price from 50% to 150% of the observed prices every year and compare the magnitude of the
eects. The e-book prices is xed at the observed value. As Kindle price drops, more consumers adopt Kindle and
the total cannibalization and market expansion eects become stronger. An interesting observation is that the per
owner eects increase as Kindle becomes more expansive which at rst sight, looks like an upward-sloping demand
curve for e-books. This is because Kindle adoption is a self-selection process and consumers who adopt Kindle at
a higher price are those who buy more books and have higher cannibalization and market expansion eects. The
model is able to capture this self-selection
So far, I discussed the demand side estimation implications on price elasticity, cannibalization and market
expansion eects and comparative statistics on how varying Kindle and e-book price exogenously can change the
results.
Based on the estimated demand responses to price change, Amazon can solve for the optimal pricing
problem. In the next section, I take the estimated demand system and model the supply side of the market.
7 Intertemporal Price Discrimination Analysis: Implications for Pricing
and Welfare
As the demand parameters are estimated without imposing supply-side restrictions, it is possible to compute the
supply side optimal strategies and provide managerial implications. Benkard (2004), Dubé, Hitsch, and Manchanda
(2005), Nair (2007), and Liu (2010) take a similar approach. Given the estimated demand system
25 , I solve for
Amazon's optimal Kindle and e-book pricing problems with both rm and consumers forward-looking. The impact
of intertemporal price discrimination on protability and welfare is discussed. In particular, how does the existence
of the complementary good aect the IPD on the primary good? Is it always better for the rm to practice IPD
on the complementary good in addition to IPD on the primary good?
There has been a long dispute on the e-book pricing right in the publishing industry. Amazon initially signed
a wholesale contract with the publishers and conducted IPD on both Kindle and e-book.
25 The
The contract scheme
demand side components used in the supply side are: (a) the number of paperbacks bought by a Kindle non-owner
the number of paperbacks/e-books bought by a Kindle owner
non-owner
f0
and owner
f1 p
q P 1 pE , q E pE
E
qP 0 ;
(b)
; (c) the ow utility from book purchase for a Kindle
. These are all functions of e-book price (except for
qP 0
and
f0 ),
and are not functions of Kindle price.
To get the demand response to e-book price change, I simulate the demand system for 50 grid points of e-book price. This price range
covers up to 5 times e-book current price. I save the corresponding (a)-(c) variables and use a spline to numerically approximate them
as functions of the e-book price. They are the demand functions for each book format. I substitute them into the supply side when
solving for the optimal e-book and Kindle price.
The cannibalization and market expansion eects between e-book and print book
aects the demand of e-book and the prot from selling complementary good in a pre-determined way, so it will not change the IPD
implications discussed below.
23
switched to agency model after Apple came into play, and platforms including Amazon lost the pricing right on
e-books.
The analysis in this section helps understand the impact of e-books on the IPD of Kindle, as well as
protability and welfare implications.
The two scenarios - when Amazon can practice IPD on both Kindle and
e-book, and when Amazon can only IPD on Kindle - are comparable to the wholesale contract and agency contract.
Comparison between the two scenarios illustrates whether e-book pricing right is always helpful for protability of
the rm. It also sheds light on which contract scheme can be more welfare-improving.
For my purpose and scope I will not attempt an exhaustive investigation on the conditions upon which IPD is
optimal. Instead, I focus on the empirical results in the e-book industry. As IPD strategies are often interwined with
quality improvements and cost reduction (arising from economies of scale, learning-by-doing eects, etc), I abstract
from the quality improvement and cost decline of Kindle in reality to analyze the IPD incentives and impacts. The
e-book price solved is not category-specic. I assume that the rm changes e-book prices uniformly across genres
and the price can be interpreted as a sale-weighted average price. I made one simplication of the demand side to
t into the supply side. There are 9 observed types (3 income groups times 3 age groups) and 4 unobserved types
of consumers in the demand side model. The most signicant heterogeneity is the unobserved type as discussed in
the price elasticity analysis. In particular, I combine types 1, 2 and 3 who share similar elasticity of demand and
call it avid reader. Type 4 is still general reader. I keep these two unobserved types of consumers and average
across observed types. This helps reduce the dimension of the state space greatly from 36-dimension to 2-dimension,
while keeping the fundamental heterogeneity in consumer taste.
Finally, as it is in the demand side estimation,
publishers' role is taken as exogenous. Paperback price, wholesale prices of both formats, and the content provision
of e-books are taken from the observed values. The model neglects the vertical negotiation and interaction between
publishers and the platform, which is interesting but beyond the scope of this paper.
7.1 Model Setup
In this section, I set up consumer's and rm's dynamic problem and present the equilibrium denition.
Consumer Problem
The notation from the demand side model carries through to the supply side.
Let
qP 0 , qE , qP 1
denote
the quantity of paperbacks bought by a Kindle non-owner, and e-books/paperbacks bought by a Kindle owner
respectively. Upgrading is still allowed - existing Kindle owners will have a higher ow utility from book purchase,
but their upgrading decisions will not aect the book revenue generated because the penetration rate of Kindle is
unchanged. To keep the model trackable, I do not distinguish between Kindle ownership status of dierent Kindle
generations. There are two device ownership status - having a Kindle and not having one. Let
having Kindle and
1
stands for having one.
Kindle adoption status
ι = 0, 1
0
stands for not
The state space for the consumer problem includes the consumer's
and the Kindle price
P and
the e-book price
pE .
The Bellman equation for a
consumer who does not have Kindle at the beginning of the period is
EV 0, P, pE = ln exp σ f fi0 + δE V 0, P 0 , pE0 , ~ε | P, pE , Di = 0
+ exp σ f fi1 pE − αiHW P + δE V 1, pE0 , P 0 , ~ε | pE , Di = 1
I now impose the assumption that consumers have perfect foresight regarding future prices. Firm's state space
contains only the number of consumers who have not bought Kindle at the beginning of the current period
∆.
It
is 2-dimension with the number of avid readers on one dimension and the number of general readers on the other.
The ownership distribution is relevant to the consumer since it aects rm's current and future prices. I assume
that consumers observe it merely as a convenient way to impose rational expectations of future prices. Rationality
24
requires consumers to act as if they condition on the ownership distribution since it inuences future prices through
rm's policy function. In equilibrium, the Kindle price is
P = P (∆)
and
pE = pE (∆).
The price expectation of
consumers will be consistent with the pricing policy chosen by the rm. That is, consumers will correctly anticipate
that the rm will set the price
consumer's problem with
∆
P 0 = P (∆0 )
and
pE0 = pE (∆0 )
in the state space instead of
P
and
when facing the future state
E
p
∆0 .
I can re-write
.
EV (0, ∆) = ln exp σ f fi0 + δE [V (0, ∆0 , ~ε0 ) | ∆, Di = 0]
f
E
+ exp σ fi1 p (∆) −
αiHW P
Since Kindle adoption is an absorbing state, we have
0
0
(∆) + δE [V (1, ∆ , ~ε ) | ∆, Di = 1]
(17)
E [V (1, ∆0 , ~ε0 ) | ∆, Di = 1] = E [V (1, ∆0 , ~ε0 ) | ∆]
E [V (1, ∆0 , ~ε0 ) | ∆] = σ f fi1 pE (∆) + δE [V (1, ∆0 , ~ε0 ) | ∆] + E (εi1 )
Here the mean of the error is the Euler constant
and
(18)
E (εi1 ) = 0.5772.
The probability of adopting Kindle given that the consumer has not adopted one at the beginning of the period
is
Pr (Di = 1 | ιi = 0, ∆) =
A
,
A+B
A = exp σ f fi0 + δE [V (0, ∆0 , ~ε0 ) | ∆, Di = 0]
B = exp σ f fi1 pE (∆) − αHW P (∆) + δE [V (1, ∆0 , ~ε0 ) | ∆, Di = 1]
(19)
Similarly, the probability for upgraders is
A
,
A+B
h
i
0 0
e
A = exp σ f ff
ε ) | ∆, Di = 0
i0 + δE V (0, ∆ , ~
h
i
B = exp σ f fi1 pE (∆) − αHW P (∆) + δE Ve (1, ∆0 , ~ε0 ) | ∆, Di = 1
f (Di = 1 | ιi = 0, ∆) =
Pr
where the upgraders have higher ow utility from book reading
ff
i0 = κfi0
than non-owners
fi0 26 .
(20)
Notice that
although they contribute to Kindle prot, the penetration rate of Kindle does not change. Their behavior will not
aect the state space which is dened as the number of consumers who have not bought any Kindle.
The state space evolves deterministically as
∆0 = ∆ [1 − Pr (D = 1 | ι = 0, ∆)]
The aggregate demand of Kindle is
P
i
(21)
∆i ∗ Pr (Di = 1 | ιi = 0, ∆).
Firm Problem
The state space for rm's problem contains the number of consumers in each type who have not bought Kindle
at the beginning of the current period
Pr (D = 1 | ι = 0, ∆)
26 In
∆.
Dene the probability of adopting Kindle for a non-owner as
and the probability for upgraders
the calculation, I set
κ = 2.5.
g1 (∆) ≡ Pr
f (D = 1 | ι = 0, ∆).
Pr
I vary the value and the results are robust to the change.
25
Pr 1 (∆) ≡
The Bellman equation of
the rm is
EW (∆) = max π P, pE , ∆ + δE [W (∆0 ) | P, ∆]
P
where
π P, pE , ∆
= prof itKindle + (revbook1 + revbook0)
here "protKindle" stands for prot from Kindle sales, "revbook1" stands for book revenue from Kindle owners,
and "revbook0" stands for book revenue from Kindle non-owners. The demand for Kindle equals the current market
size times the probability of adoption
∆ ∗ Pr 1 (∆)
for rst-time buyers and
g1 (∆)
(1 − ∆) ∗ Pr
for upgraders. The
prot from Kindle sales is this demand multiplied by the price margin. The number of Kindle owners equals the
number of owners by the end of last period
this period
∆ ∗ Pr 1 (∆).
w
is the initial market size at time 0) plus the new adopters
The number of Kindle non-owners equals the current market size times the probability of
not adopting Kindle this period
E
∆ 0 − ∆ ( ∆0
∆ [1 − Pr 1 (∆)].
Denote the wholesale prices of paperbacks and e-books
wP
and
, we have

prof itKindle =


g1 (∆)
∆ ∗ Pr 1 (∆) + (1 − ∆) ∗ Pr
 · [P (∆) − c]
|
{z
} |
{z
}
# new buyers
revbook1
=
revbook0
=
# upgraders
[∆0 − ∆ + ∆ ∗ Pr 1 (∆)] · (revdif f )
|
{z
}
# cumulative Kindle owners
∆ ∗ [1 − Pr 1 (∆)]
|
{z
}
· pP − w P q P 0
# cumulative Kindle non−owers
where
revdif f ≡ pE − wE q E pE (∆) + pP − wP q P 1 pE (∆) − pP − wP q P 0
represents the book prot
dierence generated by a Kindle owner and a Kindle non-owner. The Kindle cost and book wholesale price infor-
27 .
mation used in the model is imputed from the newspress
F.O.C. with respect to Kindle price and e-book price yields
∂W (∆)
∂P
∂W (∆)
∂pE
=
=
∂π P, pE , ∆
∂∆0
+ δW 0 (∆0 )
=0
∂P
∂P
∂π P, pE , ∆
∂∆0
+ δW 0 (∆0 ) E = 0
E
∂p
∂p
From the rst F.O.C., we can get
27
(1) I assume that in reality, the cost of manufacturing Kindle drops at the same rate as that of computer parts.
From
the news, the cost of Kindle 2 is $185 in year 2009 (http://www.isuppli.com/Teardowns/News/Pages/Amazon-Kindle-Fire-Costs$201-70-to-Manufacture.aspx).
The costs for other years are imputed accordingly.
I abstract from the cost drop in the calcu-
lation and use the average cost of Kindle over the ve years $149.5 as the cost in the dynamic pricing problem.
erage wholesale price is chosen to be $15 and the average e-book wholesale price is $12.
(2) The av-
According to industry-wide practice
(http://www.salon.com/2013/07/01/everything_you_need_to_know_about_the_great_e_book_price_war/), the publisher sets a
cover price for a book, sells its print book version to a retailer at a discount (typically 50 percent). A new book with a hardcover list
price of $29.95 would be given an e-book price of $23.95 20 percent less to account for the publisher's savings in printing, binding
and distribution. The publisher would sell that e-book to Amazon for $12, and the print book's wholesale price is $15.
26
∂ Pr 1 (∆)
[P (∆) − c] + ∆ ∗ Pr 1 (∆)
∂P
g1 (∆)
∂ Pr
g1 (∆)
+ (1 − ∆) ∗
[P (∆) − c] + (1 − ∆) ∗ Pr
∂P
{z
}
|
∆∗
(22)
static Kindle revenue change
∂ Pr 1 (∆)
∂ Pr 1 (∆)
+∆ ∗
∗ [revdif f ] − ∆ ∗
{δW 0 (∆0 )} = 0
∂P
∂P
|
{z
} |
{z
}
static book revenue change
where
∂ Pr 1(∆)
∂P
dynamic f uture state change
= − Pr 1 (∆) [1 − Pr 1 (∆)] αiHW − δ∆ ∗
∂ Pr 1(∆)
EV1
∂P
(0, ∆0 )
, similarly for the upgraders. The
pricing incentives for Kindle are reected in the rst-order-condition above. Statically, if Kindle price is lower, more
people will buy Kindle. Given that a Kindle owner generates more total book revenue than a non-owner, this will
also increase the total book revenue. So in terms of the current period prot, Amazon needs to trade-o between a
loss from selling Kindle and a gain from book revenue. Dynamically, (1) an increase in the demand for Kindle today
will reduce the market size for Kindle tomorrow; (2) given that consumers are heterogeneous in the probability of
adopting Kindle, the composition of the two types of consumers in the market for Kindle evolves over time and is
endogenous to Amazon's pricing policy. The composition evolution further leads to change in demand elasticity.
(3) given that consumers have perfect foresight of Kindle price, a change in
consumer expectation on the next period
P 0.
P
aects the next period
∆0 , and aects
The eect of Kindle price change on the market composition and the
continuation value of the rm is represented in the last term.
From the second F.O.C., we can get
∆∗
|
g1 (∆)
∂ Pr 1 (∆)
∂ Pr
[P (∆) − c] + (1 − ∆) ∗
[P (∆) − c]
E
∂p
∂pE
{z
}
static Kindle revenue change
∆∗
|
+
∂ Pr 1 (∆)
∗ [revdif f ]
∂pE
{z
}
static book revenue change f or new adopters
"
+ [∆0 − ∆ + ∆ Pr 1 (∆)] ∗ q
|
where
∂ Pr 1(∆)
∂pE
E
−∆ ∗
|
∂ Pr 1 (∆)
{δW 0 (∆0 )}
∂pE
{z
}
dynamic f uture state change
#
∂q E pE (∆)
P 1 pE (∆)
P
P ∂q
p (∆) + p − w
+ p −w
= 0 (23)
∂pE
∂pE
{z
}
E
E
E
static book revenue change f or existing owners
∂fi1 (pE (∆))
1(∆)
1(∆)
= Pr 1 (∆) [1 − Pr 1 (∆)] K − δ∆ ∂ Pr
EV1 (0, ∆0 ) and K = σ f
−δ∆ ∂ Pr
EV1 (1, ∆0 ).
∂pE
∂pE
∂pE
Similarly for the upgraders. Again, rst order condition helps understand the trade-o for e-book pricing. Statically,
e-book price change aects current period Kindle adoption and the book revenue generated by new adopters and
existing owners. Dynamically, a change in
over the next period
pE
also aects the next period
∆0 ,
and in turn consumer expectation
pE0 .
Equilibrium Denition
I consider pure-strategy Markov-perfect Nash equilibrium (MPNE). Compared to that of Ericson and Pakes
(1995), the model allows both the consumer and the rm to be forward-looking.
The equilibrium here requires
that consumer's expectation over future state is consistent with the rm's optimal strategy.
dened as the set of
V ∗ , W ∗ , P ∗ , pE∗ , ∆0∗
The equilibrium is
, which includes the equilibrium value function for the consumer, the
rm's value function and optimal pricing policy function for both Kindle and e-book, and the belief about next
27
period state space. Both the consumer and rm are forward-looking with rational expectation in that the belief is
consistent with the realized next period state space given that consumer and rm behave optimally. (1) consumers
solve the optimal Kindle adoption policy in equation 17, 18 and 19; (2) rm solves the optimal Kindle pricing policy
characterized by the F.O.C. in equation 22 and 23; (3) consumers' belief about the next period state
∆0
in equation
17, 18 and 19 is consistent with the rms' optimal policy and evolvement of the state space; (4) rm's belief about
the next period state
∆0
in equation 22 and 23 is consistent with the consumers' Kindle adoption decision.
The computation algorithm is in Appendix B.
7.2 Result Analysis
The welfare eect of price discrimination by a monopolist is an open question in the literature. In the third
degree price discrimination (e.g Robinson 1933, Schmalensee 1981, and Aguirre et al. 2010), the standard intuition
is that prices are lower in the weak market, where the demand is more price sensitive, and higher in the strong
market compared with the non-discriminatory case. So while the seller's protability is guaranteed, some consumers
are better o and others are worse o. The overall impact is not obvious. These theoretical results apply to the
intertemporal price discrimination as well, once we reinterpret demand of avid readers with lower elasticity - following
Robinson's terminology - as the weak demand, and demand of general readers as the strong demand. The dierence
is that while no arbitrage across consumers can be exogenously assumed in the third degree price discrimination,
consumers facing IPD can arbitrage intertemporally by delaying purchase, so the price that rm can charge is
restricted and the protability of IPD for the rm is not even guaranteed. If rm chooses to IPD, price skimming
is optimal. Compared to the non-discriminatory case, some high evaluation consumers switch from buying in the
rst period to buying in the second period, and some lower evaluation consumers become new buyers in the second
period. The cost of IPD for the rm is to sell to those switching high evaluation consumers at a lower price. The
gain of IPD is to sell to the high evaluation consumers who still buy in the rst period at a higher price and to
expand the market to the new lower evaluation buyers. In terms of the total welfare, the switching consumers lead
to a negative misallocation eect and new buyers create an output eect. The overall impact of IPD is an empirical
question, and in particular depends on the relative elasticity of the demand across consumer types (Aguirre et al.
2010). A necessary condition for welfare improvement is the increase in quantity sold. A sucient condition for
an increase in the quantity sold is that the demand is more convex in the weak market than in the strong market
(Aguirre et al. 2010).
In the context of monopolist selling a pair of complementary goods, the picture is more complex. The existence
of the complementary good provides an extra incentive of penetration pricing for the primary good. Given that rm
has both skimming and penetration incentives, the optimal pricing strategy needs not be decreasing. Relatively
few theoretical studies have discussed the feasibility and protability in this setting (e.g. Leung 1997, Koh 2006),
and there is no empirical paper, especially on the welfare impact of the primary good IPD. This paper aims to ll
this gap. Another issue is that the complementary good price is exogenously xed in the two theoretical papers.
In practice, it is becoming increasingly common for monopoly rms to set both the durable and non-durable good
prices, especially in online business.
Theoretical literature provides little guidance on conducting IPD on both
goods. I use a dynamic model with both rm and consumers forward-looking to solve for the optimal pricing policy,
protability and welfare implications of IPD on both goods.
In this section, I evaluate the impact of IPD with complementary goods on protability and welfare.
I rst
consider the implication of my estimates for a monopolist selling a pair of complementary products, but only IPD
on the primary good and not on the complementary good.
The impact of the complementary good on primary
good IPD is discussed. I then consider the case with both goods IPD where there is no theoretical studies.
28
Table 12: Demand Curvature of Primary Good: Vary Complementary Good Price
e-book price
$16
$13
avid
4.06
3.89
% change
-4.2%
general
4.35
4.32
-0.7%
Primary good IPD with constant price complementary good
How does the complementary good change primary good IPD? The intuition is that the complementary good
aects the demand curvature of the primary good. The fundamental reason why rms price discriminate is that
consumers are heterogeneous in their demand elasticity. Demand curvature
α = −pq 00 /q 0
is particularly relevant
(Aguirre et al. 2010). I calculate demand curvatures of the two consumer types for the primary good under two
dierent e-book prices (curvature is evaluated at the same primary good price $150).
The rst nding is that
both avid readers and general readers have convex demand, and the demand is more convex for general readers.
A sucient condition for an increase in the quantity sold is that the demand is more convex in the weak market
than in the strong market (Aguirre et al. 2010). Here weak market corresponds to the general readers and strong
market corresponds to the avid readers. So in this setup, price drop will lead to an increase in output. The second
observation is that when e-book price decreases from $16 to $13, demand is less convex for both types which means
that the rm is stronger in both markets. In particular, the curvature change is larger for the strong market. It
implies that a lower e-book price induces a larger dierence in demand convexity between the two types. As the
IPD ability is associated with the heterogeneity in demand curvature, a lower complementary good price increases
the demand curvature dierence of the primary good and enhances rm's IPD ability on the primary good.
To see how the existence of the complementary good aects optimal pricing policy of the primary good, profitability and welfare, I solve the dynamic pricing model with dierent at complementary good prices exogenously
given. The optimal pricing policy functions are in Figure 1. The x- and y-axis are the state space which are the
number of avid and general readers in the market respectively. We can see that (1) the existence of the complementary good provides an extra penetration pricing incentive to the primary product. Given both price skimming and
penetration incentives, the rm's optimal IPD price is actually increasing over time as more consumers adopt the
primary product. This is in contrast to the traditional IPD literature where optimal price is always decreasing over
time; (2) as the complementary product price gets lower, the rm invests less in the primary product by increasing
the price level. This is consistent with the intuition discussed earlier: a lower complementary product price enhances
rm's IPD ability on the primary product, so the rm can price discriminate more harshly by investing less on the
primary product.
The impact of a at rate complementary product on protability and welfare is in Table 13. For each variable,
I take the average value across state space and use e-book price $16.5 as the baseline.
The baseline values are
normalized to be 100 and other cases are relative to the baseline. The protability, or producer surplus, is measured
by the discounted total prots of the rm which is the value function of the rm. The (expected) consumer surplus
is calculated as
CS =
value function and
ε̄
P
i
ni × (EVi + ε̄) /αHW
where
ni
is the number of consumers of type i,
EVi
is the expected
is the expected value of the shock. Since the device side ow utility that enters the expected
value function contains utility from both device and book consumption, consumer welfare is computed from the
device side alone.
The rst observation is that the impact of the complementary good on protability is not
monotonic. Producer surplus increases rst and drops later as the complementary good becomes more expensive.
This is because rm cares about prots from both the primary good and complementary good. As compIementary
good price and prot increase, primary good price and prot drop, leading to a non-monotonic overall prot. It
also implies that in the setting of complementary goods, the protability of the IPD on the primary good is aected
not only by the demand curvature, but also the price of the complementary good. The second conclusion is that
29
Figure 1: Optimal Primary Good Pricing Policy: Flat Rate Complementary Good
*Notes: when interpreting the level of optimal Kindle price here, the readers should bear in mind that the cost drop
and quality increase of Kindle in reality are not modeled. The optimal price solved in the model can be seen as
a quality-adjusted mark-up plus a constant cost. To test whether the estimated demand system and the modeled
supply system are quantitatively plausible, I solve for the optimal pricing path of Kindle with decreasing cost (the
cost information is imputed from the industrial facts). The model-predicted Kindle price drops from $350 in 2008
to $80 in 2012, which is comparable to the observed Kindle price in reality.
30
Table 13: Protability and Welfare: Flat Rate Complementary Good
e-book price
PS (Protability)
CS
TS
$16.5
100
100
100
$17
99.73
100.04
99.86
$17.5
99.81
100.12
99.94
interestingly, consumer surplus increases as the price of the complementary good increases. This is because a higher
complementary good price makes rm's IPD ability on the primary product weaker, so that consumers pay a lower
price for the primary good. The overall impact is positive. In practice, people worry that a high e-book price will
hurt consumer surplus, but it also induces a lower e-reader price and the quantity sold is higher for both e-readers
and e-books. Consumers may benet from a relatively higher e-book price. The last observation is that total surplus
change is not monotonic since the protability is non-monotonic.
IPD on both primary and complementary product
In the last section, I discuss the case with a at rate complementary good to show the intuition for the impact
of complementary good on the IPD of the primary good. Now I relax the assumption on the complementary good
price and allow the rm to conduct IPD on both primary and complementary good. The rst question is, what is
the optimal pricing policy for the rm given both skimming and penetration incentives and two price instruments.
In the last section where only the primary product price is solved, the rm can use the primary good price to
"harvest" or "invest". In the current problem with both primary good and complementary good IPD, the price of
the complementary good also potentially serves as a tool for both "invest" and "harvest". It is not obvious how
the rm should choose the instruments to "invest" and "harvest".
Traditionally, rm's IPD behavior depends on the relative demand curvatures of the heterogeneous consumers.
It turns out in my case with complementary good, it also depends on the relative demand elasticities of the primary
product and complementary product. The rm uses the product with higher elasticity of demand to "invest" and
the product with lower elasticity of demand to "harvest". I explore the pricing incentives by comparing the rm's
optimal pricing decisions under the current demand estimates and the case with a higher demand elasticity of books.
As consumers respond more to e-book price change, a lower e-book price can attract new consumers more easily
compared to a low Kindle price. At the same time, setting a high Kindle price can also exploit consumers with
relatively low elasticity of demand for Kindle. So when the demand elasticity of books is higher, the rm switches
from investing in the primary product to harvesting in the primary product, and invests more on the complementary
28 . The result on which product to harvest/invest is comparable to the nding in Koh (2006). He uses a
product
Cobb-Douglas utility function and shows that if the weight placed on the consumption of the durable good in the
utility function is suciently close to 1, the optimal price of the durable good will be decreasing over time. The
weight here indicates the substitution, or relative elasticity of the demand for the two goods, and is a special case
of my nding.
The second question is how does allowing for complementary good IPD aect pricing and welfare. I compare
the results from IPD on both products with only IPD on the primary product. Figure 2 plots the optimal pricing
policies for both products, the producer surplus (protability), the consumer surplus and the total surplus with the
state space on the x- and y-axis. We can see that (1) if rm can do complementary good IPD, the optimal pricing
28
The nding of the investing behavior on the complementary product in both cases is consistent with the industry observation:
Amazon featured its low-price e-books for as cheap as $0.99. The percentage of books priced from $0.01 to $50 that are under $10
uctuates around 88% since the introduction of Kindle Store in year 2007.
The number of free books oered reaches 54,567 as of
February 2013 (http://ilmk.wordpress.com/category/analysis/snapshots/). The low e-book price shows Amazon's eort in investing on
new e-book buyers.
31
policy for the primary good switches from investing in the avid readers to investing in the general readers. The
complementary good is used to invest in avid readers. So IPD on both goods allows rm to coordinate across two
price instruments and better price discriminate between the two consumer types. (2) Interestingly, the protability
of the rm is not guaranteed to be larger under the double IPD case.
In particular, when the initial condition
(the starting point in the state space) is such that the market consists of relatively larger proportion of general
readers, IPD on primary product only is actually more protable than IPD on both goods. In other words, whether
complementary good IPD increases total protability depends on the relative share of the heterogeneous consumers.
It is not always the case that having an extra price instrument improves protability, even though it can potentially
enhance the IPD ability on the primary product. The managerial implication is discussed in the next section. (3)
Consumer surplus is higher under the double IPD case when the relative share of general readers is higher. This is
to the opposite of the result on rm protability. (4) The total welfare is lower when rm conducts IPD on both
products, rather than only on the primary product. The loss for the rm outweighs the gain for the consumers
when there is relatively more general readers, and the loss for the consumers outweighs the gain for the rm when
there is relatively more avid readers. In general, welfare and protability results are ambiguous in traditional IPD
literature. In my empirical model, protability for complementary good IPD only holds for some conditions, while
total welfare loss is always true.
Managerial Implications
As discussed in the last two sections, the existence of the complementary product provides an extra penetration
pricing incentive for the primary product.
By changing the demand curvature of the primary product across
consumer types, the complementary product enhances the IPD ability on the primary product.
Interestingly,
consumers can benet from a higher at complementary good price. A higher e-book price can trigger a stronger
investment strategy on the primary good for the rm, and consumer surplus increases as the device becomes cheaper.
So when evaluating the welfare impact of a price increase on e-books, the rm's pricing incentive from the interaction
between the device and e-books should also be accounted for.
When IPD on the complementary good is allowed, the rm needs to choose which price instruments to use
for harvesting and investment purposes. Not only does the relative demand curvature of the primary good across
consumer types matter, but also the relative elasticity of demand between primary and complementary good matters.
The results show that IPD on complementary product does not necessarily increase protability of the rm. This
nding helps understand the dispute of e-book pricing right in the book industry. The case with both goods IPD
is comparable to the wholesale model between the publishers and platform where the platform sets both device
price and e-book price. The case with only primary good IPD and a xed exogenous complementary product price
is comparable to the agency model where platform only has control over the device price. The contract initially
proposed by Amazon is the wholesale contract, in the hope that controlling both device and e-book prices can
allow better price discrimination and a higher prot. It turns out that if the relative share of general readers is
high, Amazon is better o by IPD only on the device. So if publishers seek to get the e-book pricing right back by
switching to the agency model, there is room for negotiation as Amazon can get more prot even without controlling
e-book price. On the other hand, if the relative share of avid readers is high, it is better for Amazon to have both
device and e-book pricing rights in its hands. Given that consumers self-select into buying device and avid readers
adopt earlier, the market contains relatively more avid readers in the early stage and relatively more general readers
later. So it is more important for Amazon to adopt the wholesale contract and control e-book price in the early
stage. As the consumer composition evolves, Amazon may be better o by practicing IPD on the device only and
publishers can seize this opportunity to persuade Amazon into the agency model.
32
Figure 2: The Impact of Allowing IPD on Complementary Product
*Notes: Single indicates the case with IPD on only the primary product; double indicates the case with IPD on
both goods.
33
8 Conclusion
Many important questions in economics hinge on the extent to how new goods aect existing ones. Examples of
new goods such as radio, movies, PCs, and le sharing suggest that these relationships can be highly uncertain ex
ante. In this paper, I analyze the impact of e-books on online print book sales. I nd that taking supply side prices
as exogenously given, 28% of e-book sales on Amazon come from cannibalizing online print book sales and 72%
come purely from market expansion. The magnitude of the two eects depends on both Kindle price and e-book
price. Interestingly, book categories dier in their format substitution pattern. Consumers prefer to buy casual
reading books in e-book format compared to the other two categories lifestyle and practical. Price dierence
across formats has a stronger impact on the rate of cannibalization than quality dierence does.
Two modeling
decisions are important in this analysis: (1) Accounting for consumer heterogeneity in reading taste is critical, as
dierent types of consumers generate dierent book revenue and respond to e-book price dierently; (2) Taking into
account dynamic device adoption decision is necessary as it allows self-selection based on heterogeneous consumer
taste.
The pricing and welfare implications of intertemporal price discrimination for rms selling complementary goods
is explored in the counterfactuals. The existence of the complementary good changes rm's IPD incentive on the
primary good through its impact on the demand curvature across consumer types. A lower complementary good
price enhances rm's primary good IPD ability. Consumers may benet from a price increase in the complementary
good because rm will price discriminate less in the primary good. When the rm can also IPD on the complementary good, the protability is not necessarily higher. The results on protability and consumer surplus depend
on the relative share of consumer types in the initial condition. In particular, the rm can be better o without
IPD on the complementary good when the market consists of relatively large proportion of general readers. There
is room for negotiation on e-book pricing right between publishers and the platform as the monopoly platform can
sometimes get more prot even without controlling e-book price.
There are also limitations.
First, I only consider online book purchase and the cannibalization and market
expansion eects are only measured with respect to Amazon Kindle.
In practice, oine print book sales are
aected by the introduction of e-books as consumers shift from oine book stores to online sellers.
The eect
magnitudes estimated in my model should be interpreted only within the online book market. Second, platforms
face competition in practice. Amazon and Barnes & Noble compete for device owners and book buyers. Consumers
are locked-in to a particular e-reader once she buys the device because e-books are not compatible across e-readers
(for instance, Kindle e-books cannot be read on Nook). There is considerable heterogeneity in reading taste and
price elasticity among consumers, and the composition of consumers not having Kindle evolves over time. This will
aect the optimal pricing strategy for Amazon and its competitors. My model provides a baseline monopoly case
for further supply side analysis. Third, the supply side model considers the optimal pricing strategies. A richer
model would incorporate the innovation and quality choice of Kindle for Amazon to account for the quality impact
on welfare.
34
References
[1]
Aguirre, Inaki, Simon Cowan and John Vickers. 2010. "Monopoly Price Discrimination and Demand
Curvature". American Economic Review, Volume 100, Number 4, September, pp. 1601-1615(15).
[2]
Dubin, Jerey A, McFadden, Daniel L. 1984. An Econometric Analysis of Residential Electric Appliance Holdings and Consumption, Econometrica, Econometric Society, vol. 52(2), pages 345-62.
[3]
Ericson, R., & Pakes, A. 1995. Markov-perfect industry dynamics: A framework for empirical work.
The Review of Economic Studies, 62(1), 53-82.
[4]
Economides, Seim, Viard. 2008. Quantifying the Benets of Entry into Local Phone Service, RAND
Journal of Economics, RAND Corporation, vol. 39(3), pages 699-730.
[5]
Gowrisankaran, Rysman. 2012. Dynamics of Consumer Demand for New Durable Goods. Journal of
Political Economy.
[6]
Matthew A. Gentzkow. 2007. Valuing New Goods in a Model with Complementarity: Online Newspapers. American Economic Review, 97(3), pp. 713 - 44.
[7]
Goettler, Gordon. 2011, Does AMD spur Intel to innovate more?
Journal of Political Economy,
119(6), 1141-1200.
[8]
Gowrisankaran,
Gautam,
in
Environment.
a
Dynamic
Rysman,
Marc and Park,
NET
Institute
Minsoo. 2010. Measuring Network Eects
Working
Paper
No.
10-03.
Available
at
SSRN:
http://ssrn.com/abstract=1647037.
[9]
Hendel, I., and A. Nevo. 2013. Intertemporal Price Discrimination in Storable Goods Markets," Amer-
ican Economic Review.
[10]
Yu (Jerey) Hu and Michael D. Smith. 2011. The Impact of Ebook Distribution on Print Sales:
Analysis of a Natural Experiment. mimeo.
[11]
Kannan, P. K., Barbara Kline Pope, Sanjay Jain. 2009. Pricing Digital Content Product Lines: A
Model and Application for the National Academies Press, Marketing Science, Lead Article, Vol. 28,
No. 4, pp. 620-636.
[12]
Koh. W. 2006. The Microfoundations of Intertemporal Price Discrimination, Economic Theory, Vol
27, No. 2, pp 393-410.
[13]
Robin Lee. 2013. Vertical Integration and Exclusivity in Platform and Two-Sided Markets. American
Economic Review.
[14]
Lazarev, John 2013. The Welfare Eects of Intertemporal Price Discrimination: An Empirical Analysis
of Airline Pricing in U.S. Monopoly Markets. working paper.
[15]
Leung, H. M. 1997. Intertemporal price discrimination and consumer demand. Journal of Economics
65, 19-40
[16]
Liu, H. 2010. Dynamics of pricing in the video game console market: skimming or penetration?.
Journal of Marketing Research, 47(3), 428-443.
[17]
Melnikov, O. 2000. Demand for dierentiated durable products: The case of the US computer printer
market. Manuscript. Department of Economics, Yale University.
35
[18]
Nair, H. 2007. Intertemporal price discrimination with forward-looking consumers: Application to the
US market for console video-games, Quantitative Marketing and Economics, 5, 239-292.
[19]
F. Oberholzer-Gee, K. Strumpf. 2007. The Eect of File Sharing on Record Sales:
An Empirical
Analysis. Journal of Political Economy, Vol. 115, No. 1, pp.1-42.
[20]
Robinson, Joan 1933 , The Economics of Imperfect Competition McMillan and Co.
[21]
Schmalensee, Richard 1981 Output and Welfare Implications of Monopolistic Third- Degree Price
Discrimination. American Economic Review, Vol. 71, No. 1 (March), pp. 242-247.
[22]
Stokey, N. L. 1979. Intertemporal Price Discrimination. The Quarterly Journal of Economics, 93(3),
355-371.
[23]
Stokey, N. 1981, Rational Expectations and Durable Goods Pricing, The Bell Journal of Economics
12(1), 112-128.
[24]
Varian, Hal 1985 Price Discrimination and Social Welfare. American Economic Review, Vol. 75, No.
4 September, pp. 870-875.
[25]
Joel Waldfogel. 2007. 'Lost' on the Web:
Does Web Distribution Stimulate or Depress Television
Viewing?, NBER Working Papers 13497, National Bureau of Economic Research, Inc.
36
Appendix
A1. Likelihood Calculation
To simplify notation, I drop
i
and
g
subscripts for now.
Case 1 : The book buyer does not have a Kindle.
Given the assumption that
v

v P = v q P > v̄ P
v P 6 v̄ P
we have that the probability of buying
P
f v =v q
and the probability of buying
qP = 0
P
qP > 0
P
| v > v̄
P
and
Φ (·)
P
Pr v > v̄
P
>0
if q
P
=0
P
v qP
σ
1
= φ
σ
!
is
Pr v < v̄
φ (·)
if q
and
number of books is
P
Here
σ2 ,
is normally distributed with mean zero and variance
P
=Φ
v̄ P
σ
are pdf and cdf of the normal distribution.
So the contribution to the likelihood for a non-Kindle owner is
"
lnon−owner =
Y
g
1
1 qgP > 0
φ
σ
v qgP
σ
!
+ 1 qgP = 0 Φ
v̄gP
σ
!#
Case 2 : The book buyer has a Kindle. Similarly, based on



v P < v̄ P , v E < v̄ E


v P = v q P > max v̄iP , v E + v̄ P − v̄ E



v E = v q E > max v̄ E , v P − v̄ P − v̄ E we have the probability that
q P = 0, q E = 0
if
q P = 0, q E = 0
if
q P > 0, q E = 0
if
q P = 0, q E > 0
is
Pr ({0, 0}) = Pr v P < v̄ P Pr v E < v̄ E
The probability that
q P > 0, q E = 0
Pr
qP , 0
is
= f v P = v q P | v P > max v̄ P , v E + v̄ P − v̄ E
· Pr v P > max v̄ P , v E + v̄ P − v̄ E
It is a conditional probability of a truncated normal where the truncation point is a result of a maximization
operator. I calculate it by using quadrature method. The details are in the Appendix A2.
The probability that
q P = 0, q E > 0
is
Pr 0, q E = f v E = v q E | v E > max v̄ E , v P − v̄ P − v̄ E
· Pr v E > max v̄ E , v P − v̄ P − v̄ E
37
So the contribution to the likelihood for a Kindle owner is
lowner =
Y
1 {0, 0} Pr ({0, 0}) + 1 qgP , 0 Pr qgP , 0 + 1 0, qgE Pr 0, qgE
g
Notice that the individual book purchase data set does not provide the Kindle ownership status information,
but the book format information is available for year 2011-2012. For those two years, I assume that a consumer is
a Kindle owner if I observe him buying e-books. One year is a relatively long period to rule out the possibility that
an actual Kindle owner does not buy any e-book in my sample. For year 2008-2010, I take a probabilistic point of
view. Consumers have a Kindle at the beginning of the period with probability
not have one with probability
Pr {ιt = 0} = 1 − Pr {ιt = 1}.
Pr {ιt = 1} =
Pt−1
τ =2007
n1τ
N
and do
The book part of the log likelihood function for these
years is
"
Lbook
t
#
= log Pr {ιt = 0}
X
Pr {type =
non−owner
i} lit
+ Pr {ιt = 1}
X
i
Pr {type =
owner
i} lit
i
For year 2011-2011, the log likelihood function is
"
#
Lbook
= log 1 {ιt = 0}
t
X
non−owner
Pr {type = i} lit
+ 1 {ιt = 1}
X
i
owner
Pr {type = i} lit
i
A2. Probability Calculation
I need to calculate this probability:
Pr
where
vP
and
vE
qP , 0
= f v P = v q P | v P > max v̄ P , v E + v̄ P − v̄ E
· Pr v P > max v̄ P , v E + v̄ P − v̄ E
are i.i.d. normally distributed error terms with mean 0 and variance
are known deterministic parts and dene
∆ ≡ v̄ P − v̄
E
σ2 . v qP
,
v̄ P ,
and
v̄ E
for notation simplicity.
Step (i)
Compared to calculate the density directly, it is easier to start with a cdf:
Pr v P 6 v q P | v P > max v̄ P , v E + ∆ · Pr v P > max v̄ P , v E + ∆
Denote event
A
as
vP 6 v qP
, event
B
as
v P > v̄ P ,
event
C
as
v P > v E + ∆,
and event
D
as
v̄ P > v E + ∆.
Then the above probability can be written as
Pr (A | B ∩ C) Pr (B ∩ C)
Notice that event
B∩D
implies
C,
and event
C ∩ ¬D
implies
B.
Event
B
and event
D
are independent.
For the rst component,
Pr (A | B ∩ C)
=
Pr (A | B ∩ C ∩ D) Pr (D) + Pr (A | B ∩ C ∩ ¬D) Pr (¬D)
Pr (A | B ∩ D) Pr (D) + Pr (A | C ∩ ¬D) Pr (¬D)
Pr (A ∩ C ∩ ¬D)
Pr (A ∩ B ∩ D)
=
Pr (D) +
Pr (¬D)
Pr (B ∩ D)
Pr (C ∩ ¬D)
=
38
where
P
P
h
P i
Pr (B) = 1 − Φ v̄σ , Pr (D) = Φ v̄ σ−∆ , Pr (¬D) = 1 − Φ v̄ σ−∆ , Pr (B ∩ D) = Pr (B) Pr (D)
and
"
Pr (A ∩ B ∩ D)
=
Pr (A ∩ C ∩ ¬D)
=
!
P # P
v qP
v̄
v̄ − ∆
Φ
−Φ
Φ
σ
σ
σ
P
ˆ v(qP ) x−∆
v̄ − ∆
Φ
−Φ
dFx
σ
σ
v̄ P +1
!
P
ˆ v(qP ) v qP
v̄ − ∆
x−∆
dFx − Φ
Φ
Φ
σ
σ
σ
v̄ P +1
P
ˆ +∞ x−∆
v̄ − ∆
Φ
−Φ
dFx
σ
σ
P
v̄
P
P ˆ +∞ x−∆
v̄ − ∆
v̄
Φ
dFx − Φ
1−Φ
σ
σ
σ
P
v̄
=
Pr (C ∩ ¬D)
=
=
For the second component,
Pr (B ∩ C)
=
Pr (B ∩ C | D) Pr (D) + Pr (B ∩ C | ¬D) Pr (¬D)
=
Pr (B ∩ C ∩ D) + Pr (B ∩ C ∩ ¬D)
=
Pr (B ∩ D) + Pr (C ∩ ¬D)
Pr (B) Pr (D) + Pr (C ∩ ¬D)
ˆ +∞ x−∆
dFx
=
Φ
σ
v̄ P
=
In all,
CDF v q P
where
a=Φ
v (q P )
σ
,
b=Φ
v̄ P
σ
,
Pr (A | B ∩ C) Pr (B ∩ C)
(a − b) c
I1 − ac
=
+
(1 − c) I2
1−b
I2 − c(1 − b)
=
c=Φ
v̄ P −∆
σ
,
I1 =
´ v(qP )
v̄ P +1
Φ
x−∆
σ
dFx
and
I2 =
´ +∞
v̄ P
Φ
x−∆
σ
dFx .
Step (ii)
Back to the original interest
f v P = v q P | v P > max v̄ P , v E + v̄ P − v̄ E
· Pr v P > max v̄ P , v E + v̄ P − v̄ E
Taking into account the fact that book quantity observed can only be of integer value, the nal probability of
interest to calculate is
Pr v q P 6 v P < v q P + 1 | v P > max v̄ P , v E + ∆
· Pr v P > max v̄ P , v E + ∆
=
=
where
ã = Φ
v (q P +1)
σ
−Φ
v (q P )
σ
and
I˜1 =
´ v(qP +1)
v(q P )
39
Φ
x−∆
σ
CDF v q P + 1 − CDF v q P
"
#
ãc
I˜1 − ãc
=
+
(1 − c) I2
1 − b I2 − c(1 − b)
dFx .
There are two integrals to calculate
I˜1 =
ˆ v(qP +1)
x−∆
σ
Φ
v(q P )
ˆ
+∞
I2 =
Φ
v̄ P
x−∆
σ
dFx
dFx
I use Gauss-Chebychev quadrature for the rst one and Gauss-Laguerre quadrature for the second.
A3. Flow Utility Calculation
T
T
aTig = αig
+ vig
for T = P, E .
P
P
P
E
E
E
≡ qig + bpg − αig and v qig ≡ qig + bpE
g − αig .
The taste parameter
Dene the realized error terms given quantity choice as
P
Dene the threshold of worth buying as
v qig
E
and v̄ig
≡
bpE
g
−
P
P
v̄ig
≡ bpP
g − αig
E
αig
.
(1) The ex-ante ow utility from book reading for a Kindle non-owner is
fi0
=
=
where
X ≡
P
vig
−
P
v̄ig
!
P 2
aP
ig − bi pg
P
P
yi +
E
| qig
> 0 Pr qig
>0
2b
i
g
X
1
P
P 2
P
P
P
P
E vig
− v̄ig
| vig
− v̄ig
> 0 Pr vig
− v̄ig
>0
yi +
2bi g
X
∼ N
P
−v̄ig
, σ2
and
Pr
P
vig
−
P
v̄ig
>0 = 1−Φ
P
v̄ig
σ
.
From the truncated normal
distribution properties, we know that
E X2 | X > 0
where
α=
P
v̄ig
σ and
λ (α) =
2
= V ar (X | X > 0) + [E (X | X > 0)]
P
2
= σ 2 [1 − λ (α) (λ (α) − α)] + −v̄ig
+ σλ (α)
φ(α)
1−Φ(α) . This is a closed form solution.
(2) The ex-ante ow utility from book reading for a Kindle owner is
fi1
!
2
aTig − bi pTg
−T
−T
T
T
= yi +
E
| qig > 0, qig = 0 Pr qig
> 0, qig
= 0 , T = P, E
2bi
g
P E
P E
1 X
P
P 2
P
P
E
P
P
E
= yi +
E vig
− v̄ig
| vig
> max v̄ig
, vig + v̄ig
− v̄ig
Pr vig
> max v̄ig
, vig + v̄ig
− v̄ig
2bi g
E P
E P
E
E 2
E
E
P
E
E
P
+E vig
− v̄ig
| vig
> max v̄ig
, vig + v̄ig
− v̄ig
Pr vig
> max v̄ig
, vig + v̄ig
− v̄ig
I need
X
E v T | v T > max v̄ T , v −T + v̄ T − v̄ −T
and
E
vT
2
| v T > max v̄ T , v −T + v̄ T − v̄ −T
.
tice that the conditional expectation follows
ˆ
+∞
E [X | H] =
x · f (x | H) dx
−∞
ˆ
+∞
E X2 | H =
x2 · f (x | H) dx
−∞
40
No-
f (x | H)
I already calculate the conditional density
and the probability in Appendix 1. They are
ˆ
T
T
Pr v > max v̄ , v
−T
T
+ v̄ − v̄
−T
+∞
Φ
=
v̄ T

 a0 c +
1−b
=
0
f v T = x | v T > max v̄ T , v −T + v̄ T − v̄ −T
where
Fx
a0 =
1
σφ
x−∆
σ
I10 −a0 c
I2 −c(1−b)
dFx
(1 − c)
if x > v̄ T
otherwise
T
T
´ +∞
b = Φ v̄σ , c = Φ v̄ σ−∆ , I10 = Φ x−∆
fx (x) and I2 = v̄T Φ x−∆
dFx . fx and
σ
σ
2
N 0, σ . Given the conditional density, I calculate the expectations using Gauss-Hermite
x
σ ,
are pdf and cdf of
quadrature.
B. Computation
The numerical algorithm contains an inner loop and an outer loop. The inner loop solves the rm and consumer
maximization problem along with the next period state space given value function guess. The outer loop updates
the value function guess and iterates until convergence.
For the single-pricing problem where only Kindle optimal price is solved, for each iteration
1. Guess the value function for the rm and consumers
k = 1, 2, ...,
k−1
V
, W k−1 .
2. Given the value function guess, simultaneously solve rm's rst-order conditions in equation 22 and 23 at each
state. Since the rst-order conditions depend on consumers' current choices and next period
∆0 ,
depend on their rational expectations of
for
∆
0
I solve for a xed point in
∆0
∆0 , which in turn
such that consumers' expectations
are realized according to equation 21. In particular, to solve for the xed point, rst guess the next
period state space
∆0m−1
and rm's optimal pricing policy
P m−1 ,
where
m
is the iteration number for the
xed point inner loop. Given the guess, solve consumers' problem in equation 17, 18 and 19 to get updated
next period state space
∆0m .
Given the updated
∆0m ,
solve the rm's rst-order conditions in equation 22
and 23 at each state and get updated optimal pricing policy
|P
m
−P
m−1
|.
If converged, let
given the value function guess
∆
0k
P
k−1
and
k−1
V
,W
k
P m.
Check convergence of
| ∆0m − ∆0m−1 |
.
3. Update the value functions given rm's policy and next period state space and denote them
4. Check for convergence of the outer loop
and
denote this xed point. This is the solution to the inner loop
| V k − V k−1 |
and
| W k − W k−1 |.
V k, W k .
If convergence is not achieved,
return to step 1.
The double-pricing problem is similar except that an extra set of rst-order conditions for e-book pricing is simultaneously solved along with the ones for Kindle pricing.
The optimal pricing policy for e-book is also updated
and converged in the inner loop. In all the computation, I discretize the state space into 20 grid points on both
dimensions and use a cubic spline to interpolate between
∆ grid points for the value functions and policy functions.
This is because solving the rm's rst-order condition requires dierentiable continuation values. The convergence
is checked at the grid points.
41
C. Industry facts
I am focusing on Kindle Amazon because it is the dominant e-reader during the years 2008-2012. According to
the survey conducted by Pew Research Center in January 2012, 62% of the e-reader owners have Kindle and 22%
have Nook. The third biggest player, Pandigital, only accounts for 2% of the market. Also, I assume that e-book
reading is done on Kindle and not on other devices and consumers need to buy a Kindle before buying any e-books.
In practice, people may read e-books on multiple screens - dedicated e-readers, PCs, iPads and smartphones. But
according to the survey conducted by the Book Industry Study Group in 2011, e-reader is the dominant device: in
March 2011, 60% consumers read e-books on e-readers, 16% on PCs, 15% on iPad, and only 9% on smartphones.
Figure 3, 4, and 5 display some supporting evidence about this industry.
Figure 3: Kindle Dominates the E-reader Market
42
Figure 4: Kindle is the Dominant Device Across All Kinds of Screens: Books Read
Figure 5: Kindle is the Dominant Device Across All Kinds of Screens: Device Users
43