This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and education use, including for instruction at the authors institution and sharing with colleagues. Other uses, including reproduction and distribution, or selling or licensing copies, or posting to personal, institutional or third party websites are prohibited. In most cases authors are permitted to post their version of the article (e.g. in Word or Tex form) to their personal website or institutional repository. Authors requiring further information regarding Elsevier’s archiving and manuscript policies are encouraged to visit: http://www.elsevier.com/copyright Author's personal copy Electronic Commerce Research and Applications 11 (2012) 275–289 Contents lists available at SciVerse ScienceDirect Electronic Commerce Research and Applications journal homepage: www.elsevier.com/locate/ecra Online user reviews, product variety, and the long tail: An empirical investigation on online software downloads Wenqi Zhou 1, Wenjing Duan ⇑ Department of Information Systems & Technology Management, School of Business, The George Washington University, 2201 G Street, NW, Washington, DC 20052, United States a r t i c l e i n f o Article history: Received 24 November 2010 Received in revised form 1 December 2011 Accepted 11 December 2011 Available online 21 December 2011 Keywords: Long tail Superstar Online user reviews Product variety Word-of-mouth Software download Quantile regression a b s t r a c t Our study examines the impact of both a demand side factor (online user reviews) and a supply side factor (product variety) on the long tail and superstar phenomena in the context of online software downloading. The descriptive analysis suggests a significant superstar download pattern and also the emergence of the long tail. Using the quantile regression technique, we find the significant interaction effect between online user reviews and product variety on software download. We find that the impacts of both positive and negative user reviews are weakened as product variety goes up. In addition, the increase in product variety reduces the impact of user reviews on popular products more than it does on niche products. After taking the interaction effect into account, we find that the overall impact of the increased product variety helps niche products to get more downloads. These results highlight the importance of considering the intricate interplay between demand side and supply side factors in the long tail and online word-of-mouth research. Ó 2011 Elsevier B.V. All rights reserved. 1. Introduction In traditional brick-and-mortar stores, vendors are recommended to apply an intensive advertising strategy, i.e., highlight the hits (superstars), and sellers have well recognized the high risks of investing in new product development. The underlying belief behind such strategies is the well-recognized Pareto principle applied to sales distribution, namely the 80/20 rule (Brynjolfsson et al. 2011). This rule says that the cumulative sales from the most popular products (the top 20%) account for approximately 80% of total sales. In essence, the superstar effect represents a concentrated consumption pattern in which a relatively small number of very popular products account for the majority of sales (Frank and Philip 1995, Rosen 1981). Such superstar effect still prevails in online markets as documented in recent studies (Brynjolfsson et al. 2010b, Duan et al. 2009). However, the significantly increased product choices and information made possible by the Internet and electronic markets provide the opportunity for more niche products to be discovered and adopted. Anderson (2006) first coined the term ‘‘long tail’’ (see Fig. 1), which envisages that more niche products offered exclusively in online stores better satisfy consumers’ diversified preferences and thus have the potential to outgrow the demand for ⇑ Corresponding author. Tel.: +1 202 994 3217; fax: +1 202 994 5830. 1 E-mail addresses: [email protected] (W. Zhou), [email protected] (W. Duan). Tel.: +1 202 994 2454; fax: +1 202 994 5830. 1567-4223/$ - see front matter Ó 2011 Elsevier B.V. All rights reserved. doi:10.1016/j.elerap.2011.12.002 those popular products often sold through traditional channels. Recent studies have applied Anderson’s original long tail prediction to both offline and online contexts, and even extend it to purely online investigations to examine the change in demand distribution over time (Brynjolfsson et al. 2010a, Elberse and Oberholzer-Gee 2008, Ghose and Gu 2007, Tan and Netessine, 2009, Tucker and Zhang 2007, Zhao et al. 2008). Thus, the more thorough and essential definition of the long tail effect, which we use in this research, describes the change in the consumption pattern when more niche products are being selected and the demand is shifting from the hits to the niches over time (Anderson 2006, Elberse and Oberholzer-Gee 2008). There are both supply side and demand side factors that could attribute to the long tail formation (Brynjolfsson et al. 2006, 2010b). Major supply side factors include increased availability and variety of products on the Internet resulting from virtually unlimited ‘‘shelf space,’’ as well as make-to-order production and digital distribution, which significantly reduce the costs for producers and retailers (Brynjolfsson et al. 2006, 2010b). Key demand side factors include abundant online product information, such as consumer reviews and recommendations, and powerful search and sampling tools on e-commerce websites (Brynjolfsson et al. 2010b, 2011). The predicted shift of user choices to niche products has already attracted much attention from academics examining the change in consumption patterns on the Internet. Those studies mainly focus on the demand side and results are inconsistent. Some researchers believe that lower consumer search costs, resulting from online Author's personal copy 276 Fig. 1. Long tail on www.longtail.comgr1. W. Zhou, W. Duan / Electronic Commerce Research and Applications 11 (2012) 275–289 distribution curve. Source: Long Tail Blog, http:// feedback and recommendation systems, contribute to the reduced concentration of sales of popular products, resulting in the long tail phenomenon (Clemons et al. 2006, Hervas-Drane 2009, Maryanchyk 2008, Oestreicher-Singer and Sundararajan 2009). Meanwhile, other empirical studies have advocated the superstar effect because of lower consumer search costs (Fleder and Hosanagar 2009, Ghose and Gu 2007, Zhao et al. 2008). Ghose and Gu (2007) found that search costs for price information are lower for hit products compared to niche products. Fleder and Hosanagar (2009) suggest that selection-biased recommendation systems may help reduce the sales diversity because these systems tend to recommend products with more historical data, i.e., the popular products. In contrast to these studies, which primarily focus on either demand side or supply side justifications for the skewed shape of user choices, we focus in this research on both a key supply side factor, product variety, and a critical demand side factor, online user reviews, to study their effects on the formations of long tail and superstar phenomena. Using a data set of online software downloads, we first undertake the descriptive analysis to examine changes in distribution patterns of software downloads over time, which is followed by a rigorous quantile regression empirical analysis to examine how these two factors and their interplay influence user choices of software with different levels of popularity. Our descriptive analysis suggests a coexistence of the superstar and long tail effects. We still observe a small number of very popular products dominating the demand, but a larger number of hits are receiving fewer individual downloads over time. In the meantime, we also observe that significantly more niche products are available and are being chosen by users. More importantly, the demand is shifting from the hits to the niches over time. The quantile regression results show a significant interaction effect between online user reviews and product variety on the skewed shape of the download distribution. The interaction effect demonstrates that consumers’ reliance on online user reviews to choose products is significantly influenced by the quantity of products available. Specifically, we find that the impacts of both positive and negative user reviews are weakened as product variety goes up. In addition, the increase in product variety reduces the impact of user reviews on popular products more than it does on niche products. As a result, the impact of user reviews on user choices does not show a well-defined trend over various product popularities, and the magnitude of the impact depends on the level of product variety. On the other hand, our empirical analyses suggest a clear trend of the impact of product variety on the distribution of user choices, even after taking its interaction effect with online user reviews. Increased product variety has a greater positive impact on tail products than it does on popular products, leading to the long tail formation, regardless of whether the products receive positive or negative user reviews. Our study, to the best of our knowledge, is the first to investigate the impacts of both online user reviews (demand side factor) and product variety (supply side factor), and particularly their interaction effect, on the formations of long tail and superstar phenomena. The quantile regression technique allows us to investigate these influences across the entire spectrum of user choices distribution by estimating models of conditional quantile functions from covariates. We find empirical evidence that omitting the interaction effect between product variety and user reviews could lead to misleading inferences about the word-of-mouth (WOM) effect. Hence, this paper contributes to the long tail research by offering a new perspective on the interaction effect between demand side and supply side factors in the interpretation of the long tail and superstar predictions of online consumption patterns. The intertwined and complex relationships among user reviews, product variety, and product popularity, also offer possible explanations for extant mixed results on the impact of online user reviews, thus contributing to both Information Systems (IS) and Marketing research by re-examining the influence of online user reviews. This paper, as the first attempt to examine the long tail phenomenon in the Internet software market, also adds to the research on long tail e-commerce by expanding the boundary of long tail research to a very important territory. Many extant long tail studies focus on the cultural goods (e.g., books and movies) upon which Anderson (2006) originally built the concept of the long tail. However, the investigation of the demand pattern in the software market is also vital, given its rapid growth in the online marketplace. In 2013, the global software market is forecasted to have a value of $457 billion, an increase of 50.5% since 2008, of which the US accounts for 42.6%.2 As noted in Anderson’s (2006) illustrations, which included eBay, Google, and Salesforce (a customer relationship management (CRM) software service provider), the long tail could certainly manifest in the software industry. The economics of the software industry, with the help of online recommendation systems, could be significantly influenced by three primary long tail forces, which we can summarize as ‘‘make it, get it out there, and help me find it,’’ particularly because most software programs can be produced and delivered on the Internet with almost zero marginal cost. However, unlike books and movies, software adoptions usually come with the much greater risks and complexities resulting from installation, learning, and long-term maintenance. Therefore, the conclusions drawn from cultural goods may suffer from poor validity if directly extended to the software market, which calls for independent investigations from researchers. The rest of this paper proceeds as follows. We review the related literature and develop the conceptual framework in the next section. We then describe the data and analyze the empirical model. In the last section, we discuss the results and implications, and conclude the paper by addressing the limitations and identifying areas for future research. 2. Literature review and conceptual framework 2.1. The long tail The long tail phenomenon in the distribution of product sales was first observed through the comparison between offline and online sales (Anderson 2006, Brynjolfsson et al. 2006, Tucker and Zhang 2007). Recent long tail studies in e-commerce have extended the shifts predicted in user choices from the hits to the niches to purely online channels (Clemons et al. 2006, Fleder and Hosanagar 2009, Ghose and Gu 2007, Hervas-Drane 2009, Maryanchyk 2008, Oestreicher-Singer and Sundararajan 2009, Zhao et al. 2 Software: Global Industry Guide (2009). Author's personal copy W. Zhou, W. Duan / Electronic Commerce Research and Applications 11 (2012) 275–289 2008). In essence, the long tail phenomenon describes the change in consumption patterns and thus can only be examined by comparing consumer demand distribution shifts over time in pure online channels (Brynjolfsson et al. 2010b, 2011, Elberse and Oberholzer-Gee 2008). The long tail phenomenon can be investigated and identified from the following perspectives. First, the widely agreed-on feature of the long tail effect is that the longer tail should be emerging, i.e., more niche products are being consumed over time (Anderson 2006, Brynjolfsson et al. 2011, Elberse and Oberholzer-Gee 2008, Oestreicher-Singer and Sundararajan 2009). In addition, the long tail consumption pattern is also expected to include a relatively fatter tail, i.e., the demand is shifting from the hits to the niches over time. Anderson (2006), who first officially coined the term, ‘‘long tail,’’ pointed out that the theory of the long tail in economics is that demand is ‘‘shifting away from a focus on a relatively small number of ‘hits’ at the head of the demand curve and toward a huge number of ‘niches’ in the tail.’’ Specifically, there are three scenarios that leads to a relatively fatter tail compared to the head. First, the demand for the niches increases while the demand for the hits decreases; second, the demand for the niches increases at a greater rate than the increase in demand for the hits; and the third, also most plausible scenario is that the demand for all products generally decreases, but the decrease for the hits is more pronounced (Elberse and Oberholzer-Gee 2008). Given the largely expanded product variety on the Internet (Brynjolfsson et al. 2006, 2010a,b, 2011; Tan and Netessine 2009), we expect that the sales of an individual product would decrease, regardless of its popularity, as many long tail studies show (e.g., Brynjolfsson et al. 2010a,b; Elberse and Oberholzer-Gee 2008; Tan and Netessine 2009). In this case, when the decrease in demand for the niches is less significant than the decrease in demand for the hits, the demand actually shifts from the hits to the niches, resulting in a relatively fatter tail. Following the literature, in this study we examine the long tail phenomenon from two perspectives: longer and fatter tail, with the fatter tail resulting from the shift in demand from the hits to the niches. The specific measures to examine the long tail phenomenon require appropriately categorizing products into the hits and the niches, which have raised disagreements among researchers. Many of them followed Anderson’s (2006) approach by adopting the absolute measure (Brynjolfsson et al. 2003, 2010a). They identified the long tail by comparing the demand shares accounted for by the hits and the niches differentiated by the amount of sales, e.g., the Top 100 products as the hits and products above rank 100 as the niches. Others examine the long tail phenomenon using the relative measure, i.e., percentage of sales, to evaluate the change in distribution curve (Elberse and Oberholzer-Gee 2008, Tan and Netessine 2009). Tan and Netessine (2009) argued that the relative measure is more appropriate because it controls for the significant increase in number of products over time or across channels. Nevertheless, Brynjolfsson et al. (2010b) conducted a comprehensive review on extant long tail studies. They pointed out that both the absolute and relative measures have their strengths and weaknesses in comparing the demand shares for both the hits and the niches, depending on the study settings. In contrast to the long tail effect, the identification of superstar effect involves only an observation of the concentration of demand distribution, as demonstrated both in its original description of the Pareto principle and the recent long tail research. The superstar effect has been consistently defined as the consumption pattern in which a small number of popular products account for the majority of sales (Elberse and Oberholzer-Gee 2008, Rosen 1981, Tucker and Zhang 2007, Zhao et al. 2008), which show a concentrated demand distribution toward the hits. In light of such a definition, even as the tail of the demand distribution becomes longer and fatter over 277 time, a small number of products could still dominate the sales and thus could still demonstrate the superstar effect. Therefore, the long tail and superstar phenomena are not necessarily conflicting and could coexist in descriptions of various attributes of a consumption pattern. Although the long tail indicates the shift of demand from the hits to the niches, the very popular products can still dominate market demand at the same time. Supporting such arguments, Elberse and Oberholzer-Gee (2008) found empirical evidence for the coexistence of the long tail and superstar phenomena by examining overall video sales through both online and offline channels. They found that the number of obscure products increases almost twice every week; meanwhile, the demand for niche products decreases at a much less pronounced rate than the demand for hits, which denotes the long tail phenomenon. They also observed that an even smaller number of video titles accounts for the majority of sales, which indicates that the superstar phenomenon also still prevails. Tucker and Zhang (2007) provided some explanations for such coexistence using an empirical comparison of consumers’ click-through behavior. Examining clickthroughs between the catalog and Internet channels of a website that provides online wedding service vendor lists, the study showed that sales from superstar products are enhanced by attracting new demand without cannibalizing the demand for niche products. 2.2. Online user reviews and product variety The widely adopted online user feedback systems allow consumers to exchange their evaluations and experiences on the Internet and thus to automate and amplify the digital WOM process (Duan et al. 2009). Nevertheless, the conclusions regarding the WOM effect across products with various popularities are not consistent. An earlier theoretical development by Bakos (1997) has predicted that online WOM recommendations would help consumers find the less popular goods that nevertheless match their preferences. Online product feedback and recommendations have been viewed as important demand side factors that can reduce consumer search costs in the pursuit of niche products (Brynjolfsson et al. 2006). Therefore, most long tail studies regard the digital WOM effect as an influence on information transformation, which contributes to the long tail formation on the demand side (Brynjolfsson et al. 2011). Hervas-Drane (2009) analytically showed that a recommendation system functions as a taste matching mechanism to help consumers get product information from others with similar preferences; as such, it reduces sales concentration. Oestreicher-Singer and Sundararajan (2009) provided empirical evidence that a stronger and broader influence of recommendations on Amazon.com results in a flatter sales distribution. In particular, online user reviews, as one of the most important formats of online conversations for measuring WOM (Godes and Mayzlin 2004), are also believed to contribute to the long tail formation as an important demand side factor in recent empirical studies (Clemons et al. 2006, Duan et al. 2009, Hervas-Drane 2009, Maryanchyk 2008, Oestreicher-Singer and Sundararajan 2009, Zhu and Zhang 2010). For instance, Clemons et al. (2006) showed that receiving the most positive reviews helps new products grow more quickly in the marketplace. Zhu and Zhang (2010) found that online user reviews are more influential for less popular video games whose players have more years of Internet experiences. By examining informational cascades in the context of online software adoption, Duan et al. (2009) demonstrated that online user reviews have an increasingly positive and significant impact on the adoption of less popular products, although they also found that popular products become more popular regardless of user review ratings. Author's personal copy 278 W. Zhou, W. Duan / Electronic Commerce Research and Applications 11 (2012) 275–289 On the other hand, superstar proponents have a different voice regarding the WOM effect across products with various popularities. The classic superstar effect proposed by Rosen (1981) argued that communication technologies could result in a concentration of the consumption of high quality supplies. Consumers are more likely to infer product quality from previous users’ purchase decisions, which then results in only choosing from a few highly ranked products. This superstar effect facilitated by communication technologies has also been shown to be strengthened by the Internet and other digital distribution channels (Duan et al. 2009, Goldmanis et al. 2009, Tucker and Zhang 2007). For example, in opposition to the long tail prediction from Brynjolfsson et al. (2006), Goldmanis et al. (2009) found that the decline in consumer search costs due to the introduction of e-commerce leads to the superstar phenomenon in market share of both travel agencies and bookstores. Fleder and Hosanagar (2009), in considering the lower consumer search costs resulting from e-commerce, conducted a simulation to show that recommendations based on sales and ratings can lead to a reduction in sales diversity. Supporting these findings, other studies predict that online user reviews strengthen the superstar consumption pattern (Fleder and Hosanagar 2009, Zhao et al. 2008). Zhao et al. (2008) found that positive user reviews have a stronger impact on the hits than on the niches and that negative user reviews hurt niche products more, which suggests the superstar effect of online user reviews. In contrast to online user reviews, increased product variety is consistently viewed as an important long tail factor on the supply side in the literature (Anderson 2006, Brynjolfsson et al. 2006). The Internet has made it more feasible and efficient to provide consumers a much larger selection of products through online channels than brick-and-mortar stores (Brynjolfsson et al. 2003). Stocking more products for online channel requires much lower inventory costs than traditional brick-and-mortar stores. While physical stores encounter space limitations and various costs including logistic and holding, online stores can maintain a very large inventory in their centralized warehouse on a much less expensive location. The lower inventory cost of online channels is even more pronounced for digital products, of which one additional product only needs adding one more line in the product database. Significantly more niche products are accessible in online channels, which may not be available in physical stores before. Such availability of more varieties of products is more likely to meet more consumers’ niche preferences and thus increase the sales of the niches products, leading to the long tail formation in consumption pattern. 2.3. Conceptual framework Fig. 2 depicts our conceptual model, which illustrate the proposed effect of online user reviews, product variety, and their interaction on online consumers’ choices of products with various popularities. We define the interaction effect between online user reviews and product variety as a relationship that the degree of a consumer’s reliance on online user reviews to make product choices depends on the level of product variety. Previous studies have identified the interaction effect between online user reviews and contextual variables on product sales (Cheema and Papatla 2010, Zhu and Zhang 2010). For example, Zhu and Zhang (2010) applied the classic psychological choice models proposed by Hansen (1976) to the e-commerce setting and argued that consumers’ reliance on user reviews is affected by contextual variables. Contextual variables are conventionally defined as the factors describing the context where the brain processes occur, out of which consumers subsequently form their behavioral responses. These variables may include store size, promotion event, product category, and so forth (Hansen 1976). In the context of online video games, Zhu and Zhang (2010) identified ‘‘product characteristics’’ as the contextual variable that interacts with online user reviews to influence sales. Cheema and Papatla (2010) also echoed Hansen’s theory (1976) by pointing out that the importance of online information for influencing Internet purchases depends on product category. Similar to these studies, we argue that online user reviews and an important contextual variable (product variety) interact to influence user choices, potentially having various impacts across products with various popularities. Product variety has been recognized as the supply side factor influencing the diversity of user choices. Product variety information is readily available in almost every online store and serves as an important information resource for consumers’ decisionmaking. Brynjolfsson et al. (2011) suggested that product variety should be considered together with demand side factors in examining changes in consumption patterns, implying the potential interaction effect between online user reviews and product variety. Their study highlighted the importance of keeping product variety constant while examining the long tail effect of the demand side factors resulting from lower consumer search costs enabled by the Internet channel. Online user reviews, as one of the major demand side factors (Brynjolfsson et al. 2006), thus might influence user choices, depending on the number of products offered. A few recent long tail studies have controlled for product variety during their study periods, thus losing the opportunity to reveal the possible influence of changes in product variety (Brynjolfsson et al. 2011, Ghose and Gu 2007, Tucker and Zhang 2007). Although extant research has not formally investigated it, these prior studies nevertheless suggest that the influence of online user reviews may depend on their intricate interplay with product variety. In addition, extant work suggests that online user reviews have different impact on products that have different levels of popularity (Duan et al. 2009, Zhao et al. 2008, Zhu and Zhang 2010). In this study, we investigate the impact of both online user reviews and product variety, along with their interaction effect, at different locations of the demand distribution. The findings are expected to provide a more comprehensive interpretation of the impact of online user reviews and product variety on the heterogeneity of user choices. We also expect our study to shed light on the mixed results documented in previous online WOM studies. In addition, we aim to provide a more thorough understanding of the role that product variety plays in consumers’ online decision-making, thus contributing to resolution of the current disagreements between long tail and superstar advocates (Anderson Fig. 2. onceptual framework. Author's personal copy 279 W. Zhou, W. Duan / Electronic Commerce Research and Applications 11 (2012) 275–289 2006; Brynjolfsson et al. 2006, 2010b; Goldmanis et al. 2009). Furthermore, the potentially very complex relationships between the demand and supply side factors dictate the nature of this question to be primarily empirical, which we would like to take the first initiative to investigate in this study. Table 2 Description of key variables. Variable Description and measure WEEKLYDOWNLOADit TOTALDOWNLOADit Weekly number of downloads of software i at week t Cumulative number of downloads of software i at week t Average user rating for software i at week t (one to five scale with half points) A dummy variable measures if software i receives CNET rating at week t The rank of software i at week t by weekly downloads The total number of software programs listed in the category at week t A dummy variable measures if software i is free-to try at week t Days since software i has been posted USERRATINGit 3. Data CNETRATINGDit Our data are from CNET Download.com (CNETD), which is a collection of more than 30,000 free or free-to-try software programs for Windows, Mac, mobile devices, and webware. CNETD, as a part of CNET network, is a leading and representative online platform for software download. CNETD lists approximately 20 large groups of software programs with approximately 5–20 categories in each group. In addition to providing the overall product description, CNETD also shows download counts for each software program posted and solicits user reviews. The user review system includes detailed comments and an overall evaluation, using a five-star user rating system. CNETD also provides editorial reviews for selected software programs (usually the popular ones). Reviews are summarized by rating on a scale of one to five, with one being the lowest and five being the highest. CNETD provides an ideal environment for this study. First, in addition to the description of product features, it clearly provides the information about the number of products offered for each category. We measure product variety by the number of software programs listed in a specific category. Product evaluation is also readily available from user reviews for each product, with both detailed comments and a one to five star rating. The data set up the foundation for our study, which allows us to test how online user reviews, interacting with the level of product variety, influence market demand concentration. Specifically, we use the cumulative average user rating (on a scale of 1–5) as a measure for online user reviews, which is the most prominent user feedback information displayed on CNETD for each product. All the information on CNETD is updated on a daily basis, which enables us to compile a longitudinal dataset to analyze the dynamics of software downloads. Second, all the software programs listed on CNETD can be downloaded without any charge, therefore, the price effect on the demand side is controlled by default. This advantage comes at a cost, however, because of the possibility that consumers’ download behavior on CNETD may be different from their actual software purchase behavior. However, because users are often required to make significant commitments and considerable efforts to use free and free-to-try software programs, free-to-download software programs on CNETD might not substantially different from other online products (Duan et al. 2009). Our sample consists of four software categories, which includes the popular downloaded software categories as well as a diversified coverage of software programs with different application purposes. The categories are: Antivirus Software, Digital Media Player, Download Manager and File Compression. Our sample is composed of weekly data for two periods: from December 2004 to July 2005 (period 1) and from August 2007 to February 2008 (period 2). The WEEKLYRANKit WEEKLYVARIETYit FREEPRICEDit AGEit time interval across the two periods offers us the unique opportunity to observe and compare the variations of software download patterns. The two periods are fairly comparable. Both periods encompass similar time frames: period 1 is eight months long and period 2 is seven months long. In addition, during the two periods, CNETD made no fundamental changes in terms of the interface design, user rating system, CNET rating system, or search options. The number of software programs listed in each category differs considerably, from approximately 60 to 450, as shown in Table 1. The difference reflects the distinctive environment in each category, and we define each category as a single market (Duan et al. 2009). The following information has been extracted for every software program listed in each category: software name, date added, total downloads, last week downloads, average user rating, and CNET rating. We also collect software characteristics, including operating system requirements, file size, publisher, license (free or free-to-try), and price when its license is free-to-try. Table 2 presents the variable definitions, descriptions, and explanations of measurement. 4. Empirical methodology and results 4.1. Descriptive analysis of software download distribution To investigate the software download pattern on CNETD and its change over time, we conduct the following descriptive analysis for each of the four categories across two periods to examine the number of weekly downloads distribution pattern. We use the number of weekly downloads to capture user demand in this study, which is similar to some prior studies that have also used consumers’ incremental demand in studying superstar and long tail phenomena (Elberse and Oberholzer-Gee 2008). We define Qa as the lower ath quantile of the weekly download distribution. As a result, the top a% most popular products are simply those whose weekly downloads exceed Q(1 a) of the download distribution. To first examine whether the superstar phenomenon prevails over time, we follow the conventional approach of using the Table 1 Product variety for sample periods 1 and 2. Variable Mean (P1|P2) SD (P1|P2) Min. (P1|P2) Max. (P1|P2) Antivirus software Digital media player Download manager File compression 106.47|226.96 174.78|437.46 119.78|213.31 66.91|174.35 16.45|26.09 13.36|50.83 5.14|32.68 8.03|21.73 82|132 157|242 112|150 60|95 137|249 203|466 131|256 86|193 Notes: P1 denotes period 1 and P2 denotes period 1. The data is based on weekly aggregation. Author's personal copy 280 W. Zhou, W. Duan / Electronic Commerce Research and Applications 11 (2012) 275–289 Digital Media Player Average Weekly Download Share Average Weekly Download Share Antivirus Software 100% p1 p2 90% 80% 70% 60% 50% 40% 30% 20% 10% 0% 100% p1 90% p2 80% 70% 60% 50% 40% 30% 20% 10% 0% 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 0% 100% 10% 20% 30% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0% 10% 20% 30% 40% 50% 60% 50% 60% 70% 80% 90% 100% File Compression p1 p2 70% 80% 90% 100% Popularity Average Weekly Download Share Average Weekly Download Share Download Manager 100% 0% 40% Popularity Popularity 100% p1 90% p2 80% 70% 60% 50% 40% 30% 20% 10% 0% 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Popularity Fig. 3. Distributions of average weekly number of downloads. Notes: P1 refers to period 1 and P2 refers to period 2. relative measure by examining the demand shares accounted for by products with different popularities (Brynjolfsson et al. 2011; Fleder and Hosanagar 2009). Specifically, we report the download share of software titles in different quantiles Qa by calculating the corresponding percentages of average weekly downloads in each period.3 The result suggests that a small number of very popular products dominate the demand in both periods. For example, we find that in each of the four categories in both periods, the top 10% most popular products, whose weekly downloads exceed Q90 of the weekly download distribution, account for more than 80% of the overall downloads. Moreover, the superstar phenomenon seems to be more significant in period 2, according to this measure. The download shares accounted for by the hits in all four categories increase in period 2 compared to period 1. For instance, in the category of Digital Media Player, the download share of the top 10% most popular products increases by 12.70%, from 83.04% in period 1 to 95.74% in period 2. The increase in the download share of the top 1% most popular products is even more dramatic, as large as 21.96%, from 32.82% in period 1 to 54.78% in period 2. For a better illustration, Fig. 3 plots this distribution for both periods in each category against the overall popularity, starting with the most popular software on the left side (the ‘‘head’’ of the distribution) and the least popular software on the right side (the ‘‘tail’’ of the distribution). This relative measure plot seems to demonstrate a more significant superstar download pattern, in which the distribution becomes more asymmetrical over time with a sharper peak. We now turn to the question of whether the long tail emerges in period 2. We first look into the longer tail attribute of long tail phenomenon by examining the changes in the number of software programs in both head and tail over two periods. For simplicity, 3 The detailed report is too lengthy to be included in the paper but is available upon request. we denote the bottom a% least popular products, whose weekly downloads are in the Qa of the weekly download distribution, as TAILa. Similarly, the top (1 a)% most popular products, whose weekly downloads exceed Qa of the weekly download distribution, are denoted by HEAD(1 a). For example, the top 25% most popular products are denoted by HEAD25. Average weekly numbers of software programs in a series of popularities are portrayed in Fig. 4 to show the product variety in both head and tail between the two periods in each category. Fig. 4 indicates that there are more products in the tail in period 2 than in period 1 in each category, indicating a longer tail in the download pattern. For example, in the category of Antivirus Software, the average number of the bottom 25% least popular products is 26 in period 1, and it increases to 57 in period 2. Among these tail products, many of the software programs are virtually unknown, (e.g., ‘‘Yes AntiVirusTool NetsKy-P’’ and ‘‘DiamondCS WormGuard’’). Nevertheless, even these more obscure products are chosen by some users. Similarly, to illustrate the relatively fatter tail attribute of the long tail, we report the average weekly numbers of downloads in a series of popularities in Fig. 5, after applying a natural log transformation to present results on a comparable scale. We find a substantial download drop in both head and tail from period 1 to period 2, which has been shown to be significant using a two sample t-test. This result implies that, on average, each individual software program gets fewer downloads over time, regardless of its popularity. Therefore, to examine whether the relatively fatter tail exists, we investigate the average decrease of average weekly downloads over the two periods for both the hits and the niches, with results reported in Table 3. In each category, the decrease in demand for the niches is shown to be much less pronounced than the decrease in demand for the hits, indicating a shift in demand from the hits to the niches. For example, in the category of Antivirus Software, the weekly downloads of the bottom 1% least popular Author's personal copy 281 W. Zhou, W. Duan / Electronic Commerce Research and Applications 11 (2012) 275–289 Number of Software Programs Digital Media Player Week Week Download Manager File Compression Number of Software Programs Number of Software Programs Number of Software Programs Antivirus Software Week Week Fig. 4. Average weekly number of software programs in head and tail. Notes: PiWj refers to the jth week in period i. Table 3 Decrease of average weekly downloads over two periods for both hits and niches. Antivirus software Digital media player Download manager File compression TAIL1 TAIL5 TAIL10 TAIL25 HEAD25 HEAD10 HEAD5 HEAD1 3 2 2 2 11 9 7 5 20 13 11 9 54 31 20 20 82,522 8333 2604 26842 19,2691 16,527 5360 60,768 349,704 26,408 8001 104,433 909,768 80,840 11,990 20,5426 products decrease by 3 in period 2, on average, while the decrease in weekly downloads for the top 1% most popular products is much more significant in period 2, as large as 909,768. Figs. 4 and 5, along with Table 3, show that the software download pattern exhibits a longer and relatively fatter tail in period 2, demonstrating the long tail phenomenon. We also notice some interesting results at the individual product level from Fig. 4, which complement our superstar observation demonstrated in Fig. 3 using the pure relative measure. Fig. 4 shows that in each category more products are not only in the tail but also in the head. For example, in the category of Digital Media Player, the average number of the top 10% most popular products increases from 18 in period 1 to 52 in period 2. These findings, along with the results in Fig. 5, suggest that a larger number of hits receive fewer individual downloads in period 2. Therefore, although overall the superstar effect seems to be more significant over time as shown in Fig. 3, hit products are actually facing more intense competition and getting less popular individually. This result is consistent with our observation of the long tail phenomenon that demands seem to be shifting to the niche products. Overall, we observe the coexistence of superstar and long tail download patterns. Similar to most extant studies, the first part of our descriptive analysis for identifying the superstar effect adopts the relative measure and looks into the classic distribution curve of user choices shown in Fig. 3 (Brynjolfsson et al. 2011; Fleder and Hosanagar 2009). However, our two long tail measures to examine the longer and relatively fatter tail are slightly different from either the relative measure or the absolute measure. Our measures keep the format of the relative measure using the relative popularity of products to separate them in head and tail. This allows us to control for the largely increased product variety in period 2.4 In the meantime, our measures uses the absolute values of both the number of software programs (product variety) and the number of downloads, instead of examining the consumption shares commonly used in previous studies. Instead of choosing from either a pure relative measure or a pure absolute measure, our analysis on the long tail phenomenon offers a more complete view from various perspectives. Our analysis also provides support for the argument of Brynjolfsson et al. (2010b) that one possible reason for the inconsistent observations on consumer demand pattern in literature could be the biased choices of different measures. Although our initial descriptive analysis suggests the presence of both the superstar and long tail phenomena, it has limited statistical significance. In the next section, a more rigorous empirical analysis is conducted to generate more insights for understanding the factors that might drive such phenomena. 4 A two sample t-test conducted on a time series of weekly product variety for the two periods (each period data as one sample) shows that the average weekly product variety in period 2 is significantly greater than that in period 1. The detailed report on the t-test results is available upon request. Author's personal copy W. Zhou, W. Duan / Electronic Commerce Research and Applications 11 (2012) 275–289 Antivirus Software Digital Media Player Average Weekly Downloads Average Weekly Downloads 282 Week Download Manager File Compression Average Weekly Downloads Average Weekly Downloads Week Week Week Fig. 5. Average weekly number of downloads in head and tail. Notes: PiWj refers to the jth week in period i. 4.2. Quantile regression model Table 4 Descriptive statistics of key variables. To investigate how online user reviews and product variety interact to influence user choices of software programs with different popularities, we use the quantile regression methodology. The widely used least-squares regression, which particularly models the conditional mean, such as the Ordinary Least Square (OLS) regression, assumes the noise around the mean of the dependent variable is normally distributed. Its estimation result can thus only infer the mean effect of covariates on the dependent variable. In contrast, quantile regression examines how covariates influence the entire distribution of the dependent variable. However, it has not been broadly implemented in IS research. In fact, the quantile regression method fits well into most long tail studies by directly coping with the highly skewed distribution of user choices. The quantile regression model specifies the conditional quantile of the dependent variable as a linear function of covariates. By examining a series of quantiles, researchers are able to assess and uncover the different impacts of covariates at various locations of the dependent variable distribution. Our descriptive analysis has verified a heavily skewed distribution of software downloads, which suggests that employing the quantile regression technique is more appropriate than the conventional least-squares regression model. Specifically, quantile regression is particularly helpful in this study to investigate empirically whether and how the heterogeneity of user choices is influenced by online user reviews (demand side factor) and product variety (supply side factor). The general form of the quantile regression is expressed as: Q aðyjxÞ ¼ x0 bðaÞ ð1Þ where Qa(y|x) denotes the ath quantile of the distribution of the dependent variable y, and x denotes the vector of covariates. The key observation of the dependent variable in our data is the number of weekly downloads (WEEKLYDOWNLOADit), which captures user demand as previously discussed. We apply a natural log transformation on the number of weekly downloads (Log (WEEKLYDOWNLOADit)) and use it as the dependent variable in our quantile regression models. The log transformation has the advantage of reducing the nonconstant variance and converting the value to a magnitude that is comparable to other variables. In addition, the Variable Mean SD Min. Max. 4.24 1.61 2.55 1.74 0 3 14.07 2 0 22.67 96.80 20.20 0.07 9.23 115.61 428.47 0.70 0.25 2.56 67.61 453.41 0.46 0 0 1 0 0 1 17.89 243.50 2336 1 2.28 1.54 0 3 12.31 2 0 40.33 200.24 23.76 0.09 9.13 222.09 617.70 0.47 0.29 2.34 129.91 536.08 0.50 0 0 1 0 0 1 17.90 452 3457 1 Download Manager (N = 5, 796) LOG(WEEKLYDOWNLOAD)it 3.85 1.76 USERRATINGR 2.23 1.67 0 3 12.26 2 WEEKLYVARIETY Ct CNETRATINGDit LOGTOTALit WEEKLYRANKit AGEit FREEPRICEDit 0 31.29 67.15 38.85 0.22 9.18 109.06 591.55 0.66 0.42 2.27 65.28 569.69 0.48 0 0 1 0 0 1 17.87 249.50 2365 1 3.66 1.98 2.33 1.70 0 3 12.86 1.50 0 17.40 81.57 16.43 0.24 9.04 89.32 577.63 0.61 0.42 2.43 52.07 489.84 0.49 0 0 1 0 0 1 19 190.50 2546 1 Antivirus software (N = 6, 162) LOG(WEEKLYDOWNLOAD)it USERRATINGRit WEEKLYVARIETY Ct CNETRATINGDit LOGTOTALit WEEKLYRANKit AGEit FREEPRICEDit Digital media player (N = 1, 1810) LOG(WEEKLYDOWNLOAD)it 3.57 2 USERRATINGR it WEEKLYVARIETY Ct CNETRATINGDit LOGTOTALit WEEKLYRANKit AGEit FREEPRICEDit it File compression (N = 4, 749) LOG(WEEKLYDOWNLOAD)it USERRATINGRit WEEKLYVARIETY Ct CNETRATINGDit LOGTOTALit WEEKLYRANKit AGEit FREEPRICEDit ‘‘monotone equivariance’’ property of quantile regression, which does not hold for least-squares regression, allows us to perfectly Author's personal copy 283 W. Zhou, W. Duan / Electronic Commerce Research and Applications 11 (2012) 275–289 re-interpret the fitted quantile regression models for the original variable (WEEKLYDOWNLOADit) from the transformed variable (Log (WEEKLYDOWNLOADit)) (Koenker and Gilbert 1978). Eq. (1) can be estimated using the following linear form: logðQ ait Þ ¼ b0 ðaÞ þ X bj ðaÞ xijt þ eit ðaÞ ð2Þ Qait denotes the ath quantile of weekly downloads of software i at week t, and b0(a) is a constant term. xijt denotes the value of the jth covariate for software i at week t, and eit is the error term. In our context, average user ratings range from 1 to 5; thus a 3-star user rating can be defined as a neutral review and all the other levels can be defined as extreme reviews (either positive or negative). We note that a minor linear transformation on user ratings would help differentiate the rating level of user reviews, thus making the interpretation of the coefficients more intuitive. Instead of including USERRATINGit in the model, we consider (USERRATINGit-3). For parsimony, we name the new variable USERRATINGRit . If the ratings are neutral/positive/negative, (i.e., equal to/above/below point 3), USERRATINGRit is zero/positive/negative, respectively. Chevalier and Mayzlin (2006) found that 1-star user reviews hurt sales more than 5-star user reviews benefit sales. To assess the nonlinear impact of user reviews of different rating levels, we also include a quadratic term of USERRATINGRit , denoted by USERRATINGRSQit. To test the interaction effect between online user reviews and product variety, we include an interaction term WEEKLYVARIETY Ct USERRATINGRit (Kenny et al. 1998). WEEKLYVARIETY Ct refers to the centered number of software programs at week t. We demean the number of software programs (WEEKLYVARIETYt) to treat zero as a meaningful value of product variety for better interpretations of both the simple effect and the interaction effect (Aiken and West 1991, Judd and McClelland 1989). A single term, WEEKLYVARIETY Ct , is thus also included instead of WEEKLYVARIETYt. Therefore, the components related to user ratings in our model can be expressed as b1 USERRATINGRit þ b2 USERRATINGR SQ it þ b3 WEEKLYVARIETY Ct USERRATINGRit . The estimation of b3 determines the significance of the interaction effect between online user reviews and product variety. Hence, if the interaction effect between online user reviews and product variety indeed is present, the simple effect of user reviews, measured by b1 USERRATINGRit þ b2 USERRATINGR SQ it , cannot represent the actual impact of user reviews under most situations with various product variety levels. In these cases, the total impact of user reviews should be measured by b1 USERRATINGRit þ b2 USERRATINGR SQ it þ b3 C WEEKLYVARIETY t USERRATINGRit . Following previous studies, we use the cumulative number of downloads (TOTALDOWNLOADit) to control for network effects (Brynjolfsson and Kemerer 1996, Duan et al. 2009, Gallaugher and Wang 2002), which is considered to be particularly prominent in the software industry. We also include product age AGEit and the quadratic term of product age AGESQit to control for product diffusion (Duan et al. 2009). AGEit captures the linear part of the diffusion process, and AGESQit approximates the nonlinear component. Using these two variables allows us to reasonably control for product diffusion while maintaining an adequate degree of freedom for analysis (Duan et al. 2009). A dummy variable FREE- Table 5 Correlation matrix of key variables. Variable Antivirus software 1. LOG(WEEKLYDOWNLOADit) 2. USERRATINGRit 3. 4. 5. 6. 7. 8. WEEKLYVARIETY Ct CNETRATINGDit LOGTOALit WEEKLYRANKit AGEit FREEPRICEDit Digital media player 1. LOG(WEEKLYDOWNLOADit) 2. USERRATINGRit 3. 4. 5. 6. 7. 8. WEEKLYVARIETY Ct CNETRATINGDit LOGTOALit WEEKLYRANKit AGEit FREEPRICEDit Download manager 1. LOG(WEEKLYDOWNLOADit) 2. USERRATINGRit 3. 4. 5. 6. 7. 8. WEEKLYVARIETY Ct CNETRATINGDit LOGTOALit WEEKLYRANKit AGEit FREEPRICEDit File compression 1. LOG(WEEKLYDOWNLOADit) 2. USERRATINGRit 3. 4. 5. 6. 7. 8. WEEKLYVARIETY Ct CNETRATINGDit LOGTOALit WEEKLYRANKit AGEit FREEPRICEDit 1 2 3 1 0.656 1 0.004 0.026 1 0.438 0.811 0.947 0.374 0.271 0.276 0.732 0.661 0.104 0.224 0.007 0.014 0.181 0.004 0.002 1 0.650 1 0.004 0.004 1 0.378 0.803 0.926 0.040 0.228 0.341 0.688 0.604 0.248 0.181 0.003 0.002 0.163 0.010 0.002 1 0.647 1 0.034 0.012 1 0.159 0.677 0.941 0.344 0.223 0.269 0.675 0.621 0.134 0.107 0.011 0.021 0.244 0.032 2.3E04 1 0.630 1 0.01 0.018 1 0.283 0.823 0.935 0.179 0.169 0.172 0.677 0.582 0.086 0.130 0.010 0.003 0.177 0.012 0.020 4 5 6 7 8 1 0.427 0.335 0.068 0.070 1 0.764 0.025 0.205 1 0.372 0.267 1 0.050 1 1 0.457 0.285 0.213 0.105 1 0.731 0.319 0.138 1 0.027 0.221 1 0.022 1 0.349 0.170 0.307 3.14E04 1 0.611 0.119 0.019 1 0.340 0.198 1 0.157 1 0.374 0.240 0.302 0.027 1 0.744 0.133 0.032 1 0.143 0.191 1 0.035 1 Author's personal copy 284 W. Zhou, W. Duan / Electronic Commerce Research and Applications 11 (2012) 275–289 Table 6 OLS and quantile regression estimations. b1 USERRATINGR b2 USERRATINGRSQ b3 INTERACTION b4 WEEKLYVARIETYC b5 LOGTOTAL Antivirus software OLS 0.036 (0.008)*** Q10 0.022 (0.007)*** Q20 0.022 (0.007)*** Q30 0.020 (0.008)** Q40 0.011 (0.008) Q50 1.7E4 (0.008) Q60 0.007 (0.009) Q70 0.021 (0.011)** Q80 0.079 (0.016)*** Q90 0.132 (0.024)*** Q99 0.226 (0.035)*** 0.008 (0.004)** 0.006 (0.003)** 0.014 (0.003)*** 0.015 (0.003)*** 0.013 (0.004)*** 0.010 (0.003)*** 0.015 (0.003)*** 0.021 (0.004)*** 0.043 (0.005)*** 0.073 (0.008)*** 0.116 (0.014)*** 0.004 (0.000)*** 0.004 (0.000)*** 0.004 (0.000)*** 0.004 (0.000)*** 0.004 (0.000)*** 0.004 (0.000)*** 0.004 (0.000)*** 0.004 (0.000)*** 0.005 (0.000)*** 0.004 (0.000)*** 0.004 (0.001)*** 0.019 (0.000)*** 0.018 (0.001)*** 0.015 (0.001)*** 0.013 (0.001)*** 0.011 (0.001)*** 0.010 (0.001)*** 0.009 (0.001)*** 0.008 (0.001)*** 0.006 (0.001)*** 0.006 (0.001)*** 0.005 (0.001)*** Digital media player OLS 0.054 (0.015)*** Q10 0.020 (0.011)* Q20 0.054 (0.015)*** Q30 0.094 (0.014)*** Q40 0.096 (0.013)*** Q50 0.172 (0.022)*** Q60 0.343 (0.028)*** Q70 0.389 (0.019)*** Q80 0.460 (0.027)*** Q90 0.451 (0.038)*** Q99 0.821 (0.069)*** 0.009 (0.006) 0.001 (0.004) 0.009 (0.006) 0.023 (0.006)*** 0.020 (0.005)*** 0.039 (0.008)*** 0.087 (0.009)*** 0.100 (0.006)*** 0.120 (0.009)*** 0.092 (0.013)*** 0.167 (0.029)*** 0.001 (0.000)*** 0.001 (0.000)*** 0.001 (0.000)*** 0.001 (0.000)*** 0.001 (0.000)*** 0.001 (0.000)*** 0.002 (0.000)*** 0.002 (0.000)*** 0.002 (0.000)*** 0.002 (0.000)*** 0.001 (0.001) Download manager OLS 0.035 (0.011)*** Q10 0.033 (0.011)* Q20 0.013 (0.009) Q30 0.018 (0.008)** Q40 0.030 (0.009)*** Q50 0.048 (0.013)*** Q60 0.100 (0.016)*** Q70 0.168 (0.022)*** Q80 0.217 (0.029)*** Q90 0.194 (0.034)*** Q99 0.297 (0.144)** 3E4 (0.005) 0.004 (0.007)*** 0.005 (0.004) 0.010 (0.003)*** 0.017 (0.003)*** 0.026 (0.005)*** 0.044 (0.006)*** 0.066 (0.008)*** 0.071 (0.011)*** 0.054 (0.011)*** 0.011 (0.047) File compression OLS 0.131 (0.018)*** 0.032 (0.007)*** b6 AGE b7 AGESQ b8 FREEPRICED b9 CNETRATINGD b10 WEEKLYRANK Estimated parameter (N = 6162) 0.035 1E04 3E07 (0.004)*** (0.000) (0.000) 0.029 7.1E05 1.4E08 (0.004)*** (0.000)** (0.000) 0.048 7.8E05 4.8E10 (0.004)*** (0.000)** (0.000) 0.057 1.4E4 2.1E08 (0.005)*** (0.000)*** (0.000) 0.067 1.8E4 3.6E08 *** *** (0.005) (0.000) (0.000)** 0.081 2.5E4 5.5E08 (0.005)*** (0.000)*** (0.000)*** 0.094 2.9E4 7.4E08 (0.005)*** (0.000)*** (0.000)*** 0.112 4E4 1.2E07 (0.006)*** (0.000)*** (0.000)*** 0.136 0.001 1.7E07 (0.005)*** (0.000)*** (0.000)*** 0.206 0.001 3.0E07 (0.010)*** (0.000)*** (0.000)*** 0.246 0.001 5.7E07 (0.014)*** (0.000)*** (0.000)*** 0.036 (0.014)*** 0.014 (0.00) 0.017 (0.012)* 0.019 (0.011)* 0.016 (0.010) 0.011 (0.011) 0.014 (0.010) 0.027 (0.009)*** 0.018 (0.014) 0.046 (0.020)** 0.054 (0.054) 0.031 (0.026) 0.059 (0.021)*** 0.100 (0.030)*** 0.187 (0.031) *** 0.301 (0.087)*** 0.787 (0.091) *** 1.126 (0.107) *** 1.801 (0.111) *** 2.268 (0.193) *** 2.781 (0.094) *** 2.631 (0.099) *** 0.033 (0.000)*** 0.033 (0.000)*** 0.033 (0.000)*** 0.032 (0.000)*** 0.032 (0.000)*** 0.031 (0.000)*** 0.031 (0.000)*** 0.030 (0.000)*** 0.030 (0.000)*** 0.030 (0.000)*** 0.031 (0.001)*** 0.006 (0.000)*** 0.007 (0.000)*** 0.006 (0.000)*** 0.006 (0.000)*** 0.005 (0.001)*** 0.004 (0.001)*** 0.003 (0.001)*** 0.003 (0.001)*** 0.002 (0.001)*** 0.002 (0.001)*** 0.002 (0.002) Estimated parameter (N = 11,810) 0.093 4.0E04 2.55E07 (0.005)*** (0.000)*** (0.000)*** 0.065 2.5E4 4.8E08 (0.006)*** (0.000)*** (0.000)*** 0.093 4.2E4 9.0E08 (0.005)*** (0.000)*** (0.000)*** 0.112 4.7E4 9.6E08 (0.006)*** (0.000)*** (0.000)*** 0.130 0.001 1.3E07 (0.006)*** (0.000)*** (0.000)*** 0.154 0.001 1.4E07 (0.009)*** (0.000)*** (0.000)*** 0.185 0.001 1.3E07 (0.008)*** (0.000)*** (0.000)*** 0.238 0.001 1.9E07 (0.010)*** (0.000)*** (0.000)*** 0.303 0.001 2.4E07 (0.008)*** (0.000)*** (0.000)*** 0.314 0.002 3.5E07 (0.008)*** (0.000)*** (0.000)*** 0.354 0.002 5.9E07 (0.011)*** (0.000)*** (0.000)*** 0.006 (0.009)*** 0.010 (0.009) 0.006 (0.009) 0.048 (0.010)*** 0.040 (0.010)*** 0.034 (0.012)*** 0.018 (0.011) 0.001 (0.010) 0.037 (0.013)*** 0.060 (0.014)*** 0.103 (0.026)*** 0.152 (0.033) 0.070 (0.020)*** 0.152 (0.033)*** 0.251 (0.023)*** 0.263 (0.022)*** 0.358 (0.047)*** 0.417 (0.061)*** 0.757 (0.070)*** 1.301 (0.069)*** 1.236 (0.043)*** 0.824 (0.148)*** 0.013 (0.000)*** 0.013 (0.000)*** 0.013 (0.000)*** 0.013 (0.000)*** 0.013 (0.000)*** 0.012 (0.000)*** 0.012 (0.000)*** 0.011 (0.000)*** 0.01 (0.000)*** 0.011 (0.000)*** 0.010 (0.000)*** 0.002 (0.000)*** 0.002 (0.000)*** 0.002 (0.000)*** 0.002 (0.000)*** 0.002 (0.000)*** 0.002 (0.000)*** 0.003 (0.000)*** 0.003 (0.000)*** 0.004 (0.000)*** 0.005 (0.000)*** 0.007 (0.001)*** 0.013 (0.000)*** 0.012 (0.000)*** 0.011 (0.000)*** 0.010 (0.000)*** 0.009 (0.000)*** 0.008 (0.000)*** 0.007 (0.000)*** 0.007 (0.001)*** 0.005 (0.001)*** 0.003 (0.001)*** 0.002 (0.002) Estimated parameter (N = 5796) 0.043 2E4 2.98E07 (0.006)*** (0.000)*** (0.000)*** 0.044 1.1E4 3.6E08 (0.004)*** (0.000)*** (0.000)** 0.052 1.4E4 5.6E08 (0.005)*** (0.000)*** (0.000)*** 0.058 2.0E4 7.9E08 (0.005)*** (0.000)*** (0.000)*** 0.073 3.0E4 1.1E07 (0.005)*** (0.000)*** (0.000)*** 0.095458 4.1E4 1.5E07 (0.006)*** (0.000)*** (0.000)*** 0.111 5.2E4 1.9E07 (0.006)*** (0.000)*** (0.000)*** 0.137 0.001 2.2E07 (0.010)*** (0.000)*** (0.000)*** 0.165 0.001 2.7E07 (0.014)*** (0.000)*** (0.000)*** 0.248 0.001 4.3E07 (0.008)*** (0.000)*** (0.000)*** 0.095 0.001 2.5E07 (0.025)*** (0.000)*** (0.000)** 0.060 (0.015)*** 0.057 (0.010)*** 0.056 (0.010)*** 0.074 (0.009)*** 0.086 (0.009)*** 0.096 (0.010)*** 0.122 (0.013)*** 0.135 (0.014)*** 0.173 (0.020)*** 0.165 (0.029)*** 0.113 (0.072) 0.043 (0.019)** 0.020 (0.011)* 0.002 (0.013) 0.004 (0.010) 0.009 (0.010) 0.035 (0.013)*** 0.047 (0.014)*** 0.038 (0.020)* 0.047 (0.026)* 0.079 (0.028)*** 0.160 (0.054)*** 0.030 (0.000) *** 0.030 (0.000)*** 0.030 (0.000)*** 0.029 (0.000)*** 0.029 (0.000)*** 0.029 (0.000)*** 0.028 (0.000)*** 0.028 (0.000)*** 0.028 (0.000)*** 0.027 (0.000)*** 0.035 (0.001)*** 0.004 (0.000)*** 0.016 (0.001)*** Estimated parameter (N = 4749) 0.049 3E4 3.35E07 (0.007)*** (0.000)*** (0.000)*** 0.060 (0.015)*** 0.116 (0.021)*** 0.035 (0.000)*** Author's personal copy 285 W. Zhou, W. Duan / Electronic Commerce Research and Applications 11 (2012) 275–289 Table 6 (continued) Q10 Q20 Q30 Q40 Q50 Q60 Q70 Q80 Q90 Q99 b1 USERRATINGR b2 USERRATINGRSQ b3 INTERACTION b4 WEEKLYVARIETYC b5 LOGTOTAL b6 AGE b7 AGESQ b8 FREEPRICED b9 CNETRATINGD b10 WEEKLYRANK 0.164 (0.014)*** 0.183 (0.017)*** 0.170 (0.016)*** 0.189 (0.017)*** 0.237 (0.018)*** 0.282 (0.015)*** 4.673 (0.016)*** 4.573 (0.018)*** 3.989 (0.039)*** 3.868 (0.039)*** 0.047 (0.006)*** 0.060 (0.007)*** 0.050 (0.007)*** 0.059 (0.007)*** 0.080 (0.007)*** 0.100 (0.006)*** 0.368 (0.006)*** 0.396 (0.008)*** 0.367 (0.016)*** 0.185 (0.017)*** 0.004 (0.000)*** 0.003 (0.000)*** 0.003 (0.000)*** 0.003 (0.000)*** 0.002 (0.000)*** 0.003 (0.001)*** 0.130 (0.001)*** 0.141 (0.001)*** 0.154 (0.001)*** 0.050 (0.001)*** 0.015 (0.001)*** 0.013 (0.001)*** 0.011 (0.001)*** 0.010 (0.001)*** 0.010 (0.001)*** 0.008 (0.002)*** 0.004 (0.001)*** 0.003 (0.001)*** 0.003 (0.002) 0.003 (0.001) 0.060 (0.008)*** 0.078 (0.007)*** 0.100 (0.007)*** 0.125 (0.008)*** 0.162 (0.008)*** 0.195 (0.010)*** 0.005 (0.009)*** 0.005 (0.006)*** 0.003 (0.014)*** 0.001 (0.007)*** 3.4E4 (0.000)*** 3.3E4 (0.000)*** 4.2E4 (0.000)*** 0.001 (0.000)*** 0.001 (0.000)*** 0.001 (0.000)*** 0.239 (0.000)*** 0.264 (0.000)*** 0.347 (0.000)*** 0.468 (0.000)*** 5.8E08 (0.000)*** 4.0E08 (0.000)*** 6.0E08 (0.000)** 9.6E08 (0.000)*** 1.5E07 (0.000)*** 2.0E07 (0.000)*** 0.001 (0.000)*** 0.001 (0.000)*** 0.002 (0.000)*** 0.002 (0.000)*** 0.063 (0.013)*** 0.056 (0.013)*** 0.053 (0.013)*** 0.058 (0.014)*** 0.048 (0.016)*** 0.041 (0.015)*** 2.4E07 (0.020)** 2.9E07 (0.018) 3.6E07 (0.028) 5.7E07 (0.034)*** 0.158 (0.021)*** 0.167 (0.019)*** 0.171 (0.016)*** 0.200 (0.021)*** 0.238 (0.020)*** 0.272 (0.020)*** 0.039 (0.022)*** 0.027 (0.029)*** 0.018 (0.045)*** 0.154 (0.041)*** 0.035 (0.000)*** 0.035 (0.000)*** 0.034 (0.000)*** 0.033 (0.000)*** 0.032 (0.000)*** 0.031 (0.000)*** 0.285 (0.000)*** 0.281 (0.000)*** 0.120 (0.001)*** 0.301 (0.001)*** Notes: In each category, the first row presents the OLS estimation results and the remainder reports the quantile regression estimation results. Due to limited space, only the results of selected quantiles are reported. * p < .10 standard errors in parentheses. ** p < .05 standard errors in parentheses. *** p < .01 standard errors in parentheses. PRICEDit is used to control for the license difference of the software. The CNETD editorial staff reviews some of the software programs (less than 20%), with an emphasis on the more popular ones, and the editors use a five-star rating system similar to the user review system. We thus use a dummy variable CNETRATINGDit to control for the impact of the availability of a CNET rating. Finally, WEEKLYRANKit is also included to control for the influence of product popularity information, as well as for the potential herding effect (Duan et al. 2009, Tucker and Zhang 2007). Therefore, our final quantile regression model can be expressed as the following: logðWEEKLYDOWNLOADait Þ ¼ b0 ðaÞ þ b1 ðaÞ USERRATINGRit þ b2 ðaÞ USERRATINGR SQ it þ b3 ðaÞ WEEKLYVARIETY Ct USERRATINGRit þ b4 ðaÞ WEEKLYVARIETY Ct þ b5 ðaÞ logðTOTALDOWNLOADit Þ þ b6 ðaÞ AGEit þ b7 ðaÞ AGESQ it þ b8 ðaÞ FREEPRICEDit þ b9 ðaÞ CNETRATINGDit þ b10 ðaÞ WEEKLYRANK it þ eit ðaÞ ð3Þ The simple impact of product variety is measured by b4. If the interaction effect indicated by the significance of b3 is present, overall, one additional product added to CNETD would result in a download change of b4 þ b3 USERRATINGRit . By looking into the estimations of b1–b4 over a series of quantiles a, we would be able to derive the impacts of online user reviews and product variety on the skewed shape of the software download distribution. Specifically, products that have more weekly downloads, namely in higher quantiles (larger a) of weekly download distribution, are more popular. Therefore, by comparing bj(a) over quantiles, we could determine whether the influencer xijt is more significant on the hits than the niches and could further determine whether this influence contributes to the superstar or the long tail phenomenon. 4.3. Quantile regression results We estimate the quantile regression models using the aggregated weekly observations for a series of quantiles including 5th, 10th, . . . , 95th, 99th, in each of the four categories in period 2, during which the coexistence of long tail and superstar phenomena has been identified by our descriptive analysis.5 Tables 4 and 5 present the descriptive statistics and correlation matrix of our weekly CNETD data. To facilitate the interpretation of the quantile regression estimation results shown in Table 6, Fig. 6 provides quantile plots for some key independent variables for each category. We plot each quantile regression estimator of these variables for a (quantile), ranging from 5th to 99th, as the solid black curve. The gray area illustrates the 95% confidence interval. For all categories, most estimators are significant with narrow confidence intervals across all quantiles, which shows good overall model fits. We first examine the significance of the interaction effect between online user reviews and product variety because this interaction is closely related to our further inferences regarding the impact of both online user reviews and product variety. As expected, the coefficient of WEEKLYVARIETY Ct USERRATINGRit (b3) is significantly negative, which supports the proposed interaction effect between user reviews and product variety. In addition, we notice that this coefficient has a slightly downward slope across quantiles. Given the significance of this interaction effect, when WEEKLYVARIETY Ct is not zero (i.e., the mean), the overall impact of online user reviews can be expressed by b1 USERRATINGRit þ b2 USERRATINGR SQ it þ b3 WEEKLYVARIETY Ct USERRATINGRit . Thus, the magnitude of the impact of user reviews is contingent on the level of product variety. The coefficients of USERRATINGRit (b1) and USERRATINGRSQit (b2) are both significantly positive for most quantiles and are increasing over the quantiles. This result suggests that the simple effect of positive user reviews is positive and stronger for more popular products. However, the opposite signs and opposite trends across quantiles of the simple 5 We also extend our empirical examination to additional categories, which generates qualitatively similar results. The results are available upon request. Author's personal copy 286 W. Zhou, W. Duan / Electronic Commerce Research and Applications 11 (2012) 275–289 Fig. 6. Quantile plots for key independent variables. effect (b1 USERRATINGRit þ b2 USERRATINGR SQ it ) and the interaction effect (b3 WEEKLYVARIETY Ct USERRATINGRit ) suggest that the total impact of positive user reviews, indicated by the summation of the two terms, does not have a definite trend over product popularity. In other words, the effect depends on the level of product variety. Without knowing the specific level of product variety, we cannot determine whether positive user reviews reduce the heterogeneity of the consumption pattern, leading to the superstar phenomenon, or whether they help shift the demand from the hits to the niches, leading to the long tail phenomenon. Similarly, in terms of negative user reviews, we start by examining its simple effect at each possible value of USERRATINGR for negative user reviews (i.e., 0.5, 1, 1.5, 2), because of the opposite signs of USERRATINGRit and USERRATINGRSQit. The simple effect of negative user reviews is shown to be negative and gets more negative across quantiles. Nevertheless, the opposite signs and opposite trends across quantiles of the simple effect and the interaction effect with product variety also suggest that the negative impact of negative user reviews on user choices of products does not have a definite trend over different levels of product popularity. In sum, these results show that our study of how online user reviews influence the superstar and long tail phenomena is inconclusive; the results depends on the level of product variety. Omitting the interaction effect between online user reviews and product variety could result in misleading inferences because considering the simple effect alone results in inaccurate conclusions. This finding helps explain the mixed results regarding the impact of online user reviews on the skewed shape of user choices in extant studies, because most works examine the impact in various contexts but don’t consider the influence of product variety (Duan et al. 2009, Fleder and Hosanagar 2009, Hervas-Drane 2009, Maryanchyk 2008, Oestreicher-Singer and Sundararajan 2009, Zhao et al. 2008, Zhu and Zhang 2010). In addition, the interaction effect between online user reviews and product variety indicates that an increase in product variety weakens the impact of online user reviews. In terms of positive user reviews, both the negative value of b3 WEEKLYVARIETY Ct USERRATINGRit and the positive simple effect indicate that higher product variety leads to a less positive impact from positive user reviews. More product variety, nevertheless, weakens the impact of negative user reviews because of the positive value of b3 WEEKLYVARIETY Ct USERRATINGRit . In other words, higher product variety always dilutes the impact of user reviews, regardless of their rating levels. The magnitude discrepancy between the impacts of positive and negative user reviews, measured by 2 USERRATINGit ðb2 þ b3 WEEKLYVARIETY Ct Þ, is thus reduced by the increase in product variety. Moreover, the slightly downward trend of the interaction effect (b3) suggests that the weakening effect of product variety on the influence of user reviews is more significant on popular products than on niche products. This result suggests that, when they have more product choices, consumers may depend less on individual Author's personal copy W. Zhou, W. Duan / Electronic Commerce Research and Applications 11 (2012) 275–289 product reviews, especially for popular products. This result is somewhat counterintuitive because we expect consumers to be more dependent on others’ feedback to make decisions when facing a very large pool of choices. However, recent studies do identify the herding effect among consumers for popular products, regardless of user reviews (Duan et al. 2009). Thus, a possible explanation for the result would be that when product variety increases, consumers may simply follow previous consumers’ choices without resorting to user reviews. In terms of product variety, the coefficients of WEEKLYVARIETY Ct (b4) are significantly positive, with a larger value in the lower quantile. This result shows that the simple effect of product variety has a stronger positive impact on niche products than on hit products. To evaluate the overall impact of product variety, we conduct a series of estimations for b3 USERRATINGRit þ b4 at each possible value of USERRATINGRit , i.e., 2, 1.5, . . . , 1.5, 2. We find that the total effect of product variety on software downloads is also positive and shows a trend similar to its simple effect. In sum, enhanced product variety leads to the development of a relatively fatter tail in download pattern by helping generate more demand for the niches regardless of the rating level of user reviews, which contributes to the long tail formation. As proposed in many long tail theories (e.g., Brynjolfsson et al. 2011), providing more product choices satisfies niche preferences of some consumers, who would otherwise have to choose popular products despite their suboptimal match. It is noted that for products with positive user reviews, product variety might even have a negative impact on very popular products, which echoes the ‘‘overchoice’’ effect argued by Gourville and Soman (2005). As a comparison, we also conduct the OLS regression estimation, and the results of the key covariates are shown in the first row of Table 6. The OLS estimators on USERRATINGit R (b1) are significantly different from the corresponding median estimators of the quantile regression in all four categories, which suggests that the impact of online user reviews on the download distribution (WEEKLYDOWNLOADit) is heavily skewed, and OLS regression might not be able to provide a consistent estimation for the mean effect. Moreover, in the categories of Digital Media Player and Download Manager, the OLS estimators of coefficients of USERRATINGRSQit (b2) are both insignificant compared with the significant quantile regression estimates. This OLS result could generate a misleading conclusion: that the impact of user reviews with different rating levels is uniform; but this conclusion contradicts the results from extant studies (Chevalier and Mayzlin 2006). Thus, the concern again is raised that OLS regression might not be suitable when dealing with a highly skewed distribution of the dependent variable. In addition, OLS provides only one snap shot of the overall picture of how user reviews and product variety influence software downloads, which fails to explain the shape of the entire download distribution. Such a comparison empirically highlights the advantages of using the quantile regression over least-squares regression in long tail studies. Facilitated by this more systematic approach, the intertwined and complicated relationships between online user reviews and product variety and their impact on software downloads could offer possible explanations and interpretations for the inconsistent findings in extant research regarding the long tail and superstar effects of either demand side or supply side factors. 5. Discussion, limitations, and future research The objective of this paper is to demonstrate how both demand side and supply side factors in the online shopping environment influence user choices, thus cumulatively contributing to changes in online consumption patterns. We conduct our analysis in the 287 context of online software downloading that demonstrates the overwhelming amount of product choices available in the current electronic marketplaces. Our results provide a more comprehensive understanding of the underlying mechanism of long tail and superstar phenomena by differentiating the levels of both user review ratings and product popularity, as well as recognizing the interaction effect between user reviews and product variety. The interaction effect identified between product variety and online user reviews provides a new perspective for understanding long tail and superstar effects. The interaction effect suggests a more complicated relationship between online user reviews and the heterogeneity of user choices. The increase in product variety weakens the impact of both positive and negative user reviews, and this weakening effect is more pronounced on popular products than on niche products. Thus, suggesting and predicting a unified trend in the impact of user reviews on consumption pattern across various product popularity levels without considering the number of products offered would lead to inaccuracies and biases. The intricate and entangled mechanism, in which online user reviews interact with product variety, contributes to the complexity of and difficulty in the efforts to measure and understand the demand pattern in this line of research. Therefore, the interaction effect helps explain the coexistence of superstar and long tail phenomena observed in this study. In contrast to online user reviews, we are able to identify a well-defined trend of the impact of product variety across various product popularities. Overall, product variety influences user choices in favor of the long tail formation, regardless of the rating level of user reviews. This result is consistent with one of the key arguments of long tail advocates—that a more extensive assortment of product choices could better match consumers’ diversified preferences (Anderson 2006, Brynjolfsson et al. 2003). It also helps to explain the long tail phenomenon identified in our descriptive analysis. Our results generate some new insights that help reconcile the divergent findings regarding the different impacts of online user reviews on the skewed distribution shape of user choices. One possible explanation for the mixed results on the impact of online user reviews across different product popularity levels could be the failure to consider their interplay with product variety. Previous studies have been conducted in various contexts, with different levels of product variety during their data collection periods. The interaction effect we identify between online user reviews and product variety indicates that user reviews’ contribution to the skewed distribution of user choices depends on the specific product variety level within the study context. In addition, this paper provides a potential explanation for different opinions regarding the mean impact of online user reviews on user choices by pointing out that their impact varies with product popularity (Chen et al. 2004, Chevalier and Mayzlin 2006, Duan et al. 2008, Liu 2006). Chevalier and Mayzlin (2006) found that the 5-star reviews improve sales whereas 1-star reviews hurt sales, and the impact of 1-star reviews is greater than the impact of 5star reviews. The increase in valence of online user reviews results in more sales. In contrast to this significant impact attributed to online user reviews, Chen et al. (2004), Liu (2006) and Duan et al. (2008) argued that user reviews do not influence user choices. This inconsistency could be caused by a failure to analyze the impact of user reviews in a uniform way across the entire spectrum of user choices. This paper also contributes to the understanding of product assortment strategies for online platforms. Most extant studies agree that enhanced product variety benefits sales because of the much lower inventory costs in online stores. However, Gourville and Soman (2005) identified the ‘‘overchoice’’ effect, which refers to the negative impact of an increased product assortment on Author's personal copy 288 W. Zhou, W. Duan / Electronic Commerce Research and Applications 11 (2012) 275–289 consumer choices. Their key argument is that large amounts of complex information cause cognitive overload and anticipation of regret. The impact is so profound that customers might decide not to make any choice at all. Our results provide another perspective to interpret this potential negative impact of product variety. The only case in which the increased product variety might result in fewer downloads by consumers occurs in a small group of very popular products that have received positive user reviews. In essence, this finding echoes the key argument of long tail advocates, who claim that consumers with diversified preferences might make suboptimal choices when facing a limited selection. In this suboptimal case, very popular products with highly positive user feedback are probably favored by some consumers as substitutes for their best matches. If these consumers have preferences for certain niche products, as Anderson (2006) argued, overwhelmingly high product variety would eventually shift these consumers’ final choices from the hits to the niches that better satisfy their preferences. This finding also helps explain the observation identified in our descriptive analysis that superstar products are individually getting less popular over time. Finally, the results of this study also provide some managerial implications. Many companies are starting to consider leveraging the impact of the emerging niche market. Schmidt (2005), Google’s CEO, referenced the long tail in his statement about Google’s mission: serving both the ‘‘individual contributors, the small business, the company where Joe or Bob is the CEO, the CIO, the CFO, and the worker and the support person—a one person company, a two-person company, a three-person company’’ and a very large number of customers as well. From the perspective of social planners, because limitations on store space are less of a concern through Internet channels, making more products available is always preferred to increase consumer welfare (Brynjolfsson et al. 2003). However, the observations from our descriptive analysis suggest that retailers should be cautious about switching their strategy from highlighting the hits to promoting the niches because superstars still dominate the demand. Merchants need to pay careful attention when making decisions about entering the niche market. Although the niche products do meet some consumers’ demand, the evidence that they can gain a larger market share in this more competitive market is lacking. In addition, the identified interaction effect suggests that retailers should adjust the product selection criteria to their marketing strategies. Those, who have the capacity limit and thus focus on marketing superstars, should cautiously select products of high qualities or slant their selections to consumers’ ordinary tastes to stimulate positive user reviews. For online retailers whose products seriously suffer from an excess of negative user feedback, providing more products would be a promising solution to diminish the negative impact of the feedback. One limitation of this study is that the data do not allow us to quantify the potential benefits from the coexistence of long tail and superstar phenomena. Bentley et al. (2009) analytically concluded that the optimal inventory for digital retailers is far from infinite because of the diminishing sales as they proceed further into the tail products. An interesting extension of the current study is to examine whether the niche market improves at the expense of the hit market. In addition, we use a dummy variable to control for the existence of expert ratings, but don’t consider the valence of expert reviews because of the lack of an adequate sample. Only a very small portion of software products (less than 20%) is reviewed by CNETD. The quantile regression results in each category show that the existence of expert reviews can have either a positive influence or a negative influence on software downloading. This influence is stronger with respect to the more popular products. One interesting extension of this study in future research is to examine the impact of expert ratings on the whole distribution of consumer choices. Extant studies focused primarily on either expert reviews or user reviews. Future research, therefore, also calls for a more comprehensive investigation of the impact of online WOM information generated by different reviewer identity on consumers’ decision-making. In addition, evidence from the literature seems to suggest that product types may play a role in explaining the superstar and long tail phenomena. Of course, the data used in this study are in the online software downloading market; yet, our analyses may very likely yield additional findings and implications in other industries, which definitely calls for extended investigations in future research that include more product types in one study. Furthermore, the use of free or free-to-try products in this study may lead to concerns about the generalizability of our results. Online users might treat decisions about free or free-to-try software programs less seriously than they make decisions to purchase more expensive products. However, most software programs require users to learn new interfaces and explore new functionalities, and even free or free-to-try software programs often require significant commitments from users (Duan et al. 2009). For example, Digital Media Player software is often used to manage media files, and users need to input extensive information about various files and invest significant effort to process these files. From this point of view, free-to-download software programs are not substantially different from other software products that online users have to purchase. Finally, our study considers informational effect on consumer choices in only one third-party website, the CNETD. Consumers might search multiple information resources online before making product choices, especially for purchase decisions, which leads to the important future research direction of examining how information from multiple online resources (e.g., both retail and third-party websites) influences consumer choices and ultimately product sales. Acknowledgements The authors thank Refik Soyer, the reviewers and conference participants of the 2009 China Summer Workshop on Information Management (CSWIM 2009), International Conference on Information Systems (ICIS 2009), and seminar participants at George Washington University for valuable comments on this research. All errors are our own. References Aiken, L. S., and West, S. G. Multiple Regression: Testing and Interpreting Interactions. Sage Publications, Thousand Oaks, CA, 1991. Anderson, C. The Long Tail: Why the Future of Business Is Selling Less of More. Hyperion Press, New York, NY, 2006. Bakos, Y. Reducing buyer search costs: implications for electronic marketplaces. Management Science, 43, 12, 1997, 1676–1692. Bentley, R. A., Ormerod, P., and Madsen, M. E. Shelf space strategy in long-tail markets. Physica A: Statistical Mechanics and its Applications, 388, 5, 2009, 691– 696. Brynjolfsson, E., and Kemerer, C. F. Network externalities in microcomputer software: an econometric analysis of the spreadsheet market. Management Science, 42, 12, 1996, 1627–1647. Brynjolfsson, E., Hu, Y., and Smith, M. Consumer surplus in the digital economy: estimating the values of increased product variety at online booksellers. Management Science, 49, 11, 2003, 1580–1596. Brynjolfsson, E., Hu, Y., and Smith, M. From niches to riches: the anatomy of the long tail. Sloan Management Review, 47, 4, 2006, 67–71. Brynjolfsson, E., Hu, Y., and Smith, M. The long tail: the changing shape of Amazon’s sales distribution curve. Working paper, SSRN, 2010a. http://ssrn.com/ abstract=167999. Brynjolfsson, E., Hu, Y., and Smith, M. Long tails vs. superstars: the effect of information technology on product variety and sales concentration patterns. Information Systems Research, 21, 4, 2010b, 736–747. Brynjolfsson, E., Hu, Y., and Simester, D. Goodbye Pareto principle, hello long tail: the effect of search costs on the concentration of product sales. Management Science, 57, 8, 2011, 1373–1386. Cheema, A., and Papatla, P. Relative importance of online versus offline information for internet purchases: the effect of product category and Internet experience. Journal of Business Research, 63, 9–10, 2010, 979–985. Author's personal copy W. Zhou, W. Duan / Electronic Commerce Research and Applications 11 (2012) 275–289 Chen, P. Y., Wu, S. Y., and Yoon, J. The impact of online recommendations and consumer feedback on sales. In Proceedings of the 25th International Conference on Information Systems (ICIS 2004), Washington, DC, December 12–14, 2004, 711–724. Chevalier, J. A., and Mayzlin, D. The effect of word of mouth on sales: online book reviews. Journal of Marketing Science, 43, 3, 2006, 345–354. Clemons, E., Gao, G., and Hitt, L. When online reviews meet hyperdifferentiation: a study of the craft beer industry. Journal of Management Information Systems, 23, 2, 2006, 149–171. Duan, W., Gu, B., and Whinston, A. B. The dynamics of online word-of-mouth and product sales—an empirical investigation of the movie industry. Journal of Retailing, 84, 2, 2008, 233–242. Duan, W., Gu, B., and Whinston, A. B. Informational cascades and software adoption on the internet: an empirical investigation. MIS Quarterly, 33, 1, 2009, 23–48. Elberse, A., and Oberholzer-Gee, F. Superstars and underdogs: an examination of the long tail phenomenon in video sales. Working paper no.07-120, Harvard Business School, Cambridge, MA, 2008. Fleder, D., and Hosanagar, K. Blockbuster culture’s next rise or fall: the impact of recommender systems on sales diversity. Management Science, 55, 5, 2009, 697– 712. Frank, R. H., and Philip, J. C. The Winner-Take-All Society. The Free Press, New York, NY, 1995. Gallaugher, J. M., and Wang, Y. M. Understanding network effects in software markets: evidence from web server pricing. MIS Quarterly, 26, 4, 2002, 303–327. Ghose, A., and Gu, B. Search costs, demand structure and long tail in electronic markets: theory and evidence. Working paper no. 06-19, NET Institute, 2007. Godes, D., and Mayzlin, D. Using online conversations to study word of mouth communication. Marketing Science, 23, 4, 2004, 545–560. Goldmanis, M., Hortaçsu, A., Syverson, C., and Emre, O. E-commerce and the market structure of retail industries. The Economic Journal, 120, 545, 2009, 651–682. Gourville, J. T., and Soman, D. Overchoice and assortment type: when and why variety backfires. Marketing Science, 24, 3, 2005, 382–395. 289 Hansen, F. Psychological theories of consumer choices. Journal of Consumer Research, 3, 3, 1976, 117–142. Hervas-Drane, A. Word of mouth and tasting matching: a theory of long tail. Working paper no. 07-14, NET Institute, 2009. Judd, C. M., and McClelland, G. H. Data Analysis: A Model Comparison Approach. Harcourt Brace Jovanovich, San Diego, CA, 1989. Kenny, D. A., Kashy, D. A., and Bolger, N. Data analysis in social psychology. In D. Gilbert, S. Fiske, and G. Lindzey (eds.). Handbook of Social Psychology, Vol. 1, McGraw-Hill, Boston, MA, 1998, 233–265. Koenker, R., and Gilbert, B. Regression quantiles. Econometrica, 46, 1, 1978, 33–50. Liu, Y. Word of mouth for movies: its dynamics and impact on box office revenue. Journal of Marketing, 70, 3, 2006, 74–89. Maryanchyk, I. Are ratings informative signals? The analysis of the Netflix data. Working paper no. 08-22, NET Institute, 2008. Oestreicher-Singer, G., and Sundararajan, A. Recommendation networks and the long tail of electronic commerce. Working paper no. 09-03, NET Institute, 2009. Rosen, S. The economics of superstars. American Economic Review, 71, 5, 1981, 845– 858. Schmidt, E. Presentation, Annual Stockholders’ meeting, Google Inc., May 12, 2005. Tan, T. F., and Netessine, S. Is Tom Cruise threatened? Using Netflix prize data to examine the long tail of electronic commerce. Working paper, Wharton Business School, University of Pennsylvania, Philadelphia, PA, 2009. Tucker, C., and Zhang, J. J. Long tail or steep tail? A field investigation into how online popularity information affects the distribution of customer choices. Working paper no. 4655-07, Massachusetts Institute of Technology, Cambridge, MA, 2007. Zhao, X., Gu, B., and Whinston, A. B. The influence of online word-of-mouth long tail formation: an empirical analysis. In Proceedings of the Conference on Information Systems and Technology (CIST 2008), Washington, DC, October 11–12, 2008. Zhu, F., and Zhang, X. Impact of online consumer reviews on sales: the moderating role of product and consumer characteristics. Journal of Marketing, 74, 2, 2010, 133–148.
© Copyright 2026 Paperzz