2014 UKSim-AMSS 16th International Conference on Computer Modelling and Simulation Interval Estimation of the Herfindahl-Hirschman Index Under Incomplete Market Information Marta Flamini Faculty of Engineering Università Telematica Internazionale UNINETTUNO Roma, Italy m.fl[email protected] Maurizio Naldi Dpt. of Computer Science and Civil Engineering Università di Roma Tor Vergata Roma, Italy Email: [email protected] which may be tiny by themselves but add up to a large fraction of the market. In those cases, we may wish to obtain nevertheless an estimate of the HHI. A point estimate for the HHI in the presence of incomplete market data has been provided in [6]. Being based on a combinatorics analysis to assess the probability that a customer chooses one of the manufacturers, that estimate assumed however that the market share was represented by an integer number. A further attempt has been again made by Nauenberg et alii [7], adopting the Bradford distribution to fit the incomplete data and estimate the unknown market shares. An upper and lower bound to estimate the HHI have been proposed in the context of market power acquisition by Kanagala et alii [8]. However, those bounds relied on quite specific assumptions, which may not be met in most cases. The lower bound was computed under the hypothesis that the dominant market participant had 20% of market share and the rest of the total market share was uniformly distributed among N − 1 participants. The upper bound adopted instead the hypothesis that the market was made of 5 companies, each having a 20% share. Here we propose a new approach to estimate the HHI when we do not know all the market shares. Instead of providing a point estimate, we aim an interval estimate. On the basis of the only assumption that we know the largest market shares, without any further assumption on the probability model that generates the market distribution, we derive both a lower and an upper bound for the HHI, which serve as the two ends of our interval estimate. Unlike [8], these bounds do not rely on any specific or arbitrary assumption, and are therefore tight bounds, valid under any market condition. We apply these bounds to three cases, considering respectively a set of real data and two sets of synthetic data, where we assume that the sales follow respectively a Zipf law and a Pareto probability distribution. We show that the interval estimate is quite tight even when the unknown market shares add up to a signification fraction of the overall market. The paper is organized as follows. In Section II, we define Abstract—An interval estimate is provided for the Herfindahl-Hirschman Index (HHI) when the knowledge about the market is incomplete, and we know just the largest market shares. Two rigorous bounds are provided for the HHI, without any further assumptions. Though the interval gets wider as the sum of the known market shares gets smaller, the estimate proves to be quite tight even when the fraction of the market that we do not know in detail is as high as 30%. This robustness is shown through three examples, considering respectively a set of real data and two sets of synthetic data, with the company sizes (a proxy for market shares) following respectively a Zipf law and a Pareto distribution. Keywords-Herfindahl-Hirschman Index; HHI; Zipf law; Pareto distribution; Market structure; I. I NTRODUCTION The analysis of the structure of a market is a relevant issue. That means understanding how it is divided among the companies, whether it is characterized by a significant competition or instead is dominated by a few companies or even a single company. The analysis of the degree of concentration (or competition, its counterpart) is a key driving force for the industrial policy of many administrations. This analysis is often conducted by considering the HerfindahlHirschman Index (HHI) as a parameter measuring the degree of concentration [1]. The use of the HHI encompasses many industrial sectors and serves many application purposes. Just to name a few examples, it has been considered in [2] to measure the degree of competition in the DSL (Digital Subscriber Line) telecommunication access market, as an indicator of the harmony of accounting measurement practices (i.e., if companies use the same accounting practice) in [3], to investigate bank pricing in [4], and to measure concentration in banking systems in [5]. The computation of the HHI requires however the precise knowledge of the market shares of all the companies, so that the 100% of the market is accounted for. It is often the case that we do not know all of them. For example, the market may include a large number of small manufacturers, for which we do not know precisely the market shares, 978-1-4799-4923-6/14 $31.00 © 2014 IEEE DOI 10.1109/UKSim.2014.66 317 HHI Concentration degree < 0.01 0.01 ÷ 0.15 0.15 ÷ 0.25 > 0.25 negligible competition absence of concentration phenomena moderate concentration strong concentration We now provide some examples of real world evaluations of the HHI. We consider first the market structure of mobile telephony services. The data for 2011 in Italy, reported by the Italian Telecommunications Regulatory Authority AGCOM in its Annual report, are shown in Table II [9]. On the basis of these data, we can easily compute the HHI, which is 0.30785. Table I HHI VALUES AND DEGREE OF CONCENTRATION Operator the HHI and provide examples of its computation. We derive the bounds on the HHI in Section III and apply them in Section IV. II. T HE H ERFINDAHL -H IRSCHMAN I NDEX Total 100 According to the typical classification reported above, this market exhibits therefore a strong degree of concentration. III. B OUNDS ON HHI In Section II, we have provided the definition of the HHI and described its application in several contexts. As stated in the Introduction, it is often the case that one does not have access to detailed information concerning all the companies’ market shares. Though the precise HHI value cannot be computed in those cases, it is desirable to have at least an estimate of the true value. In this section, we derive upper and lower bounds for the HHI, which can serve as an interval estimate of the actual HHI value. In many cases, we know just the market shares of the most important market players. This may happen, for example if the market includes many small manufacturers (or service providers, if we are considering a market for services), or the complete data are simply not available. Suppose that we know just the market shares of the M largest companies in the market. Without loss of generality, we label the market shares in non-increasing order, so that s1 ≤ s2 ≤ · · · ≤ sM ≤ sM +1 ≤ · · · ≤ sN . Then we know s1 , s2 , . . . , sM , while we don’t know sM +1 , sM +2 , . . . , sN . Since all the market shares are positive numbers, the true HHI is larger than the partial sum of the squares of known market shares: i=1 where the following obvious constraint holds si = 1. 37.1 35.4 19.7 7.8 Table II M ARKET SHARES OF MOBILE NETWORK OPERATORS IN I TALY (2011) The Herfindahl-Hirschman Index is a well established tool to assess the degree of competition in a market (or, conversely, the closeness to a monopolistic market structure). In this section, we provide its definition, the relation between its values and the market structure, and some examples of actual values in several contexts. If we consider a market, where N companies operate, and indicate the market share of the i-th company by si (i = 1, 2, . . . , N and 0 < si ≤ 1), the Hirschman-Herfindahl Index is N HHI = s2i , (1) N Market share [%] Vodafone Telecom Italia Wind H3G (2) i=1 THe HHI is a well established indicator of the degree of competition in the market, with low values indicating a high degree of competition and, conversely, higher values betraying closeness to a monopoly. Though in the definition (1) employing market shares it is a normalized index, its range is not the whole [0,1] interval. In fact, we can see that the maximum value is 1 (when the market is made of a single company, which has a market share s1 = 1), but the minimum value, which holds when the market is equally shared by all the companies (perfect competition), is 2 2 N 1 1 1 min HHI = (3) =N = , N N N i=1 HHI = so that the HHI varies within the range N s2i > i=1 1 ≤ HHI ≤ 1. (4) N Though an exact mapping of HHI values onto concentration degree is not possible, given the qualitative nature of the concentration concept, the classification of Table I is rather established. M s2i . (5) i=1 That partial sum is therefore itself a (rough) lower bound for the HHI. A tighter lower bound as well as an upper bound can be found by introducing the residual sum of squares Δ= N i=M +1 318 s2i > 0, (6) so that HHI = M s2i + Δ. (7) i=1 min Δ = (N − M ) M 1 − i=1 si N −M 2 = 2 M 1 − i=1 si N −M , i=M +1 = s2M Q + i=1 1− M i=M +1 2 si − sM Q (14) , i=1 and the resulting upper bound for the HHI 2 M M 2 2 HHI < si + sM Q + 1 − si − sM Q . (15) i=1 i=1 Depending on the direction of the inequality R ≶ sM , we end up therefore with two different ranges for the HHI, with the first one being valid when R ≤ sM and the second one when R > sM . The set of bounds is reported in Table III. (10) we have two cases, depending on the direction of the inequality R ≶ sM . If R ≤ sM , we can assign all the residual market share to the M +1-st company, since the constraint sM +1 = R < sM holds true. Though this means that the remaining companies (i.e., those labelled M + 2, M + 3, . . . , N ) have a zero market share, so that the market is actually made of M + 1 companies rather than N . We can always approximate this situation with a full market of N companies, assuming that the smaller N − M − 1 have a market share → 0. The resulting maximum value for Δ is 2 N M 2 2 si = (sM +1 ) = 1 − si , (11) max Δ = IV. T IGHTNESS OF BOUNDS In Section III, we have derived both the lower and the upper bound on the HHI, which allows us to obtain an interval estimate for the HHI. In order to be of use, the interval associated to such an estimate should be as narrow as possible. In this section, we evaluate the tightness of those bounds in three situations. We first consider real data, and then synthetic data obtained from two models typically found for the distribution of company sizes: Zipf and Pareto distributions. A. Real data We consider the market of mobile phones. According to the data reported in Table IV-A, provided by Gartner [10], the market is made of 10 major players, but a large slice, making up roughly 1/3 of the whole sales, is fragmented among a host of minor manufacturers. We cannot compute the exact HHI value in this case. We resort therefore to the bounds obtained in the previous section. i=1 and the resulting upper bound for the HHI is provided by the following inequality 2 M M 2 si + 1 − si . (12) HHI < i=1 + With this Δ-maximizing assignment, we have 2 M +Q M +Q M 2 max Δ = si + 1 − si − si i=1 i=M +1 2 (1− M i=1 si ) −M 2 N + 1− M i=1 si 2 M 2 M 2 i=1 si + sM Q + 1 − i=1 si − sM Q 2 i=1 si M 2 i=1 si On the other hand, if R > sM , we can assign the following residual market shares so that Q = R/sM have a market share equal to sM ⎧ i = M + 1, . . . , M + Q ⎨ sM R − sM Q i = M + Q + 1 (13) si = ⎩ 0 i = M + Q + 2, . . . , N By a similar reasoning, we can obtain an upper bound. In fact, we know that the highest values of the HHI are obtained when the market is concentrated in as few hands as possible. Likewise, the higher values for Δ are obtained when the residual market shares are concentrated in as few companies as possible. However, we cannot always consider M all the residual market share 1 − i=1 si to be concentrated in a single company (namely the M + 1-st), since by the ordering imposed we have sM +1 ≤ sM . By introducing the residual market share si , M Table III B OUNDS ON HHI which provides the following lower bound inequality for the HHI 2 M M 1 − i=1 si HHI > (9) s2i + N −M i=1 R=1− Lower Upper (R > sM ) (8) M Bound Upper (R ≤ sM ) We now have to identify lower and upper bounds for Δ. Since we know that the lowest value for the HHI takes place when all the companies have the same market share, as in Equation (3), this is true for the residual sum of squares as well, so that the minimum value of Δ takes place when 1− M i=1 si , i = M + 1, M + 2, . . . , N : si = N −M Type i=1 319 Brand 0.07 Share [%] 22.0 19.1 7.5 3.9 3.3 2.7 2.1 2.0 1.9 1.8 33.7 0.068 0.066 HHI Samsung Nokia Apple ZTE LG Huawei TCL RIM Motorola HTC Others HHI max HHI min 0.064 0.062 Table IV S ALES OF MOBILE PHONES IN 2012 0.06 6 7 8 9 10 11 12 Number of known market shares Figure 1. Since we know M = 10, but we don’t know the total number N of manufacturers, we cannot use the lower bound (9), but rather the less tight (5). We obtain min HHI = 10 s2i = 0.0663. been expressed as a function of the truncated Riemann Zeta N −α , which can be immediately function ζN (α) = i=1 i seen as tightly connected with the generalized Zipf law (19): (16) i=1 As to the upper bound, the lowest known market share is s10 = 0.018, and the residual market share is R = 0.337, so that R > s10 , and we fall in the second upper bound case of Table III. Since Q = R/s10 = 18, the upper bound is 2 10 10 2 2 max HHI = si + 18s10 + 1 − si − 18s10 i=1 HHI = ζN (2α) 2 (α) . ζN (20) Here we consider that the market shares follow the generalized form of the Zipf law, assuming that we however know just the larger M market shares, and see what the resulting bounds are for the HHI. In [15] the value of the Zipf parameter α has been found to be about 1 for the U.S.market. The wider range 0.7 ÷ 1.2 has been observed in [17] for Japanese companies, while an even wider range 0.44 ÷ 1.4 has been estimated for the slope of the power law for 20 countries in America, Asia and Europe [16]. Here, we report in Figure1, Figure 2, and Figure 3 respectively the results for three values of the Zipf parameter that encompass these ranges, i.e., α = 0.5, 1, 1.5, for a market with N = 20 players, when the number of known market shares ranges from 6 to 12. As expected, the incertitude cone shrinks as the number of known market shares increases. The incertitude range is however quite small even at its maximum, when we know just 6 market shares out of 20 (less than 1/3), since the interval width with respect to the interval center is just ±6.1% when α = 0.5, reduces to ±3.2% when α = 1, and becomes a negligible ±0.58% when α = 1.5. The bound can then be considered quite tight. It is to be noted that such tight bounds are achieved even in the presence of high residual values for the unknown market shares. In Table V we report the residual R for the cases considered here. In one case (M = 6 and α = 0.5), the fraction of unknown market shares is even larger than that of known ones. i=1 = 0.00663 + 0.00583 + 0.000169 = 0.0126. (17) The interval estimate for the HHI is then 0.00663 < HHI < 0.0126, HHI bounds for the Zipf law (α = 0.5) (18) which leads us to classify that market as one with negligible concentration. B. The Zipf model The generalized Zipf law states that, if we rank a collection of N subjects in non-decreasing order according to their size, the product of a power of the rank and of the size of each subject is constant throughout the collection, so that the size decreases with the rank according to the equation 1 . (19) iα That law, initially observed for α = 1 by Zipf in a linguistic context [11], has been found to represent well a number of other phenomena (see [12] for an explanation of its properties and [13], [14] as two examples of its application in other contexts). In particular, it has been found to describe well the distribution of firms’ sizes [15], [16], [17], [18] and is therefore a good candidate to describe the distribution of market shares. In [19], the relationship between the HHI and the Zipf law has been investigated, and the HHI has si ∝ 320 0.13 HHI max HHI min 100 0.128 Empirical pdf 80 HHI 0.126 0.124 60 40 20 0.122 0 0.12 6 7 8 9 10 11 12 0.04 Number of known market shares 0.08 0.1 0.12 0.14 HHI HHI bounds for the Zipf law (α = 1) Figure 2. 0.06 Figure 4. Empirical distribution of HHI under the Pareto distribution 0.26 HHI max HHI min cumulative distribution function [21]. In the Pareto distribution, the probability that a company has a size lower than x is (with α > 2 to have at least the first two moments finite [22]) α k F (x) = 1 − . (21) x 0.258 HHI 0.256 0.254 The empirical pdf of the resulting HHI is shown in Figure 4, where its skewness and the tail we expect in a power law model are visible. 0.252 0.25 6 7 8 9 10 11 12 Number of known market shares Figure 3. When we come to the estimates of HHI, in this case we do not find a deterministic relationship as in the Zipf law, but rather a scatterplot. For a sample of 1000 simulation instances (with k = 1 and α = 3) and a market made of 20 players, we find the belts reported in Figure 5 and Figure 6 respectively. The scatterplot is obtained by computing the upper and lower bound for HHI for each simulated instance (which exhibits a random value of R). The belts obtained in such a way are plotted versus the overall market share owned by the M largest companies (i.e., 1 − R). HHI bounds for the Zipf law (α = 1.5) C. The Pareto distribution Another model, closely related to the Zipf law, to describe the distribution of firms size, is the Pareto distribution [20]. While the Zipf law provides deterministic values for ranked quantities, the Pareto distribution offers a probability model where no ranking is involved. However, the two models, which are both power-laws, bear quite a direct relationship, which is self-evident by plotting the Pareto complementary M 6 9 12 0.5 α 1 1.5 52.1 38.1 26.1 31.9 21.4 13.7 15.8 9.5 5.7 V. C ONCLUSION When we do not know the whole distribution of the market, the Herfindahl-Hirschman Index cannot be computed exactly. We have derived both a lower and an upper bounds on the HHI. These bounds do not require further assumptions and can be used to obtain an interval estimate of the HHI. We have demonstrated their application in three cases, considering respectively a set of real data and two sets of synthetic data, obtained by assuming that the company sales (a proxy for the market shares) follow respectively a Zipf law and a Pareto distribution. In all cases, we have shown that the estimates provide quite a tight interval even Table V R ESIDUAL MARKET SHARE R [%] 321 [6] E. Nauenberg, K. Basu, and H. Chand, “HirschmanHerfindahl index determination under incomplete information,” Applied Economics Letters, vol. 4, no. 10, pp. 639–642, 1997. HHI 0.2 [7] E. Nauenberg, M. Alkhamisi, and Y. Andrijuk, “Simulation of a Hirschman-Herfindahl index without complete market share information,” Health Economics, vol. 13, no. 1, pp. 87–94, 2004. 0.15 0.1 [8] A. Kanagala, M. Sahni, S. Sharma, B. Gou, and J. Yu, “A probabilistic approach of hirschman-herfindahl index (hhi) to determine possibility of market power acquisition,” in Power Systems Conference and Exposition, 2004. IEEE PES, Oct 2004, pp. 1277–1282 vol.3. 0.05 0.35 0.4 0.45 0.5 0.55 0.6 0.65 Known market share Figure 5. [9] AGCOM, “Annual report,” 2012, available at www.agcom.it/Default.aspx?message=contenuto&DCId=5. HHI bounds for the Pareto distribution (M = 6) [10] Gartner, “Gartner says worldwide mobile phone sales declined 1.7 percent in 2012,” Press release, February 2013, available at www.gartner.com/newsroom/id/2335616. 0.25 [11] G. K. Zipf, Human behavior and the principle of least effort. Addison-Wesley, 1949. HHI 0.2 [12] W. J. Reed, “The Pareto, Zipf and other power laws,” Economics Letters, vol. 74, no. 1, pp. 15–19, 2001. 0.15 [13] X. Gabaix, “Zipf’s law for cities: an explanation,” The Quarterly journal of economics, vol. 114, no. 3, pp. 739–767, 1999. 0.1 [14] M. Naldi and C. Salaris, “Rank-size distribution of teletraffic and customers over a wide area network,” European Transactions on Telecommunications, vol. 17, no. 4, pp. 415–421, 2006. 0.05 0.65 0.7 0.75 0.8 Known market share Figure 6. [15] R. L. Axtell, “Zipf distribution of us firm sizes,” Science, vol. 293, no. 5536, pp. 1818–1820, 2001. HHI bounds for the Pareto distribution (M = 12) [16] J. Ramsden and G. Kiss-Haypal, “Company size distribution in different countries,” Physica A: Statistical Mechanics and its Applications, vol. 277, no. 1-2, pp. 220 – 227, 2000. when the unknown market shares account for a significant fraction of the overall market. [17] K. Okuyama, M. Takayasu, and H. Takayasu, “Zipf’s law in income distribution of companies,” Physica A: Statistical Mechanics and its Applications, vol. 269, no. 1, pp. 125–131, 1999. R EFERENCES [1] S. A. Rhoades, “The Herfindahl-Hirschman index,” Federal Reserve Bulletin, no. Mar, pp. 188–189, 1993. [18] P. L. Conti, L. D. Giovanni, and M. Naldi, “Blind maximum likelihood estimation of traffic matrices under long-range dependent traffic,” Computer Networks, vol. 54, no. 15, pp. 2626–2639, 2010. [2] J. Bouckaert, T. Van Dijk, and F. Verboven, “Access regulation, competition, and broadband penetration: An international study,” Telecommunications Policy, vol. 34, no. 11, pp. 661–671, 2010. [19] M. Naldi, “Concentration indices and Zipf’s law,” Economics Letters, vol. 78, no. 3, pp. 329 – 334, 2003. [3] R. H. Taplin, “Harmony, statistical inference with the herfindahl h index and c index,” Abacus, vol. 39, no. 1, pp. 82–94, 2003. [20] J. Growiec, F. Pammolli, M. Riccaboni, and H. E. Stanley, “On the size distribution of business firms,” Economics Letters, vol. 98, no. 2, pp. 207–212, 2008. [4] T. H. Hannan, “Market share inequality, the number of competitors, and the hhi: An examination of bank pricing,” Review of Industrial Organization, vol. 12, no. 1, pp. 23–35, 1997. [21] L. A. Adamic, “Zipf, power-laws, and Pareto - a ranking tutorial,” Available at www.parc.xerox.com/istl/groups/iea/papers/ranking/ranking.html, 2000. [5] C. Alegria and K. Schaeck, “On measuring concentration in banking systems,” Finance Research Letters, vol. 5, no. 1, pp. 59–67, 2008. [22] C. Walck, “Handbook on statistical distributions for experimentalists,” 2007. 322
© Copyright 2026 Paperzz