On the estimation of the heavy–tail exponent in time series using the max–spectrum Stilian A. Stoev ([email protected]) University of Michigan, Ann Arbor, U.S.A. JSM, Salt Lake City, 2007 joint work with: George Michailidis ([email protected]) and Murad Taqqu ([email protected]) Outline • Heavy tails are ubiquitous • An old problem • Max–spectrum • The estimator • Asymptotic properties • Data examples 2 Heavy tails • A random variable X is said to be heavy–tailed if P{|X| ≥ x} ∼ L(x)x−α, as x → ∞, for some α > 0 and a slowly varying function L. ◦ Here we focus on the simpler but important context: X ≥ 0, a.s. and P{X > x} ∼ Cx−α, ◦ X (infinite moments) For p > 0, In particular, and EX p < ∞ if and only if 0<α≤2 0<α≤1 ⇒ ⇒ as x → ∞. p < α. Var(X) = ∞ E|X| = ∞. • The estimation of the heavy–tail exponent α is an important problem with rich history. 3 Heavy tails everywhere: Traded volumes 5 10 x 10 Traded Volumes No. Stocks, INTC, Nov 1, 2005 8 6 4 2 2 4 6 8 10 12 4 x 10 4 x 10 4 3 2 1 2000 4000 6000 8000 10000 12000 4 Heavy tails everywhere: TCP durations 4 x 10 TCP Flow Sizes (packets): UNC link 2001 (~ 36 min) 8 6 4 2 2 4 6 8 time The first minute 10 12 14 4 x 10 1200 1000 800 600 400 200 500 1000 1500 2000 2500 3000 3500 5 Heavy tails everywhere: Insurance claims Danish Fire Loss Data: 1980 − 1990 250 200 150 100 50 200 400 600 800 1000 1200 1400 1600 1800 2000 Hill plot: α (k) = 1.394 H= 0.60422 (0.020897), α =1.655 H 10 Max−Spectrum 2 1.5 1 0 500 1000 1500 order statistics 2000 8 6 4 2 0 0 5 Scales j 10 6 Tail exponent estimation: an old problem • Hill (1975) – the MLE in the Pareto model P{X > x} = x−α , x ≥ 1 and introduced the Hill plot: k 1X α bH (k) := ( log(Xi,n) − log(Xk+1,n ))−1, k i=1 where X1,n ≥ X2,n ≥ · · · ≥ Xk,n are the top–k order statistics of the sample. • A lot of work for iid data – less for dependent: ◦ Resnick and Stǎricǎ (1995) – consistency of Hill–type estimators. ◦ J. Hill (2006) – asymptotic normality of Hill–type estimators under NED (near epoch dependence) conditions. ◦ ... • Even for iid data, Hill plots are: volatile & hard to interpret: “Hill horror plot” 7 Another approach: max self–similarity • For iid (Xk ) with tail exponent α 1 n _ n1/α i=1 d Xi −→ Z, as n → ∞, where P{Z ≤ x} = exp{−Cx−α}, x > 0. ◦ The above continues to hold for many dependent stationary (Xk )! • Given X1, . . . , Xn , set j D(j, k) := 2 _ i=1 X2j (k−1)+i , 1 ≤ k ≤ nj := [n/2j ], 1 ≤ j ≤ log2(n). to be block–maxima of dyadic sizes. ◦ Observe that nj 1 X log2 D(j, k) ≃ E log2 2j/αZ = j/α + E log2 Z, Yj := nj k=1 as j → ∞. 8 The max–spectrum: iid asymptotics The Yj ’s, 1 ≤ j ≤ log2 n is the max–spectrum of the data set (Xk , 1 ≤ k ≤ n). • An estimator of α is then derived from Yj via regression: α b=α b[j1, j2] := j2 X j=j1 wj Yj , with X j wj = 0, X jwj = 1. j • For iid data: The estimator α b[j1, j2] is consistent and asymptotically normal, j as j1 , j2 → ∞ but so that n/2 1 , n/2j2 → ∞. Thm [S., Michailidis & Taqqu (2006)] For iid data under second order tail regularity conditions. Let 1 ≤ r(n) ≤ log2 n be such that √ √ n/2r(n)(1/2+β/α) + r(n)2r(n)/2/ n −→ 0, as n → ∞, then √ ~ Y ~ µ~r i) ≤ x} − Φ(x/σ~)| −→ 0, ~ i − hθ, sup |P{ nj2+r(n) (hθ, θ x∈R n → ∞. 9 The max–spectrum: iid asymptotics (cont’d) ~ = (Yj+r(n))j2 , θ~ = (θj )j2 , and Here Y j=j1 j=j1 µ ~ r = ((j + r(n))/α + C, j1 ≤ j ≤ j2 ), and ~ σθ~2 = α−2 θ~tΣ1 θ. Remarks: • The β > 0 governs the “second order” tail behavior. Roughly: P{X > x} ∼ Cx−α(1 + Dx−β ), as x → ∞. • The asymptotic cov matrix Σ1 is the same as for 1−Fréchet data. ◦ It does not depend on α and C = E log2 Z. • Consistency and asymptotic normality for α b[r(n) + j1, r(n) + j2] follow. ◦ The rates are the same as for the Hill estimator – Hall (1982). ~ yields the optimal • The explicit asymptotic cov α−2Σ1 of the max–spectrum Y linear GLS estimators – important in practice. 10 The max–spectrum: dependent data Let (Xk )k∈Z be stationary, with tail exponent α and extremal index θ > 0. • Then, 1 n1/α _ 1≤k≤n d Xk → θ 1/α Z where 1 n1/α _ d 1≤k≤n Xk∗ → Z, (n → ∞) where (Xk∗ ) are iid copies of X1. ◦ Since θ > 0, the max–spectrum (Yj ) for time series scales as for iid data: Yj ≃ j/α + C, as j → ∞ and nj = n/2j → ∞. • The same, regression–based, estimators α b= Pj2 j=j1 wj Yj work! • The asymptotics for α b are harder (than for iid data)! ◦ Intuition: the block–maxima D(j, k), 1 ≤ k ≤ nj are asymptotically iid, as j → ∞. 11 Max–spectrum illustration: TCP durations TCP Flow Sizes (bytes): Max self−similarity H= 0.924 (0.044637), α =1.0822 26 24 Max−Spectrum 22 20 18 16 14 12 2 4 6 8 10 Scales j 12 14 16 12 Two asymptotic regimes • Intermediate scales: Fix j1 < j2 integer and let α bn = α b[r(n) + j1, r(n) + j2], where r(n) → ∞ and 2r(n) /n → 0, ◦ We expect to get consistency and asymptotic normality for α bn . as n → ∞. • Large scales: Fix ℓ ∈ N and focus on the largest ℓ + 1 scales: α bn = α b[log2 n − ℓ, log2 n]. ◦ We can only get “distributional consistency”: d with αZ a random variable. α b n → αZ , as n → ∞, • Both regimes are useful/interesting in practice. • More details ... 13 Intermediate scales asymptotics The regularity conditions: for Mn := max1≤k≤n Xk P{n−1/αMn ≤ x} = exp{−c(n, x)x−α}, x > 0, |c(n, x) − cX | ≤ c1(x)n−β , ∀x > 0, (Plus a technicality at x ≈ 0.) where with c1(x) = O(x−R ), x ↓ 0. (1) ◦ Intuition: β controls the second order tail behavior of Mn . ◦ Caveat: Relation (1) may be hard to verify! We have it for moving maxima. • We get rates on moments of f (Mn /n1/α ), in particular: Thm [S. & Michailidis (2006)] Under the above conditions, for all k ∈ N, provided R∞ 1 E| logk (Mn/n1/α) − E logk (Z)| = O(n−β ), c1 (x)x−α−1+δ dx, for δ > 0. as n → ∞, 14 Intermediate scales: asymptotic normality Let (Xk ) be stationary with tail exponent α > 0. Thm [S. & Michailidis (2006)] Under the above conditions, and if (Xk ) is m–dependent, we have √ d nr(n) (α bn − α) −→ N (0, α2cw ), where cw = w ~ t Σ1w, ~ and α bn = α b[r(n) + j1, r(n) + j2], provided 2r(n) /n + n/2r(n)(1+2 min{1,β}) −→ 0, as n → ∞. Remarks: • The same asymptotic variance as in the iid case. ◦ Intuition: The block–maxima D(j, k), 1 ≤ k ≤ nj – asymptotically iid! • β captures: second order tails PLUS dependence. • Asymptotic confidence intervals available! • Optimal linear GLS estimators available! 15 Large scales: distributional consistency The regularity conditions and m–dependence are restrictive. • As in Davis & Resnick (1985), let Xk = ∞ X ci ξk−i , where i=0 X i |ci |δ < ∞, 0 < δ < min{1, α}. ◦ Here (ξk ) are iid and P{|ξ1| > x} ∼ Cx−α , x → ∞, with P{ξ1 > x}/P{|ξ1| > x} → p ∈ [0, 1], as x → ∞. Lemma For Xk (m) := max1≤i≤m Xm(k−1)+i , k = 1, 2, . . ., we get f dd {m−1/α Xk (m)}k∈N −→ {Zk }k∈N, as m → ∞, where (Zk ) are iid α−Fréchet. Provided p maxi ci > 0 or (1−p) maxi (−ci ) > 0. • This justifies the “asymptotic independence phenomenon” for the block– maxima (D(j, k))k as j → ∞! Thm [S. & Michailidis (2006)] Under the above conditions, with fixed ℓ d bZ,ℓ, as n → ∞, α bn −→ α where α bn = α b[ top–ℓ scales] and α bZ is based on iid α−Fréchet data Z1, . . . , Z2ℓ+1 . 16 Distributional consistency: implications • No consistency but confidence intervals! • Covers more processes! • The approximation is often valid for “small” n. 17 AR(1) with Pareto (α = 1.5) innovations AR(1) with Pareto innovations: φ = 0.9, α = 1.5 1500 1000 500 0.5 1 1.5 2 2.5 3 4 x 10 Hill plot Hill plot 3 2.5 2 α α 2 1.5 1.5 1 0.5 1 2 Order statistics k 1 3 4 x 10 500 1000 1500 Order statistics k 2000 18 The max–spectrum ... Max self−similarity: α = 1.4844 13 12 Max−Spectrum 11 10 9 8 7 6 5 2 4 6 8 Scales j 10 12 14 19 Data examples: the advantage of time scales 20 Google: traded volume Transaction volumes for GOOG in November 2005 5 Number of shares x 10 2 1.5 1 0.5 5 10 15 20 Day of the month Confidence intervals for α per day 25 4 α 3 2 1 5 10 15 20 Day of the month 25 30 21 Google: traded volume – the time series Transaction volumes for GOOG: Nov 7, 2005 4 Number of shares x 10 8 6 4 2 0.5 1 1.5 2 2.5 3 3.5 4 4 α = 1.0729 Hill plot 18 Max−Spectrum 3 2.5 α x 10 2 1.5 1 16 14 12 10 8 0 200 400 600 800 Order statistics k 0 5 10 15 Scales j 22 Intel: traded volume Transaction volumes for INTC in November 2005 6 Number of shares x 10 3 2 1 5 10 15 20 Day of the month Confidence intervals for α per day 25 6 5 α 4 3 2 1 5 10 15 20 Day of the month 25 30 23 Intel: strange time series Transaction volumes for INTC: Nov 23, 2005 5 x 10 Number of shares 2.5 2 1.5 1 0.5 1 2 3 4 5 6 7 8 9 4 x 10 α(7,11) = 1.0578, α(12,16) = 5.2128 Hill plot 18 Max−Spectrum α 3 2 1 16 14 12 10 0 200 400 600 800 1000 Order statistics k 0 5 10 Scales j 15 24 Intel: typical time series Transaction volumes for INTC: Nov 21, 2005 5 Number of shares 3 x 10 2 1 0 1 2 3 4 5 6 4 α = 1.5564 Hill plot 2.5 Max−Spectrum 25 2 α x 10 1.5 1 20 15 10 0.5 200 400 600 800 1000 Order statistics k 5 10 15 Scales j 25 References: Davis, R. A. and Resnick, S.I.(1985) Limit theory for moving averages of random variables with regularly varying tail probabilities. The Annals of Probability 13 (1), 179–195. Hall, P. (1982) On some simple estimates of an exponent of regular variation, J. Roy. Stat. Assoc. (Ser B), 44, 37–42. Hill, B. M. (1975) A simple general approach to inference about the tail of a distribution. The Annals of Statistics 3, 1163–1174. Resnick, S. and Stǎricǎ, C. (1995) Consistency of Hill’s estimator for dependent data. Journal of Applied Probability 32, 139–167. Stoev, S. and Michailidis, G. (2006) On the estimation of the heavy–tail exponent in time series using the max–spectrum, Technical Report, University of Michigan. Stoev, S., Michailidis, G., and Taqqu, M.S. (2006) Estimating heavy–tail exponents through max self–similarity, Technical Report, University of Michigan. WRDS https://wrds.wharton.upenn.edu/. Pennsylvania. Wharton School of Management, Universty of 26
© Copyright 2025 Paperzz