1966 1966 Martingale Transforms D. L. Burkholder The Annals of Mathematical Statistics, Vol. 37, No. 6. (Dec., 1966), pp. 1494-1504. So, what is a martingale transform? Let fn be a martingale and dn be the corresponding martingale difference: dn = fn − fn−1 . Think about it as prices and the changes in prices. Then a martingale transform is gn = P n i=1 ci di for a certain sequence ci . You can think about it as the wealth process for a certain predetermined trading strategy. The following example shows that a martingale transform of an L1 -bounded martingale may be unbounded. Let the probability space is the set of all positive integers and the measure is given by the formula Pr{k} = 1 1 − . k k+1 Let fn (k) = { n, if n < k, −1, if n ≥ k, where n = 0, . . . . The sequence fn is an L1 -bounded martingale: Efn = n Pr{k > n} + (−1) Pr{k ≤ n} 1 1 − (1 − ) = n n+1 n+1 = 0. E|fn | = n Pr{k > n} + Pr{k ≤ n} 1 1 = n + (1 − ) n+1 n+1 < 2. In particular, by one of Doob’s theorems, the martingale converges almost surely to −1. Clearly, 1, if n < k, dn (k) = { −k if n = k, 0 if n > k, where n = 1, . . . 1 2 Consider the sequence gn which is the martingale transform of fn with the multiplier sequence 1, −1, 1, . . .. Then gn (k) = { [1 + (−1)n−1 ]/2, if n < k, [1 + (−1)k ]/2 + (−1)k k, if n ≥ k. This is an unbounded martingale: Egn = 1 + (−1)n−1 Pr{k > n} 2 n X 1 + (−1)k + + (−1)k k] Pr{k} [ 2 k=1 = 0. However, E|gn | ∼ n X k Pr{k} ∼ log n. k=1 Of course, this martingale is still convergent almost surely. Intuitively, in this example there is a random crash date. Before the crash, the wealth of the gambler grows steadily in an almost non-random fashion provided that she puts 1 unit of capital every period in the market. At the crash date she loses everything and stays at –1 forever. (This reminds the recent financial history of US except the staying at -1 forever.) The gambler who uses the strategy 1, −1, 1, . . . is not very wise from the traditional financial point of view. He bets periodically for and against the market despite the clear trend. As a consequence, his wealth fluctuates around zero until finally at the crash date he either earns a huge amount of money in the case when he has a short position or loses an equally huge amount of money and commits suicide in the case he is long. (After that, his wealth or rather the logarithm of his wealth is constant at a rather large negative number.) The results of Burkholder show that what is important for the eventual convergence of the wealth process is not its L1 -boundedness which not always holds in realistic situations, but rather that the wealth is a bounded transform of a bounded martingale process. Intuitively, this should apply to a very wide class of financial markets. (In certain situation the price process itself may become unbounded and in this case there is little hope for almost sure convergence.) Theorem 1. Suppose that g is a transform of an L1 -bounded martingale f . Then g converges almost everywhere on the set where the maximal function c∗ of the multiplier sequence c is finite. 3 Proof goes in several steps. First it is shown that the bounded transform of L2 -bounded martingale is also L2 -bounded so it is absolutely surely convergent. Then it is shown that if g is a bounded transform of uniformly bounded sub-martingale, then it is convergent almost surely. Finally, every martingale can be represented as difference of non-negative martingales and for each of those we can define fbn = − min(fn , C) This is a uniformly bounded sub-martingale and the theorem follows when C is sent to infinity. Now, define ∞ X S(f ) = ( d2n )1/2 . n=1 Theorem 2. If f is a martingale such that ES(f ) < ∞, then f converges almost everywhere. Proof is based on an ingenious argument that shows that f can be represented as a martingale transform of a bounded martingale. Let rk (t) be the Rademacher functions on the unit interval. Then, Z 1 X Z 1 X n n | rk (t)dk |2 dt]1/2 E| rk (t)dk |dt ≤ E[ 0 0 k=1 k=1 = ESn (f ). Here dk are considered as Fourier coefficients, and the equality in the last line is the Parseval equality. It follows that Z 1 n X rk (t)dk |dt ≤ ES(f ). sup E| 0 n k=1 Consequently for some t, sup E| n n X rk (t)dk | ≤ ES(f ). k=1 P For this t, the sum nk=1 rk (t)dk defines an L1 -bounded martingale gn and fn is a martinale transform of this bounded martingale. Therefore the conclusion of the theorem follows from Theorem 1. The next theorem gives another operational criterion for absolutely sure convergence. Theorem 3. Suppose that f and g are martingales relative to the same sequence of sigma fields. If f is L1 -bounded and Sn (g) ≤ Sn (f ) for all n ≥ 1, then g converges almost everywhere. 4 Radon-Nikodym Derivatives of Gaussian Measures L. A. Shepp The Annals of Mathematical Statistics, Vol. 37, No. 2. (Apr., 1966), pp. 321-354. The paper gives a formula for the Radon-Nikodym derivative (i.e., likelihood ratio function) of one Wiener-equivalent measure relative to another. First of all, we need a condition for the equivalence of Wiener measures. Suppose that µ is a measure with mean m and covariance R, and let µW be the standard Wiener measure. Theorem 4. µ ∼ µW if and only if there exists a kernel K ∈ L2 (R+ × R+ ), for which Z Z s t R(s, t) = min(s, t) − K(u, v)dudv, 0 0 and whose spectrum does not contain 1, and a function k ∈ L2 (R+ ) for which Z t m(t) = k(u)du. 0 The kernel K is unique and symmetric and is given by ∂ ∂ K(s, t) = − R(s, t) ∂s ∂t for almost every (s, t). The function k is unique and is given by k(t) = m0 (t) for almsot every t. The conditin on the spectrum is essentially the condition of positive definiteness of R. Assume now that µ is equivalent to µW . Since 1 does not belong to the spectrum we can define the resolvent H: H = (I − K)−1 . The formula for the Radon-Nikodym derivative is given by the following theorem. Theorem 5. If K is continuous and of trace class, then Z Z dµ 1 1 T T (X) = √ exp[− H(s, t)dX(s)dX(t) dµW 2 0 0 d Z T Z 1 T 2 + k(u)dX(u) + k (u)du], 2 0 0 where Y d= (1 − λj ), and λj are eigenvalues of K. 5 This theorem can be thought of as a certain variant of the Ito formula. This version is especially suitable for estimation problems. One example is a shift of the Wiener measure. Let Z(t) = −αt + W (t). Then 1 2 dµZ (X) = e−aX(T ) e− 2 a T . dµW This formula allows us to calculate probabilities for the shifted measure from the corresponding probabilities for the original Wiener measure. For another example, consider the time-changed Wiener process: 1 W (h(t)) Z(t) = p h0 (t) The Radon-Nicodym derivative of the measure defined by this process with respect to the original measure is given in the following theorem. Theorem 6. Z ∼ W if and only if g = h−1/2 is absolutely continuous and g 0 ∈ L2 . For smooth h, the Radon-Nikodym derivative is given by the formula: s Z dµZ h0 (T ) X 2 (T ) h00 (T ) 1 T exp{− − f (t)X 2 (t)dt}, (X) = dµW h0 (0) 4 h0 (T ) 2 0 where 1 h00 1 h00 f = − ( 0 )0 + ( 0 )2 . 2 h 4 h With the help of the Radon-Nikodym derivatives, it is possible to calculate certain Wiener integrals for the exponentials of quadratic forms. Let Z 1 T A(f ) = EW exp(− f (t)X 2 (t)dt). 2 0 We want to calculate A(f ) explicitly. Define g by the following equation: d2 g = f g, g 0 (T ) = 0, g > 0 on [0, T ). dt2 Then, s A(f ) = g(T ) . g(0) If there is no positive solution then A(F ) = +∞. (This problem can also be approached with the use of the Feynman-Kac formula.) 6 Pooling Cross Section and Time Series Data in the Estimation of a Dynamic Model: The Demand for Natural Gas Pietro Balestra; Marc Nerlove Econometrica, Vol. 34, No. 3. (Jul., 1966), pp. 585-612. This is the seminal paper about panel data with dynamic time structure. Dynamics means that the lags of dependent variables are included as explanatory variables. The paper points out that the regular pooled OLS gives inconsistent estimates because of the dynamic structure. It suggests overcoming the difficulty by a two-stage procedure. In the first stage a consistent estimate of the parameters is obtained by an instrumental regression that avoids using lagged dependent variables. Next, the structure of covariance matrix is estimated, including the degree of dynamic correlation. Finally, the regression is estimated by a maximal likelihood, GLS, or a similar method. The method are illustrated on an example of estimation consumer’s demand for gas.
© Copyright 2025 Paperzz