Ergodic Theorems for Lower Probabilities S.Cerreia–Vioglio, F. Maccheroni, and M. Marinacci Abstract. We establish an Ergodic Theorem for lower probabilities, a generalization of standard probabilities widely used in applications. As a byproduct, we provide a version for lower probabilities of the Strong Law of Large Numbers. 1. Introduction The purpose of this paper is to state and prove an Ergodic Theorem for lower probabilities: a class of monotone set functions that are not necessarily additive and are widely used in applications where standard additive probabilities turn out to be inadequate (for applications in Economics see Marinacci and Montrucchio [16], for applications in Statistics see Walley [19]). We consider a measurable space ( ; F), endowed with an FnF-measurable transformation : ! , and a (continuous) lower probability : F ! [0; 1]. We study three di¤erent notions of invariance for lower probabilities (De…nitions 1-3).1 They are equivalent in the additive case, and so are genuine generalizations to the nonadditive setting of the usual concept of invariance. The most natural de…nition of invariance for a lower probability (De…nition 1) requires that 1 (A) = (A) 8A 2 F: It is the weakest form of invariance for the nonadditive case. Nevertheless, it is still possible to derive a version of the Ergodic Theorem (Theorem 2). In other words, if is an invariant lower probability, then for each real valued, bounded, and measurable function f : ! R the limit n lim n 1X f n k 1 (!) k=1 exists on a set that has measure 1 with respect to . If, in addition, is ergodic, we are able to provide bounds for such limit in terms of lower and upper Choquet integrals. Corresponding Author: Fabio Maccheroni <[email protected]>, U. Bocconi, via Sarfatti 25, 20136, Milano, ITALY. The authors gratefully acknowledge the …nancial support of MIUR (PRIN grant 20103S5RN3_005). 1 In the arXiv working paper version (arXiv:1508.01329), using the techniques of [4] and [5], we study a fourth notion of invariance arising naturally from the literature on Robust Statistics. 1 2 S.CERREIA–VIOGLIO, F. M ACCHERONI, AND M . M ARINACCI Under the stronger notions of invariance (De…nitions 2 and 3), the previous result can be strengthened in several ways. First, we develop a nonadditive version of Kingman’s super-subadditive ergodic theorem (Theorem 3). Second, when ( ; F) is a standard measurable space we can better characterize the limit of time averages (Corollary 2). As an application of our main result, we establish a nonadditive version of the Strong Law of Large Numbers (Theorem 4) for stationary and ergodic processes. 2. Mathematical Preliminaries 2.1. Set functions. Consider a measurable space (S; ), where S is a nonempty set and is a -algebra of subsets of S. Subsets of S are understood to be in even when not stated explicitly. A set function : ! [0; 1] is (i) a capacity if (;) = 0, (S) = 1, and (A) (B) for all A and B such that A B; (ii) convex if (A [ B) + (A \ B) (A) + (B) for all A and B; (iii) additive if (A [ B) = (A) + (B) for all disjoint A and B; (iv) continuous if limn!1 (An ) = (A) whenever either An # A or An " A; (v) continuous at S if limn!1 (An ) = (S) whenever An " S; (vi) a probability if it is an additive capacity; (vii) a probability measure if it is a probability which is continuous at S. We denote by (S; ) the set of all probabilities on and by (S; ) the set of all probability measures on . We endow both sets with the relative topology induced by the weak* topology.2 A set function : ! [0; 1] is (viii) a lower probability (measure) if there exists a compact set M such that (A) = min P (A) 8A 2 : (S; ) P 2M Given a capacity , its conjugate ! [0; 1] is given by (Ac ) (A) = 1 It is immediate to verify that if : is a lower probability, then (A) = max P (A) P 2M The core of a capacity 8A 2 : 8A 2 : is the weak* compact set de…ned by core ( ) = fP 2 (S; ) : P g; that is, the core is the collection of all probabilities that setwise dominate . A capacity : ! [0; 1] is (ix) exact if core ( ) 6= ; and (A) = minP 2core( ) P (A) for each A. 2 Recall that a net fP g 2I converges to P , in the weak* topology, if and only if P (A) ! P (A) for all A 2 . The weak* topology is thus the restriction to (S; ) of the topology (ba (S; ) ; B (S; )) where B (S; ) is the space of all real valued, bounded, and -measurable functions on S and ba (S; ) is the set of all bounded and …nitely additive set functions on . In the case of S being a Polish space and the Borel -algebra, the above topology should not be confused with the topology generated by real valued, bounded, and continuous functions on S. ERGODIC THEOREM S FOR LOW ER PROBABILITIES 3 If is a convex capacity continuous at S, then is exact and ; 6= core ( ) (S; ) (see [6, Lemma 2 and Theorem 1], [17, Theorem 3.2], and [16, Theorem 4.2 and Theorem 4.7]). In particular, is a lower probability where M = core ( ). Conversely, if is a lower probability, then is exact, continuous at S, and M core ( ) (S; ). Nevertheless, being exact does not automatically imply being convex. An exact capacity continuous at S is continuous. Finally, we say that a statement about a random element holds a:s: if and only if there exists an event A such that (A) = 1 and the statement holds for all s 2 A. 2.2. Integrals. We denote by B (S; ) the set of all bounded and -measurable functions from S to R. A capacity induces a functional on B (S; ) via the Choquet integral, de…ned for all f 2 B (S; ) by: Z Z 1 Z 0 fd = (fs 2 S : f (s) tg) dt + [ (fs 2 S : f (s) tg) (S)] dt S 1 0 where the right hand side integrals are (improper) Riemann integrals. If is additive, then the Choquet integral reduces to the standard additive integral. It is also R R routine to check that f d = f d for all f 2 B (S; ). It is well known S S (see [6, Lemma 2], [18, Proposition 3], and [16, Theorem 4.7]) that if is a convex capacity, then Z Z Z Z f d = min f dP and f d = max f dP 8f 2 B (S; ) : S P 2core( ) S P 2core( ) S S In the rest of the paper, we consider two measurable spaces (S; ). The …rst one is ( ; F) which we interpret as the space where ultimately uncertainty lives. As for the second one, given a real valued and F-measurable stochastic process ffn gn2N on , we consider the space RN ; (C) , which we interpret as the space of observations endowed with the -algebra generated by the algebra of cylinders C. 3. Ergodic Theorems 3.1. Invariant Capacities. In this section, we consider a measurable space ( ; F). We also consider a transformation : ! which is F=F-measurable. Recall that a probability measure P is ( -)invariant if and only if (3.1) P (A) = P 1 (A) 8A 2 F: We denote by I the set of all probability measures that satisfy (3.1) and by G the set of all invariant events of F, that is, A 2 G if and only if A 2 F and 1 (A) = A. An invariant probability measure P is said to be ergodic if and only if P (G) = f0; 1g. Similarly, we say that a capacity is ergodic if and only if (G) = f0; 1g. We denote by S (I) the subset of I such that S (I) = fP 2 I : P (G) = f0; 1gg : If ( ; F) is a standard measurable space, then it can be checked that S (I) is the set of strong extreme points of I (see Dynkin [11]). Finally, following Dunford and Schwartz [10, pp. 723-724] (see also Dowker [8]), we say that a probability measure P is potentially ( -)invariant if and only if there exists a probability measure P^ 2 I such that P (E) = P^ (E) 8E 2 G: 4 S.CERREIA–VIOGLIO, F. M ACCHERONI, AND M . M ARINACCI We denote the set of potentially invariant probability measures by PI. Next, we propose three notions of ( -)invariance for a capacity. Definition 1. A capacity is invariant if and only if for each A 2 F 1 (A) = Definition 2. A capacity An M 1 1 (A) = (A) : is strongly invariant if and only if for each A 2 F 1 (A) nA and Definition 3. A lower probability I. (A) nA = An 1 (A) : is functionally invariant if and only if In what follows, we clarify the connection between these notions of invariance. Theorem 1. Let ( ; F) be a measurable space and a lower probability. Then, (i) =) (ii) =) (iii) where (i) is strongly invariant; (ii) is functionally invariant and core ( ) I; (iii) core ( ) I. Moreover, if is convex and continuous at , then statements (i), (ii), and (iii) are equivalent. As a corollary, we obtain that the three de…nitions coincide with the usual de…nition of invariance when is a probability measure. Finally, if is a functionally invariant lower probability, then it is immediate to see that is invariant.3 In the next section, we show that, if is an invariant lower probability, then its core must be contained in PI. 3.2. Ergodic Theorem. Given the notions of invariance previously discussed, we could then ask ourselves if suitable ergodic theorems can be developed for nonadditive probabilities. Given Subsection 3.1, an immediate dichotomy presents. In fact, the notion of invariance of De…nition 1 stands separate from, and it is actually weaker than, the other notions of strong and functional invariance, even in the convex case. Theorem 2 only assumes the weak form of invariance of De…nition 1. On the other hand, Corollary 2 assumes strong invariance. Strong invariance, paired with the convexity of and ( ; F) being standard, allows us to provide a sharper version of Theorem 2. Theorem 2. Let ( ; F) be a measurable space and a lower probability. If is invariant, then for each f 2 B ( ; F) there exists f ? 2 B ( ; G) such that n lim n 1X f n k 1 (!) = f ? (!) a:s: k=1 Moreover, if is ergodic, then ( Z !2 : f ?d n 1X lim f n n k=1 k 1 (!) Z ? f d )! 3 Since is a functionally invariant lower probability, we have that M 1 (A) = 1 (A) for all A 2 F . minP 2M P (A) = minP 2M P = 1: I and (A) = ERGODIC THEOREM S FOR LOW ER PROBABILITIES 5 As a corollary, we are able to show a necessary property that core ( ) of an invariant lower probability must satisfy (cf. Theorem 1). Clearly, it is not a characterization since it is well known that there are probability measures that are potentially invariant, but not invariant. Corollary 1. If a lower probability is invariant, then core ( ) PI. As a second corollary, we discuss the ergodic theorem for convex and strongly invariant capacities. Compared to Theorem 2, the following corollary assumes convex and a stronger form of invariance that, in turn, yield a limit function f ? which has more properties. These properties naturally generalize the ones found in the Individual Ergodic Theorem of Birkho¤. In this case, convergence of empirical averages is a simple consequence of Birkho¤’s theorem applied to each probability in core ( ). Nevertheless, the relation between f and f ? in terms of Choquet expectations is not immediate at …rst sight. A similar comment applies to Theorem 3. Corollary 2. Let ( ; F) be a standard measurable space and a convex capacity continuous at . If is strongly invariant, then for each f 2 B ( ; F) there exists f ? 2 B ( ; G) such that n 1X f k 1 (!) = f ? (!) a:s: (3.2) lim n n k=1 Moreover, (1) For each P 2 I, f ? is a version of the conditional expectation of f given G. R ? R (2) f d = fd . (3) If is ergodic, then ( )! Z Z n 1X k 1 !2 : fd lim f (!) fd = 1: n n k=1 3.3. Subadditive Ergodic Theorem. Next we turn to a Subadditive Ergodic Theorem for lower probabilities. Definition 4. A sequence fSn gn2N of F-measurable random variables is superadditive (resp., subadditive) if and only if Sn+k n Sn + Sk (resp., ) 8n; k 2 N: The sequence fSn gn2N is additive if and only if it is superadditive and subadditive. Consider an F-measurable function f : (3.3) Sn = n X f ! R. If we de…ne fSn gn2N by k 1 k=1 8n 2 N; then we have that fSn gn2N is an additive sequence. The opposite is also true, that is, if fSn gn2N is additive, then it takes the form (3.3) for some F-measurable real valued function f . On the other hand, if we take fSn gn2N as in (3.3) and we consider fjSn jgn2N we obtain a genuine subadditive sequence. Note that if f 2 B ( ; F), then we also have that there exists 2 R such that (3.4) Similarly, we have that n n Sn (!) jSn j n 8! 2 : n for all n 2 N. 6 S.CERREIA–VIOGLIO, F. M ACCHERONI, AND M . M ARINACCI Theorem 3. Let ( ; F) be a standard measurable space and a lower probability. If fSn gn2N is either a superadditive or a subadditive sequence that satis…es (3.4) and if is functionally invariant, then there exists f ? 2 B ( ; G) such that lim n Sn = f? n a:s: Moreover, (1) If R is convex and strongly invariant and fSn gn2N superadditive, then R Sn f ? d = supn2N n d . R ? (2) If is convex and strongly invariant and fSn gn2N subadditive, then f d = R Sn inf n n d . (3) If is ergodic and fSn gn2N is either subadditive or superadditive, then Z Z Sn (!) f ?d !2 : = 1: f ?d lim n n 4. Strong Law of Large Numbers As an application of Theorem 2, we provide a nonadditive version of the Strong Law of Large Numbers. Before doing so, we need to introduce some notation and terminology. Consider a sequence of real valued, bounded, and measurable random variables f = ffn gn2N B ( ; F). We denote by T the tail -algebra \ (fk ; fk+1 ; :::). k2N Definition 5. Given a capacity , we say that f = ffn gn2N is stationary if and only if for each n 2 N, for each k 2 N0 , and for each Borel subset B of Rk+1 (4.1) (f! 2 : (fn (!) ; :::; fn+k (!)) 2 Bg) = (f! 2 : (fn+1 (!) ; :::; fn+k+1 (!)) 2 Bg) : This notion generalizes the usual notion of stationary stochastic process by allowing for the nonadditivity of the underlying probability measure. Recall that RN ; (C) denotes the space of sequences endowed with the -algebra generated by the algebra of cylinders. We denote a generic element of RN by x. We also consider the shift transformation : RN ! RN de…ned by (x) = (x2 ; x3 ; x4 ; ::::::) 8x 2 RN : The sequence ffn gn2N induces a natural (measurable) map between ( ; F) and RN ; (C) , de…ned by ! 7! f (!) = (f1 (!) ; :::; fk (!) ; :::) De…ne f : (C) ! [0; 1] by f (C) = f 1 (C) 8C 2 8! 2 : (C) : Definition 6. Given a capacity , we say that f = ffn gn2N is ergodic if and only if f is ergodic with respect to the shift transformation. Lemma 1. If is a convex capacity continuous at and f is stationary, then is a convex capacity continuous at RN which is shift invariant. Moreover, f is ergodic if (T ) = f0; 1g. f ERGODIC THEOREM S FOR LOW ER PROBABILITIES 7 This observation is a …rst step to deduce the Strong Law of Large Numbers as a corollary of Theorem 2 applied to f . In a nutshell, the assumption of stationarity yields that the limit n 1X lim fk n n k=1 exists -a.s. In order to obtain also a characterization of the limit in terms of the (Choquet) expected value, we further need f to be ergodic. Theorem 4. Let be a convex capacity continuous at . If f = ffn gn2N is stationary and ergodic, then )! ( Z Z n 1X fk (!) f1 d = 1: !2 : f1 d lim n n k=1 We close by observing that there are few but important di¤erences with the nonadditive Strong Law of Large Numbers of Marinacci [15] and Maccheroni and Marinacci [14]. In terms of hypotheses, we weaken the assumption of total monotonicity of to convexity, while we replace the i.i.d hypothesis of [15] with stationarity and ergodicity. Finally, compared to the main result of [14], we need to assume the continuity of . In turn, we obtain that empirical averages exist -a.s., a property that was not present in previous works. The bounds for these empirical averages are in terms of the lower and the upper Choquet integrals of the random variable f1 , as in [15] and [14]. Appendix A. Dynkin Spaces and Nonadditive Probabilities Consider a standard measurable space ( ; F) and a transformation : ! which is FnF measurable. Recall that we denote by I the set of all invariant probability measures. If I is a nonempty set, then the triple ( ; F; I) forms a Dynkin space. Definition 7 (Dynkin, 1978). Let P be a nonempty subset of ( ; F) where ( ; F) is a separable measurable space. The triple ( ; F; P) is a Dynkin space if and only if there exist a sub- -algebra G F, a set W 2 F, and a function p: F (A; !) ! 7! [0; 1] p (A; !) such that: (a) for each P 2 P and A 2 F, p (A; ) : ! [0; 1] is a version of the conditional probability of A given G; (b) for each ! 2 , p ( ; !) : F ! [0; 1] is a probability measure; (c) P (W ) = 1 for all P 2 P and p ( ; !) 2 P for all ! 2 W . It is not hard to check that, given f 2 B ( ; F), the function f^ : de…ned by Z (A.1) f^ (!) = f dp ( ; !) 8! 2 ; ! R, is a version of the conditional expected value of f given G for all P 2 P, in particular, f^ 2 B ( ; G) (see also [5, Remark 13]). When ( ; F) is a standard measurable space, 8 S.CERREIA–VIOGLIO, F. M ACCHERONI, AND M . M ARINACCI if ( ; F; P) = ( ; F; I), then G is the set of invariant events. In particular, we can consider W = (see Gray [12, Theorem 8.3]). The next lemma is ancillary.4 Lemma 2. Let ( ; F) be a measurable space and G a sub- -algebra of F. If is a lower probability such that (G) = f0; 1g and g 2 B ( ; G), then Z Z !2 : gd g (!) gd = 1: Appendix B. Proofs Proof of Theorem 1. Recall that if (B.1) P is a lower probability, we have that 8P 2 core ( ) ( ; F) : (i) implies (ii). Pick A 2 F. Since is strongly invariant and , we have 1 1 1 (A) nA . (A) nA An 1 (A) = (A) nA = An 1 (A) 1 1 1 1 It follows that An (A) = An (A) = (A) nA = (A) nA = 1 k. By (B.1), we can conclude that P An 1 (A) = k = P (A) nA for all 1 P 2 core ( ). This implies that P (A) = P An (A) + P A \ 1 (A) = 1 1 P (A) nA + P A \ 1 (A) = P (A) for all P 2 core ( ). (ii) implies (iii). It is trivial. Assume that is further convex and continuous at . Then, it is a lower probability. We are left to show: (iii) implies (i). Since is convex and core ( ) I, it follows that Z c 1 An 1 (A) + A[ (A) = 1 + 1A 1 1 (A) d Z = min 1 + 1A 1 1 (A) dP = 1: P 2core( ) Thus, we have that An 1 (A) = 1 A[ 1 (A) c 1 =1 (A) nA c = 1 (A) nA : 1 An analogous argument yields that (A) nA = An 1 (A) , proving the statement. Before proving Theorem 2, we provide an ancillary key result. Theorem 5. Let ( ; F) be a measurable space, a lower probability, and assume that the family I of invariant probability measures is not empty. The following statements are equivalent: (i) There exists P 2 I such that for each E 2 F P (E) = 1 =) lim k k (E) = 1; (ii) There exists P 2 I such that for each E 2 G P (E) = 1 =) (iii) For each E 2 G P (E) = 1 (E) = 1; 8P 2 I =) (E) = 1; 4 The proof is standard and available in the arXiv working paper version of this work (arXiv:1508.01329). ERGODIC THEOREM S FOR LOW ER PROBABILITIES 9 (iv) For each f 2 B ( ; F) there exists f ? 2 B ( ; G) such that n 1X lim f k 1 (!) = f ? (!) a:s:; n n k=1 (v) core( ) PI. k Proof. (i) implies (ii). If E 2 G, then (E) = E for all k 2 N, yielding the statement. (ii) implies (iii). It is trivial. (iii) implies (iv). Consider f 2 B ( ; F). De…ne f ? : ! R by n 1X f k 1 (!) f ? (!) = lim sup 8! 2 : n n k=1 De…ne f? : ! R by considering the lim inf. Since f 2 B ( ; F), it can be shown that f ? ; f? 2 B ( ; G). Consider the event ) ( n 1X k 1 f (!) exists = f! 2 : f ? (!) = f? (!)g E = ! 2 : lim n n k=1 ) ( n 1X k 1 ? f (!) = f? (!) : = ! 2 : f (!) = lim n n k=1 By Birkho¤’s Ergodic Theorem (see [2, Theorem 24.1]), we have that P (E) = 1 for all P 2 I. By assumption, this yields that (E) = 1. Since f was chosen to be generic, the statement follows. (iv) implies (v). Recall that for each P 2 core ( ), P (A) (A) for all A 2 F. By assumption, we can conclude that for each P 2 core ( ), for each f 2 B ( ; F) there exists f ? 2 B ( ; G) such that n 1X lim f k 1 (!) = f ? (!) P a:s: n n k=1 By [13, p. 964] (see also [10, Exercises 31 and 32, pp. 723–724]), it follows that P 2 PI. (v) implies (i). Since is a lower probability, it is continuous at and exact. By [16, Theorem 4.2], it follows that there exists a measure P 2 core ( ) such that for each A 2 F, for each " > 0, there exists > 0 such that (B.2) P (A) < =) Q (A) < " 8Q 2 core ( ) : It is immediate to show that P is such that for each A 2 F (B.3) P (A) = 0 =) Q (A) = 0 8Q 2 core ( ) : Since P 2 core ( ) PI, we have that there exists P 2 I such that P (E) = P (E) for all E 2 G. Consider E 2 F. Assume that P (E) = 1. It follows that P (E c ) = 0. k At the same time, de…ne Fn = [1 (E c ). Note that Fn # F 2 G. Since k=n P1 k P 2 I, it follows that P (F ) = limn P (Fn ) P (F1 ) (E c ) = k=1 P 0. It follows that P (F ) = 0, that is, P (F ) = 0. By (B.3), we have that Q (F ) = 0 for all Q 2 core ( ), that is, (F ) = 0. Since is a lower probability, satis…es the Fatou’s property, that is, 0 lim supk (Ak ) (lim supk Ak ) k for each sequence fAk gk2N F. This implies that 0 lim inf k (E c ) 10 S.CERREIA–VIOGLIO, F. M ACCHERONI, AND M . M ARINACCI k lim supk limk k (E c ) lim supk k (E c ) = (F ) = 0. We can conclude that k (E) = limk 1 (E c ) = 1, proving the statement. The proof of Theorem 2 uses some of the techniques common in Ergodic Theory (see, e.g., [7, Theorem 7]). Also, note that, given a capacity , we have that core ( ) = fP 2 ( ; F) : P g = fP 2 ( ; F) : Pg: Proof of Theorem 2. We …rst prove that, given the assumptions, ; 6=core( ) PI. In particular, this shows that I = 6 ;. Claim: Let be a lower probability. If particular, I = 6 ;. is invariant, then core ( ) PI. In Proof of the Claim. Since is invariant, is invariant. Since is a lower probability, is continuous at and, in particular, ; = 6 core ( ) ( ; F). Fix a Banach-Mazur limit (see [1, pag. 550]) : l1 ! R, that is, a functional from l1 to R such that: (1) (2) (3) (4) is linear; is positive; (x1 ; x2 ; :::) = (x2 ; x3 :::) for all x 2 l1 ; (x1 ; x2 ; :::) = limn xn for all x 2 c. Observe that (A) P (A) (A) for all P 2 core ( ) and all A 2 F. Fix P 2 core ( ), de…ne Pn : F ! [0; 1] by Pn (A) = n 1 1X P n k k=0 (A) 8A 2 F: k k Note that P (A) (A) = (A) for all A 2 F and for all k 2 N0 . Since core ( ) is convex, this implies that fPn gn2N core ( ). For each A 2 F, de…ne xA = (P1 (A) ; P2 (A) ; P3 (A) ; :::). Note that 0 xA 1N , thus, xA 2 l1 for all A 2 F. De…ne P^ : F ! [0; 1] by P^ (A) = (xA ) for all A 2 F. Since is positive, note that P^ is a well de…ned positive set function. Next, consider A; B 2 F such that A \ B = ;. Since fPn gn2N ( ; F), it follows that Pn (A [ B) = Pn (A) + Pn (B) for all n 2 N. Since is linear, this implies that P^ (A [ B) = (xA[B ) = (xA + xB ) = (xA ) + (xB ) = P^ (A) + P^ (B), proving k that P^ is additive. Next, consider A 2 G. Since (A) = A for all k 2 N0 , it follows that Pn (A) = P (A) for all n 2 N. Since maps convergent sequences into their limit, we have that P^ (A) = (xA ) = P (A). In particular, this implies that P^ ( ) = 1 and P^ (;) = 0. Up to now, we have proved that P^ 2 ( ; F) and P^ (A) = P (A) for all A 2 G. Since fPn gn2N core ( ), we have that xA (A) 1N . ^ Since is linear and positive, it follows that P (A) = (xA ) ( (A) 1N ) = (A) for all A 2 F, that is, P^ 2 core ( ). Since core ( ) ( ; F), we can conclude that P^ 2 ( ; F). We next show that P^ is invariant. Note that for each A 2 F and for each n 2 N Pn 1 (A) = n 1 1X P n n k 1 k=0 n+1 = Pn+1 (A) n (A) = 1 X n+1 P n n+1 k=0 1 P (A) : n k (A) 1 P (A) n ERGODIC THEOREM S FOR LOW ER PROBABILITIES 11 De…ne y = (P2 (A) ; P3 (A) ; :::). De…ne z = x 1 (A) y 2 l1 . Note that jzn j = 1 2 1 Pn (A) Pn+1 (A) P (A)j n jPn+1 (A) n for all n 2 N. It follows that limn zn = 0. Since satis…es properties 3, 1, and 4, we have that 1 ^ ^ x 1 (A) (xA ) = x 1 (A) (y) = j (z)j = P (A) P (A) = ^ 0, proving that P is invariant. Given the previous part of the proof, P^ 2 I and P 2 PI. Since P was arbitrarily chosen in core ( ), it follows that I 6= ; and core ( ) PI. By the previous claim and Theorem 5, the main statement follows. Finally, assume that is further ergodic. By Lemma 2 and since f ? 2 B ( ; G) and is an ergodic lower probability, it follows that Z Z ? ? !2 : f d f (!) f ?d = 1: Since ( !2 : f ? (!) = limn ability, this implies that ( Z !2 : f ?d 1 n n X f k 1 (!) k=1 )! n 1X lim f n n k=1 k 1 (!) = 1 and Z ? f d is a lower prob)! = 1; proving the statement. Proof of Corollary 1. It is the proof of the claim contained in the proof of Theorem 2. We next proceed by proving Theorem 3 and obtaining Corollary 2 as a corollary of this former result. It is also possible to provide a proof of Corollary 2 as a consequence of Theorem 2. Lemma 3. Let fSn gn2N be a superadditive (resp., subadditive) sequence that satis…es (3.4) and M a compact subset If fan gn2N R R of invariant probability measures. in R is de…ned by an = minP 2M Sn dP (resp., an = maxP 2M Sn dP ) for all n 2 N, then fan gn2N is subadditive, that is, an+k an + ak for all n; k 2 N. Proof. Since fSn gn2N satis…es (3.4), fSn gn2N B ( ; F). We just prove the statement for the superadditive case, being the subadditive one similarly proven. If fSn gn2N is superadditive and M is a compactRsubset of invariant probability R measures, then we haveR that an+k = minRP 2M Sn+k dP minP 2M Sn + R n n Sk dP min S dP + min S dP = min S dP + P 2M n P 2M k P 2M n R minP 2M Sk dP = an ak for all n; k 2 N, proving the statement. Proof of Theorem 3. Since is a functionally invariant lower probability, we have that M I. De…ne ffn gn2N B ( ; F) by fn = Sn =n for all n 2 N. It follows that f^n 2 B ( ; G) for all n 2 N. Since fSn gn2N satis…es (3.4), it follows that there exists 2 R such that fn ; f^n for all n 2 N. De…ne f ? 2 B ( ; G) by f ? = ? ^ ^ supn2N fn (resp., f = inf n2N fn ). By Kingman’s Subadditive Ergodic Theorem (see Dudley [9, Theorem 10.7.1] n and [12, Theorem 8.4]) and o since W = , we Sn (!) ? ? ^ ! 2 : limn n = f (!) = 1 for all P 2 M. have that f = limn fn and P n o Sn (!) Since is a lower probability, it follows that ! 2 : limn n = f ? (!) = 1, proving the main part of the statement. 12 S.CERREIA–VIOGLIO, F. M ACCHERONI, AND M . M ARINACCI 1. If (B.4) is convex and strongly invariant, then we have that core ( ) Z Z f d = min f dP 8f 2 B ( ; F) : I and P 2core( ) R Consider the sequence fan gn2N de…ned by an = Sn d for all n 2 N. By (B.4) and Lemma 3, we have that fan gn2N is subadditive. It follows that (see [12, Lemma 8.3]) limn ann = inf n2N ann , that is, (B.5) lim n n o Recall that f^n n2N an an = sup : n n2N n is uniformly bounded. By Cerreia-Vioglio, Maccheroni, Mari- nacci, and Montrucchio [3, Theorem 22], (B.5), and the main part of the statement and since core ( ) I, we have that Z Z Z Z f^n dP f^n d = lim min f ?d = lim f^n d = lim n n n P 2core( ) R Z Z Sn d = lim min fn dP = lim fn d = lim n n n n P 2core( ) R Z Sn d an an = lim = sup = sup = sup fn d ; n n n n n2N n n2N proving point 1. 2. If (B.6) is convex and strongly invariant, then we have that core ( ) Z Z f d = max f dP 8f 2 B ( ; F) : I and P 2core( ) R Consider the sequence fan gn2N de…ned by an = Sn d . By (B.6) and Lemma 3, we have that fan gn2N is subadditive. It follows that (see [12, Lemma 8.3]) (B.7) lim n n o Recall that f^n n2N an an = inf : n n n is uniformly bounded. By [3, Theorem 22], (B.7), and the main part of the statement and since core ( ) I, we have that Z Z Z Z f ?d = lim f^n d = lim f^n d = lim max f^n dP n n n P 2core( ) R Z Z Sn d = lim max fn dP = lim fn d = lim n n n n P 2core( ) R Z Sn d an an fn d ; = lim = inf = inf = inf n n n n n n2N n proving point 2. 3. By Lemma 2 and since is ergodic, it follows that Z Z ? ? !2 : f d f (!) f ?d = 1: ERGODIC THEOREM S FOR LOW ER PROBABILITIES 13 n By the initial part of the proof, we have that ! 2 : f ? (!) = limn 1. Since is a lower probability, this implies that Z Z Sn (!) ? !2 : f d lim f ?d = 1; n n Sn (!) n o = proving the statement. Proof of Corollary 2. Pick f 2 B ( ; F). It is immediate to see that fSn gn2N , Pn k 1 de…ned by Sn = for all n 2 N, is an additive sequence which k=1 f satis…es (3.4). Since is convex, continuous at , and strongly invariant, it is a functionally invariant lower probability. De…ne ffn gn2N by fn = Sn =n for all n 2 N. Note that f^n = f^ for all n 2 N. By the proof of Theorem 3, we have that limn Snn = limn f^n = f^, a:s:, proving the main statement and point 1 where f ? = f^. R 2. Since is convexR and strongly invariant, then we have that core ( ) I and f d = minP 2core( ) f dP . By point 1 and since core ( ) I, we have that R R R R f d = minP 2core( ) f dP = minP 2core( ) f^dP = f^d , proving point 2. R R R R Note also that f d = maxP 2core( ) f dP = maxP 2core( ) f^dP = f^d . 3. By point 3 of Theorem 3 and the proof of point 2, the statement follows. Proof of Lemma 1. Consider a convex capacity and a process f . It is immediate to see that f is a convex capacity. Next, consider fCn gn2N (C) such that Cn " RN . It follows that the sequence fAn gn2N , de…ned by An = f 1 (Cn ) for all n 2 N, is such that An " . Since is continuous at , we have that limn f (Cn ) = limn f 1 (Cn ) = limn (An ) = 1, proving that f is continuous at RN . Next, consider C 2 C. Then, there exist k 2 N and E 2 B Rk such that C = x 2 RN : (x1 ; :::; xk ) 2 E . Note that 1 (C) = fx 2 RN : (x1 ; x2 ; :::; xk+1 ) 2 R Eg. Since f is stationary, it follows that f (C) = = = = f 1 (C) = (f! 2 (f! 2 f 1 (f! 2 : (f1 (!) ; :::; fk (!)) 2 Eg) : (f2 (!) ; :::; fk+1 (!)) 2 Eg) : (f1 (!) ; f2 (!) ; :::; fk+1 (!)) 2 R 1 (C) = f 1 Eg) (C) : Since C 2 C was arbitrarily chosen, it follows that C fC 2 (C) : f (C) = 1 (C) g (C). Since f is convex and continuous at RN , we have that fC 2 f 1 (C) : f (C) = f (C) g is a monotone class. By the Monotone Class Theorem 1 (see [2, Theorem 3.4]), it follows that (C) = C 2 (C) : f (C) = f (C) , 1 \ 1 that is, f is shift invariant. De…ne H = Ck+1 \ (C).5 Note that f 1 (H) k=1 T . Thus, f (H) = f0; 1g if (T ) = f0; 1g. Let G be the -algebra of shift invariant events. It is well known that G H. In light of these observations, it is immediate to see that if (T ) = f0; 1g, then f (G) = f0; 1g, that is, f is ergodic. 5C1 k+1 is the class of cylinders such that n C = x 2 RN : (x1 ; :::; xk ; xk+1 ; :::; xk0 ) 2 Rk where k0 > k and E 2 B(Rk 0 k ). E o 14 S.CERREIA–VIOGLIO, F. M ACCHERONI, AND M . M ARINACCI Proof of Theorem 4. By induction and since f is stationary, it follows that for each k 2 N and for each Borel subset B of R (B.8) (f! 2 : f1 (!) 2 Bg) = (f! 2 : fk (!) 2 Bg) : By (B.8), this implies that for each k 2 N and for each Borel subset B of R x 2 RN : x k 2 B f = (f! 2 : fk (!) 2 Bg) = (f! 2 : f1 (!) 2 Bg) : In particular, since ffn gn2N B ( ; F), it follows that there exists m 2 R such that m1 f1 m1 . If we replace B with [ m; m], then we can conclude that (B.9) x 2 RN : xk 2 [ m; m] = (f! 2 : f1 (!) 2 [ m; m]g) = 1 8k 2 N: f : RN ! R by De…ne x1 0 (x) = if x1 2 [ m; m] otherwise 8x 2 RN : It is immediate to see that 2 B RN ; (C) . Note also that (B.10) ) ( n n 1 1 \ \ 1X 1X k 1 N N (x) = xk : x2R : x 2 R : xk 2 [ m; m] n n n=1 k=1 k=1 k=1 By (B.9) and (B.10) and since f is a convex capacity which is further continuous at RN , it follows that ( )! 1 n n \ 1X 1X N k 1 (B.11) x2R : (x) = xk = 1: f n n n=1 k=1 k=1 By Theorem 2 and since f is shift invariant and ergodic, we have that there exists ? 2 B RN ; G such that (B.12)( )! Z Z n 1X N ? k 1 ? ? x2R : d f lim (x) = (x) d f = 1: f n n RN RN k=1 By (B.11) and (B.12) and since ( Z N ? (B.13) x 2 R : d f is convex, we can conclude that )! Z n 1X ? ? xk = (x) d f = 1: lim f n n RN RN k=1 Pn Pn k 1 k 1 Let E = x 2 RN : limn n1 k=1 (x) = ? (x) and n = n1 k=1 for all n 2 N. By (B.12), we have that P (E) = 1 for all P 2 core ( f ). By construction, f1E n gn2N B RN ; (C) is a uniformly bounded sequence which converges pointwise to 1E ? . By [3, Theorem 22] and since f is convex and P (E) = 1 for all P 2 core ( f ), this implies that (B.14) Z Z Z Z Z ? d f = 1E RN ? d f = RN Next, since Z RN f lim 1E RN n nd f = lim n 1E RN nd f = lim n nd f : RN is convex and shift invariant, note that for each n 2 N Z Z n n Z 1X 1X k 1 k 1 d f d f = d f: nd f = n RN n RN RN f k=1 k=1 ERGODIC THEOREM S FOR LOW ER PROBABILITIES 15 R R By itR follows that RN ? d f R RN d f . R A similar argument yields R (B.14), R R that ? d d . Finally, since d = f d and d = f1 d , f f f 1 f RN RN RN RN by (B.13), we can conclude that ( )! Z Z n 1X N 1= f x2R : d f lim xk d f n n RN RN k=1 )! ( Z Z n 1X fk (!) f1 d ; = !2 : f1 d lim n n k=1 proving the statement. References [1] C. D. Aliprantis and K. Border, In…nite Dimensional Analysis, 3rd ed., Springer, New York, 2006. [2] P. Billingsley, Probability and Measure, 3rd ed., John Wiley & Sons, New York, 1995. [3] S. Cerreia-Vioglio, F. Maccheroni, M. Marinacci, and L. Montrucchio, Signed Integral Representations of Comonotonic Additive Functionals, Journal of Mathematical Analysis and Applications, 385, 895-912, 2012. [4] S. Cerreia-Vioglio, F. Maccheroni, M. Marinacci, and L. Montrucchio, Choquet Integration on Riesz Spaces and Dual Comonotonicity, Transactions of the American Mathematical Society, forthcoming. [5] S. Cerreia-Vioglio, F. Maccheroni, M. Marinacci, and L. Montrucchio, Ambiguity and Robust Statistics, Journal of Economic Theory, 974-1049, 2013. [6] F. Delbaen, Convex Games and Extreme Points, Journal of Mathematical Analysis and Applications, 45, 210-233, 1974. [7] Y. N. Dowker, Invariant Measure and the Ergodic Theorems, Duke Mathematical Journal, 4, 1051-1061, 1947. [8] Y. N. Dowker, Finite and -Finite Invariant Measures, Annals of Mathematics, 4, 595-608, 1951. [9] R. M. Dudley, Real Analysis and Probability, 2nd ed., Cambridge University Press, Cambridge, 2002. [10] N. Dunford and J. T. Schwartz, Linear Operators; Part I: General Theory, Wiley, New York, 1958. [11] E. B. Dynkin, Su¢ cient Statistics and Extreme Points, The Annals of Probability, 6, 705-730, 1978. [12] R. M. Gray, Probability, Random Processes, and Ergodic Properties, 2nd ed., Springer, New York, 2009. [13] R. M. Gray and J. C. Kie¤er, Asymptotically Mean Stationary Measures, The Annals of Probability, 8, 962-973, 1980. [14] F. Maccheroni and M. Marinacci, A Strong Law of Large Numbers for Capacities, The Annals of Probability, 33, 1171-1178, 2005. [15] M. Marinacci, Limit Laws for Non-additive Probabilities and Their Frequentist Interpretation, Journal of Economic Theory, 84, 145-195, 1999. [16] M. Marinacci and L. Montrucchio, Introduction to the Mathematics of Ambiguity, in Uncertainty in Economic Theory, Routledge, New York, 2004. [17] D. Schmeidler, Cores of Exact Games, I, Journal of Mathematical Analysis and Applications, 40, 214-225, 1972. [18] D. Schmeidler, Integral Representation without Additivity, Proceedings of the American Mathematical Society, 97, 255-261, 1986. [19] P. Walley, Statistical Reasoning with Imprecise Probabilities, Chapman and Hall, London, 1991. Università Bocconi
© Copyright 2026 Paperzz