On the estimation of imprecise probabilities

On the estimation of imprecise probabilities
Marco Cattaneo
Department of Statistics, LMU Munich
[email protected]
WPMSIIP 2010, Durham, UK
10 September 2010
the fundamental problem of statistical inference
I
given the realizations of the independent random variables X1 , . . . , Xn
with Xi ∼ Ber (p), estimate p ∈ [0, 1]
Marco Cattaneo @ LMU Munich
On the estimation of imprecise probabilities
the fundamental problem of statistical inference
I
given the realizations of the independent random variables X1 , . . . , Xn
with Xi ∼ Ber (p), estimate p ∈ [0, 1]
I
the estimation is unproblematic when n is sufficiently large
Marco Cattaneo @ LMU Munich
On the estimation of imprecise probabilities
the fundamental problem of statistical inference
I
given the realizations of the independent random variables X1 , . . . , Xn
with Xi ∼ Ber (p), estimate p ∈ [0, 1]
I
I
the estimation is unproblematic when n is sufficiently large
given the realizations of the independent random variables X1 , . . . , Xn
with Xi ∼ Ber (pi ) for some pi ∈ [p, p], estimate [p, p] ⊆ [0, 1]
Marco Cattaneo @ LMU Munich
On the estimation of imprecise probabilities
the fundamental problem of statistical inference
I
given the realizations of the independent random variables X1 , . . . , Xn
with Xi ∼ Ber (p), estimate p ∈ [0, 1]
I
I
the estimation is unproblematic when n is sufficiently large
given the realizations of the independent random variables X1 , . . . , Xn
with Xi ∼ Ber (pi ) for some pi ∈ [p, p], estimate [p, p] ⊆ [0, 1]
I
for example, the estimation of the transition matrix of an imprecise
Markov chain is a strictly related problem
Marco Cattaneo @ LMU Munich
On the estimation of imprecise probabilities
the fundamental problem of statistical inference
I
given the realizations of the independent random variables X1 , . . . , Xn
with Xi ∼ Ber (p), estimate p ∈ [0, 1]
I
I
the estimation is unproblematic when n is sufficiently large
given the realizations of the independent random variables X1 , . . . , Xn
with Xi ∼ Ber (pi ) for some pi ∈ [p, p], estimate [p, p] ⊆ [0, 1]
I
for example, the estimation of the transition matrix of an imprecise
Markov chain is a strictly related problem
I
note that the IDM model describes some imprecise knowledge about the
precise probability p (for instance, in the IDM model the lower probability
of Xi = Xi+1 = 1 is in general not p 2 )
Marco Cattaneo @ LMU Munich
On the estimation of imprecise probabilities
the fundamental problem of statistical inference
I
given the realizations of the independent random variables X1 , . . . , Xn
with Xi ∼ Ber (p), estimate p ∈ [0, 1]
I
I
the estimation is unproblematic when n is sufficiently large
given the realizations of the independent random variables X1 , . . . , Xn
with Xi ∼ Ber (pi ) for some pi ∈ [p, p], estimate [p, p] ⊆ [0, 1]
I
for example, the estimation of the transition matrix of an imprecise
Markov chain is a strictly related problem
I
note that the IDM model describes some imprecise knowledge about the
precise probability p (for instance, in the IDM model the lower probability
of Xi = Xi+1 = 1 is in general not p 2 )
I
no estimator of [p, p] can be consistent under all sequences (pi ) ∈ [p, p]N ,
and the same holds for the estimators of [inf pi , sup pi ] or
[lim inf pi , lim sup pi ] (since for example the deterministic model with
pi = xi can never be excluded)
Marco Cattaneo @ LMU Munich
On the estimation of imprecise probabilities
maximum likelihood estimation
I
basic question: given two imprecise probability models P1 , P2 and the
observed event A, which one of the two models was best in forecasting A?
Marco Cattaneo @ LMU Munich
On the estimation of imprecise probabilities
maximum likelihood estimation
I
basic question: given two imprecise probability models P1 , P2 and the
observed event A, which one of the two models was best in forecasting A?
I
in a certain sense, the likelihood function induced by the data A on a set
of imprecise probability models is bidimensional: lik(P) = P(A), P(A)
Marco Cattaneo @ LMU Munich
On the estimation of imprecise probabilities
maximum likelihood estimation
I
basic question: given two imprecise probability models P1 , P2 and the
observed event A, which one of the two models was best in forecasting A?
I
in a certain sense, the likelihood function induced by the data A on a set
of imprecise probability models is bidimensional: lik(P) = P(A), P(A)
I
when the set of imprecise probability models is large enough, the “upper
likelihood function” lik(P) = P(A) is maximized by the vacuous model,
Marco Cattaneo @ LMU Munich
On the estimation of imprecise probabilities
maximum likelihood estimation
I
basic question: given two imprecise probability models P1 , P2 and the
observed event A, which one of the two models was best in forecasting A?
I
in a certain sense, the likelihood function induced by the data A on a set
of imprecise probability models is bidimensional: lik(P) = P(A), P(A)
I
when the set of imprecise probability models is large enough, the “upper
likelihood function” lik(P) = P(A) is maximized by the vacuous model,
while the “lower likelihood function” lik(P) = P(A) is maximized by the
precise ML estimate
Marco Cattaneo @ LMU Munich
On the estimation of imprecise probabilities
maximum likelihood estimation
I
basic question: given two imprecise probability models P1 , P2 and the
observed event A, which one of the two models was best in forecasting A?
I
in a certain sense, the likelihood function induced by the data A on a set
of imprecise probability models is bidimensional: lik(P) = P(A), P(A)
I
when the set of imprecise probability models is large enough, the “upper
likelihood function” lik(P) = P(A) is maximized by the vacuous model,
while the “lower likelihood function” lik(P) = P(A) is maximized by the
precise ML estimate
I
the weighted geometric mean likα (P) = lik(P)α lik(P)1−α of the upper
and lower likelihood functions is an interesting compromise, where
α ∈ [0, 1] can perhaps be interpreted as a degree of optimism;
Marco Cattaneo @ LMU Munich
On the estimation of imprecise probabilities
maximum likelihood estimation
I
basic question: given two imprecise probability models P1 , P2 and the
observed event A, which one of the two models was best in forecasting A?
I
in a certain sense, the likelihood function induced by the data A on a set
of imprecise probability models is bidimensional: lik(P) = P(A), P(A)
I
when the set of imprecise probability models is large enough, the “upper
likelihood function” lik(P) = P(A) is maximized by the vacuous model,
while the “lower likelihood function” lik(P) = P(A) is maximized by the
precise ML estimate
I
the weighted geometric mean likα (P) = lik(P)α lik(P)1−α of the upper
and lower likelihood functions is an interesting compromise, where
α ∈ [0, 1] can perhaps be interpreted as a degree of optimism; of
particular interest is the case with α = 12 , since
s
lik 21 (P1 )
P 1 (A) P 1 (A)
=
lik 21 (P2 )
P 2 (A) P 2 (A)
is the geometric mean of the likelihood ratio most favorable to P1 and
the one most favorable to P2
Marco Cattaneo @ LMU Munich
On the estimation of imprecise probabilities
imprecise Bernoulli problem
I
maximum likα estimates:
α≤
1
2
⇒
α>
1
2
⇒
Marco Cattaneo @ LMU Munich
n1
n0 + n1
(1 − α) n1
b
,
pα =
α n0 + (1 − α) n1
bα = b
b
pα = p
p=
On the estimation of imprecise probabilities
b
pα =
α n1
(1 − α) n0 + α n1
imprecise Bernoulli problem
I
I
maximum likα estimates:
α≥
α≤
1
2
⇒
α>
1
2
⇒
1
2
⇒
Marco Cattaneo @ LMU Munich
n1
n0 + n1
(1 − α) n1
b
,
pα =
α n0 + (1 − α) n1
bα = b
b
pα = p
p=
b
pα =
[logit b
p α , logit b
p α ] = [logit b
p ± logit α]
On the estimation of imprecise probabilities
α n1
(1 − α) n0 + α n1
imprecise Bernoulli problem
I
maximum likα estimates:
α≤
1
2
⇒
α>
1
2
⇒
1
2
⇒
n1
n0 + n1
(1 − α) n1
b
,
pα =
α n0 + (1 − α) n1
bα = b
b
pα = p
p=
b
pα =
α n1
(1 − α) n0 + α n1
I
α≥
I
in the case with k categories, the maximum likα estimates are precise if
and only if α ≤ k1
Marco Cattaneo @ LMU Munich
[logit b
p α , logit b
p α ] = [logit b
p ± logit α]
On the estimation of imprecise probabilities