Algorithms in Games Evolving in Time: Winning Strategies Based on

Algorithms in Games Evolving in Time: Winning
Strategies Based on Testing Hypotheses
Evgeny Dantsin1 and Jan-Georg Smaus2 and Sergei Soloviev2
1
Department of Computer Science, Roosevelt University?
2
IRIT, University of Toulouse??
Abstract. We model two-player imperfect-information games evolving
in time, where one player makes and tests “hypotheses” about the opponent’s strategies. We consider algorithms needed for the first player to
compute a winning strategy. The main assumptions about the scenario
are the following: (1) the hypotheses form a “covering”, i.e., each strategy
of the second player satisfies at least one hypothesis; (2) the hypotheses
can be enumerated and tested; (3) for each hypothesis, the first player
has a strategy that “defeats” all of the opponent’s strategies satisfying
this hypothesis.
We have modelled a significant part of the theory presented here in Isabelle/HOL.
1
Introduction
Nowadays the interest to algorithmic games and multi-agent systems is rapidly
growing. The reason is obvious: “Algorithms became the natural environment
and default platform of strategic decision making” (Christos H. Papadimitriou
[3]). This paper is about computability of winning strategies in a broad class of
games that evolve in time.
In 1957 Michael O. Rabin constructed an example of a two-player win-lose
game with decidable rules but no computable winning strategies [4]. This seminal work was first among many further works that studied the question of computability in classical game theory. These studies have somewhat “changed the
rules”: even if the existence of a winning strategy is proved, can we compute it
and can we compute it efficiently?
In this algorithmic context, we consider two-player games evolving in time,
where the first player makes hypotheses about the opponent’s strategy and tests
them. The tests are based on observing the “trajectory” of the game during a
period of time: if any inconsistency between the hypothesis and the observations
appears, the hypothesis is refuted. We describe algorithms needed for the first
player to compute a winning strategy.
It is important to emphasize that in our scenario, the first player has “preliminary” knowledge about all possible strategies of the opponent (but not about
?
??
430 South Michigan Ave., Chicago, IL 60605, [email protected]
118 Route de Narbonne, 31062 Toulouse CEDEX 9, France, [email protected]
the strategy that is actually used). For example, such knowledge can be given
in the form of a universal function for the class of the opponent’s functions. The
idea that a universal function may be a key to computing a winning strategy
was outlined by one of the authors in [5], see also [6]. Here this idea is put in a
more general context.
The main contributions of the paper can be summarized as follows:
– a unifying framework for two-player games that evolve in time, where one
player makes and tests hypotheses about the opponent’s strategies;
– “built-in” algorithms needed to compute a winning strategy in such games;
– sufficient conditions that guarantee the computability of a winning strategy
(most importantly, the locality condition defined in Section 3.2).
It is worthwhile to note that the computability of winning strategies in our
framework is far from obvious. Indeed, it may seem that the first player can compute a winning strategy just because the opponent’s strategy has been identified
through making and testing hypotheses. However, this is not the case. Using
our approach, a winning strategy can be computed not only without identifying
the opponent’s strategy but even if the problem of identifying the opponent’s
strategy is undecidable.
The paper is organized as follows. In Section 2 we define our class of games
and give two examples: Nulling Game and Hide-and-Seek Game. In Section 3
we describe algorithmic tools needed for the first player to compute a winning
strategy. The computability of a winning strategy (using the algorithmic tools
as subroutines) is proved in this section. Possible refinements and extensions of
our approach are discussed in Section 4. Directions of future work are sketched
in Section 5.
This work has been presented before by the first and the third author [1].
The second author has contributed the modelling in Isabelle/HOL. In the present
paper, we leave the original presentation essentially untouched, but intersperse
it with paragraphs where we explain for each aspect how it was modelled in
Isabelle/HOL. Occasionally the process of Isabelle/HOL modelling raised questions about the underlying “paper-and-pencil” formalism. We mention these
questions where applicable, but a harmonisation between the “paper-and-pencil”
formalism and the Isabelle modelling is left as a topic for future work.
2
2.1
Unifying Framework
General Scenario and Settings
Alice and Bob jointly control a dynamical system, but they have different goals.
Alice seeks to bring the system into a terminal state, while Bob’s goal is to avoid
terminal states (in which case the system evolves infinitely long). In this section
we describe this general scenario and define notation for underlying dynamical
systems.
2
Dynamical systems with parameters. Let S be a discrete dynamical system whose
state space is a set M and whose time is the set N = 0, 1, 2, . . . of natural
numbers. A state of S at time instant t is an element S(t, a, b) ∈ M . The state
is determined by not only t but also two parameters whose values a and b are
set by Alice and Bob respectively. Thus, a state function for S is a function
S :N×A×B → M
(1)
where A is a set of possible values for the parameter controlled by Alice and B
is a set of possible values for Bob’s parameter.
Isabelle: In the Isabelle formalisation, we have three type parameters for
the types of Alice’s strategies, Bob’s strategies and the set of states. The
type of a state function is hence parametrised by these three parameters:
type synonym (’A,’B,’M) state function = ”nat ⇒ ’A ⇒ ’B ⇒ ’M”
Strategies. The state space M has a subset T of terminal states. Alice tries to
bring S into a terminal state by choosing a value for her parameter, and Bob
chooses a value for his parameter with the hope to avoid falling into T . Possible
values of the parameters, i.e., elements in the sets A and B, are called strategies.
We say that Alice uses a strategy a ∈ A if a is the value of her parameter, and
similarly for Bob.
Isabelle: A system consists of a state function and a set of terminal states:
type synonym (’A,’B,’M) system = ”(’A,’B,’M) state function * ’M set”
Alice and Bob choose their initial strategies at time instant t = 0 and then
they can change them as often as they want. Their choices of strategies are given
by strategy functions
α:N→A
β:N→B
where α(t) (resp. β(t)) is the strategy used by Alice (Bob) at time instant t.
Isabelle: We did define the notion of strategy function faithfully to the
original text:
type synonym ’AorB strategy function = ”nat ⇒ ’AorB”
To emhasise that a-priori both Alice and Bob may apply strategy functions, we call the type parameter ’AorB.
However, the Isabelle modelling revealed that the notion of function is
somewhat misleading because it suggests that the behavior of Alice and
Bob depends on time and nothing else. In reality the game description
given here intends that Alice and Bob may react to the outcomes of
previous rounds of the game (where later on, this freedom is actually
taken away from Bob). The notion of strategy sequence would be more
appropriate.
3
Trajectories. Alice’s and Bob’s strategy functions α and β uniquely determine
a trajectory of S, i.e., a function
τ :N→M
where τ (t) is the state of S at time instant t. In fact, we have
τ (t) = S(t, α(t), β(t))
where S is state function (1).
Isabelle: Of course, we need the definition of the type of a trajectory:
type synonym ’M trajectory = ”nat ⇒ ’M”
We also formalised the definition of τ :
definition
trajectory :: ”(’A,’B,’M) system ⇒
’A strategy function ⇒ ’B strategy function ⇒
’M trajectory”
where ”trajectory s alpha beta = (λ t. (state function s) t (alpha t) (beta t))”
However, it turns out that while we did prove some possibly interesting
lemmas about the function trajectory, it is never needed for the proof of
the main theorem of this paper. The reason is that the computation of
the trajectory resulting from Alice’s and Bob’s game is considerably more
complicated that just being a function of two given strategy functions.
This again underlines that the notion of strategy function is inadequate.
If there exists t ∈ N such that τ (t) is a terminal state, we say that Alice wins.
Otherwise Bob wins. We also say that α is a winning strategy function for Alice
if for each strategy function β for Bob, there exists a time instant t such that
S(t, α(t), β(t)) is a terminal state.
Isabelle: We have formalised the above definitions in Isabelle but then
never used them in the main argument of this paper, again because the
trajectory resulting from the game is not adequately captured by the
notion of strategy function.
Who wins, Alice or Bob? Of course, it is not possible to answer this question
if the scenario is so general. In next sections we describe a restricted scenario
under which the question of the winner can be answered.
2.2
Two Examples
Before we restrict the general scenario we give two examples. They will be used
to illustrate our definitions and constructions.
Example 1 (Nulling Game). The state space M is the set Z of integers. Strategies
of both Alice and Bob belong to the same countable set of functions f : N → N,
for example
A = B = the set of primitive recursive functions.
4
At time instant t, Alice and Bob choose strategies denoted by at and bt respectively. They choose their strategies simultaneously, not knowing the opponent’s
choice. The state of the system at time instant t is defined by:
S(t, at , bt ) = at (t) − bt (t)
The set T of terminal states is a one-element set, namely T = {0}. Can Alice
win?
Example 2 (Hide-and-Seek Game). Both the set A of Alice’s strategies and the
set B of Bob’s strategy is a metric space. At time instant t, Alice and Bob choose
points at and bt in this space, simultaneously, not knowing the opponent’s point.
Their choice determines the state at this time instant:
S(t, at , bt ) = the distance between at and bt .
Thus, the state space M is the set of nonnegative real numbers. The subset T
of terminal states is the set of numbers less than or equal to some fixed . That
is, Alice tries to choose at in the ball of radius around bt . Can she win?
2.3
Restricted Scenario
We pose the following three restrictions on the general scenario.
Limitation for Bob. The first restriction is a limitation for Bob that violates
symmetry between him and Alice. Namely, we assume that Bob chooses his initial
strategy b ∈ B and then he cannot change it, unlike Alice who can change her
strategies as often as she wants. Notice that this limitation is not very restrictive
since strategies (elements of A and B) can be functions of the time as, for
example, in Nulling Game.
Isabelle: We formalised the intuition that strategy functions add no expressiveness. First, we introduce what seems to be a restriction of our
scenario, namely, we define a trajectory based on plain strategies rather
than strategy functions:
definition
trajectory ab restr :: ”(’A,’B,’M) system ⇒ ’A ⇒ ’B ⇒ ’M trajectory”
where ”trajectory ab restr s a b = (λ t. (state function s) t a b)”
Compare this to the definition of the function trajectory above.
We then define a function which converts a state function in the scenario
where strategy functions are “something special”, into a state function
in the scenario where strategy functions are just a conincidental instantiation of the strategy parameter:
5
definition
flatten state function :: ”(’A,’B,’M) state function ⇒
(’A strategy function,’B strategy function,’M) state function”
where ”flatten state function S = (λt alpha beta. S t (alpha t) (beta t))”
definition
flatten system :: ”(’A,’B,’M) system ⇒
(’A strategy function,’B strategy function,’M) system”
where ”flatten system s = (flatten state function (state function s), terminal s)”
We then prove a theorem which states: Starting from the notion of system assuming strategy functions in the definition of trajectories, we can
obtain an isomorphic system where strategy functions are not “something special”.
theorem strategy functions are unnecessary :
”trajectory s = trajectory ab restr (flatten system s)”
However, the above does not explain why there is a restriction for Bob
and not for Alice. Concerning this point, we ask for the reader’s patience
...
Alice tests hypotheses about Bob’s choice. Alice knows something about the set
B of Bob’s strategies but she does not know what strategy b ∈ B he actually
uses. However, she can make hypotheses about Bob’s actual strategy and then
she can test them.
In the simplest case, Alice’s hypotheses can be identified with Bob’s strategies
themselves. That is, Alice hypothesizes that Bob’s actual strategy is a certain
strategy b ∈ B. For example, in Nulling Game, Alice could hypothesize that
Bob’s choice is fb where b is a certain number. In Hide-and-Seek, Alice could
hypothesize that the point b chosen by Bob is within distance at most from a
certain point a in the underlying metric space.
The second restriction can be described in terms of such hypotheses as follows. Alice has a set H of hypothesis. A hypothesis h ∈ H is a predicate on
B, i.e., a Boolean function from B to {true, false}. We assume that Alice’s hypotheses “cover” all Bob’s choices, which means that for any b ∈ B, there exists
h ∈ H such that h(b) is true. We also assume that Alice can enumerate all her
hypothesis: the set H is enumerable.
In Nulling Game, this restriction follows from the fact that there is a universal
function for primitive recursive functions, e.g. [2]. In the case of Hide-and-Seek,
our restriction is equivalent to the existence of an enumerable -net for the
underlying metric space.
Isabelle: We have defined a type synonym for Alice’s hypothesis: a hypothesis is a set of Bob’s strategies. It is then easy to define the notion
of “covers”:
type synonym ’B hypothesis = ”’B set”
definition
covers :: ”’B hypothesis set ⇒ ’B set ⇒ bool”
where ”covers H B = (∀b∈B. ∃h∈H. b∈h)”
6
Concerning the restriction that the hypothesis set should be enumerable,
we again ask for the reader’s patience.
Terminators. We assume that for any strategy b ∈ B of Bob, Alice has a
“stronger” strategy a ∈ A. We say that a is a terminator for b if there exists a time instant t0 such that S(t, a, b) is a terminal state for all t ≥ t0 . That is,
if Bob’s actual strategy is b, Alice can choose a at any time to bring the system
S into a terminal state.
The third restriction is that for any hypothesis, Alice has the corresponding
terminator. More precisely, for any h ∈ H, there exists a ∈ A such that a is a
terminator for all b ∈ B that satisfy the hypothesis h:
h(b) is true ⇒ a is a terminator for b.
In Nulling Game, a is a terminator for b if a = b (though there can be other
terminators for b). In Hide-and-Seek, any point in the ball of radius around of
b is a terminator for b.
Isabelle: We first define what it means for a strategy of Alice to be a
terminator for a strategy of Bob:
definition
is terminator :: ”’A ⇒ ’B ⇒ (’A,’B,’M) system ⇒ bool”
where ”is terminator a b s = (∃t0. ∀t≥t0. (state function s) t a b ∈ terminal s)”
Note that the requirement that all states after a certain time point are
terminal states might be overly strong because Alice only needs to reach
a terminal state once to win, but for the time being, we stick to this
definition.
Concerning the restriction that Alice has a terminator, we again ask for
the reader’s patience.
3
3.1
Computability of Winning Strategies
Toolbox for Alice
We consider systems S with the restrictions posed in Section 2.3. What tools
does Alice need to win in this restricted scenario? We describe such tools in this
section.
Tester. Alice tests her hypotheses using a function called a tester. Before we
define it formally, we outline how a tester could be used and how it could be
implemented.
A tester can be thought of as an oracle that answers Alice’s queries. She
issues a query when she wants to test her hypothesis about the actual choices of
Bob. Suppose that, at time instant t ≥ 1, Alice’s hypothesis is h ∈ H. Then she
issues a query that includes h and information about the behavior of the system
on the previous time instants {0, 1, . . . , t − 1}. The oracle’s answer is either 0 or
7
1, where 0 means “the hypothesis h is not refuted” and 1 means “the hypothesis
h is refuted”.
How could a tester be implemented? It can analyze the “observable” restriction τ |t of the trajectory τ to the time interval {0, 1, . . . , t − 1}:
τ |t = hτ (0), τ (1), . . . , τ (t − 1)i.
Also, it can compute “hypothetical” trajectories on {0, 1, . . . , t−1}. More exactly,
the tester can compute an initial part of a hypothetical trajectory determined
by the following:
– the “history” of Alice’s choices of strategies, i.e., the restriction α|t of Alice’s
strategy function α to the time interval {0, 1, . . . , t − 1}, where
α|t = hα(0), α(1), . . . , α(t − 1)i;
– the set of Bob’s possible strategies b that satisfy the hypothesis h.
Let T (h, α|t ) denote the set of all such hypothetical trajectories on the time
interval {0, 1, . . . , t − 1}. Then the tester checks whether the observable restriction τ |t of τ can belong to the set T (h, α|t ) of trajectories compatible with the
hypothesis h. The check may result in two possible outcomes:
– the observable restriction τ |t belongs to T (h, α|t ) and, therefore, the hypothesis is not refuted;
– the observable restriction τ |t does not belong to T (h, α|t ) and, therefore, the
hypothesis is refuted.
On the other hand, we can think of a tester as an agent (“demon”) that
was run by Alice at some time instant t ≥ 1 in order to test her hypothesis h.
The agent monitors the “observable” trajectory of S and verifies whether it is
consistent with her hypothesis h and history α|t . As long as it is, the agent is
silent. If at some time instant t0 ≥ t, the inconsistency is discovered, the agent
turns on a signal and keep it for all further time.
Formally, a tester T is a Boolean function such that
– the domain of T is the set of all 4-tuples (h, t, α|t , τ |t ), where
• h is Alice’s hypothesis;
• t ∈ N \ {0} is a time instant;
• α|t is a function from {0, 1, . . . , t − 1} to A;
• τ |t is a function from {0, 1, . . . , t − 1} to M .
– T is non-decreasing with respect to time t, i.e. if the hypothesis is refuted
at time instant t then it remains refuted all further time.
Isabelle: The previous paragraph nicely dictates the type signature of a
tester except that the actual system is a tacit parameter in the “paperand-pencil” formalism while it has to be explicit in the Isabelle modelling. We define what it means to be a tester:
8
definition
is tester :: ”(’A,’B,’M) system ⇒ ’B hypothesis ⇒ nat ⇒ ’A strategy function ⇒
’M trajectory ⇒ bool”
where ”is tester s h t alpha tau =
(∀b∈h. ∃t’<t. (state function s) t’ (alpha t’) b 6= tau t’)”
Concerning the restriction that Alice has a tester, we again ask for the
reader’s patience.
Note that the tester has an implicit parameter, not occurring in the input.
This parameter is Bob’s actual strategy b. The value T (h, t, α|t , τ |t ) indeed depends on b because the observable trajectory depends on b.
Computational tools. We assume that Alice has three algorithms:
1. Algorithm enumeration. An algorithm that enumerates all hypotheses
h ∈ H (with or without repetition), i.e., an algorithm that computes an
onto function from N to H. The enumeration is given by indexes: H =
{h0 , h1 , . . .}. Given any such enumeration, Alice identifies her hypotheses
with natural numbers.
2. Algorithm tester. An algorithm that computes the tester function T defined above.
3. Algorithm find-terminator for finding terminators. An algorithm that
takes i ∈ N as input and returns a strategy ai ∈ A such that ai is a terminator
for any Bob’s strategy that satisfies the hypothesis hi .
Isabelle: Now is the time to define the trajectory that results from Alice’s
and Bob’s game. Alice has three algorithms available as just described,
poor Bob only has his strategy that he is obliged to stick to.
Note first that we strive for a clean separation of the trajectory that
results from Alice’s algorithms and Bob’s strategy, and the question of
whether Alice’s algorithms actually do what they promise. For example,
Alice might be using an “broken oracle” for testing her hypotheses: the
resulting trajectory is well-defined, just that it will not necessarily be a
winning one for Alice. Only later will we have a statement saying: if the
algorithms work correctly, then Alice will win.
The definition of the trajectory is rather complicated. At each time point
t, there is a system state for t, and there is the number of Alice’s hypothesis that is currently in rigour (meaning that Alice’s strategy that is
in rigour at t results from using her algorithms for determining the terminating strategy directed at that hypothesis). Alice needs to know this
system state and number for all time points up to t in order to determine
what to do at time point t + 1, because she needs to determine if the
current hypothesis is refuted and should therefore be changed or whether
she should rather stick to it. If Alice decides that the current hypothesis
should be changed, she increments the hypothesis counter, otherwise she
leaves it unchanged. In any case, she determines her strategy based on
the new hypothesis counter and applies it to compute the state at t + 1.
Summarising, Alice computes a new pair (hypothesis counter, state):
9
primrec
testing trajectory ::
”nat (*the timepoint*)
⇒ (’A,’B,’M) system (*the system*)
⇒ (nat ⇒ ’B hypothesis) (*enumeration of hypotheses*)
⇒ (’B hypothesis ⇒ nat ⇒ ’A strategy function ⇒ ’M trajectory ⇒ bool) (*the tester*)
⇒ (’B hypothesis ⇒ ’A) (*gives the terminator*)
⇒ ’B (*Bob’s contribution*)
⇒ (nat ⇒ ’M) * (nat ⇒ nat)” (*state sequence and hypothesis counter sequence*)
where
”testing trajectory 0 s h enum tester find terminator b =
((λt. (state function s) 0 (find terminator (h enum 0)) b), λt.0)”
| ”testing trajectory (Suc t) s h enum tester find terminator b =
(let (traj,hyp sec) = (testing trajectory t s h enum tester find terminator b) in
(let new hyp = if (tester (h enum (hyp sec t))
(Suc t)
(λt’. find terminator (h enum (hyp sec t’)))
traj)
then Suc (hyp sec t) else (hyp sec t) in
((λt’. if t’≤ t then (traj t’) else (state function s) (Suc t) (find terminator (h enum new hyp)) b),
(λt’. if t’≤ t then (hyp sec t’) else new hyp))))”
3.2
When Alice Wins
Let S be as above, a system where the restrictions of Section 2.3 hold and where
Alice has three algorithms described in Section 3.1. We define a condition that
guarantees a win for Alice.
Locality condition. Let α be Alice’s strategy function. Let a ∈ A be her strategy
and let h ∈ H be her hypothesis. Let b ∈ B be Bob’s actual strategy. We say
that α stabilizes at the strategy a if α(t) = a for all t greater than some t0 . We
also say that α stabilizes at the strategies a and b under the hypothesis h if α
stabilizes at a and, in addition, the tester T never refutes the hypothesis h when
Alice’s strategy function is α and Bob’s actual strategy is b, i.e.,
T (h, t, α|t , τ |t ) = 0
for all t ≥ 1, where α|t denotes the restriction of α to {0, 1, . . . , t − 1}.
The following locality condition in fact connects all three algorithms in Alice’s
toolbox. Let h0 , h1 , . . . be hypothesis returned by the enumeration algorithm.
Let a0 , a1 , . . . be all “terminating” strategies returned by the find-terminator
algorithm. Suppose that b ∈ B is Bob’s actual strategy such that Alice’s strategy
function α stabilizes at strategies ai and b under hi for some i ∈ N. Then ai is
a terminator for b (even if b does not satisfy hi ).
We use the term “locality” to emphasize that local information about the
actual strategy b is sufficient to find a terminator for it. The tester analyzes
the behavior of b only on a finite time interval {0, 1, . . . , t − 1} (with respect to
Alice’s actual choices of strategies, not all possible choices) and does not take
10
into account the “global” behavior of b. To make the notion of locality clearer,
consider Nulling Game where B is the set of all primitive recursive functions. In
this case, the problem of identifying b on the basis of information accessible to
Alice (on a finite time interval) is undecidable. Thus, Alice may not know b, but
the locality condition guarantees that the stabilization is enough to win.
The locality condition can also be described in terms of games represented
by game trees, where nodes correspond to positions. The trajectory used by
the tester is the beginning of a branch of this tree. The complete description
of a and b includes their behavior in the positions that will never appear in
the continuations of this trajectory and the locality condition implies that this
behavior is not used by the terminator (while the “global” behavior can be taken
into account in the hypothesis hi ).
Isabelle: The above presentation is not so clear about what locality is a
property of, but we propose the following formalisation, using a notion
of trajectory according to the restriction for Bob, to be faithful to the
original paper:
definition
trajectory b restr :: ”(’A,’B,’M) system ⇒ ’A strategy function ⇒ ’B ⇒ ’M trajectory”
where ”trajectory b restr s alpha b = (λt. (state function s) t (alpha t) b)”
definition
stabilizes aux :: ”’A strategy function ⇒ ’A ⇒ bool”
where ”stabilizes aux alpha a = (∃t0. ∀t>t0. alpha t = a)”
definition
stabilizes :: ”’A strategy function ⇒ ’A ⇒ ’B ⇒ ’B hypothesis ⇒ (’A,’B,’M) system ⇒ bool”
where ”stabilizes alpha a b h s =
(stabilizes aux alpha a ∧
(∀t≥1. ¬ is tester s h t alpha (trajectory b restr s alpha b)))”
definition
local :: ”(’A,’B,’M) system ⇒ ’A set ⇒ ’B set ⇒ bool”
where ”local s A B = (∀ b ∈ B. ∀ alpha∈ (strategy functions A). ∀a∈ A. ∀h∈ hypotheses B.
(stabilizes alpha a b h s −→ is terminator a b s))”
Theorem 1. Suppose that Alice has the algorithms enumeration, tester,
and find-terminator defined in Section 3.1. Suppose also that the locality
condition is satisfied. Then Alice has an algorithm that, using these algorithms
as subroutines, computes a winning strategy function for her.
Proof. Alice needs to define her strategy function α at any time instant t. Let b ∈
B be a strategy that Bob actually uses. Alice does not know this actual strategy
b but she has hypotheses h0 , h1 , . . . that “cover” all Bob’s possible choices. Alice
is going to test them one by one.
Consider time instant t = 0. Assuming that Bob’s actual strategy b satisfies
h0 , Alice uses find-terminator to find a terminator a0 for b. She sets α(0)
11
to a0 . If S(0, a0 , b) is a terminal state3 , Alice wins and her winning strategy
function is a0 for all time instants t ≥ 0. Otherwise, Alice proceeds with the
next time instant t = 1, and she changes neither her current strategy a0 not her
hypothesis h0 .
Now consider time instant t > 0. Let ai = α(t − 1) be Alice’s strategy at the
previous time instant and let hi be her hypothesis about Bob’s actual strategy
at time instant t − 1. She runs tester to check whether hi is refuted at time t
or not. If not, Alice moves on to the next time instant t + 1, changing neither
her current strategy ai nor her hypothesis hi . Otherwise, i.e. if tester returns
1 (the hypothesis is refuted), Alice changes both her strategy and hypothesis.
Namely, her new strategy is ai+1 and her new hypothesis is hi+1 (she finds them
using enumeration and find-terminator). If S(t, ai+1 , b) is a terminal state,
Alice wins and α is set to ai+1 for all further time instants. If S(t, ai+1 , b) is
not a terminal state, she moves on to the next time instant t + 1 with the same
strategy and hypothesis.
Continuing this process, we define Alice’s strategy function α for all time
instants t. Why is α is a winning strategy function?
Since all hypotheses h0 , h1 , . . . form a covering of B, there exists j such that
Bob’s actual strategy b satisfies hj . There are two possibilities:
1. All hypotheses h0 , h1 , . . . , hj−1 are successively refuted and hj appears as a
new hypothesis at some time instant t0 .
2. There is a hypothesis hi with i < j such that hi appears as a hypothesis at
some step t0 and hi is never refuted.
In the former case, Alice wins because aj is a terminator for b and, therefore, aj
eventually brings the system into a terminal state. In the latter case, α stabilizes
at ai and b under hi . Then, by locality, ai is a terminator for b. Therefore, Alice
wins in this case as well.
t
u
Isabelle: The Isabelle version of the theorem is as follows:
theorem alice wins :
assumes
covers restr : ”covers H B”
and enum alg : ”range (h enum::(nat ⇒ ’B hypothesis)) = H”
and tester alg : ”tester = is tester s”
and terminator alg : ”∀h∈ H. ∀ b ∈ h. (is terminator (find terminator h) b s)”
and locality : ”local s A B”
shows ”∀b ∈ B. ∃ t. (fst (testing trajectory t s h enum tester find terminator b) t) ∈ terminal s”
Let us discuss the “restrictions” and “availability of algorithms” mentioned above:
– The restriction that Bob must stick to his strategy is implicit in the
definition of testing trajectory, which is parametrised by a plain strategy for Bob. It is also present in the definition of locality, although
one could envisage other formulations.
3
It is a really good luck for Alice not only to guess the right hypothesis but also to
get a terminator that brings the system into a terminal state at t = 0!
12
– The restriction that Alice’s hypotheses cover Bob’s strategies is explicited by covers restr.
– The restriction that Alice has a terminator is subsumed by the existence of a correct algorithm for computing this terminator, i.e.,
terminator alg.
– The restriction that the hypothesis set should be enumerable is given
by the fact that h enum is indeed an enumeration (a funtion from
the natural numbers).
– The restriction that Alice has a tester is explicited by tester alg.
– Locality is explicited.
We have proven “one half” of the theorem in Isabelle, that is to say, we
have not yet proven the case 2 above that the good hypothesis is never
enumerated, “sorry” about that! The following outline gives some corner-
proof
(* ... identify the hypothesis h and its number j ... *)
hence bh : ”b∈h” by (rule conjunct2)
(...)
then obtain j where j correct hypothesis : ”h = h enum j” by (rule rangeE)
(* ... either the correct hypothesis is eventually reached or not ... *)
have twocases : ”¬(∃t0. snd (testing trajectory t0 s h enum tester find terminator b) t0 = j) ∨
(∃t0. snd (testing trajectory t0 s h enum tester find terminator b) t0 = j)” (is ”¬?reach ∨ ?re
by (rule excluded middle)
{ assume ”¬?reach”
have ”∃t. fst (testing trajectory t s h enum tester find terminator b) t ∈ terminal s” sorry }
moreover
stones of the proof: { assume ”?reach”
(* identify the time point at which the correct hypothesis is reached. *)
then obtain t stable where
”snd (testing trajectory t stable s h enum tester find terminator b) t stable = j” by (rule exE
(* ... show by induction that testing trajectory is constant from now on:
several 100s of lines! ... *)
(* identify the first time point t win at which the ”good” strategy would win if
it had been played from the beginning: some dozen lines ... *)
(*show that that at max(t stable,t win), Alice indeed wins*)}
ultimately show
”∃t. fst (testing trajectory t s h enum tester find terminator b) t ∈ terminal s”
by (rule disjE[OF twocases])
qed
The entire ISAR proof script we developed is around 1700 lines long and
we estimate that it could be reduced to a few hundred lines by removing
all unnecessary lemmas and organising the proofs more intelligently.
As a preliminary summary, it can be said that modelling the existing
formalism in Isabelle went smoothly at first but turned out to be rather
mind-boggling later on. Even if none of the proofs had been completed,
the experience of Isabelle modelling was a worthwhile one because it
helped the second author to even understand the work written by the
two other authors, and in the future the “paper-and-pencil” version will
certainly benefit from the insights gained. As stated above, the Isabelle
13
proof of the main theorem currently has a clear-cut hole, but we are
optimistic that this hole can eventually be filled.
3.3
Application to the Examples
Corollary 1 (Application of Theorem 1 to Nulling Game). Suppose that
the set of functions in this game is the set of primitive recursive functions.
Then, if Alice has an algorithm that computes a universal function for primitive
recursive functions, she can use it as a subroutine to compute a winning strategy
for her.
Proof. Let universal be Alice’s algorithm for computing a certain universal
function for primitive recursive functions. We show that
– the algorithms enumeration, tester, and find-terminator can be implemented using universal as a subroutine;
– the locality condition is satisfied.
Let f0 , f1 , . . . be the enumeration of primitive recursive functions given by
universal. Alice’s hypotheses h0 , h1 , . . . are as follows: hi is the hypothesis that
Bob actually uses fi . Thus, hypotheses can be identified with natural numbers,
and enumeration is trivial: return i on input i.
Consider tester on input (hi , t, α|t , τ |t ), where hi is a hypothesis, t is a
time instant, α|t is a sequence a0 , . . . , at−1 of indexes of functions chosen by
Alice before t (“history”), and τ |t is a sequence s0 , . . . , st−1 of states before t
(“observable trajectory”). It is not difficult to see that, using universal and
knowing this information, tester can check whether b coincides with hi on
the time interval {0, 1, . . . , t − 1}. In the case of coincidence, hi is not refuted,
otherwise hi is refuted.
The algorithm find-terminator is trivial: given a number i, return fi . That
is, a terminator for any function fi is this function itself.
It remains to make sure that the locality condition is satisfied. Indeed, suppose that for some hypothesis hi and for all time instants t, the hypothesis hi
is not refuted. This means that the function fi agrees on N with Bob’s actual
strategy fj . Therefore, fi is a terminator not only for fi itself but also for fj , as
required in the locality condition.
t
u
Corollary 2 (Application of Theorem 1 to Hide-and-Seek Game). Suppose that the metric space in this game has a finite or countably infinite -net.
Then, if Alice has an algorithm that computes such a net, she can use it as a
subroutine to compute a winning strategy for her.
Proof. Suppose Alice has an algorithm that enumerates points c0 , c1 , . . . that
form an -net for the underlying metric space. Then the algorithm enumeration
for enumerating hypotheses h0 , h1 , . . . is basically the same algorithm: hi is the
hypothesis that Bob’s point b is in the -ball around ci . The algorithm findterminator is also the same algorithm: ci is a terminator for any point b in the
14
-ball around ci . The algorithm tester compares with the distance between
b and Alice’s point chosen at time instant t (by definition of the game, this
distance is the state at t). The hypothesis is refuted if the distance is larger than
, and it is not refuted otherwise. The locality condition is trivially satisfied. t
u
4
Refinements
The algorithm that constructs a winning strategy in the proof of Theorem 1 has
many refinements. For example,we assumed that find-terminator outputs a
terminator taking as input only the number of a hypothesis. We could extend
the applicability of Theorem 1 assuming that find-terminator takes more
information as input. The most natural extension is to search for a terminator,
using information about all previous states known to Alice.
Modified definition of find-terminator. It is now an algorithm that takes as
input not only i ∈ N but also
– time instant t;
– the “history” α|t of Alice’s choices before t;
– the “observable” part τ |t of the trajectory.
Let b be Bob’s actual strategy. We write αi,b to denote Alice’s strategy function
computed by find-terminator recursively, when hi and b are fixed. That is,
find-terminator computes Alice’s strategy αi,b (0) for time instant t = 0, and
then for any time instant t > 0, find-terminator computes αi,b (t) using itself
as a subroutine. We assume further that if b satisfies hi then find-terminator
outputs Alice’s strategy ai such that
– αi,b stabilizes at ai and b under hi ;
– ai is a terminator for b.
Modified locality condition. In the notation as above, let
α0,b , α1,b , . . . , αi,b , . . .
be all strategy functions for Alice returned by find-terminator. Recall that,
as we assume in the modified definition of find-terminator, every strategy
function αi,b stabilizes at ai and b under hi . Then ai is a terminator for b (even
if b does not satisfy hi ).
Theorem 2. Suppose that Alice has the algorithms enumeration, tester,
and the modified version of find-terminator. Suppose also that the modified
locality condition is satisfied. Then Alice has an algorithm that, using these algorithms as subroutines, computes a winning strategy function for her.
Proof. The proof Theorem 1 is easily generalized for these modified settings. t
u
15
Another possible refinement is connected with an ordering of hypotheses
returned by the algorithm enumeration. Obviously, this order plays a very
important role and it should be optimized whenever it is possible.
Example 3 (Nulling Game with Periodic Functions). This game is a special case
of Example 1 where Bob’s strategies are periodic functions. That is, for any
strategy b ∈ B, there exists a period p ∈ N such that b(t + p) = b(t) for all t ∈ N.
Suppose that in this game, Alice seeks not only to win but also she tries
to minimize the winning time, i.e., the time needed to reach a terminal state.
This time depends on the order in which Alice enumerates hypotheses about the
actual strategy of Bob. What order could she use to shorten the winning time?
Any strategy b of Bob can be identified with a p-dimensional vector of natural
numbers, where p is the period of b. A straightforward approach for Alice is to
apply Theorem 1 with enumerator that enumerates all finite vectors of natural
numbers. In this approach, Alice makes no use of known information that Bob’s
strategies are only periodic functions.
A more efficient approach is to make hypotheses about the period p of the
actual function b of Bob. Namely, Alice makes hypotheses h0 , h1 , . . . where hi is
a hypothesis that p = i. An efficient winning strategy for her can be obtained
applying Theorem 2 as follows.
Corollary 3 (Application of Theorem 2). In Nulling Game with Periodic
Functions, Alice can win not later than at time instant p+1 where p is the period
of the function b chosen by Bob.
Proof. How can Alice test the hypothesis hi that p = i? At any time instant
t > 0, Alice can learn the values of b at all previous time instants (like it was
done for Nulling Game in Corollary 1). In particular, she learns b(t − i) and
chooses her strategy
α(t) = b(t − i)
hoping that p = i. If the resulting state τ (t) is 0 she wins, otherwise she moves
on to the next strategy hi+1 . Clearly, this strategy function guarantees her win
at time instant at most p + 1.
t
u
There are other possible refinements. For example, analogues of Theorems 1
and 2 can be proved for the case when states may depend directly not only on
the time but also on previous states. Another interesting refinement is to use
randomness in making and testing hypotheses. For example, we could modify
Example 2 as follows. Let the underlying metric space A = B be a bounded open
subset of Rn with the Euclidean metric. Suppose that Alice has two additional
algorithms:
– an algorithm that checks whether the distance between two given points in
A is greater than or not;
– a random point generator for the uniform distribution on A.
16
Alice generates “random” hypotheses one by one. In addition, she checks for each
hypothesis hi whether the distance from hi to each of the previously generated
hypotheses h0 , . . . , hi−1 is greater than . If so, she tests hi . If not, she replaces
hi by a new random hypothesis. Continuing in this way, Alice will get with
probability 1 a hypothesis (point) hj such that hj is within distance greater
than form all previous hypotheses and within distance at most from the
point b chosen by Bob.
5
Conclusions and Future Work
In this paper we developed a general framework for two-player games evolving
in time, where one player computes a winning strategy function via making and
testing hypotheses about the opponent’s strategy. A key point is that a winning
strategy function can be computed without identifying the opponent’s strategy
b and even if the problem of identifying b is undecidable. We give sufficient
conditions that guarantee the computability of a winning strategy function. In
particular, these conditions include the following:
– the set of hypotheses “covers” all possible strategies of the opponent;
– the set of hypothesis is enumerable;
– the locality condition, where “locality” refers to “local” information accessible to the first player on a finite time interval {0, 1, . . . , t − 1}: only this
“observable” part of the trajectory is used to compute a winning strategy
function.
Our approach admits refinements and extensions. Some of them are described
in Section 4, others are work in progress. Some extensions can be obtained by
developing the technique used in this paper, others require completely different
technical tools. In particular, a different type of techniques is required to analyze
complexity aspects of the approach. Here are examples of questions arising in this
analysis:
– Given a specific algorithm for computing a winning strategy function α, what
time and space are needed to compute α(t) for any time instant t? Upper or
lower complexity bounds? Given a specific game, can we find an algorithm
with a feasible computational complexity?
– Suppose that a winning strategy function is computable, but Alice cannot
compute it since her computational power is limited and there are time or
space constraints on computation. What would be the best feasible strategy
function for Alice in this case?
– Each winning strategy function has its own winning time, i.e., the time
needed to reach a terminal state. How can Alice compute a winning strategy
function with the best (or just acceptable) winning time?
Acknowledgment
We are grateful to the anonymous referees for their careful analysis and valuable
suggestions. Most of the suggestions are implemented in the final version.
17
References
1. Evgeny Dantsin and Sergei Soloviev. Algorithms in Games Evolving in Time: Winning Strategies Based on Testing Hypotheses. In Proceedings of the 2nd Workshop
on Games for Design, Verification and Synthesis (CONCUR’10 workshop), 2010.
2. Stephen Cole Kleene. Introduction to Metamathematics. D. Van Nostrand Co., Inc.,
1952.
3. Noam Nisan, Tim Roughgarden, Éva Tardos, and Vijay V. Vazirani, editors. Algorithmic Game Theory. Cambridge University Press, 2007.
4. Michael O. Rabin. Effective computability of winning strategies. Annal of Math.
Studies, 39:147–157, 1957.
5. Sergei Soloviev. Logical and mathematical aspects of algorithmics and randomness
in strategies of choice. In Proceedings of the 8th National Conference on Logic and
Methodology of Science, (Palanga), pages 4–5, 1982. In Russian.
6. Sergei Soloviev. Asymmetric games and game semantics: Some philosophical consequences. In Proceedings of the International Conference on Philosophy, Mathematics, Linguistics: Aspects of Interaction (St. Petersburg), pages 188–191, 2009.
18

Download Report

Algorithms in Games Evolving in Time: Winning Strategies Based on

Paperzz.com

Your Paperzz