Complete Solution of a Differential Game with Linear Dynamics and

Glizer, V. Y. and V. Turetsky. (2008) “Complete Solution of a Differential Game,”
Applied Mathematics Research eXpress, Vol. 2007, Article ID abm012, 49 pages.
doi:10.1093/amrx/abm012
Complete Solution of a Differential Game with Linear
Dynamics and Bounded Controls
Valery Y. Glizer1,2 and Vladimir Turetsky1
1
Faculty of Aerospace Engineering, Technion—Israel Institute of
Technology, Haifa 32000, Israel
2
Department of Mathematics, Ort Braude Academic College of
Engineering, P.O.B. 78, Karmiel 21982, Israel
Correspondence to be sent to: Vladimir Turetsky, Faculty of Aerospace Engineering, Technion Israel Institute of Technology, Haifa 32000, Israel. e-mail: [email protected]
A zero-sum finite-horizon differential game with linear dynamics and bounded controls
is considered. The target set is a given hyperplane in the state space. The cost function is
the distance between the terminal state and this hyperplane. The complete game solution
is obtained in two classes of controls: open-loop and feedback-based controls.
1 Introduction
A differential game is an appropriate mathematical model for real-life control problems,
which either involve many decision-makers or contain a high degree of uncertainties.
There is a rich literature devoted to the theory of differential games (see e.g. [1–5]).
A zero-sum finite-horizon differential game with linear dynamics and bounded
controls was studied extensively in the literature, because of its considerable meaning
both in theory and applications (see e.g. [6–10] and the references therein). Important
applications of this game are: a pursuit-evasion problem (see e.g. [11–14]), an airplane
landing problem under windshear conditions (see [15] and references therein), and some
others.
Different versions of this game were analyzed in the literature. A simple example
with the ideal dynamics of the players was considered in [6]. The game with a first-order
Received November 29, 2006; Accepted November 28, 2007
Communicated by Bud Mishra
See http://www.oxfordjournals.org/our journals/amrx/ for proper citation instructions.
C The Author 2008. Published by Oxford University Press. All rights reserved. For permissions,
please e-mail: [email protected].
2 of 49 V. Y. Glizer and V. Turetsky
dynamics of the first player (pursuer) and an ideal dynamics of the second player (evader)
was studied in [7]. The time-invariant game with the first-order dynamics of both players
was solved in [8]. This result was extended to the time-varying case in [10]. In all of these
works, the feedback solution was obtained by using necessary conditions based on the
Maximin Principle [6]. In [9], the method of stochastic program synthesis was applied
to obtain a feedback solution of the game with rather general linear dynamics and cost
functional, and the detailed analysis was carried out for a two-dimensional example.
The structure of solutions, obtained in [6, 8, 10] and in the example considered
in [9], is determined by some scalar function of the time (determining function). This
function is constructed based on the system dynamics and cost functional. In these
works, only the cases, where determining function either is of a constant sign or changes
the sign once during the game, were treated. Although cases with more than one sign
changes of the determining function occur in practice [16], the general case involving
an arbitrary number of changes in sign, to the best of authors’ knowledge, is yet to be
studied properly.
In this paper, a game with n-dimensional time-varying linear dynamics and
bounded scalar controls is solved. The cost function is the distance between the terminal
system state and a given hyperplane. Two types of solution are obtained: namely, ones
involving open-loop and feedback control. For the feedback solution, the case, where the
determining function changes the sign an arbitrarily many but finite number of times,
is investigated in details.
The paper is organized as follows. In Section 2, the problem statement is presented, including the scalarized version of the original game. In Section 3, the open-loop
solution is obtained and its analysis is carried out. The feedback solution is derived in
Section 4. Concluding remarks are given in Section 5. In Appendix, the proofs of some
theorems are presented.
2 Problem Statement
2.1 Original game formulation
Consider the differential game with the dynamics described by the equation
ẋ = A(t)x + b(t)u + c(t)v + f(t), t ∈ [t0 , t f ],
(1)
x(t0 ) = x0 ,
(2)
Solution of a Differential Game 3 of 49
where x ∈ Rn is the state vector; u and v are scalar controls of the first and second
players, respectively; t0 and t f (t0 < t f ) are fixed time instants; the matrix function A(t)
and the vector functions b(t), c(t), f(t) are given and assumed continuous for t ∈ [t0 , t f ];
x0 ∈ Rn is a given initial state. The controls of the players are assumed to be measurable
on [t0 , t f ] and satisfying the constraints
|u(t)| ≤ 1,
|v(t)| ≤ 1,
t ∈ [t0 , t f ].
(3)
The set of all such functions is denoted by C.
Now, we introduce the hyperplane
D = {x ∈ Rn | d T x + d0 = 0},
(4)
where d = (d1 , d2 , . . . , dn )T ∈ Rn is a prescribed nonzero vector, d0 is a prescribed scalar.
The objective of the first player (minimizer) is to minimize the distance between the tertends to maxminal state x(t f ) and the hyperplane D, while the second player (maximizer)
imize it. This distance is given by |d T x(t f ) + d0 |/d, where d =
d12 + d22 + · · · + dn2 is
the Euclidian norm of the vector d. Since d > 0 is constant, the cost function, evaluating the performance of the minimizer and maximizer, can be chosen as
Jx = Jx (u, v) = |d T x(t f ) + d0 |.
(5)
Below, a feasibility of this formally defined cost function is justified.
The dynamics equation (1) with the initial condition (2), the control constraints
(3) and the cost function (5), along with the objectives of the players, constitute the
differential game. We call this differential game the Game with Bounded Controls (GBC).
Below, we present two types of the GBC solution: (a) the open-loop solution (the controls
of the players are functions of the time only) and (b) the feedback solution (the controls
of the players are functions of a current game position in the (t, x)-space).
2.2 Reduced game
By the transformation of the state variable in (1),
z = z(t, x) = d (t f , t)x +
T
t
tf
(t f , τ ) f(τ )dτ
+ d0 ,
(6)
4 of 49 V. Y. Glizer and V. Turetsky
where (t f , t) is the transition matrix of equation ẋ = A(t)x, this vector differential equation and the respective initial condition can be reduced to the scalar initial value problem:
ż = h1 (t)u + h2 (t)v,
(7)
z(t0 ) = z0 ,
(8)
h1 (t) = d T (t f , t)b(t), h2 (t) = d T (t f , t)c(t),
tf
T
z0 = d (t f , t0 )x0 +
(t f , τ ) f(τ )dτ + d0 .
(9)
with
(10)
t0
Note that (t f , t) satisfies
d(t f , t)
= −(t f , t)A(t),
dt
(t f , t f ) = In .
(11)
In the sequel, it is assumed that the functions h1 (t) and h2 (t) have no more than a finite
number of distinct zeros in the interval [t0 , t f ].
Due to (6), the cost function (5) is reduced to
Jz = |z(t f )|.
(12)
The dynamics equation (7) with the initial condition (8), the control constraints
(3) and the cost function (12) constitute the reduced game with bounded controls (RGBC),
in which the first player minimizes (12), while the second player maximizes it.
3 Open-Loop Solution
3.1 Main definitions
For any functions u(·) ∈ C and v(·) ∈ C, the initial value problem (1)–(2) has the unique
absolutely continuous solution x(t) for t ∈ [t0 , t f ]. Moreover, there exists a finite limit
lim x(t) = x(t f ),
t→t f −0
which justifies the feasibility of the cost function (5).
(13)
Solution of a Differential Game 5 of 49
Definition 1. For a fixed initial state x0 , the control u∗ (·) ∈ C is called the optimal openloop minimizer control in the GBC, if for any u(·) ∈ C,
sup Jx (u∗ (·), v(·)) ≤ sup Jx (u(·), v(·)).
(14)
Jxu = sup Jx (u∗ (·), v(·)),
(15)
v(·)∈C
v(·)∈C
The value
v(·)∈C
is called the upper value in open-loop controls of the GBC.
Definition 2. For a fixed initial state x0 , the control v ∗ (·) ∈ C is called the optimal openloop maximizer control in the GBC, if for any v(·) ∈ C,
inf Jx (u(·), v ∗ (·)) ≥ inf Jx (u(·), v(·)).
(16)
Jxl = inf Jx (u(·), v ∗ (·)),
(17)
u(·)∈C
u(·)∈C
The value
u(·)∈C
is called the lower value in open-loop controls of the GBC.
It is well known [17] that Jxu ≥ Jxl .
Definition 3. If
Jxu = Jxl = Jx∗ ,
(18)
then Jx∗ is called the value in open-loop controls of the GBC and the pair {u∗ (·), v ∗ (·)} is
called the saddle point in open-loop controls of the GBC.
By the well-known Saddle Point Inequality [5], for any u(·), v(·) ∈ C,
Jx (u∗ (·), v(·)) ≤ Jx∗ = Jx (u∗ (·), v ∗ (·)) ≤ Jx (u(·), v ∗ (·)).
(19)
Note that the RGBC can be considered as a particular case of the GBC. Hence,
all these (the existence and the uniqueness of the differential equation solution, the
6 of 49 V. Y. Glizer and V. Turetsky
existence of the solution limit for t → t f − 0, the game values, and saddle point definitions and the saddle point inequality) can be directly reformulated for RGBC.
3.2 Equivalence of GBC and RGBC
Lemma 1. If u∗ (·) ∈ C (v ∗ (·) ∈ C) is an optimal open-loop minimizer (maximizer) control
in the GBC, then it is an optimal open-loop minimizer (maximizer) control in the RGBC,
and vice versa.
Proof. We prove the lemma for the minimizer control. For the maximizer control, the
proof is similar.
Let for any pair (u(·), v(·)), x(t) and z(t) be the unique solutions of (1)–(2) and (7)–(10),
respectively. Then, for z0 = z(t0 , x0 ),
Jx (u(·), v(·)) = Jz (u(·), v(·)).
(20)
Let u∗ (·) ∈ C be an optimal open-loop minimizer control in the GBC, i.e. (14) is valid
for all u(·) ∈ C. The latter, along with (20), means that u∗ (·) ∈ C is an optimal open-loop
minimizer control in the RGBC.
Let u∗ (·) ∈ C be an optimal open-loop minimizer control in the RGBC, i.e. for any
u(·) ∈ C,
sup Jz (u∗ (·), v(·)) ≤ sup Jz (u(·), v(·)).
v(·)∈C
(21)
v(·)∈C
Assume that u∗ (·) is not optimal in the GBC. This means that there exists u∗∗ (·) ∈ C such
that
sup Jx (u∗ (·), v(·)) > sup Jx (u∗∗ (·), v(·)).
v(·)∈C
(22)
v(·)∈C
Hence, due to (20), the same inequality holds for Jz , contradicting the optimality of u∗ (·).
This contradiction proves that u∗ (·) is optimal in the GBC, completing the proof of the
lemma.
Solution of a Differential Game 7 of 49
Two following propositions are direct consequences of Lemma 1.
Corollary 1. The upper (lower) values of the GBC and RGBC coincide with each other:
Jxu = Jzu (Jxl = Jzl ).
Corollary 2. If (u∗ (·), v ∗ (·)) is a saddle point in the GBC, then it is a saddle point in the
RGBC, and vice versa. Moreover, the values of these games are equal: Jx∗ = Jz∗ .
3.3 Solution of RGBC
The following theorem establishes conditions of the existence of the RGBC saddle point
in the open-loop controls.
Theorem 1. If
|z0 | ≥
tf
|h1 (t)|dt,
(23)
t0
then the saddle point (u∗ (t), v ∗ (t)) (in the open-loop controls) of the RGBC exists and its
components are given by
u∗ (t) = −(sign z0 )(sign h1 (t)),
∗
v (t) = (sign z0 )(sign h2 (t)).
(24)
(25)
The game value is
Jz∗ = |z0 | −
tf
|h1 (t)|dt +
t0
tf
|h2 (t)|dt.
(26)
t0
The proof of the theorem is presented in Appendix.
The following corollary is a direct consequence of Theorem 1.
Corollary 3. If (23) holds then the optimal trajectory z∗ (t), generated by the saddle point
controls (24)–(25), preserves the sign, i.e.
sign(z∗ (t)) = sign z0 , t ∈ [t0 , t f ].
(27)
8 of 49 V. Y. Glizer and V. Turetsky
Now, consider the case
|z0 | <
tf
|h1 (t)|dt.
(28)
t0
The following two theorems are proved in Appendix.
Theorem 2. In the case (28), the optimal open-loop minimizer control u∗ (·) is any function from C satisfying the integral equation
z0 +
tf
h1 (t)u∗ (t)dt = 0,
(29)
t0
and the upper value of the RGBC is
Jzu =
tf
|h2 (t)|dt.
(30)
t0
Theorem 3. Let (28) be valid. If
|z0 | −
tf
tf
|h1 (t)|dt +
t0
|h2 (t)|dt ≤ 0,
(31)
t0
then the optimal open-loop maximizer control v ∗ (·) is arbitrary function from C, and the
lower value of the RGBC is
Jzl = 0.
(32)
If
tf
|z0 | −
|h1 (t)|dt +
t0
tf
|h2 (t)|dt > 0,
(33)
t0
then the optimal open-loop maximizer control v ∗ (·) is
v ∗ (t) =
⎧
⎨ sign z0 sign h2 (t),
⎩
± sign h2 (t),
z0 = 0,
z0 = 0,
(34)
Solution of a Differential Game 9 of 49
and the lower value of the RGBC is
tf
Jzl = |z0 | −
tf
|h1 (t)|dt +
t0
|h2 (t)|dt.
(35)
t0
Corollary 4. In the case (28), the RGBC has no saddle point.
Proof. The statement of the corollary directly follows from the comparison of the expression (30) for Jzu , and the expressions (32) and (35) for Jzl . Indeed, in the case (31),
Jzu =
tf
|h2 (t)|dt > 0 = Jzl .
(36)
t0
In the case (33), due to (28),
Jzu =
tf
|h2 (t)|dt >
t0
tf
|h2 (t)|dt + |z0 | −
t0
tf
|h1 (t)|dt
= Jzl .
(37)
t0
Summarizing Theorems 1–3 and Corollary 4 yields the following proposition.
Theorem 4. The RGBC has a saddle point in the open-loop controls if and only if the
condition (23) is valid.
Thus, we have obtained the complete open-loop solution of the RGBC. Due to
Lemma 1 and Corollaries 1–2, the obtained solution is also an open-loop solution of the
GBC.
Remark 1. Note that in the proof of Theorems 1–3, the optimal control problem was
solved. This problem consists of minimizing Jz by the minimizer open-loop control
u(·) ∈ C, while the minimizer knows the maximizer control v(·) ∈ C. Such a minimizer
control, based on the knowledge of the maximizer control, is called a countercontrol.
The optimal value of Jz and the respective optimal minimizer countercontrol are given
as follows.
(I) If
z0 +
tf
t0
h2 (t)v(t)dt ≤
tf
t0
|h1 (t)|dt,
(38)
10 of 49 V. Y. Glizer and V. Turetsky
then
Jzc(v(·)) = 0,
(39)
and in the class Cu , given by (A.12), the optimal control is
uc(t) = −
tf
z0 +
tf
h2 (t)v(t)dt
t0
|h1 (t)|dt sign h1 (t).
(40)
|h1 (t)|dt,
(41)
t0
(II) If
z0 +
tf
tf
h2 (t)v(t)dt < −
t0
t0
then
Jzc(v(·))
tf
= −z0 −
tf
h2 (t)v(t)dt −
t0
|h1 (t)|dt,
(42)
t0
and in the class Cu , the optimal control is
uc(t) = sign h1 (t).
(43)
(III) If
z0 +
tf
h2 (t)v(t)dt >
t0
tf
|h1 (t)|dt,
(44)
t0
then
Jzc(v(·)) = z0 +
tf
h2 (t)v(t)dt −
t0
tf
|h1 (t)|dt,
(45)
t0
and in the class Cu , the optimal control is
uc(t) = −sign h1 (t).
Example 1. In this example, h1 (t) = t 2 − 4t + 3, h2 (t) = t 2 /2 − 3t + 4, t f = 5.
(46)
Solution of a Differential Game 11 of 49
Fig. 1
Decomposition of the (t0 , z0 )-plane in Example 1.
Fig. 2
Saddle point trajectory in Example 1.
In Figure 1, the decomposition of the (t0 , z0 )-plane for the RGBC open-loop solution is shown. By the solid line, the boundary of the region of all initial positions
(t0 , z0 ), satisfying the inequality (23), is depicted. Due to Theorem 1, this region consists
of all initial positions, for which there exists the saddle point of the RGBC in open-loop
controls. It is called the saddle point region (SPR). All the initial positions between two
dashed lines in this figure satisfy the inequality (31). For each of these positions, due to
Theorem 3, there exists a countercontrol, guaranteeing the zero outcome of the game.
The set of these positions is called the successful countercontrol region (SCR).
In Figure 2, the optimal trajectory of the game, starting from the SPR, is depicted.
It is generated from the initial position (t0 , z0 ) = (0, 10) by the pair of optimal controls
12 of 49 V. Y. Glizer and V. Turetsky
Fig. 3
Successful countercontrol trajectory in Example 1.
Fig. 4
Upper value and lower value trajectories in Example 1.
(24)–(25), constituting the saddle point of the RGBC in the open-loop controls. In this
example, the outcome of the game coincides with the game value Jz∗ = 5.33.
Figure 3 shows the game trajectory, generated by the maximizer control v(t) ≡ 1
and the minimizer countercontrol (40). This trajectory starts from the initial position
(0, 4), falling in the SCR. It is seen that in this case, by using the countercontrol, the
minimizer provides the zero game outcome.
In Figure 4, two game trajectories, starting from the initial position (0, 6.5), are
depicted. This initial position belongs neither to the SPR, nor to the SCR. This means
that for this point, the RGBC has no saddle point, and the minimizer cannot provide the
zero game outcome against any maximizer control, even by using a countercontrol. The
Solution of a Differential Game 13 of 49
Fig. 5
Decomposition of the (t0 , z0 )-plane in Example 2.
first trajectory (the dotted line) is generated by the minimizer optimal control
u∗ (t) ≡ − t f
t0
z0
h1 (t)dt
= −0.975,
(47)
and the maximizer optimal control v ∗ (t) = sign h2 (t). Note that (47) is a solution of the
integral equation (29), while v ∗ (t) is obtained by (34). In this case, the outcome of the game
coincides with the upper game value Jzu = 4.67. The second trajectory (dash-dotted line)
is generated by the same maximizer control (optimal one) and minimizer countercontrol
uc(t) = −sign h1 (t) [see (46)]. In this case, the outcome of the game coincides with the
lower game value Jzl = 1.83.
Example 2. In this example, h1 (t) = 0.366(t 2 − 4t + 3), h2 (t) = t 2 − 6t + 8, t f = 5. The
decomposition of the (t0 , z0 )-plane is shown in Figure 5. It is seen that, in contrast to
Example 1, the SCR is a nonconnected set in the (t0 , z0 )-plane.
In Figure 6, two successful countercontrol trajectories for v(t) ≡ 1, starting from
both parts of the SCR, are depicted.
4 Feedback Solution
4.1 Main definitions
In this section, in contrast to Section 3, we assume that the minimizer and maximizer
controls are functions not only of the time t, but also of the current state x, i.e. the players
14 of 49 V. Y. Glizer and V. Turetsky
Fig. 6
Successful countercontrol trajectories in Example 2.
use the feedback strategies u(t, x), v(t, x). This assumption implies that the perfect state
information is available for both players.
Let G x be the set of functions g(t, x), t ∈ [t0 , t f ], x ∈ Rn , satisfying the following
inequality:
|g(t, x)| ≤ 1, t ∈ [t0 , t f ], x ∈ Rn .
Definition 4. A pair of functions (u(t, x), v(t, x)) ∈ G × G F x , is called admissible.
(48)
Remark 2. Since at least one of the functions u(t, x), v(t, x) may be discontinuous with
respect to x, a classical Caratheodory notion of a differential equation solution may
not be applicable to (1) for u = u(t, x), v = v(t, x). Thus, in the sequel, the solution of (1)
is understood in the sense of the Krasovskii’s constructive motion [3, Section 6]. This
argument implies that the solution is defined as a limit of a convergent subsequence of
piecewise-linear Eiler’s functions, associated with (1). Due to results of [3], such solution
exists for any initial position (t0 , x0 ) and any admissible pair u(t, x), v(t, x) , although
it is not in general unique. The graphs of such solutions are called the trajectories
of (1).
Definition 5. The minimizer feedback strategy u0 (·) ∈ G x is called optimal in the GBC,
if for any initial game position (t̄, x̄), t̄ ∈ [t0 , t f ), x̄ ∈ Rn , and for any u(·) ∈ G x ,
sup Jx (u0 (·), v(·)) ≤ sup Jx (u(·), v(·)).
v(·)∈G x
v(·)∈G x
(49)
Solution of a Differential Game 15 of 49
The value
Jxu = sup Jx (u0 (·), v(·)),
(50)
v(·)∈G x )
is called the upper value in feedback strategies of the GBC.
Definition 6. The maximizer feedback strategy v 0 (·) ∈ G x is called optimal in the GBC,
if for any initial game position (t̄, x̄), t̄ ∈ [t0 , t f ), x̄ ∈ Rn , and for any v(·) ∈ G x ,
inf Jx (u(·), v 0 (·)) ≥ inf x Jx (u(·), v(·)).
u(·)∈G x
u(·)∈G
(51)
The value
Jxl = inf x Jx (u(·), v 0 (·)),
u(·)∈G
is called the lower value in feedback strategies of the GBC.
(52)
Definition 7. If (u0 (·), v 0 (·)) ∈ F x and
Jxu = Jxl = Jx0 ,
(53)
then Jx0 is called the value in feedback strategies of the GBC and the pair {u0 (·), v 0 (·)} is
called the saddle point in feedback strategies of the GBC.
Note that Definitions 4–7 and Remark 2 can be directly reformulated for the RGBC
by introducing the respective sets G z , F z , and using Jz instead of Jx .
4.2 Connection between GBC and RGBC
In this section, we present preliminary arguments that, similarly to the open-loop solution, the feedback solution of the GBC can be obtained based on the feedback solution
of the RGBC.
First, we convert the original system (1) to a simpler one by the transformation
y = y(t, x) = (t f , t)x +
t
tf
(t f , τ ) f(τ )dτ ,
(54)
16 of 49 V. Y. Glizer and V. Turetsky
leading to the differential equation
ẏ = (t f , t)b(t)u + (t f , t)c(t)v.
(55)
Note that y(t f ) = lim y(t) = x(t f ), and the cost function (5) can be rewritten as
t→t f −0
Jy = Jy(u, v) = |d T y(t f ) + d0 |.
(56)
Thus, the GBC is transformed to the auxiliary differential game with bounded controls
(AGBC) having the dynamics (55), control constraints (3) and the cost function (56).
Note that for all t ∈ [t0 , t f ], the transformation (54) is invertible: by virtue of
−1 (t f , t) = (t, t f ),
x = x(t, y) = (t, t f ) y −
tf
(t f , τ ) f(τ )dτ .
(57)
t
The latter means that there exists a one-to-one mapping between the sets of admissible
pairs of strategies in the GBC and AGBC: to each admissible pair (u(t, x), v(t, x)) in the
GBC, there corresponds a unique admissible pair (u(t, y), v(t, y)) = (u(t, x(t, y)), v(t, x(t, y)))
in the AGBC, where x(t, y) is given by (57). Similarly, to each admissible pair
(u(t, y), v(t, y)) in the AGBC, there also corresponds a unique admissible pair (u(t, x), v(t, x))
= (u(t, y(t, x)), v(t, y(t, x))) in the GBC, where y(t, x) is given by (54). Moreover, the optimal
feedback strategies in these games are transformed to each other by this one-to-one
mapping. In this sense, the GBC and AGBC are equivalent.
Consider the AGBC. The target hyperplane in this game is
D y = {y ∈ Rn | d T y + d0 = 0}.
(58)
Let y(t) be the current state of this game. We can construct the hyperplane, containing
y(t) and parallel to (58):
Y(t) = {y ∈ Rn | d T (y − y(t)) = 0}.
(59)
The distance between the hyperplanes (58) and (59) is
l(t) =
|d T y(t) + d0 |
,
d
(60)
Solution of a Differential Game 17 of 49
and, due to (6) and (54),
l(t) =
|z(t)|
.
d
(61)
Note that for all t ∈ [t0 , t f ], the distance between the hyperplanes Y(t) and D y coincides
with the distance between the hyperplane D y and the point y(t), regardless of its position
on Y(t).
Thus, the cost function (56) in the AGBC (the distance between y(t f ) and the
hyperplane D y) can be replaced by l(t f ) (the distance between the hyperplanes Y(t f ) and
D y), regardless of the position y(t f ) on Y(t f ). Therefore, instead of the motion of the point
y(t), we can consider the motion of the hyperplane Y(t). Since the orientation of Y(t) is
determined by the vector d, its motion is fully described by the behavior of the scalar
function d T y(t) + d0 , indicating the current position of Y(t) with respect to D y. By virtue
of (6) and (54), this scalar function is z(t). Based on these arguments, we can replace
a search for the optimal feedback strategies in the AGBC by another that explores the
optimal feedback strategies in the RGBC.
4.3 Solution of RGBC
In this section, we construct the saddle point of the RGBC in the feedback strategies. This
construction is based on the saddle point (24)–(25) in open-loop controls. It is important
to note that this pair constitutes the saddle point in the open-loop controls, if and only
if the condition (23) is valid. However, in this section, based on (24)–(25), it is shown that
the saddle point in feedback strategies exists for any initial position.
Due to (27), the pair (24)–(25) can be rewritten as:
u∗ (t) = −(sign z(t f ))(sign h1 (t)),
∗
v (t) = (sign z(t f ))(sign h2 (t)).
(62)
(63)
Recall that this saddle point exists only asubject to the condition (23). In this subsection,
by using (62)–(63), the saddle point of the RGBC in feedback strategies is constructed
formally and justified in the entire game space.
18 of 49 V. Y. Glizer and V. Turetsky
4.3.1 Preliminary discussion
Substituting (62) and (63) into (7) yields
ż = (|h2 (t)| − |h1 (t)|) sign z(t f ).
(64)
For any z(t f ) = 0, Equation (64) generates the candidate optimal trajectory z∗ (t) =
z∗ (t; u∗ (·), v ∗ (·), z(t f )), t0 ≤ t ≤ t f , which is obtained by the integration of this equation from
t = t f in the backward time. Thus, we have the family of all such trajectories. Note that
this family is symmetric with respect to the t-axis. This means that if z(t f ) = −z f , where
z f > 0, then the respective candidate optimal trajectory is symmetric with respect to the
t-axis to the trajectory with z(t f ) = z f . All these trajectories cover some set Rr of the (t, z)plane, which is called in the sequel the regular region. Let (t̄, z̄) ∈ Rr . Then, as a rule, there
exists the unique trajectory z∗ (t; u∗ (·), v ∗ (·), z(t f )) such that z∗ (t̄; u∗ (·), v ∗ (·), z(t f )) = z̄. For this
position, we define u0 (t̄, z̄) = u∗ (t̄), v 0 (t̄, z̄) = v ∗ (t̄). If (t̄, z̄) lies on the trajectory emanated
from z(t f ) > 0, then u∗ (·), v ∗ (·) are given by (62)–(63) with sign z(t f ) = 1. For z(t f ) < 0, they
are given by (62)–(63) with sign z(t f ) = −1. By varying the position (t̄, z̄) ∈ Rr , we construct
the feedback strategies u0 (t, z) and v 0 (t, z) uniquely defined on Rr . Since these feedback
strategies generate, by construction, the candidate optimal trajectories, they constitute
the candidate optimal feedback strategies in Rr . Note that the points of Rr that lie on
more than one candidate optimal trajectory, constitute a set of a zero Lebesgue measure.
For these points, u0 (t̄, z̄) and v 0 (t̄, z̄) can be defined in some different ways. This issue is
clarified in the following paragraphs.
If Rr coincides with the entire strip S[t0 ,t f ) = {(t, z) : t0 ≤ t < t f , z ∈ R}, then u0 (·),
v 0 (·) are completely defined.
If S[t0 ,t f ) \Rr = Rs = ∅, we should also define u0 (·) and v 0 (·) for (t, z) ∈ Rs (in the
sequel Rs is called the singular region). Note that in this case, the boundary of Rr is
formed by two curves z = z+ (t) and z = z− (t) = −z+ (t), symmetrical with respect to the
t-axis. These curves can be of two types. The curves of the first type are arcs of “extreme”
candidate optimal trajectories and, by definition, they belong to Rr . The second type is
obtained as limits of the trajectories z∗ (t; u∗ (·), v ∗ (·), z(t f )) for z(t f ) → +0 and z(t f ) → −0,
respectively. For the positions on these curves, we define u0 (·) and v 0 (·) as follows. If
(t̄, z̄) lies on the curve z = z+ (t), then u0 (t̄, z̄) = −sign h1 (t̄), v 0 (t̄, z̄) = sign h2 (t̄). For (t̄, z̄)
on the curve z = z− (t), u0 (t̄, z̄) = sign h1 (t̄), v 0 (t̄, z̄) = −sign h2 (t̄). Thus, u0 (t, z) and v 0 (t, z)
are uniquely defined on these curves. This argument allows inclusion of all the points of
the second type curves z = z+ (t) and z = z− (t) into the regular region Rr . This makes Rr
Solution of a Differential Game 19 of 49
a closed set in the plane (t, z), while Rs becomes an open one. As to the values of u0 (t, z),
v 0 (t, z) in the singular region Rs , it will be shown that they can be chosen arbitrarily
subject to Definition 4.
Since the regular region Rr consists of the trajectories of the differential equation
(64), its structure is completely defined by the determining function
R(t) = |h2 (t)| − |h1 (t)|.
(65)
Different cases of the behavior of the determining function and the respective structure
of Rr are analyzed in the following subsection. Based on this structure, the feedback
strategies u0 (·) and v 0 (·) are formally derived.
4.3.2 Formal feedback construction
We assume that the determining function R(t) has a finite number of distinct zeros on
(t0 , t f ). Moreover, it changes its sign NR times on (t0 , t f ). Let ts1 < ts2 < · · · < tsNR (ts1 > t0 ,
tsNR < t f ) are all zeros of R(t) where its sign changes.
Consider the following cases of the behavior of the function R(t).
Case 0. NR = 0.
Case 0.1. R(t) ≥ 0 for all t ∈ (t0 , t f ).
If z(t f ) > 0, the right-hand side of (64) is positive. Hence, the candidate optimal
trajectory slopes down monotonically from z(t f ) to zero while t varies from t f backwards.
Similarly, if z(t f ) < 0, this trajectory slopes up monotonically from z(t f ) to zero. Note
that the candidate optimal trajectory cannot be extended beyond z = 0, because on its
continuation, the respective maximizer control is no more a candidate optimal one [see
Theorem 1 and Theorem 3, case (33)]. This phenomenon is illustrated by Figure 7. In this
figure, a trajectory AC of (64) for h1 (t) = t, h2 (t) = 2t, t0 = 0.5, t f = 5, is depicted. It starts
from the point (t f = 5, z(t f ) = 10) in a backward time. This trajectory is generated by the
pair of open-loop controls u∗ (t) = −sign h1 (t) ≡ −1, v ∗ (t) = sign h2 (t) ≡ 1. Since R(t) > 0,
the condition (31) is not valid. Hence, due to Theorems 1 and 3, the maximizer control
v ∗ (t) = sign h2 (t) cannot be a candidate optimal one on the arc BC of this trajectory,
because along this arc, z < 0.
Thus, the family of candidate optimal trajectories (FCOT) has the form presented
in Figure 8.
It is seen that the FCOT completely covers the strip S[t0 ,t f ) , i.e. Rr = S[t0 ,t f ) and
Rs = ∅. Any point (t̄, z̄) ∈ Rr , for which z̄ = 0, lies on the unique candidate optimal
20 of 49 V. Y. Glizer and V. Turetsky
Fig. 7
Candidate optimal trajectory.
Fig. 8
Family of candidate optimal trajectories in Case 0.1.
trajectory z∗ (t; u∗ (·), v ∗ (·), z(t f )). Moreover, the sign of z̄ coincides with the sign of z(t f ). This
observation allows the following definition of the candidate optimal feedback strategies
for t ∈ [t0 , t f ), z = 0:
u0 (t, z) = −sign z sign h1 (t),
(66)
v 0 (t, z) = sign z sign h2 (t).
(67)
Remark 3. The pair (66)–(67) of candidate optimal feedbacks generates two symmetric
trajectories (Krasovskii’s constructive motions) of (7), starting from the points (t̄, 0), independently of values of u0 and v 0 for z = 0. In the sequel, for the sake of the definiteness,
Solution of a Differential Game 21 of 49
Fig. 9
Family of candidate optimal trajectories in Case 0.2.
it is assumed that
u0 (t, z) = −sign h1 (t),
v 0 (t, z) = sign h2 (t),
z = 0,
z = 0.
(68)
(69)
Thus, the candidate optimal feedback strategies are formally designed in this
case.
Case 0.2. R(t) ≤ 0 for all t ∈ (t0 , t f ).
In this case, in contrast with the Case 0.1, the candidate optimal trajectories
slope up monotonically for z(t f ) > 0 and slope down monotonically for z(t f ) < 0. The
family of these trajectories is depicted in Figure 9.
In this figure, the curves z = z+ (t) and z = z− (t) are the limits of the candidate
optimal trajectories for z(t f ) → +0 and z(t f ) → −0, respectively:
z+ (t) =
t
tf
R(ξ )dξ ,
z− (t) = −z+ (t),
t ∈ [t0 , t f ],
t ∈ [t0 , t f ].
(70)
(71)
Note that the pair (70)–(71) constitutes the Krasovskii’s constructive motions of (7) generated by (66)–(69) from (t f , 0) in the backward time.
22 of 49 V. Y. Glizer and V. Turetsky
Fig. 10
Family of candidate optimal trajectories in Case 1.1.
The FCOT covers the part of S[t0 ,t f ) over the curve z = z+ (t) and under the curve
z = z− (t), while between these curves, there are no candidate optimal trajectories. Thus,
Rr = {(t, z) : z ≥ z+ (t) or z ≤ z− (t), t ∈ [t0 , t f )},
(72)
Rs = {(t, z) : z− (t) < z < z+ (t), t ∈ [t0 , t f )}.
(73)
Based on the preliminary discussion, the candidate optimal feedback strategies in the
regular region Rr are defined by (66)–(67).
We next proceed to defining the candidate optimal feedback strategies in the
singular region Rs . For any trajectory, starting in Rs and generated by an admissible
pair (u(·), v(·)), there are only two possibilities of its behavior. First, it can remain in Rs
till t = t f . This possibility yields z(t f ) = 0 and, consequently, the zero game outcome.
Second, it can achieve the boundary of Rs (z = z+ (t) or z = z− (t)) at some t = t̄ < t f . For
the sake of the definiteness, assume that the trajectory achieves the upper boundary
z = z+ (t). If, from the point (t̄, z+ (t̄)), the game is governed by the pair (66)–(67), then, due
to (70), the trajectory coincides with the arc z = z+ (t), t̄ ≤ t ≤ t f . This possibility also
yields z(t f ) = 0 and the zero game outcome. Therefore, the candidate optimal feedback
strategies in Rs can be chosen arbitrarily, subject to Definition 4.
Case 1. NR = 1.
Case 1.1. R(t) ≥ 0 for all t ∈ (ts1 , t f ).
In this case, there are two types of candidate optimal trajectories (see Figure 10).
The first type is the trajectories existing on [t̄, t f ] for t̄ > ts1 . These trajectories
behave similarly to Case 0.1, i.e. they slope down monotonically for z(t f ) > 0, while t
varies from t f backwards. The trajectories of the second type exist on the entire interval
Solution of a Differential Game 23 of 49
[t0 , t f ]. These trajectories for z(t f ) > 0, slope down monotonically on [ts1 , t f ] and slope up
monotonically on [t0 , ts1 ], while t varies backwards. In Figure 10, the curves z = z+ (t) and
z = z− (t) are the arcs of the candidate optimal trajectories, tangent to the t-axis at the
point (ts1 , 0):
z+ (t) =
t
R(ξ )dξ ,
t ∈ [t0 , ts1 ],
(74)
ts1
z− (t) = −z+ (t), t ∈ [t0 , ts1 ]. The family of the candidate optimal strategies covers the strip
S[t0 ,t f ) , excepting the singular region
Rs = {(t, z) : z− (t) < z < z+ (t), t ∈ [t0 , ts1 )}.
(75)
Hence, the regular region is
Rr = Rr1
Rr2 ,
(76)
where
Rr1 = S[ts1 ,t f ) = {(t, z) : t ∈ [ts1 , t f ), z ∈ R},
(77)
Rr2 = {(t, z) : z ≥ z+ (t) or z ≤ z− (t), t ∈ [t0 , ts1 )}.
(78)
Remark 4. In this case, the strip S[t0 ,t f ) can be decomposed into two strips S[ts1 ,t f ) and
S[t0 ,ts1 ) . If we set formally t0 = ts1 , the function R(t) becomes non-negative on the new
interval (t0 , t f ) = (ts1 , t f ) and we are in the conditions of Case 0.1. Therefore, in the new
strip S[t0 ,t f ) = S[ts1 ,t f ) the FCOT is constructed due to Case 0.1 (see Figure 11a). Note that
the end points of these trajectories fill the entire line t = t0 = ts1 . Furthermore, if we set
formally t f = ts1 , the function R(t) becomes non-positive on the new interval (t0 , t f ) =
(t0 , ts1 ) and we are in the conditions of Case 0.2. Therefore, in the new strip S[t0 ,t f ) = S[t0 ,ts1 )
the FCOT is constructed due to Case 0.2 (see Figure 11b). By joining Figures 11a and
11b along the line t = ts1 , we obtain the FCOT in the entire strip S[t0 ,t f ) (compare with
Figure 10).
Due to Remark 4, the candidate optimal feedback strategies for (t, z) ∈ S[ts1 ,t f ) are
constructed as in Case 0.1, while for (t, z) ∈ S[t0 ,ts1 ) , they are constructed as in Case 0.2. This
results in the following. For (t, z) ∈ Rr , excepting the segment [ts1 , t f ) of the t-axis, they
24 of 49 V. Y. Glizer and V. Turetsky
Fig. 11
Decomposition of the FCOT in Case 1.1.
Fig. 12
Family of candidate optimal trajectories in Case 1.2.1.
are given by (66)–(67), while at this segment they are given by (68)–(69). For (t, z) ∈ Rs , the
candidate optimal feedback strategies can be chosen arbitrarily subject to Definition 4.
Case 1.2. R(t) ≤ 0 for all t ∈ (ts1 , t f ).
In this case, for z(t f ) > 0, the candidate optimal trajectories slope up monotonically on [ts1 , t f ] and slope down monotonically on [t0 , ts1 ], while t varies backwardly. The
FCOT is either of the form, shown in Figure 12, or of the form, shown in Figure 13. In
both figures, the curves z = z+ (t) and z = z− (t) are the limits of the candidate optimal
trajectories for z(t f ) → +0 and z(t f ) → −0, respectively. In Figure 12, it is shown the case
(Case 1.2.1), in which there exists tin ∈ [t0 , ts1 ) satisfying
tin
tf
R(ξ )dξ = 0.
(79)
Solution of a Differential Game 25 of 49
Fig. 13
Family of candidate optimal trajectories in Case 1.2.2.
The case of nonexistence of such tin (Case 1.2.2) is shown in Figure 13.
Let t̃ = tin in Case 1.2.1, and t̃ = t0 in Case 1.2.2. Then
z+ (t) =
t
R(ξ )dξ ,
t ∈ [t̃, t f ],
(80)
tf
z− (t) = −z+ (t). Due to Figures 12 and 13, the singular region Rs is
Rs = {(t, z) : z− (t) < z < z+ (t), t ∈ [t̃, t f )}.
(81)
In Case 1.2.1, the regular region Rr is
Rr = S[t0 ,tin )
{(t, z) : z ≥ z+ (t) or z ≤ z− (t), t ∈ [tin , t f )},
(82)
while in Case 1.2.2, the regular region Rr is given by (72).
Remark 5. Similarly to Case 1.1, the FCOT can be constructed by some decomposition
of the strip S[t0 ,t f ) .
Namely, if we set t0 = ts1 formally, the function R(t) becomes nonpositive on the
new interval (t0 , t f ) = (ts1 , t f ) and we are in the conditions of Case 0.2. Therefore, in the
new strip S[t0 ,t f ) = S[ts1 ,t f ) the FCOT is constructed due to Case 0.2 (see Figures 14a and
15a).
However, in contrast to Case 1.1, the end points of these trajectories fill only two
half-lines {t = ts1 , z ≥ z+ (ts1 )} and {t = ts1 , z ≤ z− (ts1 )}. Furthermore, if we set formally
26 of 49 V. Y. Glizer and V. Turetsky
Fig. 14
Decomposition of the FCOT in Case 1.2.1.
Fig. 15
Decomposition of the FCOT in Case 1.2.2.
t f = ts1 , the function R(t) becomes non-negative on the new interval (t0 , t f ) = (t0 , ts1 ), i.e.
we fall into the conditions of Case 0.1. In order to provide a proper matching of the
FCOT in the strips S[t0 ,ts1 ) and S[ts1 ,t f ) , we should use this case in a reduced form, i.e.
only for initial values z(t)|t=t f =ts1 in two half-lines {t = ts1 , z ≥ z+ (ts1 )} and {t = ts1 , z ≤
z− (ts1 )}. Such a structure of the set of initial positions yields the singular region Rs (see
Figures 14b and 15b), although the “pure” Case 0.1 lacks a singular region.
The trajectory, emanating from (ts1 , z+ (ts1 )) can either achieve the t-axis (tin , satisfying (79), exists, see Figure 14b), or not (tin does not exist, see Figure 15b)). By joining
two parts of Figure 14, as well as two parts of Figure 15, along the line t = ts1 , we obtain
the FCOT in the entire strip S[t0 ,t f ) for Case 1.2.1 and 1.2.2, respectively (compare with
Figures 12 and 13).
Solution of a Differential Game 27 of 49
Fig. 16
Family of candidate optimal trajectories in Case 2.1.1.
Fig. 17
Family of candidate optimal trajectories in Case 2.1.2.
Based on Remark 5, for (t, z) ∈ Rs , the candidate optimal feedback strategies
can be chosen arbitrarily subject to Definition 4. For (t, z) ∈ Rr , excepting the segment
[t0 , tin ) of the t-axis in Case 1.2.1, the candidate optimal feedback strategies are given by
(66)–(67), while at this segment they are given by (68)–(69).
Case 2. NR = 2.
Case 2.1. R(t) ≥ 0 for all t ∈ (ts2 , t f ).
In this case, similarly to Case 1.2, the family of candidate optimal trajectories
can be of two types, depending on the existence of tin ∈ [t0 , ts1 ), satisfying (79). If such tin
exists, the family has the form shown in Figure 16 (Case 2.1.1).
If such tin does not exist, the family of candidate optimal trajectories has the
form shown in Figure 17 (Case 2.1.2).
28 of 49 V. Y. Glizer and V. Turetsky
Fig. 18
Decomposition of the FCOT in Case 2.1.1.
Fig. 19
Decomposition of the FCOT in Case 2.1.2.
The curves z = z+ (t) and z = z− (t) are the arcs of the candidate optimal trajectories, tangent to the t-axis at the point (ts2 , 0). The curve z = z+ (t) is given by (80),
where t̃ = tin in Case 2.1.1 and t̃ = t0 in Case 2.1.2, while t f is replaced by ts2 ; z− (t) =
−z+ (t).
Remark 6. In this case (two sign changes of R(t)), the FCOT can be also constructed
similarly to Case 1 by the decomposition of the strip S[t0 ,t f ) , presented in Figures 18 and
19.
It is seen that in the strip S[ts1 ,t f ) , we have Case 1.1 (see Figures 18a and 19a).
In the strip S[t0 ,ts1 ) , we have Case 0.1 with initial positions z(t)|t=t f =ts1 in two half-lines
{t = ts1 , z ≥ z+ (ts1 )} and {t = ts1 , z ≤ z− (ts1 )} (see Figures 18b and 19b). Joining two parts
Solution of a Differential Game 29 of 49
Fig. 20
Family of candidate optimal trajectories in Case 2.2.1.
of Figure 18, as well as two parts of Figure 19, along the line t = ts1 yields the FCOT in
the entire strip S[t0 ,t f ) for Case 2.1.1 and 2.1.2, respectively (compare with Figures 16 and
17).
By using Remark 6, the candidate optimal feedback strategies for (t, z) ∈ S[ts1 ,t f )
are constructed as in Case 1.1, while for (t, z) ∈ S[t0 ,ts1 ) , they are constructed as in Case
0.1.
Case 2.2 R(t) ≤ 0 for all t ∈ (ts2 , t f ).
In this case, the FCOT can be of three types, depending of existence and placement
of tin ∈ [t0 , ts2 ), satisfying (79). If such tin ∈ (ts1 , ts2 ) exists, the family has the form shown
in Figure 20 (Case 2.2.1).
It is seen that the singular region Rs is a nonconnected set in the (t, z)-plane:
Rs = Rs1
Rs2 ,
Rs1
Rs2 = ∅,
(83)
where Rsi is the closure of Rsi , i = 1, 2, and
Rs1 = {(t, z) : z1− (t) < z < z1+ (t), t ∈ [tin , t f )},
Rs2 = {(t, z) :
z1+ (t)
=
z2+ (t) =
z2− (t)
t
tf
t
ts1
<z<
z2+ (t),
t ∈ [t0 , ts1 )},
(84)
(85)
R(ξ )dξ ,
t ∈ [tin , t f ],
(86)
R(ξ )dξ ,
t ∈ [t0 , ts1 ],
(87)
z1− (t) = −z1+ (t), z2− (t) = −z2+ (t). Note that the boundary of the singular region consists of
the curves of two types. Namely, the curves z = z1+ (t) and z = z1− (t) are the limits of the
30 of 49 V. Y. Glizer and V. Turetsky
Fig. 21
Family of candidate optimal trajectories in Case 2.2.2.
Fig. 22
Family of candidate optimal trajectories in Case 2.2.3.
candidate optimal trajectories for z(t f ) → +0 and z(t f ) → −0, respectively. The curves
z = z2+ (t) and z = z2− (t) are the arcs of the candidate optimal trajectories, tangent to the
t-axis at the point (ts1 , 0).
If such tin exists and tin = ts1 , the family of candidate optimal trajectories has the
form shown in Figure 21 (Case 2.2.2). In this case, similarly to Case 2.2.1, Rs = Rs1 Rs2 .
However, in contrast to Case 2.2.1, their closures intersection is Rs1 Rs2 = {ts1 , 0} =
∅. Moreover, the curves z = z+ (t) and z = z− (t) are the limits of the candidate optimal
trajectories for z(t f ) → +0 and z(t f ) → −0, respectively, and simultaneously they are
tangent to the t-axis at the point (ts1 , 0).
In the last subcase (Case 2.2.3, see Figure 22), tin does not exist. In this case,
similarly to Case 2.2.2, the curves z = z+ (t) and z = z− (t) are the limits of the candidate
Solution of a Differential Game 31 of 49
optimal trajectories for z(t f ) → +0 and z(t f ) → −0, respectively, but they are not tangent
to the t-axis. Opposite to Cases 2.2.1 and 2.2.2, the singular region Rs is simply connected.
Remark 7. Figures 20–22 clearly show that, similarly to Case 2.1, the FCOT can be
constructed by the decomposition of the strip S[t0 ,t f ) into two strips S[t0 ,ts1 ) and S[ts1 ,t f ) .
Then in the strip S[t0 ,ts1 ) , it is constructed by Case 0.2, while in the strip S[ts1 ,t f ) , it is
constructed by Case 1.2. More precisely, in Cases 2.2.1 and 2.2.2, the FCOT in S[ts1 ,t f )
is constructed due to Case 1.2.1, but in Case 2.2.3, it is due to Case 1.2.2. In the strip
S[t0 ,ts1 ) , the pure Case 0.2 is realized for Cases 2.2.1 and 2.2.2. However, for Case 2.2.3,
the initial positions of Case 0.2 are chosen from two half-lines {t = ts1 , z ≥ z+ (ts1 )} and
{t = ts1 , z ≤ z− (ts1 )}.
By using Remark 7, the candidate optimal feedback strategies for (t, z) ∈ S[ts1 ,t f )
are constructed as in Case 1.2, while for (t, z) ∈ S[t0 ,ts1 ) , they are constructed as in Case
0.2.
Case k. NR = k (k ≥ 1).
Case k.1. R(t) ≥ 0 for all t ∈ (tsk , t f ).
Based on the previous cases and Remarks 4–7, the FCOT can be constructed
by the decomposition of the strip S[t0 ,t f ) into two strips S[t0 ,ts1 ) and S[ts1 ,t f ) . In the strip
S[ts1 ,t f ) , the FCOT is constructed by Case (k − 1).1. Let E1 be the set of all end points of the
trajectories of this FCOT. Note that E1 can be either entire line t = ts1 , or two half-lines,
belonging to t = ts1 and symmetrical with respect to the t-axis. In the strip S[t0 ,ts1 ) , the
FCOT is constructed by Case 0.1 (for even k) and Case 0.2 (for odd k), with the initial
positions in E1 .
Case k.2. R(t) ≤ 0 for all t ∈ (tsk , t f ).
In the strip S[ts1 ,t f ) , the FCOT is constructed by Case (k − 1).2. Let E2 be the set of
all end points of the trajectories of this FCOT. The set E2 has the similar structure as E1
in Case k.1. In the strip S[t0 ,ts1 ) , the FCOT is constructed by Case 0.2 (for even k) and Case
0.1 (for odd k), with the initial positions in E2 .
The candidate optimal feedback strategies in Cases k.1 and k.2 are obtained in
accordance with the construction of the FCOT described in the preceding paragraphs.
We illustrate Case k by the following examples.
Example 3. Let h1 (t) = 0.1t + 22, h2 (t) = 10(t 3 /3 − 3t 2 + 8t − 3.3) in the interval [t0 , t f ] =
[0.5, 5]. In Figure 23, the graph of the determining function R(t), given by (65), is shown.It
32 of 49 V. Y. Glizer and V. Turetsky
Fig. 23
Graph of R(t) in Example 3.
Fig. 24
Decomposition of the FCOT in Example 3 (Case 3.1).
is seen that in the interval [0.5, 5], this function changes its sign three times, i.e. NR = 3.
In this example, ts1 = 1.0616, ts2 = 3.5088, ts3 = 4.4296, and R(t) > 0 for t ∈ (ts3 , t f ). Thus,
we are in Case 3.1.
In Figure 24a the construction of the FCOT in the strip S[ts1 ,t f ) by Case 2.1 is
presented (compare with Figure 16). Here, E1 = {(t, z) : t = 1.0616, z ∈ R}. In Figure 24b
the construction of the FCOT in the strip S[t0 ,ts1 ) by the “pure” Case 0.2, is shown (compare
with Figure 9).
In Figure 25, the FCOT in the entire strip S[t0 ,t f ) , obtained by the joining of Figures
24a and 24b along the line t = ts1 , is depicted. It is seen that the singular region Rs
Solution of a Differential Game 33 of 49
Fig. 25
Family of candidate optimal trajectories in Example 3 (Case 3.1).
consists of two disconnected sets Rs1 and Rs2 , where
Rs1 = {(t, z) : z1− (t) < z < z1+ (t), t ∈ [tin , ts3 )},
Rs2 = {(t, z) : z2− (t) < z < z2+ (t), t ∈ [t0 , ts1 )},
t
R(ξ )dξ = 0.8333t 4 − 10t 3 + 39.95t 2 − 55t + 8.0711,
z1+ (t) =
z2+ (t) =
ts3
t
R(ξ )dξ = 0.8333t 4 − 10t 3 + 39.95t 2 − 55t + 24.2703,
t ∈ [tin , ts3 ],
t ∈ [t0 , ts1 ],
ts1
z1− (t) = −z1+ (t), z2− (t) = −z2+ (t), tin ∈ (ts1 , ts2 ) satisfies Equation (79).
Based on this structure of the FCOT, the candidate optimal feedback strategies
are designed as follows. For (t, z) ∈ Rs1 Rs2 , the candidate optimal feedback strategies
can be chosen arbitrarily subject to Definition 4. For (t, z) ∈ Rr , excepting the segments
[ts1 , tin ) and [ts3 , t f ) of the t-axis, the candidate optimal feedback strategies are given by
(66)–(67), while at this segment they are given by (68)–(69).
Example 4. In this example, the functions h1 (t) and h2 (t) of the previous example switch
their roles, i.e. h1 (t) = 10(t 3 /3 − 3t 2 + 8t − 3.3), h2 (t) = 0.1t + 22 in the interval [t0 , t f ] =
[0.5, 5]. In Figure 26, the graph of the determining function R(t) is shown. The number
of sign changes (NR = 3) and the respective zeros of R(t) are the same as in the previous
example. However, in contrast with Example 3, R(t) < 0 for t ∈ (ts3 , t f ). Thus, we are in
Case 3.2.
34 of 49 V. Y. Glizer and V. Turetsky
Fig. 26
Graph of R(t) in Example 4.
Fig. 27
Decomposition of the FCOT in Example 4 (Case 3.2).
In Figure 27a the construction of the FCOT in the strip S[ts1 ,t f ) by Case 2.2 is
presented (compare with Figure 22). Here,
E2 = {(t, z) : t = 1.0616, z ∈ [18.8515, ∞)
[−18.8515, −∞)}.
In Figure 27b the construction of the FCOT in the strip S[t0 ,ts1 ) by Case 0.2 with the initial
positions in E2 , is shown.
In Figure 28, the FCOT in the entire strip S[t0 ,t f ) , obtained by the joining of Figures
27a and 27b along the line t = ts1 , is depicted. It is seen that the singular region Rs is
simply connected, given by (73), where
z+ (t) = −0.833t 4 + 10t 3 − 39.95t 2 + 55t − 5.4167,
z− (t) = −z+ (t).
Solution of a Differential Game 35 of 49
Fig. 28
Family of candidate optimal trajectories in Example 4 (Case 3.2).
By using this structure of the FCOT, we derive the candidate optimal feedback
strategies. For (t, z) ∈ Rs , the candidate optimal feedback strategies can be chosen arbitrarily subject to Definition 4. For (t, z) ∈ Rr , the candidate optimal feedback strategies
are given by (66)–(67).
Remark 8. Let T − = {ts,i1 < ts,i2 < · · · < ts,iK }. Here, ts,i j , j = 1, . . . , K are the zeros of R(t)
where it changes the sign from negative to positive in the forward time.
Let
T =T−
T f,
(88)
where
T
f
=
⎧
⎨ {t f },
R(ts,NR − 0) > 0, R(ts,NR + 0) < 0,
and K be the number of elements of the set T . It is clear that K = K + 1 if T
K = K if T
f
(89)
⎩ ∅, otherwise,
f
= ∅, and
= ∅. We redenote the elements of T as T = {t̂1 > t̂2 > · · · > t̂K }.
From the formal recursive construction of the FCOT presented in the preceding
paragraphs, we can derive the algorithm of constructing the singular region Rs .
36 of 49 V. Y. Glizer and V. Turetsky
Step 1. Construct the set
Rs1 = {(t, z) : z1− (t) < z < z1+ (t), t ∈ (t̃1 , t̂1 )},
(90)
t
R(ξ )dξ > 0 for t ∈ (t0 , t̂1 ), or t̃1 is the maximal zero of t f R(ξ )dξ
t
in the interval (t0 , t̂1 ); z+ (t) = t f R(ξ )dξ , z− (t) = −z+ (t), t ∈ (t̃1 , t̂1 ). If t̃1 = t0 , then Rs = Rs1 .
where either t̃1 = t0 , if
t
tf
Step 2. Let t̃1 > t0 . Choose
t̂l2 = max{t̂i ∈ T : t̂i < t̃1 },
(91)
and construct the set Rs2 similarly to (90) with replacing t̂1 by t̂l2 . This construction yields
the interval (t̃2 , t̂l2 ) and the respective functions z2+ (t) and z2− (t), given on this interval. If
t̃1 = t0 , then Rs = Rs1 Rs2 .
Step M + 1. Let the sets Rs1 , Rs2 , . . . , RsM , (M < K) be constructed. Let t̃ M > t0 .
Then choose t̂l M+1 = max{t̂i ∈ T : t̂i < t̃ M } and construct the set Rs,M+1 similarly to the set
RsM .
Since the set T is finite, this algorithm consists of a finite number of steps Ks ≤ K.
Finally,
Rs =
Ks
Rsi ,
(92)
i=1
and
Rsi
Rsj = ∅,
i, j ∈ {1, . . . , Ks }, i = j.
(93)
We call the points t̂l1 = t̂1 , t̂l2 , . . . , t̂l Ks active elements of the set T .
Examples 3 and 4 provide a good illustration of this algorithm. In Example 3,
T = {t̂1 , t̂2 }, where t̂1 = ts3 , t̂2 = ts1 , i.e. K = 2. It is seen from Figure 25 that both t̂1 and
t̂2 are the active elements of T , yielding Ks = 2 and Rs = Rs1 Rs2 . In Example 4, T =
{t̂1 , t̂2 }, where t̂1 = t f , t̂2 = ts2 , i.e. K = 2. It is seen from Figure 28 that only t̂1 is the active
element of T , yielding Ks = 1 and Rs = Rs1 .
Solution of a Differential Game 37 of 49
4.3.3 Justification of RGBC solution
Theorem 5.
The RGBC has the saddle point (u0 (·), v 0 (·)) in the feedback strategies. For
(t, z) ∈ Rs , the pair (u0 (·), v 0 (·)) is arbitrary admissible. For (t, z) ∈ Rr , z = 0, u0 (·) and v 0 (·)
are given by (66) and (67), respectively. For (t, z) ∈ Rr , z = 0, u0 (·) and v 0 (·) are given by
(68) and (69), respectively. The game value in feedback strategies is
Jz0 = Jz0 (t̄, z̄) =
⎧
t
⎪
⎨ |z̄| + t̄ f R(ξ )dξ ,
⎪
⎩ t f R(ξ )dξ ,
t̂si
(t̄, z̄) ∈ Rr ,
(94)
(t̄, z̄) ∈ Rsi , i = 1, . . . , Ks .
The proof of the theorem is presented in Appendix.
4.4 GBC solution
The strategies (66) and (67) generate the following feedback strategies in the GBC:
u0 (t, x) = −sign h1 (t) sign z(t, x),
(95)
v 0 (t, x) = sign h2 (t) sign z(t, x),
(96)
where z(t, x) is given by (6).
Theorem 6. The GBC has the saddle point (u0 (·), v 0 (·)) in the feedback strategies. For all
(t, x) such that (t, z(t, x)) ∈ Rs , the pair (u0 (·), v 0 (·)) is arbitrary admissible. For all (t, x) such
that (t, z(t, x)) ∈ Rr and z(t, x) = 0, u0 (·) and v 0 (·) are given by (95) and (96), respectively.
For all (t, x) such that (t, z(t, x)) ∈ Rr and z(t, x) = 0, u0 (·) and v 0 (·) are given by (68) and
(69), respectively. The game value in feedback strategies is
Jx0 = Jx0 (t̄, x̄) =
⎧
t
⎪
⎨ |z(t̄, x̄)| + t̄ f R(ξ )dξ ,
⎪
⎩ t f R(ξ )dξ ,
t̂si
(t̄, z(t̄, x̄)) ∈ Rr ,
(97)
(t̄, z(t̄, x̄)) ∈ Rsi , i = 1, . . . , Ks .
The proof of the theorem is presented in Appendix.
Remark 9. In Theorem 6, the sets Rsi (the components of the singular region Rs ) are
constructed according to the iterative algorithm given. The boundary of each set Rsi is a
38 of 49 V. Y. Glizer and V. Turetsky
semipermeable curve [1]. Note that the boundary of the resulting set Rs is not, in general,
the trajectory of (64), but it is the union of separated segments of such trajectories.
Remark 10. Theorems 5 and 6 imply that GBC and RGBC are equivalent with respect
to the saddle point and game value, despite the constraint that the scalarizing transformation (6) is not a bijection.
5 Concluding Remarks
1. In this paper, the zero-sum differential game with the n-dimensional timevarying linear dynamics was considered. The duration of the game is prescribed. The
scalar controls are subject to geometrical constraints. The cost function is the distance
between the terminal system state and a given hyperplane.
2. Two types of solution of this game were obtained: in the classes of open-loop
and feedback strategies of the players. Both types of solution are based on the scalarizing
transformation of the original differential game, yielding a new scalar game.
3. For the open-loop solution, the optimal controls of the players were obtained.
The lower and upper values of the game were derived. The necessary and sufficient
condition of the saddle point existence was established.
4. For the feedback solution, it was proved that the saddle point always exists.
The solution yields the decomposition of the scalar game space into two regions, the
regular and singular ones. In the regular region the optimal strategies have a bang-bang
structure and the value of the game is nonzero, depending on the initial conditions. In
the singular region, consisting of a finite number of nonintersecting parts, the optimal
strategies are arbitrary admissible and the value of the game in each part is constant.
The decomposition structure is based on the behavior of the determining function R(t).
The general recursive (by the number of R(t) sign changes) decomposition procedure has
been proposed.
Acknowledgment
The authors express their gratitude to the anonymous reviewer for the very helpful remarks on the
paper. This research was partially supported by the Israel Scientific Foundation grant No. 2005241.
Solution of a Differential Game 39 of 49
Appendix: Proofs
A.1 Proof of Theorem 1
Begin with the case
tf
z0 ≥
|h1 (t)|dt,
(A.1)
t0
i.e. sign z0 = 1. First, we show that u∗ (t) = −sign h1 (t) and v ∗ (t) = sign h2 (t) satisfy the
definitions of the optimal strategies in the RGBC, i.e.
sup Jz (u∗ (·), v(·)) ≤ sup Jz (u(·), v(·)) ∀ u(·) ∈ C,
(A.2)
inf Jz (u(·), v ∗ (·)) ≥ inf Jz (u(·), v(·)) ∀ v(·) ∈ C.
(A.3)
v(·)∈C
v(·)∈C
and
u(·)∈C
u(·)∈C
Begin with u∗ (t) = −sign h1 (t). By solving the initial value problem (7)–(8) with
u = u∗ (t) and any v(·) ∈ C, we obtain
tf
tf
|h1 (t)|dt +
h2 (t)v(t)dt .
Jz (u (·), v(·)) = z0 −
t0
t0
∗
(A.4)
Due to (A.1), the supremum of (A.4) with respect to v(·) ∈ C is attained at v(t) = v ∗ (t), and
sup Jz (u∗ (·), v(·)) = z0 −
v(·)∈C
tf
|h1 (t)|dt +
t0
tf
|h2 (t)|dt.
(A.5)
t0
Now, for any u(·) ∈ C,
tf
tf
h1 (t)u(t)dt +
h2 (t)v(t)dt .
Jz (u(·), v(·)) = z0 +
t0
t0
(A.6)
Note that since u(·) ∈ C,
z0 +
tf
t0
h1 (t)u(t)dt ≥ z0 −
tf
t0
|h1 (t)|dt ≥ 0.
(A.7)
40 of 49 V. Y. Glizer and V. Turetsky
Hence, the supremum of (A.6) with respect to v(·) ∈ C also is attained at v(t) = v ∗ (t), and
tf
sup Jz (u(·), v(·)) = z0 +
v(·)∈C
tf
h1 (t)u(t)dt +
t0
|h2 (t)|dt.
(A.8)
t0
By virtue of (A.7), Equations (A.5) and (A.8) yield the inequality (A.2), meaning that u∗ (·)
is optimal open-loop minimizer control of the RGBC in the case (A.1). Moreover, in this
case, the upper value of the RGBC is
Jzu = Jz (u∗ (·), v ∗ (·)) = z0 −
tf
tf
|h1 (t)|dt +
t0
|h2 (t)|dt.
(A.9)
t0
Proceed to v ∗ (t) = sign h2 (t). Similarly to (A.4),
tf
tf
Jz (u(·), v ∗ (·)) = z0 +
h1 (t)u(t)dt +
|h2 (t)|dt .
t0
t0
(A.10)
Due to (A.7) the infinum of (A.10) with respect to u(·) ∈ C is attained at u(t) = u∗ (t), and
inf Jz (u(·), v ∗ (·)) = z0 −
u(·)∈C
tf
|h1 (t)|dt +
t0
tf
|h2 (t)|dt.
(A.11)
t0
In order to calculate inf Jz (u(·), v(·)) for arbitrary v(·) ∈ C, we introduce the subclass Cu ⊂ C:
u(·)∈C
Cu = {u(·) ∈ C| u(t) = κ sign h1 (t), t ∈ [t0 , t f ], κ ∈ [−1, 1]}.
(A.12)
Now, we replace the minimization of Jz (u(·), v(·)) in the class C by its minimization in the
subclass Cu .
Let arbitrary v(·) ∈ C be fixed. Then for u(·) ∈ Cu , the functional Jz (u(·), v(·)) becomes
a function of κ:
Jz (u(·), v(·)) = |a + bκ|,
(A.13)
where
a = z0 +
tf
h2 (t)v(t)dt,
t0
(A.14)
Solution of a Differential Game 41 of 49
Fig. A1
Minimization of Jz over Cu .
b=
tf
|h1 (t)|dt > 0.
(A.15)
t0
Thus, the minimization of Jz (u(·), v(·)) over Cu is transformed to its minimization by κ ∈
[−1, 1]. For the latter, three cases can be distinguished (see Figure A1 for the graphical
illustration).
(i)
If |a/b| ≤ 1, then the infinum of (A.13) with respect to κ ∈ [−1, 1] is attained
at κ ∗ = −a/b, and
inf Jz (u(·), v(·)) = 0.
u(·)∈Cu
(ii)
(A.16)
If a/b < −1, then the infinum of (A.13) with respect to κ ∈ [−1, 1] is attained
at κ ∗ = 1, and
inf Jz (u(·), v(·)) = |a + b| = −(a + b).
u(·)∈Cu
(iii)
(A.17)
If a/b > 1, then the infinum of (A.13) with respect to κ ∈ [−1, 1] is attained at
κ ∗ = −1, and
inf Jz (u(·), v(·)) = |a − b| = a − b.
u(·)∈Cu
(A.18)
42 of 49 V. Y. Glizer and V. Turetsky
Since Jz (u(·), v(·)) ≥ 0 for all u(·), v(·) ∈ C, the equality (A.16) implies that in the case
(i),
inf Jz (u(·), v(·)) = inf Jz (u(·), v(·)) = 0.
u(·)∈C
(A.19)
u(·)∈Cu
In the cases (ii) and (iii) by using the inequality
tf
t0
h1 (t)u(t)dt ≤
tf
|h1 (t)|dt,
(A.20)
t0
we directly obtain
inf Jz (u(·), v(·)) = inf Jz (u(·), v(·)) =
u(·)∈C
u(·)∈Cu
⎧
⎨ −(a + b), a < −b,
⎩ a − b,
(A.21)
a > b.
In order to complete the proof of the optimality of v ∗ (·) = sign h2 (t), we should
show that the inequality (A.3) is valid in all three cases (i)–(iii). In the case (i), due to
(A.1) and (A.19), this inequality is obvious. In two other cases, (ii) and (iii), it is a direct
consequence of (A.1) and the inequality
tf
t0
h2 (t)v(t)dt ≤
tf
|h2 (t)|dt.
(A.22)
t0
Namely, in the case (ii) we have
inf Jz (u(·), v ∗ (·)) − inf Jz (u(·), v(·))
u(·)∈C
tf
tf
= z0 −
|h1 (t)|dt +
|h2 (t)|dt − −z0 −
u(·)∈C
= 2z0 +
t0
tf
|h2 (t)|dt +
t0
t0
tf
tf
tf
|h1 (t)|dt −
h2 (t)v(t)dt
t0
t0
h2 (t)v(t)dt ≥ 0.
(A.23)
t0
In the case (iii),
inf Jz (u(·), v ∗ (·)) − inf Jz (u(·), v(·))
u(·)∈C
tf
tf
= z0 −
|h1 (t)|dt +
|h2 (t)|dt − z0 −
u(·)∈C
t0
t0
tf
t0
|h1 (t)|dt +
tf
h2 (t)v(t)dt
t0
Solution of a Differential Game 43 of 49
=
tf
|h2 (t)|dt −
t0
tf
h2 (t)v(t)dt ≥ 0.
(A.24)
t0
Thus, v ∗ (t) = sign h2 (t) is an optimal open-loop maximizer control of the RGBC in
the case (A.1). Moreover, in this case, the lower value of the RGBC is
Jzl = Jz (u∗ (·), v ∗ (·)) = z0 −
tf
|h1 (t)|dt +
t0
tf
|h2 (t)|dt.
(A.25)
t0
Now we show that in the case (A.1), the pair (u∗ (·), v ∗ (·)) = (−sign h1 (t), sign h2 (t))
is a saddle point (in the open-loop controls) of the RGBC, i.e. Jzu = Jzl . This directly
follows from comparing Equations (A.9) and (A.25). Moreover, the game value is
Jz∗ = Jzu = Jzl = Jz (u∗ (·), v ∗ (·)) = z0 −
tf
|h1 (t)|dt +
t0
tf
|h2 (t)|dt.
(A.26)
t0
The case
z0 ≤ −
tf
|h1 (t)|dt,
(A.27)
t0
is treated similarly. In this case, the pair (u∗ (·), v ∗ (·)) = (sign h1 (t), −sign h2 (t)) is the saddle
point (in the open-loop controls) of the RGBC and the game value is
Jz∗ = −z0 −
tf
|h1 (t)|dt +
t0
tf
|h2 (t)|dt.
(A.28)
t0
Finally, Equations (A.26) and (A.28) yield (26), which completes the proof of the theorem.
A.2 Proof of Theorem 2
First of all, note that, due to the condition (28), there exists an infinite set of controls
u∗ (·) ∈ C satisfying the integral equation (29). Let u∗ (·) be any such control. Then
∗
sup Jz (u (·), v(·)) =
v(·)∈C
tf
t0
|h2 (t)|dt.
(A.29)
44 of 49 V. Y. Glizer and V. Turetsky
For an arbitrary u(·) ∈ C,
sup Jz (u(·), v(·)) = z0 +
v(·)∈C
tf
t0
h1 (t)u(t)dt +
tf
|h2 (t)|dt.
(A.30)
t0
In (A.29) and (A.30), the supremum is attained at v(t) = sign h2 (t). Comparing (A.29) and
(A.30) yields the inequality (A.2), proving the optimality of u∗ (·). The latter, along with
(15), leads to (30).
A.3 Proof of Theorem 3
First, consider the case (31). Similarly to the case (i) in the proof of Theorem 1, we
obtain that for any v(·) ∈ C, Equation (A.19) is valid, and the infinum is attained at u(t) =
−(a/b) sign h1 (t), where a and b are given by (A.14) and (A.15), respectively. The latter
proves that in this case, any maximizer control v(·) ∈ C is optimal and (32) is valid.
In the case (33), the optimality of v ∗ (·), given by (34), is shown by the same arguments as the optimality of v ∗ (t) = sign h2 (t) in the proof of Theorem 1. Moreover, the
expression (35) for the lower value of the RGBC in this case is obtained in the same way
as (A.25). This completes the proof of the theorem.
A.4 Proof of Theorem 5
For the sake of transparency, the proof is carried out in the particular Case 1.1 (NR = 1,
R(t) ≥ 0 for t ∈ (ts1 , t f ); see Figure 10).
1. Optimality of u0 (·). By reformulating Definition 5, it is sufficient to prove that
for any u(·) ∈ G z ,
sup Jz (u0 (·), v(·)) ≤ sup Jz (u(·), v(·)).
v(·)∈G z
(A.31)
v(·)∈G z
Case I. The initial game position (t̄, z̄) ∈ Rr . For the sake of definiteness, assume
that z̄ > 0. Begin with the left-hand side of (A.31).
Subcase I.1.
z̄ ≥
t̄
R(ξ )dξ.
ts1
(A.32)
Solution of a Differential Game 45 of 49
The set G z can be partitioned into two subsets: G z = G1z (u0 (·))
G2z (u0 (·)). The subset
G1z (u0 (·)) consists of all strategies v(·) such that any trajectory z = z(t) of (7), starting
t
at (t̄, z̄) and generated by the pair (u0 (·), v(·)), does not intersect the curve z = ts1 R(ξ )dξ ;
G2z (u0 (·)) = G z \G1z (u0 (·)).
Let v(·) ∈ G1z (u0 (·)). Then, for any trajectory z(t) mentioned in the preceding
paragraphs,
Jz (u0 (·), v(·)) = z(t f ) > 0.
(A.33)
Let z0 (t) be the trajectory, generated from the same initial position by the pair (u0 (·), v 0 (·)),
where v 0 (·) is given by (67). Then
tf
Jz (u (·), v (·)) = z (t f ) = z̄ −
0
0
0
tf
|h1 (t)|dt +
t̄
|h2 (t)|dt > 0.
(A.34)
t̄
Due to (67) and the definition of the Krasovskii’s constructive motion, for all v(·) ∈ G1z (u0 (·)),
z0 (t f ) ≥ z(t f ), yielding, by using (A.33)–(A.34),
sup
v(·)∈G1z (u0 (·))
tf
Jz (u0 (·), v(·)) = Jz (u0 (·), v 0 (·)) = z̄ +
R(t)dt.
(A.35)
t̄
Let v(·) ∈ G2z (u0 (·)), i.e. there exists at least one trajectory z = z(t) of (7), starting at
t
(t̄, z̄) and generated by the pair (u0 (·), v(·)), which intersects the curve z = ts1 R(ξ )dξ . Let
the set Z(u0 (·), v(·)) of all trajectories, starting at (t̄, z̄) and generated by the pair (u0 (·), v(·)),
be partitioned into two subsets as follows: Z(u0 (·), v(·)) = Z1 (u0 (·), v(·)) Z2 (u0 (·), v(·)) of
the trajectories, not intersecting and intersecting this curve, respectively. For any
z(·) ∈ Z1 (u0 (·), v(·)), Equation (A.33) holds. Thus, due to (A.35), for v(·) ∈ G2z (u0 (·)), z(·) ∈
Z1 (u0 (·), v(·)),
Jz (u0 (·), v(·)) ≤
sup
v(·)∈G1z (u0 (·))
Jz (u0 (·), v(·)).
(A.36)
Now, let z(·) ∈ Z2 (u0 (·), v(·)). If the first intersection of the trajectory with the curve z =
t
ts1 R(ξ )dξ occurs for t < ts1 (i.e. the trajectory penetrates into the singular region Rs ),
then, due to the inequalities
− |h1 (t)| sign z + h2 (t)v(t, z) ≤ R(t), z =
t
R(ξ )dξ ,
ts1
(A.37)
46 of 49 V. Y. Glizer and V. Turetsky
Fig. A2
Geometrical illustration of the proof of Theorem 5.
t
− |h1 (t)| sign z + h2 (t)v(t, −z) ≥ −R(t), z = −
R(ξ )dξ ,
(A.38)
ts1
the trajectory cannot leave Rs before t = ts1 , and, moreover, it remains in the domain
AB + B − (see Figure A2). Similarly, if the first intersection of the trajectory with the curve
t
z = ts1 R(ξ )dξ occurs for t ≥ ts1 , then, due to the inequalities (A.37)–(A.38), it remains in
the domain AB + B − . Thus, for v(·) ∈ G2z (u0 (·)), z(·) ∈ Z2 (u0 (·), v(·)),
tf
Jz (u (·), v(·)) ≤
0
R(t)dt.
(A.39)
ts1
Due to (A.36) and (A.39),
sup
v(·)∈G2z (u0 (·))
Jz (u0 (·), v(·)) ≤
sup
v(·)∈G1z (u0 (·))
Jz (u0 (·), v(·)),
(A.40)
and
sup
v(·)∈Svz (u0 (·))
Jz (u0 (·), v(·)) = Jz (u0 (·), v 0 (·)) = z̄ +
tf
R(t)dt.
(A.41)
t̄
Subcase I.2.
t̄ ≥ ts1 , z̄ <
t̄
R(ξ )dξ.
ts1
This subcase is treated similarly to Subcase I.1, yielding (A.41).
(A.42)
Solution of a Differential Game 47 of 49
Proceed to the right-hand side of (A.31). In both subcases, for any trajectory,
starting at (t̄, z̄) and generated by the pair (u(·), v 0 (·)) (u(·) ∈ G z ):
tf
Jz (u(·), v (·)) = z(t f ) ≥ z̄ +
0
R(t)dt.
(A.43)
t̄
The inequality (A.43) is a direct consequence of the inequality
h1 (t)u(t, z) + |h2 (t)| sign z ≥ R(t) sign z, z > 0.
(A.44)
Equation (A.41) and the inequality (A.43) yield (A.31) for (t̄, z̄) ∈ Rr , z̄ > 0. The case z̄ ≤ 0
is treated similarly.
Case II. The initial game position (t̄, z̄) ∈ Rs . By the same arguments as in Subcase
I.1, in this case,
sup
v(·)∈Svz (u0 (·))
Jz (u0 (·), v(·)) =
tf
R(t)dt.
(A.45)
R(t)dt.
(A.46)
ts1
Moreover, as in Case I,
Jz (u(·), v 0 (·)) = |z(t f )| ≥
tf
ts1
The equality (A.45) and the inequality (A.46) yield (A.31) for (t̄, z̄) ∈ Rs . This completes the
proof of the optimality of u0 (·). Note that, due to (A.41) and (A.45), the upper value in the
feedback strategies of the RGBC coincides with (94).
2. Optimality of v 0 (·). By reformulating Definition 6, it is sufficient to prove that
for any v(·) ∈ G z ,
inf Jz (u(·), v 0 (·)) ≥ inf z Jz (u(·), v(·)).
u(·)∈G z
u(·)∈G
(A.47)
The proof of this inequality is symmetric to the proof of (A.31). Moreover, the lower value
in the feedback strategies of the RGBC coincides with (94).
Thus, in Case 1.1, u0 (·) and v 0 (·) are optimal feedback strategies of the minimizer
and maximizer, respectively, in the RGBC; the pair (u0 (·), v 0 (·)) is the saddle point and the
game value is given by (94). The other cases are treated similarly. This completes the
proof of the theorem.
48 of 49 V. Y. Glizer and V. Turetsky
A.5 Proof of Theorem 6
The proof of this theorem also is carried out for Case 1.1 (NR = 1). Let (u(t, x), v(t, x)) ∈ F x
be an admissible pair of feedback strategies in the GBC. Then for any initial position
(t̄, x̄), t̄ ∈ [t0 , t f ), x̄ ∈ Rn , the system (1) has a solution x(t) for t ∈ [t̄, t f ). Thus, the pair
of functions (x(t), z(t, x(t)), where z(t, x) is given by (6), is the solution of the system,
consisting of Equation (1) and the scalar equation
ż = h1 (t)u(t, x) + h2 (t)v(t, x),
(A.48)
with the initial condition
z(t̄) = z(t̄, x̄).
(A.49)
Note that |z(t f )| = Jx (u(·), v(·)).
Equation (A.48) looks like (7) with a single difference. Namely, in (7) the feedback
controls depend on t, z, while in (A.48) they depend on t, x. Nevertheless, for (A.48), the
following inequalities, similar to the inequalities (A.37), (A.38), and (A.44), are valid:
t
− |h1 (t)| sign z(t, x) + h2 (t)v(t, x) ≤ R(t), ∀ (t, x) ∈ [t0 , t f ) × R : z(t, x) =
n
R(ξ )dξ ,
ts1
− |h1 (t)| sign z(t, x) + h2 (t)v(t, x) ≥ −R(t), ∀ (t, x) ∈ [t0 , t f ) × Rn : z(t, x) = −
(A.50)
t
R(ξ )dξ ,
ts1
(A.51)
h1 (t)u(t, x) + |h2 (t)| sign z(t, x) ≥ R(t) sign z(t, x), ∀ (t, x) ∈ [t0 , t f ) × Rn : z(t, x) > 0.
(A.52)
By using the inequalities (A.50)–(A.52), the rest of the proof may be carried out in a manner
quite similar to the one used in the proof of Theorem 5 after replacing the trajectory of
(7) by the trajectory of (A.48).
References
[1]
[2]
Isaacs, R. Differential Games. New York: John Wiley, 1965.
Blaquiere, A., F. Gerard, and G. Leitmann. Quantitative and Qualitative Games. New York:
Academic Press, 1969.
Solution of a Differential Game 49 of 49
[3]
Krasovskii, N., and A. Subbotin. Game-Theoretical Control Problems. New York: Springer,
1988.
[4]
Lewin, J. Differential Games. London: Springer, 1994.
[5]
Basar, T., and P. Bernhard. H ∞ -Optimal Control and Related Minimax Design Problems: A
Dynamic Game Approach. Boston, MA: Birkhauser, 1995.
[6]
[7]
Bryson, A. and Y. Ho. Applied Optimal Control. New York: Hemisphere, 1975.
Gutman, S. “On optimal guidance for homing missiles.” Journal of Guidance and Control. 2
(1979): 296–300.
[8]
Shinar, J. “Solution techniques for realistic pursuit-evasion games.” In Advances in Control
and Dynamic Systems, edited by C. Leondes, 63–124. Vol. 17. New York: Academic Press,
1981.
[9]
[10]
Krasovskii, N. Control of a Dynamic System [in Russian]. Moscow: Nauka, 1985.
Shima, T., and J. Shinar. “Time-varying linear pursuit-evasion game models with bounded
controls.” Journal of Guidance, Control and Dynamics 25 (2002): 425–32.
[11]
Ho, Y. C., A. E. Bryson, and S. Baron. “Differential games and optimal pursuit-evasion strategies.” IEEE Transactions on Automatic Control AC-10 (1965): 385–9.
[12]
[13]
Petrosjan, L. Differential Games of Pursuit. Singapore: World Scientific, 1993.
Turetsky, V., and J. Shinar. “Missile guidance laws based on pursuit-evasion game formulations.” Automatica 39 (2003): 607–18.
[14]
Turetsky, V., and V. Glizer. “Continuous feedback control strategy with maximal capture zone
in a class of pursuit games.” International Game Theory Review 7 (2005): 1–24.
[15]
Patsko, V. “Switching surfaces in linear differential games with fixed termination time.”
Journal of Applied Mathematics and Mechanics 68 (2004): 583–95.
[16]
Kumkov, S., V. Patsko, and J. Shinar, “On level sets with “Narrow Throats” in linear differential
games.” International Game Theory Review 7 (2005): 285–311.
[17]
Karlin, S. Mathematical Methods and Theory in Games, Programming and Economics.
Reading, MA: Addison-Wesley, 1959.

Download Report

Complete Solution of a Differential Game with Linear Dynamics and

Paperzz.com

Your Paperzz