On the guaranteed convergence of the square

On the guaranteed convergence of the square-root
iteration method
M. S. Petković*, L. Rančić
Faculty of Electronic Engineering, University of Niš, P. O. Box 73
18 000 Niš, Serbia and Montenegro
Abstract. The construction of initial conditions which guarantee the convergence of the applied iterative
method, is one of the most important problems in solving nonlinear equations which attracted the great
attention for many years. In this paper we give a precise convergent analysis of the Ostrowski-like method
of the fourth order for the simultaneous determination of polynomial zeros. Using a procedure based on
Smale’s point estimation theory and some recent results related to the localization of complex polynomial
zeros, we state initial conditions which enable both the guaranteed and fast convergence of this method. These
conditions are computationally verifiable since they depend only on polynomial coefficients, its degree and
initial approximations, which is of practical importance.
MSC: 65H05
Keywords: Zeros of polynomials, point estimation, Ostrowski-like method, guaranteed convergence.
1. Introduction
One of the crucial problems in solving nonlinear equations of the form f (x) = 0 is the
choice of initial approximations which guarantee both safe and fast convergence of the applied
iterative method. Most of initial convergence conditions considered in the literature depend
on unknown data (for instance, on some “suitably chosen constants” or even desired zeros),
which is not of practical importance. Last years a special attention has been paid to the
construction of computationally verifiable initial conditions for the guaranteed convergence
of iterative methods, see [3]–[5], [7]–[10], [12]–[15]. In particular, in finding real or complex
zeros of a monic polynomial P (z) = z n + an−1 z n−1 + . . . + a1 z + a0 , initial conditions should
be some functions of polynomial coefficients a(0) = (a0 , . . . , an−1 ), its degree n and initial
(0)
(0)
approximations z (0) = (z1 , . . . , zn ).
(m)
(m)
Let z1 , . . . , zn be the approximations to the zeros ζ1 , . . . , ζn of P, obtained by some
iterative method for solving polynomial equations at the mth iteration, m = 0, 1, . . . . Let us
(m)
(m)
define the minimal distance d(m) := mini6=j |zi − zj | and the quantities
(m)
W (zi
(m)
) :=
n
Y
¡
P (zi
(m)
zi
)
(m) ¢
,
(m)
w(m) := max |W (zi
− zj
1≤i≤n
)|.
j=1
j6=i
As shown in recent papers [7]–[10], [13]–[15], convenient initial conditions providing the guaranteed convergence of a wide class of iterative methods for the simultaneous approximation of
polynomial zeros can be expressed in the form of inequality
w(0) < cn d(0) ,
(1)
where cn is a suitable quantity that depends only on the polynomial degree n. A discussion
presented in [10] shows that cn has to be chosen as great as possible. In this manner the
requirements concerning the closeness of initial approximations to the zeros are weakened.
*Correspondence address: e-mail: [email protected]
1
2
Let us note that the condition (1) is computationally verifiable, which is of great practical
importance.
The aim of this paper is to state initial conditions of the form (1) for the fourth-order
simultaneous method of Ostrowski’s type which will be described in what follows. Let P be a
monic polynomial with simple
Qn zeros ζ1 , . . . , ζn and let In := {1, . . . , n} be the index set. From
the factorization P (z) = j=1 (z − ζj ) we obtain
n
¢ X
1
d¡
P 0 (z)
=
log P (z) =
.
u(z) :=
P (z)
dz
z
−
ζi
j=1
Hence
d ³ P 0 (z) ´ P 0 (z)2 − P (z)P 00 (z) X
1
=
.
=
2
dz P (z)
P (z)
(z − ζj )2
j=1
(2)
n
δ(z) := −
(3)
We single out the term z − ζi from (3) and find
n
h
X
ζi = z − δ(z) −
j=1
j6=i
i−1/2
1
(z − ζj )2
(i ∈ In ).
(4)
The fixed point relation (4) suggests an algorithm for the simultaneous approximation of
(m)
(m)
all simple zeros of a given polynomial P. Let δi
= δ(zi ) be the quantity defined by (3)
(m)
and evaluated at zi , the current approximation to the zero ζi (i ∈ In ). Then from (4) we
can construct the following iterative formula
(m+1)
zi
(m)
= zi
1
−v
n
u
1
u (m) X
tδi −
¡ (m)
(m) ¢2
zi − zj
j=1
(i ∈ In ; m = 0, 1, . . . ),
(5)
j6=i
which defines an iterative method for the simultaneous determination of all simple zeros
ζ1 , . . . , ζn of the polynomial P of the order four.
Remark. Omitting the sum in (5) one obtains
(m+1)
zi
=
(m)
zi
−q
1
(m)
δi
=
(m)
zi
¡ (m) ¢
P zi
−q ¡
¡ (m) ¢ ¡ (m) ¢ ,
(m) ¢2
P 0 zi
− P zi
P 00 zi
(6)
which is, actually, the well known square-root method of the order three. This method was
extensively studied by Ostrowski (see chapters 14 and 15 of his book [6]) so that it is often
referred to as Ostrowski’s method. For the similarity with the iterative method (6), we will
call (5) the Ostrowski-like method.
To estimate the modulii of some complex-valued quantities which appear in the convergence
analysis, we use circular complex interval arithmetic whose basic properties are listed below.
For more details see the book [11].
A disk Z with center c := mid Z and radius r := rad Z, that is, Z = {z : |z − c| ≤ r}, will
be denoted by the parametric notation Z = {c; r}. The basic circular arithmetic operations
are defined as follows:
{c1 ; r1 } ± {c2 ; r2 } = {c1 ± c2 ; r1 + r2 },
{c1 ; r1 } · {c2 ; r2 } = {c1 c2 ; |c1 |r2 + |c2 |r1 + r1 r2 },
n1
o
r
Z I = {c; r}I =
;
(0 ∈
/ Z, i.e. |c| > r) (centered inversion),
c |c|(|c| − r)
Z1 : Z2 := Z1 · Z2I
(0 ∈
/ Z2 ).
(7)
3
It is easy to prove that, if z ∈ Z, then
|mid Z| − rad Z ≤ |z| ≤ |mid Z| + rad Z.
(8)
The square root of a disk {c; r} in the centered form, where c = |c|eiθ and |c| > r, is defined
as the union of two disks (see [11]):
p
{c; r} :=
(
r
p
|c|e ; p
|c| + |c| − r
i θ2
)
[
(
)
r
p
−|c|e ; p
.
|c| + |c| − r
i θ2
(9)
2. Some preliminary results
To state the convergence theorem for the iteration method (5), we give first some necessary
auxiliary results. For simplicity, in this section we will often omit the iteration index m and
denote quantities at the latter (m + 1)-th iteration by b .
In our convergence analysis we will use two identities given in the following lemmas.
Lemma 1. If z1 , . . . , zn are distinct complex numbers, then the following identities are valid
à n
! n
X Wj
Y
P (z) =
(z − zj ),
+1
z
−
z
j
j=1
j=1
(10)
´
P 0 (zi ) X 1
1 ³X W j
ui =
=
+
+1 .
P (zi )
zi − zj
Wi j=1 zi − zj
j=1
(11)
n
n
j6=i
j6=i
Proof. The identity (10) is, in fact, the Lagrangean (interpolation) form of a monic polynomial
P, expressed in tems of Wj ’s at the points z1 , . . . , zn . To prove (11), we apply the logarithmic
derivative to (10) and obtain [2]
X Wj
X
Wj
+ 1 − (z − zi )
z − zj
(z − zj )2
P 0 (z) X 1
j6=i
j6=i
=
+
.
hX W
i
P (z)
z − zj
j
j6=i
Wi + (z − zi )
+1
z − zj
j6=i
Putting z = zi in this formula we get (11). ¤
Using the theory presented by Carstensen in [1] and Corollary 1.1 from [9], we may state
the following assertion concerning the localization of polynomial zeros.
Lemma 2. Let us assume that z1 , . . . , zn be distinct points and the inequality
w < cn d
holds, where cn < 1/(2n). Then for n ≥ 3 the disks
n
D1 := z1 ;
o
n
o
1
1
|W1 | , . . . , Dn := zn ;
|Wn |
1 − ncn
1 − ncn
are mutually disjoint and each of them contains one and only one zero of P.
4
In this paper we have chosen the constant cn appearing in (1) to be cn = 5/(13n), that is,
we will deal with the inequality
5
w<
d.
(12)
13n
This value of cn has been found by using an extensive estimating-and-fitting procedure by
employing the programming package Mathematica 4.1. In this concrete case from Lemma 2
we have
n
o
n
o
13
|W
|
,
.
.
.
,
D
:=
z
;
|W
|
.
(13)
D1 := z1 ; 13
1
n
n
n
8
8
Let
εi = zi − ζi
and
xi = ε2i
X
j6=i
³ 1
εj
1 ´
+
.
(zi − ζj )(zi − zj ) zi − ζj
zi − zj
(14)
After some elementary manipulations and having in mind (4), we rewrite the iterative formula
(5) in the form
εi
ẑi = zi − √
(i ∈ In ).
(15)
1 − xi
Let us introduce the following abbreviations:
(1 − ncn )(2 − (2n + 1)cn )
8n(16n − 5)
17
=
, βn = 1.7cn =
,
2
2
(1 − (n + 1)cn )
(8n − 5)
26n
X
4.24n(16n − 5)
αn
|εj |, γ(n, d) =
= 3 |εi |2
.
d
(8n − 5)2 d3
αn =
hn,i
j6=i
Lemma 3. If the inequality (12) holds, then
(i)
(ii)
√
1 − xi ∈ {1; 0.51hn,i };
(16)
|ẑi − zi | < 1.7w < 1.7cn d = βn d.
(17)
Proof. From Lemma 2 and (13) we have
|εi | = |zi − ζi | ≤
1
1
cn
5
|Wi | ≤
w<
d=
d.
1 − ncn
1 − ncn
1 − ncn
8n
(18)
According to this we find
|zi − ζj | ≥ |zi − zj | − |zj − ζj | > d −
cn
1 − (n + 1)cn
8n − 5
d=
d=
d.
1 − ncn
1 − ncn
8n
(19)
Using (12) and (19), and taking into account the definition of the minimal distance d, from
(14) we obtain
Ã
!
X
1 − ncn
1 − ncn
1
|εj |
|xi | ≤
+
|εi |2
(1 − (n + 1)cn )d · d (1 − (n + 1)cn )d d
j6=i
=
(1 − ncn )(2 − (2n + 1)cn ) 2
|εi |
(1 − (n + 1)cn )2 d3
X
j6=i
|εj | =
αn
|εi |2
d3
X
|εj | = hn,i .
j6=i
Therefore, xi ∈ Xi := {0; hn,i }, where Xi is the disk centered at 0. Further, by (18) we bound
³
hn,i ≤ αn (n − 1)
cn ´3
,
1 − ncn
5
wherefrom, with αn and cn given above, we estimate
hn,i < 0.052
for all n ≥ 3.
(20)
Using (9) (taking the principal branch of the square root) and the inclusion isotonicity property,
we find
q
n
o
p
√
h
p n,i
1 − xi ∈ 1 − Xi = {1; hn,i } = 1;
.
(21)
1 + 1 − hn,i
The use of the bound (20) yields
1
51
p
<
100
1 + 1 − hn,i
for all n ≥ 3
so that the assertion (i) follows from (21).
Using the centered inversion (7), (16) and (20), from (15) we find
ẑi − zi = √
n
εi
εi
0.51hn,i o
∈
= εi 1;
⊂ εi {1; 0.53hn,i }.
{1; 0.51hn,i }
1 − 0.51hn,i
1 − xi
(22)
Hence, by (8), (18) and (20),
1
|Wi |(1 + 0.53 · 0.052)
1 − ncn
< 1.7|Wi | ≤ 1.7w < 1.7cn d = βn d.
|ẑi − zi | ≤ |εi |(1 + 0.53hn,i ) <
This proves (ii) of Lemma 3. ¤
Before stating the main convergence theorem we give some necessary estimates.
Lemma 4. Let z1 , . . . , zn be approximations produced by the iterative method (5) and let
ci |. If n ≥ 3 and the inequality (12) holds, then
ε̂i = ẑi − ζi , dˆ = min |ẑi − ẑj |, ŵ = max |W
1≤i≤n
i6=j
X
|εj |;
(i) |ε̂i | ≤ γ(n, d)|εi |3
j6=i
dˆ
;
1 − 2βn
(iii) ŵ < 0.54w;
ˆ
(iv) ŵ < cn d.
(ii)
d<
Proof. From (15) and (22) we obtain
ε̂i = ẑi − ζi ∈ εi − εi {1; 0.53hn,i } = {0; 0.53|εi |hn,i }.
Hence, by (8), it follows
|ε̂i | < 0.53|εi |hn,i = 0.53
X
αn
3
|ε
|
|εj |.
i
d3
(23)
j6=i
Taking into account the expression for αn , from (23) we obtain
X
|ε̂i | ≤ γ(n, d)|εi |3
|εj |,
j6=i
which means that (i) is proved.
(24)
6
Using (ii) of Lemma 3, we find
|ẑi − zj | ≥ |zi − zj | − |ẑi − zi | > d − βn d = (1 − βn )d,
(25)
|ẑi − ẑj | ≥ |zi − zj | − |ẑi − zi | − |ẑj − zj | > d − 2 · βn d = (1 − 2βn )d.
(26)
The inequality (26) gives
dˆ > (1 − 2βn )d,
that is,
d
1
<
,
ˆ
1 − 2βn
d
(27)
which proves (ii) of the lemma.
Using the iterative formula (5) we obtain by the inclusion (16)
√
Wi
Wi 1 − xi
Wi
=−
∈−
{1; 0.51hn,i }.
ẑi − zi
εi
εi
(28)
We use the identities (2) and (11) to find
´
X 1
P 0 (zi ) X 1
1 X 1
1 ³X W j
ui =
=
=
+
=
+
+1 ,
P (zi )
z − ζi
εi
zi − ζj
zi − zj
Wi
zi − zj
j=1 i
n
j6=i
wherefrom
j6=i
j6=i
´ X 1
X 1
1 ³X Wj
1
=
+
+1 −
.
εi
zi − zj
Wi
zi − zj
zi − ζj
j6=i
j6=i
(29)
j6=i
Using (28) and (29) we get
n
X
j=1
X Wj
X Wj
Wj
Wi
Wi
+1=
+
+1∈−
{1; 0.51hn,i } +
+1
ẑi − zj
ẑi − zi
ẑi − zj
εi
ẑi − zj
j6=i
j6=i
(
)
Wi X Wj
|Wi |
= −
+
+ 1;
· 0.51hn,i
εi
ẑi − zj
|εi |
j6=i
(
X 1
X Wj
X 1
= −Wi
−
− 1 + Wi
zi − zj
zi − zj
zi − ζj
j6=i
j6=i
j6=i
)
X Wj
|Wi |
+
+ 1; 0.51
hn,i = {Θi ; Ri },
ẑi − zj
|εi |
j6=i
where
Θi = −Wi
X
j6=i
X
εj
Wj
− (ẑi − zi )
,
(zi − ζj )(zi − zj )
(ẑi − zj )(zi − zj )
(30)
j6=i
and
Ri =
0.51|Wi |hn,i
.
|εi |
(31)
Let us estimate the modulii of Θi and Ri . Starting from (30) and using (12), (17), (18),
(19) and (25), we find
X
|εj |
|Wj |
+ |ẑi − zi |
|zi − ζj ||zi − zj |
|ẑi − zj ||zi − zj |
j6=i
j6=i
³ w ´2
n−1
n − 1 ³ w ´2
<
+ 1.7 ·
1 − (n + 1)cn d
1 − βn d
³
1.7 ´
25(n − 1)(53.2n − 34)
1
+
=
=: νn .
< (n − 1)c2n
1 − (n + 1)cn
1 − 1.7cn
13n(8n − 5)(26n − 17)
|Θi | ≤ |Wi |
X
7
The sequence {νn } is monotonically decreasing so that νn ≤ ν3 = 0.1389... < 0.14. Using this
bound, we obtain |Θi | < νn < 0.14 for all n ≥ 3.
By virtue of (12), (13) and (18), from (31) we find
Ri =
X
0.51|Wi |hn,i
0.51αn |Wi |
0.51(n − 1)αn ³ w ´3
0.51(n − 1)αn c3n
=
|ε
|
|ε
|
≤
<
< 0.017
i
j
|εi |
d3
(1 − ncn )2
d
(1 − ncn )2
j6=i
for all n ≥ 3.
According to (8), and using the upper bounds for |Θi | and Ri , we estimate
¯
¯ n
¯
¯X W
¯
¯
j
+ 1¯ < |Θi | + Ri < 0.157.
¯
¯
¯
ẑ − zj
j=1 i
(32)
Using the bounds (17) and (26) we find
¯
¯
Ã
! Ã
!n−1 Ã
!n−1
¯Y ẑ − z ¯ Y
|ẑj − zj |
βn d
17
¯
i
j¯
1+
< 1+
= 1+
< 2.
¯
¯≤
¯
ẑi − ẑj ¯
|ẑi − ẑj |
(1 − 2βn )d
26n − 34
j6=i
j6=i
Taking into account the last inequality, (17) and (32), we start from (10) for z = ẑi and
find
¯
¯
¯
¯¯
¯
n
¯ P (ẑ ) ¯
¯X
¯¯Y ẑ − z ¯
W
¯
¯
¯
¯
¯
¯
i
j
i
j
ci | = ¯ Y
|W
+ 1 ¯¯
¯ ≤ |ẑi − zi |¯
¯
¯ (ẑi − ẑj ) ¯
¯
¯
¯
ẑ
−
z
ẑ
−
ẑ
j
i
j¯
j=1 i
j6=i
j6=i
< 1.7|Wi |(|Θi | + Ri ) · 2 < 1.7|Wi | · 0.157 · 2 < 0.54|Wi |,
which proves the assertion (iii).
According to (12), and (ii) and (iii) of Lemma 4, we find
ŵ < 0.54w < 0.54cn d < 0.54cn ·
since
1
ˆ
dˆ < cn d,
1 − 2βn
0.54
≤ 0.957... < 1 for all n ≥ 3.
1 − 2βn
Therefore, we have proved (iv) of the lemma. ¤
3. Convergence theorem
Using results of Lemma 4 we state in this section initial conditions which guarantee the
safe convergence of the Ostrowski-like method (5).
Theorem 1. Let P be a polynomial of the degree n ≥ 3 with simple zeros. If the initial
condition
5
,
(33)
w(0) < cn d(0) , cn =
13n
holds, then the Ostrowski-like simultaneous method (5) is convergent with the order of convergence four.
Proof. Using similar technique as in the proof of Lemma 4, we derive the proof by induction.
Since (12) and (33) are of the same form, all estimates given in Lemma 4 are valid for the
index m = 1 which is the part of the proof with respect to m = 1. Furthermore, the inequality
8
(iv) in Lemma 4 coincides with (12) so that the assertions (i)–(iv) of Lemma 4 are valid for
the subsequent index, etc. Hence, by induction, we obtain the implication
w(m) < 0.54d(m) ⇒ w(m+1) < 0.54d(m+1) ,
which plays an important role in the convergence analysis of the Ostrowski-like method (5); it
involves the initial condition (33) under which all inequalities given in Lemma 4 are valid for
all m = 0, 1, . . . . Especially, following (27) and (24), we have
d(m)
1
<
1 − 2βn
d(m+1)
and
(m+1)
|εi
(m) 3
| ≤ γ(n, d(m) )|εi
|
X
(34)
(m)
|εj
| (i ∈ In )
(35)
j6=i
for each iteration index m = 0, 1, . . . , where γ(n, d(m) ) =
Let us substitute
(m)
ti
=
4.24n(16n − 5)
£
¤3 .
(8n − 5)2 d(m)
i1/3
h n−1
(m)
γ(n, d(m) )
|εi |
1 − 2βn
in (35), then
(m+1)
ti
≤
1 − 2βn h γ(n, d(m+1) ) i1/3 £ (m) ¤3 X (m)
1 − 2βn d(m) £ (m) ¤3 X (m)
t
t
=
t
tj .
i
j
n−1
n − 1 d(m+1) i
γ(n, d(m) )
j6=i
j6=i
Hence, by virtue of (34),
(m+1)
ti
<
1 £ (m) ¤3 X (m)
t
tj
n−1 i
(i ∈ In ; m = 0, 1, . . . ).
(36)
j6=i
In regard to (18) we find
(0)
ti
=
h n−1
i1/3
h n−1
i1/3 c
n
(0)
γ(n, d(0) )
|εi | <
γ(n, d(0) )
d(0) .
1 − 2βn
1 − 2βn
1 − ncn
(0)
For n ≥ 3 one obtains ti < 0.365 < 1.
(0)
(0)
Put t = maxi ti , then obviously ti ≤ t < 1 for all i = 1, . . . , n and n ≥ 3. Hence, we
(m)
(m)
conclude from (36) that the sequences {ti } (and, consequently, {|εi |}) tend to 0 for all
(m)
i = 1, . . . , n. Therefore, zi → ζi and the method (5) is convergent.
Starting from the inequality (26), by (iii) of Lemma 4, (17) and (33) we successively obtain
d(m) > d(m−1) − 3.4w(m−1) > d(m−2) − 3.4w(m−2) − 3.4w(m−1)
..
.
´
³
(0)
(1)
(m−1)
> d − 3.4 w + w + · · · + w
´
³
2
m−1
(0)
(0)
1 + 0.54 + 0.54 + · · · + 0.54
> d − 3.4w
³
37 ´ (0)
> d(0) − 7.4w(0) > d(0) − 7.4cn d(0) > 1 −
d .
13n
(0)
9
According to this and (36) we have
ηn
9315.28n4 (16n − 5)
γ(n, d(m) ) < £
.
¤3 , where ηn =
(8n − 5)2 (13n − 37)3
d(0)
Now, taking into account (35) and (36), we find
¯ (m+1) ¯
¯
¯ X¯ (m) ¯ (n − 1)ηn ¯ (m) ¯3
¯ (m) ¯
¯ε
¯ < £ ηn ¤ ¯ε(m) ¯3
¯ε ¯ < £
max ¯εj ¯ .
¤3 ¯εi ¯ 1≤j≤n
i
i
j
3
d(0)
d(0)
j6=i
j6=i
Therefore, the order of convergence of the Ostrowski-like method (5) is at least four, which
completes the proof of Theorem 1. ¤
References
[1] C. Carstensen, Anwendungen von Begleitmatrizen, Z. Angew. Math. Mech. 71 (1991) 809–812.
[2] C. Carstensen, On quadratic-like convergence of the means for two methods for simultaneous
rootfinding of polynomials, BIT 33 (1993) 64–73.
[3] P. Chen, Approximate zeros of quadratically convergent algorithms, Math. Comput. 63 (1994)
247–270.
[4] J. H. Curry, On zero finding methods of higher order from data at one point, J. Complexity 5
(1989) 219–237.
[5] M. Kim, On approximate zeros and rootfinding algorithms for a complex polynomial, Math. Comput. 51 (1988) 707–719.
[6] A. M. Ostrowski, Solution of Equations and Systems of Equations, Academic Press, New York,
1970.
[7] M. S. Petković, C. Carstensen, M. Trajković, Weierstrass’ formula and zero-finding methods,
Numer. Math. 69 (1995) 353–372.
- . Herceg, Point estimation of simultaneous methods for solving polynomial equa[8] M. S. Petković, D
tions: a survey, J. Comput. Appl. Math. 136 (2001) 283–307.
- . Herceg, S. Ilić, Point Estimation Theory and its Applications, Institute of
[9] M. S. Petković, D
Mathematics, University of Novi Sad, Novi Sad, 1997.
- . Herceg, S. Ilić, Point estimation and some application to iterative methods,
[10] M. S. Petković, D
BIT 38 (1998) 111–126.
[11] M.S. Petković, Lj. D. Petković, Complex Interval Arithmetic and its Applications, Wiley-VCH,
Berlin 1998.
[12] S. Smale, Newton’s method estimates from data at one point, in: The Merging Disciplines: New
Directions in Pure, Applied and Computational Mathematics, Springer-Verlag, New York 1986,
pp. 185–196.
[13] D. Wang, F. Zhao, On the determination of the safe initial approximation for the Durand-Kerner
algorithm, J. Comput. Appl. Math. 38 (1991) 447–456.
[14] D. Wang, F. Zhao, The theory of Smale’s point estimation and its application, J. Comput. Appl.
Math. 60 (1995) 253–269.
[15] S. Zheng, F. Sun, Some simultaneous iterations for finding all zeros of a polynomial with high
order of convergence, Appl. Math. Comput. 99 (1999) 233–240.