Almost sure convergence to the Quicksort process
Uwe Roesler
Mathematisches Seminar
Christian-Albrecht-Universität zu Kiel
Ludewig-Meyn-Strasse 4
24098 Kiel
Germany
February 3, 2016
Abstract
The algorithms Partial Quicksort, introduced by Conrado Martı́nez,
sorts the l smallest reals within a set of n different ones. It uses a splitting
like Quicksort, continuing always with the leftmost list. The normalized
running time Yn (t) converges with nl → t in distribution to a non degenerate limit. The finite dimensional distributions of the process Yn converge
to a limit [21], called the Quicksort process. In this paper we will present
the algorithm Quicksort on the fly, a version of Partial Quicksort, showing
the almost sure convergence of Yn to the Quicksort process in the space
of cadlag functions endowed with the Skorodhod topology.
1
Introduction
Quicksort [9] and Quickselect [8] are among the most thoroughly studied algorithms, [12, 17, 18, 20, 6, 7, 11, 10]. Both are divide-and-conquer algorithms,
the input list is split by a pivot into two sublists and the algorithm recalled.
Optimal versions choosing a pivot element are well known [15, 14].
Partial sorting asks for sorting only the l smallest reals out of n different reals.
An obvious algorithm is to combine Quicksort and Quickselect, for example to
use a Quickselect variant first in order to find the l smallest and then sort them
by a Quicksort version. The algorithm partial sort (see www.cplusplus.com),
implemented in C++, does this in practice. Martı́nez and Roesler [14] gave optimal versions for given (n, l) in the sense of an asymptotic smallest expectation
of comparisons.
The algorithm Partial Quicksort for partial sorting was introduced by Martı́nez
[13]. Let the input S be a list of different reals. Choose with a uniform distribution an element of S, called pivot, and split S into a list S< of strictly smaller,
the pivot and S> of strictly larger elements. Arrange the sublists in the order
1
(S< ,pivot,S> ) and recall the algorithm for the left most sublist containing at
least 2 elements.
Let X(S, l) be the number of comparisons in order to find the sorted 0 ≤
l ≤ |S| smallest elements in S. The rvs satisfy the recursions
|S|
|S|
(X(S, l))l=1 = (|S| − 1 + X(S< , l ∧ |S< |) + X(S> , 0 ∨ (l − |S< | − 1)))l=1 (1)
for |S| ≥ 2. We use X(S, 0) = 0 and X(S, l) = 0 for |S| ≤ 1. It is tacitly
assumed that for every S our choice of the pivot is independent of everything
else. The distribution of (X(S, l))l depends only on |S| = n which follows easily
by induction on n = |S| using (1). It is true due to the internal randomness
(within the algorithm) and the uniform distribution on S. We will use a rv
D
(X(|S|, l))l = (X(S, l))l with that distribution.
The equation (1) determines recursively the distribution of X(n, ·)
D
(X(n, l))l = (n − 1 + X1 (I − 1, l ∧ (I − 1)) + X2 (n − I, 0 ∨ (l − I)))l
(2)
n ≥ 2. For given n the rvs I, Xi (j, ·), i = 1, 2, j < n are independent. The
rv I = In is uniformly distributed on {1, 2, . . . , n} and Xi (j, ·) has the same
distribution as X(j, ·). We use the same starting values X(n, 0) = 0 = X(1, 1)
as above. The rv I = |S< | + 1 has the interpretation of the rank of the pivot
within S.
The expectation a(n, l) = EX(n, l) satisfies the recursion
1∑
(a(j − 1, l ∧ (j − 1)) + a(n − j, 0 ∨ (l − j))) (3)
n j=1
n
a(n, l)
=
n−1+
for n ≥ 2, 1 ≤ l ≤ n. The solution is explicitly known [13]
a(n, l) = 2n + 2(n + 1)Hn − 2(n + 3 − l)Hn+1−l − 6l + 6
(4)
for n ≥ 1, 1 ≤ l ≤ n
the initial conditions a(·, 0) = 0. Hn denotes the n-th
∑and
n
harmonic number i=1 1i = ln n + γ + o(1), γ the Euler constant. The term
a(n, n) = −4n + 2(n + 1)Hn is the expectation of the comparisons sorting n
numbers by Quicksort [12].
The distributions of
(Y (n,
l
X(n, l) − a(n, l)
))l := (
)l
n
n+1
(5)
satisfy the recursions
l
I
l
n−I +1
l−I
C(n, I, l)
D
))l = (
Y1 (I−1, 1∧
)+
Y2 (n−I, 0∨
)+
)l
n
n+1
I −1
n+1
n−I
n+1
(6)
for n ≥ 2 where I = In is as above and
(Y (n,
C(n, i, l) = n − 1 − a(n, l) + a(i − 1, l ∧ (i − 1)) + a(n − i, 0 ∨ (l − i))
2
(7)
The distribution of Y (j, 1) is called the discrete Quicksort distribution to sort
j elements. (The normalization by n + 1 is chosen in view of some martingale
property [16].)
In a joint paper Martı́nez and Roesler [14] showed the weak convergence of
the distribution of Yn,tn as n → ∞, tn →n t to a probability measure µt for all
t in the unit interval (0, 1]. The map (0, 1] ∋ t → µt is continuous with respect
to weak convergence (and Wasserstein lp -distance). They replaced in (6) the
term Y1 (., 1) by an independent rv with the same distribution. That defines a
process Ỹ (n, .) with the same one dimensional distributions as Y (n, .). The proof
then makes use of an embedding of Ỹ into a weighted branching process and
extending (Ỹ (n, nl ))l pathwise to a process with values in the set D of cadlag
functions on [0, 1]. Subsequent papers use the embedding approach, but for Y
itself.
Ragab and Roesler [21] considered l as a time index and process convergence after normalizing. Extend Y (n, ·) to a cadlag process on D([0, 1]) left
continuous at 1. (Actually they used a right continuous version with existing
left limits, which does not matter here since P (Y (t−) = Y (t+)) = 1 for all t.)
The one dimensional distributions converge by the above. They showed Y (n, ·)
converges in n to some limiting process Y for all finite dimensional distributions.
The D-valued process Y is called the Quicksort process and is characterized in
distribution by the stochastic fixed point equation in D
D
(Y (t))t∈[0,1] = (U Y1 (1 ∧
t
t−U
) + 11t≥U (1 − U )Y2 (
)) + C(t, U ))t∈[0,1]
U
1−U
(8)
Here we consider only solutions with a uniformly bounded expectation and
variance of all Y (t). Further we require continuity at t = 1 a.e.. Here Y1 , Y2 , U
are independent, Y1 , Y2 have the same distribution as Y and U has a uniform
distribution. The function C : [0, 1] × [0, 1] → R is given by
C(t, s) :=
C(s) :=
11t≥s C(s) + 11t<s (−1 + 2s + 2(1 − t) ln(1 − t)
−2(s − t) ln(s − t) + 2s ln s)
1 + 2s ln s + 2(1 − s) ln(1 − s)
(9)
(10)
using continuous extension.
Their approach can be described in 2 steps. First consider the fixed point
equation for t = 1, which is the Quicksort fixed point equation. Embed the rvs
Y (n, ·) into a weighted branching process obtaining a recursion of rvs. Second
embed the Y (n, ·) processes appropriate into a weighted branching process, replacing the trouble making term Y1 (I −1, 1) in equation (5) by a specific rv using
the first step. (Details in the section on the discrete Quicksort process.) This
provides a nice affine recursion and a representation of Y (n, ·) as an uniformly
absolute convergent countable sum where all summands converge nicely in n.
By the uniformity argument in time t ∈ [0, 1] they obtained convergence
of processes in some normed space, which implies convergence of finite dimensional distributions. There exists nice versions of Y (n, ·), Y such that
∫ Y (n, ·)
converges uniformly to Y in Lp (P × λ)-sense, p > 1, more specific (E( |Yn (t) −
3
Y (t)|dt)p )1/p →n 0. Another consequence of their uniformity argument is the
existence of versions of the processes Y (n, ·) and Y on D such that Y (n, ·) converges to Y in Skorodhod metric almost surely. The aim of this paper is to
present such a version arising from an algorithm.
We consider the deterministic algorithm Quicksort on the fly with random
input list U1 , U2 , . . . , Un . Quicksort on the fly, the version of Martı́nez [13] Partial
Quicksort with external randomness, works as follows:
– choose always the first element in the list as pivot
– compare any other element sequentially with the pivot
– form the list of strictly smaller, the list containing only the pivot and
the list of strictly larger reals in this order, preserving the sequential order of
comparisons within a list
– continue always with the leftmost list containing at least 2 elements
This algorithms terminates with the sorted input as output.
If we stop the algorithm Quicksort on the fly when the l-th smallest element is
identified then we have performed (a version of) the algorithm Partial Quicksort
to (n, l) [13], find the ordered l smallest numbers out of n. And we did only the
necessary comparisons so far, no more.
Let X(u, l) denote the number of comparisons necessary to find and sort
the 0 ≤ l ≤ n ∈ N0 smallest numbers out of the finite input u ∈ (0, 1)∗̸= . The
Quicksort on the fly recursion is, n = |u|
(X(u, l))nl=1 = (X(u1 , l ∧ |u1 |) + X(u2 , 0 ∨ (l − |u1 | − 1)) + n − 1)nl=1
(11)
for n ≥ 2 according to the above splitting procedure. Here u1 , u2 denote the
lists of all ui < u1 respectively ui > u1 for i ≤ n.
Let U = (Ui )i∈N be a vector of iid random variables with a uniform distribution on the open unit interval and let U|n := (U1 , U2 , . . . , Un ). The expectation
E(X(U|n , l)) satisfies the same recursion (3) as the a(n, l). Since the initial values are the same we obtain a(n, l) = E(X(U|n , l)).
Define
l
X(u, l) − a(|u|, l)
Y (u, ) =
(12)
n
|u| + 1
as before. Now extend Y .
We prefer to work on the space D = D([0, 1]) of left continuous functions
with existing right limit instead of the common space D([0, 1]) of cadlag function
[3]. The existing literature prefers more cadlag version for historic reasons and
its importance as a standard setting for time continuous Markov processes.
However the space of left continuous functions with existing right limits, as
used before in [21], would be more natural in view of the recursive combinatorial
structure and also the distributional fixed point equations. It provides a more
elegant normalization and easier notation of the equations. Both views are
equivalent, since for our processes Z we have P (Z(t−) = Z(t+) = 1 for all t.
(The points 1 for cadlag functions and 0 for caglad function play a special role.)
4
The theoretic foundation [3] for both is the same. A quick argument is the time
reversion t → 1 − t.
Extend Y (u, ·) to a left continuous step function in D
Y (u, t) =
X(u, l) − a(|u|, l)
|u| + 1
(13)
t ∈ (0, 1], |u| ≥ 2 where 1 ≤ l ≤ |u| is determined by l ≤ |u|t < l.
Here is the major result.
Theorem 1 Let Ui , i ∈ N, be iid rvs with a uniform distribution. Then the
D-valued process Yn = Y (U|n , ·) as defined above for deterministic Quicksort on
the fly with random input U|n converges path wise almost surely to the Quicksort
process RV (∞) as given in section (5) in the Skorodhod J1 -topology.
Equivalent is the statement for a right continuous extension of the process
Y (U|n , .) taking special care of the point t = 0 and t = 1.
Notice our version of Y (U|n , ·) arises from a deterministic algorithm with random input. Quicksort itself is often programmed as a deterministic algorithm,
where the randomness comes exclusively from the input. For the analysis of
the asymptotic distributional behavior of Quicksort [17] it is somewhat easier
to work with a random version of the algorithm Quicksort, picking the pivot
element always independently random. The result for the random version is the
same asymptotic distributional behavior of the running time for every input.
This is of course not true for the deterministic version.
The main result was obtained by a rigorous embedding of the processes
into weighted branching processes (WBP) [19][20], see following sections for
explanation and notation. The existence of the Quicksort process itself is shown
as the limit of the sum
∑
RW =
Lv ∗r Cv
(14)
v∈W
∗
W ⊂ V = {1, 2} for W → V in the norm ∥∥ · ∥∞ ∥p for p > 1 on the space
D. The limit exists since RV< m is a Cauchy sequence and the space D with the
norm ∥∥ · ∥∞ ∥p for p > 1 is complete.
Using our specific deterministic limiting Quicksort version with random input the process Y (U|n ) has a natural embedding into a WBP and the corresponding RW (n) as above indexed by the length n of the input U|n will converge
in D to RV . We use a random time shift τn such that the shifted process Yn (τn )
has only jumps at U1 , U2 , . . . , Un , the same as the Quicksort process Y. Then
the shifted Yn (τn ) is close to Y in the supremum metric pathwise almost surely.
(The supremum metric on D is complete.) It remains to show the time shift is
small in the sense supt |τn (t) − t| →n 0 almost surely. This amounts to the well
known fact, that the empirical distribution function of the Ui , i ≤ n converges
a.s. uniformly to the continuous distribution function of U1 .
5
2
Definitions and setting
n
Binary Ulam-Harris trees: Let V = {1, 2}∗ = ∪∞
n=0 {1, 2} be the set of all
finite 1 − 2 sequences, the empty sequence denoted by ∅. We consider V as a
rooted tree, a graph with (directed) edges (v, vi), v ∈ V, i ∈ {1, 2} and root ∅.
The vertex v = (v1 , v2 , . . . , vn ) = v1 v2 . . . vn has length |v| = n ∈ N0 = N ∪ {0}.
We use Vn , V<n , . . . in the natural sense for all vertices with length n, or strictly
smaller than n and so on.
The genealogical order ≼g on the infinite binary tree V is defined by v ≼g w
if and only if v is a prefix of w,
v = (v1 , v2 , . . . , vm ) ≼g w = (w1 , w2 , . . . , wn ) ⇔ m ≤ n and ∀i ≤ m : vi = wi
We call v an ancestor of w iff there exists a path from v to w ̸= v.
The traverse order ≼t on the infinite binary tree V is defined by traversing
∑|v|
from left to the right. Formally define h(v) = i=1 3−i 2(vi − 1) and put
v ≼t w ⇔ h(v) ≤ h(w)
Notice the traverse order is a total order on V .
The shift operator on functions f with domain V is defined by (Sv (f ))(w) =
f (vw). We use throughout the notation f v = Sv (f ) with an upper index. In
analogy for any subset W of V we use W v := Sv (11W ) = {w ∈ V | vw ∈ W }.
A binary Ulam-Harris subtree is a subset W of the infinite binary tree V
including the induced graph structure and closed below under the genealogical
order
w ∈ W ⇒ ∀v ≼g w : v ∈ W.
We call these objects just trees. The smallest tree is the empty set. Let W be
the set of all trees and Wf be the subset of finite trees. On trees we use the
induced genealogical and traverse order.
For a finite tree W ∈ Wf and 0 ≤ l ≤ |W | define
∑
(|W w | − 1)
(15)
X(W, l) :=
{w∈W |w≼t v or w≼g v}
with v the l-th smallest element of W relative to the traverse order. By (15)
the function X satisfies the recurrences,
X(W, l) = X(W 1 , l ∧ |W 1 |) + X(W 2 , 0 ∨ (l − |W 1 | − 1)) + |W | − 1
(16)
for finite non empty trees W and 1 ≤ l ≤ |W |. The symbol ∧ denotes the
infimum. (The proof is omitted due to its simplicity.) Notice X(W, 0) = 0.
For a finite non empty tree W and 0 ≤ l ≤ |W | define
(
)
l
X(W, l) − a(|W |, l)
Y W,
:=
(17)
|W |
|W | + 1
The a(n, l) are explicitly given in [14] (3).
6
The recurrences (16) imply for |W | ≥ 2, 1 ≤ l ≤ |W |
Y (W,
l
)
|W |
=
|W 1 | + 1
l
Y (W 1 , 1 ∧
)
|W | + 1
|W 1 |
|W 2 |
l − |W 1 | − 1
+
Y (W 2 , 0 ∨
)
|W | + 1
|W 2 |
C(|W |, |W 1 | + 1, l)
+
|W | + 1
(18)
(19)
where C is defined in (7). The boundary conditions are Y (W, 0) = 0 for all W
and additionally Y (W, 1) = 0 for |W | = 1. Our interest centers on the D-valued
|t⌉
objects Y (W, t) = Y (W, ⌈|W
|W | ) for random input of search trees.
Search trees: A binary search tree is a tuple (W, ψ) consisting of a tree
W and an injective map ψ : W → R which is strictly isotone (= strictly order
preserving) with respect to the traverse order.
The set of all binary search trees is an ordered set by inclusion, (W, ψ) ≤
′
(W ′ , ψ ′ ) iff W ⊂ W ′ and ψ ′ restricted to W is ψ = ψ|W
. Any increasing sequence
(Wn , ψn ) of binary search trees forms a projective system and has a projective
limit (∪n Wn , ψ = limn ψn ).
Vector input: Let u = (ui )i ∈ R∗̸= ∪ RN
̸= be a finite or infinite sequence
of different reals. We use u|n = (u1 , u2 , . . . , un ) and the length |u| as number
of coordinates. Define u1 = (ui1 , ui2 , . . .) where i1 < i2 < . . . are the ordered
elements of {2 ≤ i ≤ |u| | ui < u1 }. Analogous define u2 = (ui1 , ui2 , . . .) where
i1 < i2 < . . . are the ordered elements of {2 ≤ i ≤ |u| | ui > u1 }. The vector
uv , v ∈ V is defined by u∅ = u and recursively
uvi = (uv )i
Proposition 2 There is an injective map between finite or infinite sequences
u of different reals and and binary search trees (W, ψ). There is a one-to-one
map between u and increasing sequences (Wu| n , ψu| n ), N0 ∋ n ≤ |u|, of binary
search trees, such that ψu| n (Wu| n ) = {u1 , u2 , . . . , un }.
Proof: In algorithmic wording, start with u at the root. Then put u1 to the root
and u1 to the vertex 1 and u2 to the vertex 2. Recall recursively the procedure
now for vertex v and uv .
Alternatively, put u1 to the root and take Wu1 consisting of the root and
ψu1 (∅) = u1 . Next put u2 either to the vertex 1 in case u2 < u1 and otherwise
to the vertex 2. We obtain a search tree containing the first one and having one
more vertex. By recursion, compare the next u coordinate value with the value
at the root, go left or right, continue going left or right until a free vertex is
found. In that case put the coordinate value down to that vertex. Then uv are
the coordinates values of u which reached or passed the vertex v ordered in time
of appearance.
7
This way we obtain a one-to-one map from real valued sequences to increasing sequences of search trees.
q.e.d.
The proposition remains true, if we take sequence with values in a subset A
of the reals. instead of the reals. We use the notation A∗̸= = ∪n∈N0 An̸= where
An̸= = {u ∈ An | ui ̸= uj for i ̸= j}.
We use the notation (Wu , ψu ) for the binary search tree generated by u.
Traversing the finite search tree (Wu , ψu ) from left to right provides the elements
of ψu (Wu ) = {u1 , u2 , . . . , u|u| } in natural order. The number of comparisons in
order to find the l smallest coordinate of finite u by constructing the binary
search trees only as far as necessary is X(Wu , l).
For given finite input u we will use freely
X(u, ·) := X(Wu , ·)
Y (u, .) := Y (Wu , ·)
where u generates the tree Wu as search tree. (For this part ψ is not necessary.)
Notice that this implies for v ∈ V
Y (uv , ·) = Y (Wuv , ·)
X(uv , ·) = X(Wuv , ·)
Wuv := (Wu )v = Wuv . We neglect the symbol ∅ whenever possible. Our interest
centers on these objects for random input u.
Random Input: Let Ui , i ∈ N, be independent rvs with a uniform distribution on (0, 1) on some abstract probability space (Ω, A, P ) sufficiently rich.
Without loss of generality we may assume (and do so for notational reasons)
that for every fixed ω in the realization space Ω all values Un (ω), n ∈ N, are
different. We will use the same notation as above replacing the input u by
U|n (ω) for n ∈ N0 and ω ∈ Ω. However we suppress the ω whenever possible, as
is common in probability theory. The following properties are elementary and
found in standard text books.
Proposition 3 Let U = (Un )n∈N be a sequence of iid rvs with a uniform distribution on (0, 1).
• The rvs
(U 1 )i
U1 , i
• The rvs
(U 2 )j −U1
1−U1 , j
≥ 2 are iid with a uniform distribution on (0, 1).
≥ 2 are iid with a uniform distribution on (0, 1).
1
2
1
• The vectors U|n
, U|n
conditioned to the cardinality U|n
consists both of iid
rvs for every n ∈ N.
•
U 2 −U1
U1
U1 , 1−U1 , (R1:n )n
U1 , U2 , . . . , Un .
• E(U1k (1 − U1 )j ) =
as an index.
are independent, where R1:n is the rank of U1 under
k!j!
(k+j+1)!
using k, j ∈ N0 for the power function and not
8
Proposition 4 Let Un , n ∈ N, be independent rvs with a uniform distribution,
U = (Un )n . The distribution of X(U|n , ·) depends only on n. The expectation
of X(U|n , l) is a(n, l) as given in (3) for 0 ≤ l ≤ n.
Proof: The statement is true for n = 0 and n = 1. We use now induction from
n − 1 → n. Let µm be the distribution of X(U|m , ·) for m ≤ n − 1. Since n ≥ 2
1
we can use the recurrence (16). If we condition on |U|n
| = k it suffices to show
the distribution of
1
2
(X(U|n , l))nl=1 = (X(U|n
, l ∧ k) + X(U|n
, 0 ∨ l − k − 1))nl=1
(20)
2
1
.
, U|n
depends not on the actual realization U|n
1
U|n
U1 consists of k independent rvs with a uniform dis2
U|n
−U1
tribution and the vector 1−U
of n − k − 1 independent rvs with a uniform
1
1
2
distribution. As a consequence X(U|n
, ·) and X(U|n
, ·) are independent and
have the distribution µk , µn−k−1 by induction hypothesis. Therefore the distri-
Under k the vector
bution of (20) is determined by the probability measures µk and µn−k−1 . This
proves the induction step.
The expectations E(X(U|n , l)) = b(n, l) satisfy the recurrences
b(n, l) =
n−1
∑
1
1
1
P (|U|n
| = k)(E(E(X(U|n
, l ∧ k) | |U|n
| = k))
k=0
+
2
1
11l>k+1 E(E(X(U|n
, l − k − 1) | |U|n
| = k))) + n − 1
=
n−1
1∑
(b(k, l ∧ k) + 11l>k+1 b(n − k − 1, l − k − 1)) + n − 1 (21)
n
k=0
for n ≥ 2. These are the same recurrences as in Martı́nez-Roesler [14] for a(n, l)
with the same starting values a(1, 1) = 0 = a(n, 0), n ∈ N0 . The explicit solution
is given in (3). The uniqueness of the solution for given starting values implies
b = a.
q.e.d.
3
The weighted branching process
For the analysis of Y (17) and the Quicksort process we need the weighted
branching process (WBP) [17, 18, 20, 19]. The WBP is similar to a tree indexed stochastic process like a discrete, branching dynamical system or iterated
function system. The paths of the tree obtain some weights, represented by
transformations froward or backward.
Let V be the set N∗ of finite sequences of natural number. V is a directed
graph with edges (v, vi), v ∈ V, i ∈ N. A WBP is a tuple (V, L, G, ∗). Here
L = (Lv,vw )v,w∈V is a stochastic process, where Lv,vw has values in G. (G, ∗) is
a measurable semi group with identity and a grave. L preserves the structure
of path concatenation for the directed graph and the composition in G via
Lv,vwx = Lv,vw ∗ Lvw,vwx
9
Further (Lv,vi )i∈N , v ∈ V are iid rvs. Notice the independence of families, but
we do not require independence within a family.
The Lv,vi , v ∈ V, i ∈ N determine completely Lv,vw for any path. If the
rvs are degenerate then we face a discrete dynamical system indexed by a tree
structure. We use L∅,v = Lv .
Without loss of generality, we take G as a subset of H H for some measurable
space H. H = G will always do it, but other spaces may be more suitable.
Formally G acts right on H via ∗l : H × G → H or acts left via ∗r : G × H → H.
In the first case we have h ∗l g ∗ g ′ = g ′ (g(h)) and in the second case g ∗ g ′ ∗r h =
g(g ′ (h)).
In many cases and without loss of generality we assume H has a grave called
△ (once in a grave you stay in the grave). In this manner we include branching
processes [1] considering only vertices v such that Lv (h) ̸= △ for starting with
h at the root.
For simplicity and a special application in mind we restrict us to V = {1, 2}∗ ,
the set of finite binary sequences. Here are some examples and objects used in
the sequel for analyzing Quicksort with the forward dynamic (=G acts right).
The input for Quicksort is an element x ∈ H1 := (0, 1)∗̸= , a finite sequence of
different reals in the open unit interval. Define H2 := {(c, d] | 0 ≤ c ≤ d ≤ 1}
and H3 = {x = (xv )v∈W | xv ∈ (0, 1), W ∈ Wf }. If necessary extend u by
uv = △ for v ̸∈ W.
Sequences: Take H = H1 or the set of infinite sequences (0, 1)N
̸= and let G
be generated by the gi (u) = ui (and the identity and a grave) acting right on
H. Putting Lv,vi = gi we obtain Lv,vw (u) := uw . (See example vector input.)
The binary search tree (Wu , ψu ) is recovered from uv , v ∈ V, as the set of all w
such that uw is not empty and ψu (w) is the first element in uw .
Trees: Let H = Wf or W and let G be generated by the gi (W ) = W i
acting right on H. Putting Lv,vi = gi for v ∈ V we obtain Lv,vw (W ) = W v . For
u ∈ H1 holds Lv,vw (Wu ) = Wuw where (Wu , ψu ) is the binary search tree for u.
Intervals: Let H = H1 × H2 with the grave △ = (∅, ∅). Let G be generated
by the gi : H → H by
g1 (u, (c, d]) = (u1 , (c, d + (d − c)u1 ])
g2 (u, (c, d]) = (u2 , ((d − c)u1 , d])
for u ̸= ∅, c ̸= d and otherwise the grave △. Put Lv,vi = gi for all v ∈ V . We will
use in the sequel the notation Iv (u, I) := (Lv (u, I))2 and Jv (u) = Iv (u, (0, 1]) =
(cv (u), dv (u)].
Notice for ∅ =
̸ u ∈ H1 both maps cv (u), dv (u) on outer leafs v of Wu (v ̸∈ Wu
and ∀w ≺g v : w ∈ Wu ) are strictly isotone in traverse order. If we traverse the
outer leafs of Wu observing the c-values (d) then we obtain the coordinates of u
in increasing order and the additional values 0 for the c-sequence respectively 1
for the d-sequence. The intervals Jv (u) for outer leaves v of Wx form a partition
of (0, 1].
Interval length: The interval length of the interval Iv (u, I) from above is
an example of a WBP.
10
Intervals Revisited: Let H̃ = H3 × H2 extended by the grave. G̃ is
generated by
g̃1 (x, (c, d]) = (S1 (x), (c, c+(d−c)x∅ ])
g̃2 (x, (c, d]) = (S2 (x), (c+(d−c)x∅ , d])
whenever x has non empty domain and △ otherwise. Sv denotes the shift by v.
Put L̃v,vi = g̃i .
There is a connection between the last two examples. Define the map H1 ∋
v1 (u)|
u 7→ x(u) = (xv (u))v∈Wu ∈ H3 by xv (u) =: |J
|Jv (u)| if Jv1 (u) is non empty. Then
we claim x(u) is well defined and Jv (u) = J˜v (x(u)) whenever non empty, where
the right side is defined by L̃v (x, (0, 1]) =: (Sv (x), J˜v (x)). The proof is easy by
induction on the length of u and therefore skipped.
In the next two examples G acts left on H.
Sum: Let H = RH3 and G ⊂ H H be generated by g1 , g2
(g2 (h))(x) := (1 − x1 )h(S2 (x))
(g1 (h))(x) := u1 h(S1 (x))
The operations are g∗g ′ ∗r h = g(g ′ (h)). Put Lv,vi = gi . For given Cv ∈ H, v ∈ V
with Cv (△) = 0 consider the map Rv,vw ∈ H
Rv,vw (x) := (Lv,vw ∗r Cvw )(x) = |J˜w (x)|Cvw(Sw (x))
the J˜ as above. Then Rv,vW :=
W → V and the limit satisfies
Rv,vV =
∑
w∈W
2
∑
Rv,vw converges pointwise to Rv,vV as
Li ∗r Rvi,viV + Cv
i=1
for all v ∈ V.
Now an example for a non deterministic WBP.
Quicksort [17]: Let H = G be the unit interval [0, 1] with multiplication
and Uv , v ∈ V be iid rvs with a uniform distribution on (0, 1). Define
Lv,v1 = Uv
Lv,v2 = (1 − Uv )
The 0 plays the role of the grave. Let C : [0, 1] → R be the continuously
extended map
C(x) := 1 + 2x ln x + 2(1 − x) ln(1 − x)
(22)
and put Cv = C(Uv ). Then
Rv,vW :=
∑
Lv,vw Cvw
w∈W
converges as W → V almost surely and in any Lp , p > 1 to a limit called Qv
[20]. Q· has the Quicksort distribution and satisfies, with obvious notation,
Qv = Uv Qv1 + (1 − Uv )Qv2 + C(Uv )
11
almost surely simultaneously for all v ∈ V .
................................................
For later reference define a map φ from finite binary search trees (W, ψ) with
values of ψ in (0, 1) to maps (0, 1)Wf ∈ H3
φ(W, ψ)(v) =
ψ(v) − cv
dv − cv
for v ∈ W, where
cv := sup{0, ψ(w) | W ∋ w ≺t v} dv = inf{1, ψ(w) | v ≺t w ∈ W }
The interval (cv , dv ] is Jv (x) as above. The map u 7→ x(u) from example
’Intervals Revisited’ satisfies x(u) = φ(Wu , ψu ).
Extend the map φ to infinite binary search trees (W, ψ) via
φ(W, ψ)(v) = lim φ(Wn , ψn )(v)
n→∞
for any increasing sequence (Wn , ψn )n of finite binary search trees to (W, ψ).
Proposition 5 The map φ on finite search trees (W, ψ) with values of ψ in
(0, 1) to (0, 1)Wf ∈ H3 is well defined and is a bijection. The extension to
infinite search trees is well defined. It holds x(u) = φ(Wu , ψu )) for finite or
infinite sequence u of (0, 1) values.
The proof is standard and omitted.
For input u we obtain via (Wu , ψu ) an x(u) = φ(Wu , ψu ). For given x ∈
(0, 1)Wf we can only recover the underlying binary search tree (W, ψ). We obtain
the set of values of the coordinates of u but not the vector u itself and not the
sequence (Wu|n , ψu|n )n , n ≤ |u|.
Proposition 6 If U is a finite sequence of uniformly distributed rvs, then for
every tree W the rvs φ(WU , ψU )(v), v ∈ W, conditioned on W ⊂ WU , are
independent and have a uniform distribution.
If the input is an infinite sequence U of independent rvs uniformly distributed
on the unit interval (0, 1) then φ(WU , ψU )(v)ψU (v), v ∈ V are independent rvs
with a uniform distribution. For an infinite sequence
lim
n→∞
|WUv1|n |
|WUv|n |
= φ(WU , ψU )(v)
lim
n→∞
|WUv2|n |
|WUv|n |
= 1 − φ(WU , ψU )(v)
almost surely.
Proof: The distributional statement is shown by induction on the length of the
input U using Proposition (3). For an infinite sequence U we have WU = V and
the strong law of large numbers implies the last statement.
q.e.d.
With our problems in mind we are only interested in functions of (WU , ψU )
for random input U of iid rvs with a uniform distribution. Sometimes it seems
easier to work with the corresponding random variable φ(WU , ψU )(v) as input
instead of U. We use this in the next section.
12
4
The Quicksort process
We present the limiting Quicksort process [21] as another example of a WBP.
The proof given here follows basically the original one [21] and is essential for the
sequel. We take the Quicksort process for a random algorithm and deterministic
input.
Let D = D([0, 1]) [3] be the set of all caglad (continue à gauche, limite
à droite) functions f : [0, 1] → R (left continuous functions with existing left
limits). The value 0 plays a special role, since there is no limit from the left. We
consider only functions with f (0) = 0. D is a metric space with the Skorodhod
J1 -metric
d(f, g) = inf ∥λ − id∥∞ ∨ ∥f − g ◦ λ∥∞
(23)
λ
The infimum is over all bijective increasing functions λ : [0, 1] → [0, 1] and
∥ · ∥∞ is the usual supremum norm. The σ-field on D is the Borel-σ-field via
the Skorodhod metric.
On the space D consider the space-time transformations Ta,b : D → D, a ∈
D, b ∈ D↑
Ta,b (f ) = af ◦ b
on D. The set D↑ consist of increasing caglad functions : [0, 1] → [0, 1]. The
composition of such space-time transformation is
Ta,b ◦ Tc,d = Tac◦b,d◦b
Let Uv , v ∈ V, be independent rvs with a uniform distribution on (0, 1).
(We assume without loss of generality Ui (ω) ̸= Uj (ω) for all i ̸= j and ω ∈ Ω.)
Define the random edge values Lv,vi = TAv,vi ,Bv,vi with values in D → D
(Av,v1 (t), Bv,v1 (t)) = (11(0,Uv ] (t)Uv ,
t
∧ 1)
Uv
(Av,v2 (t), Bv,v2 (t)) = (11(Uv ,1] (t)(1 − Uv ),
t − Uv
∨ 0)
1 − Uv
t ∈ [0, 1]. Let Lv,vw = TAv,vw ,Bv,vw be the random path weight for the path
(v, vw), w ̸= ∅ and Lv,v = T1,id the identity. Notice Av,vw (t) = 11Iv,vw (t)|Iv,vw |
t−cv,vw
and Bv,vw (t) = |Iv,vw
| where Iv,vw = [cv,vw , dv,vw ) is from example ’Intervals
Revisited’ now for the infinite input Uv (ω), v ∈ V for an ω ∈ Ω. We face a
backward weighted branching process (V, L, DD , ◦).
Define rvs Rv,vw := Lv,vw (Cvw ), v, w ∈ V with values in D where Cv =
C(Uv , ·) + 11(Uv ,1] QU v1 . The function C : [0, 1] × [0, 1] → R is given in (9) and
Q is the Quicksort rvs from the example
∑Quicksort above.
Our aim is to show that Rv,vW = w∈W Rv,vw converges point wise to a
process Rv,vV =: Rv as W → V . The limit R = R∅,V respectively Rv is called
the Quicksort process.
We work on an abstract probability space (Ω, A, P ) rich enough for all our
purposes. We use the supremum norm ∥f ∥∞ = sup0≤t≤1 |f (t)| on D and the
13
Lp -norm ∥ · ∥p for 1 ≤ p < ∞ for real valued rvs. Let F be the space of all
measurable functions X : Ω → D. For 1 ≤ p < ∞ let Fp be the subspace of all
rvs X ∈ F with finite value
∥X∥∞,p := ∥∥X∥∞ ∥p
The map ∥ · ∥∞,p is a pseudo metric on Fp . By the usual procedure using the
equivalence relation ∼
X ∼ Y ⇔ P (X ̸= Y ) = 0
we obtain a metric on the set Fp of equivalence classes [X] = {Y ∈ Fp | X ∼ Y }
for X ∈ Fp . (Fp , ∥ · ∥∞,p ) is a Banach space with the usual addition and multiplication. We will not distinguish between random variables and equivalence
classes in the sequel.
Lemma 7 For any 1 < p < ∞, n ∈ N holds
∥
∑
|Rv,vw |∥∞,p ≤ (c + ∥Q∥p )(
w∈Vn
∥
∑
|Rv,vV |∥∞,p ≤
w∈V
2 n/p
)
<∞
1+p
c + ∥Q∥p
2 1/p < ∞
1 − ( 1+p
)
for all v ∈ V with c = supx,t |C(x, t)| < ∞.
Proof: We skip the argument c < ∞. Notice
∑ the distribution of (Rv,vw )w∈V is
the same as of (Rw )w∈V . Define em = E v∈Vm |Iv |p for input U.
2
• em ≤ 1+p
em−1 for m ∈ N
em
2
∑
∑
= E(
|Ii |p |Ii,iv |p )
i=1 v∈Vm−1
=
2
∑
E(|Ii |p (
i=1
=
2
∑
|Ii,iv |p ))
v∈Vm−1
E(|Ii |p )E(
i=1
=
∑
∑
|Ii,iv |p )
v∈Vm−1
2
em−1
1+p
∑
• ∥ v∈Vm |Rv |∥∞ ≤ supv∈Vm |Iv ||Cv |.
Notice Rv = 11Iv |Iv |Cv and the Iv , v ∈ Vm are a partition of [0, 1). Therefore
the left side
∑ is ≤ supv∈Vm |Iv ||Cv |.
• E∥ v∈Vm |Rv |∥p∞ ≤ (c + ∥Q∥p )p em
14
Then the Lp -inequality and independence under the σ-field Fm generated
by Uv , v ∈ V<m provides
∑
|Rv |∥p∞ ≤ E( sup |Iv |p |Cv |p )
E∥
v∈Vm
v∈Vm
∑
≤
v∈Vm
∑
=
∑
E(|Iv |p |Cv |p ) =
E(|Iv |p E(|Cv |p | Fm ))
v∈Vm
E(|Iv | )E(|Cv | )
p
p
v∈Vm
• E∥
∥
∑
v∈V
∑
= E(|C∅ |p )em ≤ (c + ∥Q∥p )p em
|Rv |∥∞,p ≤ (c + ∥Q∥p ) 1−(
|Rv |∥∞,p
= ∥
v∈V
∑ ∑
1
2
1/p
1+p )
|Rv |∥∞,p
m∈N0 v∈Vm
≤
∑
m∈N0
≤
∑
∥∥
∑
|Rv |∥∞ ∥p
v∈Vm
1/p
∥(c + ∥Q∥p )e1/p
m ≤ (c + ∥Q∥p )e0 (
m∈N0
≤ (c + ∥Q∥p )
2 m/p
)
1+p
1
2 1/p
1 − ( 1+p
)
q.e.d.
As a consequence RW converges absolute and uniformly in t ∈ [0, 1] for
W → V almost surely.
Theorem 8 Let 1 < p < ∞. Then almost
W ⊂V:
– The sum Rv,vW is absolute convergent
– For any increasing sequence Wn to W
solute in supremum metric almost surely to
D.
surely holds for all v ∈ V and all
in Fp .
the sequence Rv,vWn converges aba limit called Rv,vV with values in
Proof: Q has even finite exponential moments [17]. The theorem is standard by
the above lemma and monotonicity.
q.e.d.
∑
Notice the set of all ω ∈ Ω such that ∥ w∈V |Rv,vw |∥∞ is infinite for some
v ∈ V has probability 0. The convergence in supremum metric implies also
convergence for every t ∈ [0, 1] and the convergence in Skorodhod metric on D.
Consider the distributional fixed point equation
D
(X(t))t = (U X 1 (
t
t−U
∧ 1) + (1 − U )X 2 (0 ∨
) + C(U, t))t
U
1−U
(24)
on D. The rvs X 1 , X 2 , U are independent. The rvs X, X 1 , X 2 have values in D
and all have the same distribution. The rv U has a uniform distribution on the
15
D
unit interval, the function C : [0, 1] → D given in (9). The symbol = denotes
equality in distribution in D. The right side is well defined.
We may choose versions of the processes X 1 , X 2 and of U defining X that
way. We obtain that way point wise equality in (24). For t = 1 we obtain
X(1) = U X 1 (1) + X 2 (1) + C(U )
(25)
where C is given in (10). If X(1) is integrable [5] then X(1) has the Quicksort
distribution Q.
We write the stochastic fixed point equation (24) in the form
D
(X(t))t = 11t≤U U X 1 (
t−U
t
) + 11t>U ((1 − U )X 2 (
) + U X 1 (1)) + C(U, t) (26)
U
1−U
treating the term 11t>U U X 1 (1) like an additional cost function.
Theorem 9 The family Rv = Rv,vV , v ∈ V, satisfies
Rv = Lv,1 ∗r Rv1 + Lv,2 ∗r Rv2 + Cv
(27)
for all v ∈ V almost surely. The distribution of the rv R is the unique solution
of the distributional fixed point equation (24) in Fp for every p > 1.
Proof: Establish
Rv,vW
=
∑
Rv,vw = 11∅∈W Cv +
2
∑
∑
Lv,vi ∗ Lvi,viw ∗r Cviw
i=1 w∈W i
w∈W
= 11∅∈W Cv +
2
∑
Lv,vi ∗r Rvi,viW
i=1
and then take W converges to V. This proves equation (27). R is a solution of
the stochastic fixed point equation.
For the uniqueness
∫ consider the map K on the set Mp of probability measures
µ on D satisfying ∥f ∥p∞ µ(df ) < ∞ defined by
K(µ) = L(
2
∑
Ti (Xi ) + C)
i=1
L denotes distribution. X1 , X2 , U are independent rvs. The distribution of X1
and X2 is µ and the distribution of U is uniform. The random variables T1 , T2
are DD -valued, C is D-valued
T1 (f )(t) := U f (
t
∧1)
U
T2 (f )(t) := 11t≥U (1−U )f (
t−U
)
1−U
The fixed point of K are the solution of (24) and vice versa.
16
C(t) = C(U, t)
K is a strict contraction on the space Mp for the metric
dp (µ, ν) := inf ∥X − Y ∥∞,p
The infimum is over all D-valued rv (X, Y ) on any probability space with
marginal distribution µ and ν. The space (Mp , dp ) is a complete metric space
[2].
2
• The contraction constant is ≤ p+1
< 1.
For this let (X1 , Y1 ) and (X2 , Y2 ) be independent copies of each other with
values in D, marginals µ, ν and ∥X − Y ∥∞,p ≤ dp (µ, ν) + ϵ. Choose the versions
∑2
∑2
i=1 Li Xi + C respectively
i=1 Li Yi + C with the same Li , C for K(µ) and
K(ν). Then
dp (K(µ), K(ν)) ≤
∥T1 (X 1 ) + T2 (X 2 ) + C − T1 (Y1 ) − T2 (Y2 ) − C∥∞,p
≤ ∥T1 (X1 − Y1 ) + T2 (X2 − Y2 )∥∞,p
≤ ∥T1 (X1 − Y1 )∥∞,p + ∥T2 (X2 − Y2 )∥∞,p
≤
∥T1 ∥∞,p ∥X1 − Y1 ∥∞,p + ∥T2 ∥∞,p ∥X2 − Y2 )∥∞,p
2
≤ 2∥U1 ∥p (dp (µ, ν) + ϵ) →ϵ→0
dp (µ, ν)
p+1
The rest is easy using Banach Fixed Point Theorem [4].
q.e.d.
Remark: Here we took the view of a weighted branching process with H =
D, G ⊂ H H . The transformations Lv,vi and the cost functions Cv depend on
the values U v , v ∈ V at the knots. This corresponds to the Quicksort process
with internal randomness of the algorithm giving the same distributional result
for any deterministic input. In the next section we introduce another version of
the Quicksort process, corresponding to a deterministic algorithm and a random
input by iid rvs.
5
Convergence in Skorodhod metric.
This section is on the almost sure convergence in Skorodhod metric of the discrete Quicksort process Y (U|n , .) to the Quicksort process for iid input (Ui )i∈N .
We copy the Quicksort process approach, now modeling by a deterministic WBP
with random input. The term Y (U|n , .) reduces to a sum of finitely many terms.
A random time change to the summands and a uniformity result will be the key
to almost sure convergence in Skorodhod metric.
Throughout this section U is a sequence of iid rvs with a uniform distribution. The set (H2 , ∗2 ) of half open intervals is a monoid with the operation
(c, d] ∗2 (c′ , d′ ] = (c + c′ (d − c), d + d′ (d − c)]
(28)
Consider the deterministic WBP (V, L, G, ∗) on the infinite binary tree V =
{1, 2}∗ . Let H = H1 × H2 and define g1 , g2 ∈ H H
g1 (u, I) := (u1 , I ∗2 [0,
|u1 |
))
|u| + 1
g2 (u, I) := (u2 , I ∗2 [
17
|u| − |u1 | − 1
, 1))
|x| + 1
if |u| ≥ 1 and I are non empty and the grave △ = (∅, ∅) otherwise. Let
G ⊂ H H be generated by the functions g1 , g2 adding the identity as neutral
element and the function identical △ acting as a grave. The operation on G is
the composition g ∗ g ′ = g ′ ◦ g and G operates right on H via h ∗l g = g ◦ h. Put
Lv,vi = gi for all v ∈ V.
Define Ψ : H2 → DD by
Ψ(I)(f ) = ΨI (f ) = aI f ◦ bI
aI := 11I |I|
b(c,d] (t) = 11d>c 0 ∨ ((
t−c
) ∧ 1)
d−c
Define Jv : H1 → H2 by
Lv (u, [0, 1)) =: (uv , Jv (u))
and Rv,vw : H1 → D
Rv,vw (u) := ΨJw (u) (Cw (uw ))
Cw (u) := C(|uw |, |uw1 |, ·) + 11(
|u1 |
,1]
|u|+1
|u1 |
Y (u1 , 1)
|u| + 1
for |u| ≥ 1, I non empty and the grave other wise. The function C is given in
(7) with extension C(n, k, t) := C(n, k, ⌈nt⌉
n ) in (7).
Proposition 10 The map H2 ∋ I 7→ bI is injective. For I, J ∈ H2 and v ∈ V
holds
• |I||J| = |I ∗2 J| = |J ∗2 I|
• bI ◦ bJ = bJ∗2 I
• aI aJ ◦ bI = aJ∗2 I
• ΨI ◦ ΨJ = ΨJ∗2 I
• Lv,vw (u, I) = (uw , I ∗2 Jw (u))
• Jvw(u) = Jv (u) ∗2 Jw (uv )
• Rv,vw = Rw
∑
• Rv,vW = w∈W Rv,vw →W →V Rv,vV point wise
∑2
• RV (u) = i=1 ΨJi (u) (Ri,iV (ui )) + C∅ (u)
18
Proof: The first statement is easy. We obtain for I = [c, d), J = [c′ , d′ ) and
I, J ̸= ∅
|I ∗2 J| =
(d − c)(d′ − c′ ) = |I||J| = |J ∗2 I|
′
t−c
t − c′
d′ −c′ − c
)
∧
1)
=
0
∨
∧ 1 = bJ∗2 I
d′ − c′
d−c
t−c
aI aJ (bI (t)) = 11I (t)|I|11J (0 ∨
∧ 1) = 11I (t)|I|11c′ < t−c ≤d′ = aI∗1 J (t)
d−c
d−c
ΨI ◦ ΨJ (f ) = ΨI (aJ f ◦ bJ ) = aI aJ (bI )f ◦ bJ ◦ bI = ΨI∗2 J (f )
(bI ◦ bJ )(t)
=
Lv,vw (u, I) =
(uvw , Jvw ) =
bI (0 ∨ (
Lw (u, I) = (uw , I ∗2 Lw (u, (0, 1]) = (uw , I ∗2 Jw (u)
Lv,vw (Lv (u, (0, 1])) = Lv,vw (uv , Jv (u)) = (uvw , Jv (u) ∗2 Jw (uv ))
Next show the result for I or J the empty set.
The first statement on R is easy. For the second notice Rv,vW has an increasing number of non zero summands for increasing W and Rv,vV (u) has only
finitely many non zero summands for every u ∈ H1 . The third statement is also
easy. The last follows by induction on the length of the input u. We skip this.
q.e.d.
Now let U = (Ui )i∈N be a sequence of independent rvs with a uniform
distribution on (0, 1). The input is now U|n (ω) at the root ∅ of length n. From
that we obtain at any vertex (U|n )v as starting value. Our objects of interest
are Rv,vw ((U|n )v ) =: Rv,vw (n) (a slight abuse of notation). Notice the n is for
fixed input U|n at the root and not relative to the starting vertex v of the path
as used before. Further Rv,vV (n) = Y ((U|n )v ) with recursion
∑
ΨJi ((U|n )v ) Rvi,viV (n) + Cv ((U|n ))
Rv,vV (n) =
i
We now introduce the limiting Quicksort process R· (∞) of R· (n). We consider a weighted branching process (V, L̃, G̃, ∗) with H̃ = (0, 1)N
̸= × H2 and G̃
generated by g̃1 , g̃2 , a grave and the identity
g̃1 (u, I) := (
(u1 )1
u1
, I ∗2 [0, ))
u1
u1
g̃2 (u, I) := (
u2 − u1
(u2 )1 − u1
, I ∗2 [
, 1))
1 − u1
u1
N
˜
Put L̃v,vi = g̃i and define ũv ∈ (0, 1)N
̸= , Jv : (0, 1)̸= → H2 by
L̃v (u, [0, 1)) =: (ũv , J˜v (u))
Notice ũv =
uv −c
d−c
for J˜v (u) = [c, d) ̸= ∅ and appropriate otherwise. Define
R̃v,vw (u) := ΨJ˜w (u) C̃vw (ũw )
where C̃v : (0, 1)N
̸= → D, v ∈ V is given by
C̃v (u) := C(u1 , ·) + 11u1 u1 Q(uv1 )
19
Q is the Quicksort limit for input u (existence assumed) and C is given in
(9). The Quicksort limit exists almost surely for input U. (Régnier [16] gave a
martingale argument for the existence of the Quicksort limit.) We replace u by
the random sequence U . Our object of interest is
Rv,vw (∞) := R̃v,vw (Ũ v )
Notice the (Ũ v )1 , v ∈ V are independent rvs with a uniform distribution. They
play the same role as Uv , v ∈ V in section (4) on the Quicksort process. The rv
Rv,vw (∞) is a.s. well defined since it is the Quicksort process Rv,vw as defined
in section 4.
Let c = supt∈[0,1] sup0≤k≤n∈N0 |C(n, k, t)| < ∞ and Q∗ (U ) = supn |Y (U|n , 1)|
from Quicksort. Define
C v (U|n ) := c + Q∗ (U v1 )
and R analog to R using (C v )v instead of (Cv )v . Here is an analogue of Lemma
(7) now for the discrete Quicksort process.
Lemma 11 Let U = (Un )n∈N be a sequence of independent rvs with a uniform
distribution on (0, 1). Then Rv (n) is a majorant of Rv (n) for every n. For every
p > 1 and n ∈ N holds
∥RVm (n)∥∞,p ≤ (c + ∥Q∗ ∥p )(
∥RV (n)|∥∞,p ≤
2 m/p
)
<∞
1+p
c + ∥Q∗ ∥p
2 1/p < ∞
1 − ( 1+p
)
supn∈N RV (n) < ∞ and supn∈N RV≥ m (n) →m→∞ 0 a.e.
Proof: We show c < ∞ in Proposition 12 at the end of this paper. Since all
moments of the Quicksort distribution Q = Q(U ) are finite [17] and Y (U|n , 1))
is a martingale [16] converging to Q we obtain finite moments of all order of the
supremum Q∗ .
i
Notice
∑nthe independence of Ũ , i = 1, 2 from the ranks (R1:n )n∈N where
R1:n = j=1 11U1 ≤Uj denotes the rank of U1 under U1 , U2 , . . . , Un . This implies
Q∗ (U vi ) is independent of Jv (U|n ) for every n ∈ N, v ∈ V . This is due to the
fact that Q∗ (U vi ) = Q∗ (Ũ vi ). Further C(U ) is independent of Ji (U ).
• R is a majorant of R
This follows by |Cv (U|n )| ≤ C v (U ) for all n from the martingale argument.
∑
• ∥∥RVm (n)∥∞ ∥p ≤ (c + ∥Q∗ (U )∥p )( v∈Vm E(|Jv (n)|p ))1/p
∥RVm (n)∥∞
= ∥
∑
11Jv (U|n ) |Jv (U|n )|C v (U )∥∞
v∈Vm
=
sup |Jv (U|n )|C v (U )
v∈Vm
20
p
E( sup |Jv (U|n )|p C v (U ))
E(∥RVm (n)∥p∞ ) =
v∈Vm
= E(
∑
p
|Jv (U|n )|p C v (U ))
v∈Vm
∑
=
p
E(|Jv (U|n )|p E(C v (U )))
v∈Vm
p
∑
≤ E(C (U ))
E(|Jv (U|n )|p )
v∈Vm
∑
∗
(c + ∥Q (U )∥p )p
=
E(|Jv (U|n )|p )
v∈Vm
p
n
• E(|(U|n )1 |)p ) ≤ 1+p
1
|(U|n ) | + 1 is the rank of U1 under U1 , U2 , . . . , Un . The rank has a uniform
distribution on {1, . . . , n}. We have
E(|(U|n )1 |)p ) =
n−1
∑
k=0
1 p
k ≤
n
∫
n
xp dx =
0
1 |v|
• E(|Jv (U|n )|p ) ≤ ( 1+p
) for all n, v
For v ∈ Vm
∫ 1
E(|J1 (U|n )|p ) ≤ E(U1 )p =
xp dx =
0
np
1+p
1
p+1
E(|J2 (U|n )|p ) =
E(|J1 (U|n )|p )
E(|Jv (U|n )|p ) =
=
=
E(|Jv|m−1 (U|n ) ∗2 Jvm ((U|n )v|m−1 )|p )
E(|Jv|m−1 (U|n )|p |Jvm ((U|n )v|m−1 )|p )
E(|Jv|m−1 (U|n )|p |)E(|Jvm ((U|n )v|m−1 )|p )
1
1 m
E(|Jv|m−1 (U|n )|p |)
≤ ··· ≤ (
)
1+p
1+p
≤
∑
2 m
• v∈Vm E(|Jv (U|n )|p ) ≤ ( 1+p
) for m ∈ N
Easy.
c+∥Q∗ ∥p
• ∥RV (n)∥∞,p ≤ 1−( 2 )1/p
1+p
∥RV (n)∥∞,p
≤
≤
∞
∑
∥RVm (n)∥∞,p
m=0
∞
∑
((c + ∥Q∗ ∥p )(
m=0
=
c + ∥Q∗ ∥p
2
(1 − ( 1+p
))1/p
21
2 m/p
)
)
1+p
For the last statement we need estimates
uniformly in n.
∑
• There exists a p > 1 with d := E( i supn |Ji (U|n )|p ) < 1.
It suffices to show supn∈N |Ji (U|n )| < 1 a.e.
1:n −2
J1 (U|n ) is Rn−1
< 1 for n ≥ 1, where Ri:n is the rank of U1 under
U1 , . . . , Un . Since Rn1:n converges a.e. to U1 ∈ (0, 1) we obtain the partial result
for i = 1. The argument for i = 2 are analogue.
• E(supn |Jv (U|n )|p ) ≤ ( d2 )|v| for all v ∈ V.
For v ∈ Vm
E(sup |Jv (U|n )|p ) =
n
E(sup |Jv|m−1 (U|n ) ∗2 Jvm ((U|n )v|m−1 )|p )
n
= E(sup |Jv|m−1 (U|n )|p sup |Jvm ((U|l )v|m−1 )|p )
n
l
≤ E(sup |Jv|m−1 (U|n )|p |)E(sup |Jvm ((U|l )v|m−1 )|p )
n
l
d
≤ ( )m
2
∑
• v∈Vm E(supn |Jv (n)|p ) ≤ dm for m ∈ N
Easy.
• ∥ supn∈N ∥RVm (n)∥∞ ∥p ≤ (c + ∥Q∗ (U )∥p )dm/p
sup ∥RVm (n)∥∞
=
n∈N
n∈N
=
∑
sup ∥
11Jv (U|n ) |Jv (U|n )|C v (U )∥∞
v∈Vm
sup sup |Jv (U|n )|C v (U )
n∈N v∈Vm
E(sup ∥RVm (n)∥p∞ ) =
n
≤
p
E(sup sup |Jv (U|n )|p C v (U ))
n∈N v∈Vm
E(
∑
v∈Vm
∑
=
=
p
sup |Jv (U|n )|p C v (U ))
n∈N
p
E(sup |Jv (U|n )|p E(C v (U )))
n∈N
v∈Vm
p
E(C (U ))
∑
E(sup |Jv (U|n )|p )
n
v∈Vm
=
• ∥ supn ∥RV (n)∥∞ ∥p ≤
(c + ∥Q (U )∥p ) d
∗
c+3∥Q ∥p
1−d)1/p
∥ sup ∥RV (n)∥∞ ∥p
n
∗
p m
<∞
≤
≤
∞
∑
m=0
∞
∑
∥ sup ∥RVm (n)∥∞ ∥p
n
(c + ∥Q∗ ∥p )dm/p
m=0
=
c + ∥Q∗ ∥p
(1 − d)1/p
22
∞.
The last statement follows from ∥ supn ∥RV (n)∥∞ ∥1 ≤ ∥ supn ∥RV (n)∥∞ ∥p <
q.e.d.
Proof of Theorem (1) Estimate path wise for some m ∈ N
d(RV (n), RV (∞)) ≤
d(RV (n), RV≤m (n)) + d(RV≤m (n), RV≤m (∞))
+d(RV≤m (∞), R(∞)) = I + II + III
≤ sup ∥RV>m (n)∥∞ →m→∞ 0
sup I
n
n
≤ sup ∥RV>m (∞)∥∞ →m→∞ 0
n
∑
≤
lim d(Rv (n), Rv (∞))
sup III
n
lim II
n→∞
v∈V≤m
n
Therefore it suffices to show the limn d(Rv (n), Rv (∞)) = 0 for every v ∈ V.
Define for n ∈ N and v ∈ V the function φ = φv,n : [0, 1] → [0, 1] by
φ(0) = 0, φ(Av (∞)) = Av (n), φ(Bv (∞)) = Bv (n), φv,n (1) = 1
and linear between, where (Av (n), Bv (n)] = Jv (U|n ) =: Jv (n) ̸= ∅.
d(Rv (n), Rv (∞)) ≤
lim sup IV
n
lim sup V
n
lim sup V I
n
d(11Jv (n) |Jv (n)|Cv (n), 11Jv (n) ◦ φ|Jv (n)|Cv (n) ◦ φ)
+
+
d(11Jv (n) ◦ φ|Jv (n)|Cv (n) ◦ φ), 11Jv (∞) |Jv (n)|Cv (∞, ·))
d(11Jv (n) |Jv (∞)|Cv (∞, ·), 11Jv (∞) |Jv (∞)|Cv (∞, ·))
=
≤
IV + V + V I
lim sup sup |φ(t) − t|
≤
lim sup |Av (n) − Av (∞)| + lim sup |Bv (n) − Bv (∞)| = 0
≤
lim sup sup |Cv (n, φ(t)) − Cv (∞, t)| = 0
≤
lim sup |Av (n) − Av (∞)| + lim sup |Bv (n) − Bv (∞)| = 0
n
t
n
n
n
t
n
n
q.e.d.
Proposition 12
|C(n, k, l)|
≤ c < ∞.
n
n∈N,1≤l≤n,0≤k≤n−1
sup
If
ln
n
→n t ∈ [0, 1],
in
n
→n s ∈ [0, 1] and t ̸= s then
C(n, ln , in )
→n C(t, s)
n
where the C-functions are given in (7) and (9).
23
Proof: The proof is straight forward. Notice a(0, ·) = 0.
The asymptotics of the harmonic numbers are
Hn = ln n + γ +
1
1
10
−
+
+ O(n−6 )
2n 12n2
120n4
with γ = 0, 577215... the Euler constant. We will use Hn = ln n + γ + bn with
bn = O( n1 ).
For n ≥ 2 and 1 ≤ l = ln ≤ n, 1 < i = in < n − 1
a(n, l)
n
=
=
C(n, l, i)
n
=
I
:=
II
:=
III
:=
|11l<i−1 I|
≤
≤
≤
≤
l
l
l
l
l
ln n
2 + 2 γ − 2(1 − ) ln(1 − ) + 2 ln n − 6 + O(
)
n
n
n
n
n
n
l
O(1) + 2 ln n
n
11l<i−1 I + 11i−1≤l≤i II + 11l>i III
n − 1 − a(n, l) + a(i − 1, l)
n
n − 1 − a(n, l) + a(i − 1, i − 1)
n
n − 1 − a(n − l) + a(i − 1, i − 1) + a(n − i, l − i)
n
l
l
11l<i−1 (O(1) + |2 ln n − 2 ln(i − 1)|)
n
n
i−1
l
|
11l<i−1 (O(1) + 2 | ln
n
n
i−1
i−1
O(1) + 2
| ln
|
n
n
O(1) + 2 sup |x ln x| = O(1)
0≤x≤1
|11i−1≤l≤i II|
≤
≤
≤
|11l>i III|
≤
≤
≤
≤
i−1
l
ln n − 2
ln(i − 1)|)
n
n
i
i−1
O(1) + 2| ln n −
ln(i − 1)|
n
n
O(1)
l
i−1
l−i
O(1) + |2 ln n − 2
ln(i − 1) − 2
ln(n − i)|
n
n
n
i−1i−1 l−i−1
n−i−1
O(1) + 2| −
−
ln(
)|
n
n
n
n
l−n n−i−1
O(1) + 2|
ln
|
n
n
n−i−1
n−i−1
O(1) + 2|(1 −
) ln(1 −
)| = O(1)
n
n
11i−1≤l≤i (O(1) + |2
The cases i = 1 and i = n − 1 have to be done separately, but provide no
difficulty. (Otherwise ln 0 would appear in the above argument.)
24
Now assume limn→∞
lim 11l<i−1 I
n→∞
ln
n
= t ∈ (0, 1) and limn→∞
in
n
= s ∈ (0, 1). We obtain
= 11t<s (lim(−1 − 2tγ − 2t ln n + 2(1 − t) ln(1 − t) + 6t)
n
t
t
t
t
t
+ s(2 + 2 γ + 2 ln ns − 2(1 − ) ln(1 − ) − 6 ))
s
s
s
s
s
= 11t<s (−1 + 2s + 2(1 − t) ln(1 − t) − 2(s − t) ln(s − t) + 2s ln s)
and
lim 1l>i III
n→∞
= 11t>s lim(1 − (2 + 2tγ + 2t ln n − 2(1 − t) ln(1 − t) − 6t)
n
t−s
γ
1−s
t−s
t−s
t−s
t−s
+ 2
ln(n(1 − s)) − 2(1 −
) ln(1 −
)−6
))
1−s
1−s
1−s
1−s
= 11t>s (1 + 2s ln s + 2(1 − s) ln(1 − s))
+ s(2 + 2γ + 2 ln ns − 6) + (1 − s)(2 + 2
This implies the statement.
q.e.d.
Notice for t = s there might be no convergence of C(n, ln , in ). Conditioned
to ln < in − 1 the above converges to −1 + 2s + 2s ln s + 2(1 − s) ln(1 − s). But
conditioned to ln > in the above converges to 1 + 2s ln s + 2(1 − s) ln(1 − s). In
our setting the probability of being in that case is 0 and the boundedness of C
is sufficient for our purpose.
References
[1] Krishna B. Athreya and Peter E. Ney Branching Processes. Die
Grundlehren der mathematischen Wissenschaften Vol 196 1972, ISBN: 9783-642-65373-5 (Print) 978-3-642-65371-1.
[2] Peter J. Bickel and David A. Freedman Some asymptotic theory for the
bootstrap. Ann. Stat. 9, 1196-1217, 1981.
[3] Patrick Billingsley Convergence of Probability Measures. Wiley, New York.
1968, ISBN: 978-0-471-19745-4.
[4] Nelson Dunford and Jacob T. Schwartz Linear Operators, Part I General
Theory, Part II Spectral Theory, Self Adjoint Operators in Hilbert Space,
Part III Spectral Operators Wiley, 1988.
[5] James A. Fill and Svante Janson A characterization of the set of fixed
points of the Quicksort transformation. Electron Comm Probab 5, pp 7784
(electronic), 2000.
[6] Rudolf Grübel and Uwe Roesler Asymptotic distribution theory for Hoare’s
selection algorithm. Adv. Appl. Probab. 28, 252-269, 1996.
25
[7] Rudolf Grübel Hoare’s Selection Algorithm: A Markov Chain Approach.
Journal of Applied Probability Vol. 35, pp. 36-45, 1998.
[8] Charles A.R. Hoare Algorithm 64: Quicksort. Communications of the Association for Computing Machinery 4, 321, 1961.
[9] Charles A.R. Hoare Quicksort. Computer Journal 5, 10-15, 1962.
[10] Peter Kirschenhofer, Helmut Prodinger and Conrado Martı́nez Analysis of
Hoare’s FIND algorithm with median-of-three partition. Random Structures
and Algorithms 10, 143-156, 1997.
[11] Diether Knof and Uwe Roesler The Analysis of Find and Versions of it.
Discrete Mathematics & Theoretical Computer Science, Vol. 14, No 2,
2012.
[12] Donald E. Knuth The Art of Computer Programming, Volume 3: Sorting
and Searching. (2nd Edition), Addison-Wesley Professional 1998, ISBN 0201-89685-0.
[13] [7] Conrado Martı́nez Partial quicksort. Alcom-FT Technical Report Series,
ALCOM-FT-RR-03-50, 2004.
[14] Conrado Martı́nez and Uwe Roesler Partial Quicksort and Quickpartitionsort. Discrete Mathematics and Theoretical Computer Science, Proceedings, 21st International Meeting on Probabilistic, Combinatorial, and
Asymptotic Methods in the Analysis of Algorithms (AofA’10), 336-42,
2010.
[15] Conrado Martı́nez and Salvador Roura Optimal sampling strategies in
Quicksort and Quickselect. SIAM J. COMPUT. Vol 31, No. 3, pp. 683705,
2001.
[16] M. Régnier A limiting distribution for quicksort. RAIRO - Theoretical Informatics and Applications - Informatique Thorique et Applications 23,
335-343, 1989.
[17] Uwe Roesler A limit theorem for ”Quicksort”. Informatique théorique et
Applications/Theoretical Informatics and Applications 25 85-100, 1991.
[18] Uwe Roesler A fixed point theorem for distributions. Stochastic Processes
and their Applications 42, 195-214, 1992.
[19] Uwe Roesler The weighted branching process. Dynamics of complex and
irregular systems (Bielefeld, 1991) 154-165, Bielefeld Encount. Math. Phys.
VIII, World Sci. Publishing, River Edge, NJ, 1993.
[20] Uwe Roesler and Ludger Rüschendorf The contraction method for recursive
algorithms. Algorithmica 29, pp 3-33, 2001.
[21] Mahmoud Ragab and Uwe Roesler The Quicksort Process. arXiv:1302.3770,
2013 and Stochastic Processes and their Applications 124 10361054, 2014.
26
© Copyright 2026 Paperzz