VC Density, I - UCLA Department of Mathematics

VAPNIK-CHERVONENKIS DENSITY IN SOME THEORIES
WITHOUT THE INDEPENDENCE PROPERTY, I
MATTHIAS ASCHENBRENNER, ALF DOLICH, DEIRDRE HASKELL,
DUGALD MACPHERSON, AND SERGEI STARCHENKO
For Lou van den Dries, on his 60th birthday.
Abstract. We recast the problem of calculating Vapnik-Chervonenkis (VC) density into one of counting types, and thereby calculate bounds (often optimal) on the
VC density for some weakly o-minimal, weakly quasi-o-minimal, and P -minimal
theories.
Contents
1. Introduction
2. VC Density
3. The Model-Theoretic Context
4. Some VC Density Calculations
5. Theories with the VC d Property
6. Examples of VC d: Weakly O-minimal Theories and Variants
7. A Strengthening of VC d, and P -adic Examples
References
1
7
15
27
35
42
47
56
1. Introduction
The notion of VC dimension, which arose in probability theory in the work of Vapnik and
Chervonenkis [98], was first drawn to the attention of model-theorists by Laskowski [55],
who observed that a complete first-order theory does not have the independence property
(as introduced by Shelah [86]) if and only if, in each model, each definable family of
sets has finite VC dimension. With this observation, Laskowski easily gave several
examples of classes of sets with finite VC dimension, by noting well-known examples
of theories without the independence property. This line of thought was pursued by
Karpinski and Macintyre [49], who calculated explicit bounds on the VC dimension of
definable families of sets in some o-minimal structures (with an eye towards applications
to neural networks), which were polynomial in the number of parameter variables. In a
further paper [50], they observe that their arguments also lead to a linear bound on the
VC density of definable families of sets in some o-minimal structures. They ask whether
similar (linear) bounds hold for the p-adic numbers (whose theory also does not have
the independence property). The bound in the o-minimal case in [50] was established
independently, using a more combinatorial approach, by Wilkie (unpublished), and more
recently, also by Johnson and Laskowski [47].
Date: September 2011.
1
2
ASCHENBRENNER, DOLICH, HASKELL, MACPHERSON, AND STARCHENKO
In this paper we give a sufficient criterion (Theorem 5.7) on a first-order theory for
the VC density of a definable family of sets to be bounded by a linear function in
the number of parameter variables, and show that the criterion is satisfied by several
theories of general interest, including the theory of the p-adics and all weakly o-minimal
theories. In a sequel to this paper [6] we give different arguments to get similar bounds
in a variety of other examples where our criterion does not apply. Before we state our
main results, we introduce our setup and review some definitions and basic facts. We
hope that the present paper (unlike its sequel [6]) can be read with only little technical
knowledge of model theory beyond basic first order logic. The first few chapters of [42]
or [63] or similar texts should provide sufficient background for a prospective reader.
1.1. VC dimension and VC density. Let X be an infinite set and S be a non-empty
collection of subsets of X. Given A ⊆ X, we say that a subset B of A is cut out by S if
B = S ∩ A for some S ∈ S; we let S ∩ A := {S ∩ A : S ∈ S} be the collection of subsets
of A cut out by S. We say that A is shattered by S if every subset of A is cut out by
some element of S. The collection S is said to be a VC class if there is a non-negative
integer n such that no subset of X of size n can be shattered by S. In this case, the VC
dimension of S is the largest d ≥ 0 such that some set of size d is shattered by S. We
denote by πS (n) the maximum, as A varies over subsets of X of size n, of the numbers
of subsets of A that can be cut out by S; that is,
X
.
πS (n) := max |S ∩ A| : A ∈
n
(Here and below, X
n denotes the set of n-element subsets of X.) The function n 7→
πS (n) is called the shatter function of S. Clearly 0 ≤ πS (n) ≤ 2n for every n, and
if S is not a VC class, then πS (n) = 2n for every n. However, if S is a VC class, of
VC dimension d say, then by a fundamental observation of Sauer [83] (independently
made in [87] and, implicitly, in [98]), the function n 7→ πS (n) is bounded above by a
polynomial in n of degree d. (In fact, for d, n ≥ 1 one has πS (n) ≤ (en/d)d , where e
is the base of the natural logarithm.) Hence it makes sense to define the VC density
of a VC class S as the infimum of all reals r ≥ 0 such that πS (n)/nr is bounded
for all positive n. It turns out that in many case, the VC density (rather than the
VC dimension) is the decisive measure for the combinatorial complexity of a family of
sets. For example, the VC density of S governs the size of packings in S with respect
to the Hamming metric ([41], see also [64, Lemma 2.1]), and is intimately related to the
notions of entropic dimension [7] and discrepancy [68]. We refer to the surveys [65, 33]
for uses of VC density in combinatorics.
1.2. VC dimension and VC density of formulas. Let L be a first-order language.
In an L-structure M , a natural way to generate a collection of subsets of M m is to
take the family of sets defined by a formula, as the parameters vary. Given a tuple
x = (x1 , . . . , xm ) of pairwise distinct variables we denote by |x| := m the length of x. We
often need to deal with L-formulas whose free variables have been separated into object
and parameter variables. We use the notation ϕ(x; y) to indicate that the free variables
of the L-formula ϕ are contained among the components of the tuples x = (x1 , . . . , xm )
and y = (y1 , . . . , yn ) of pairwise distinct variables (which we also assume to be disjoint).
Here the xi are thought of as the object variables and the yj as the parameter variables.
We refer to ϕ(x; y) as a partitioned L-formula.
VC DENSITY IN SOME NIP THEORIES, I
3
In the rest of this introduction we let M be an infinite L-structure. Let ϕ(x; y) be a
partitioned L-formula, m = |x|, n = |y|, and denote by
Sϕ = ϕM (M m ; b) : b ∈ M n
the family of subsets of M m defined by ϕ in M using parameters ranging over M n . We
call Sϕ a definable family of sets (in M ). We say that ϕ defines a VC class in M if
Sϕ is a VC class; in this case the VC dimension of ϕ in M is the VC dimension of the
collection Sϕ of subsets of M m , and similarly one defines the VC density of ϕ in M .
Since the shatter function πϕ = πSϕ of Sϕ only depends on the elementary theory of M
(see Lemma 3.2 below), given a complete L-theory T with no finite models, we may also
speak of the shatter function of ϕ in T as well as VC dimension of ϕ in T and the VC
density of ϕ in T .
1.3. NIP theories. A partitioned L-formula ϕ(x; y) as above is said to have the independence property for M if for every t ∈ N there are b1 , . . . , bt ∈ M n such that for every
S ⊆ {1, . . . , t} there is aS ∈ M m such that for all i ∈ {1, . . . , t}, M |= ϕ(aS ; bi ) ⇐⇒ i ∈
S. The structure M is said to have the independence property if some L-formula has
the independence property for M , and not to have the independence property (or to
be NIP or dependent) otherwise. By a classical result of Shelah [86] (with other proofs
in [52, 55, 80]), for M to be NIP it is actually sufficient that no formula ϕ(x; y) with
|x| = 1 has the independence property for M . NIP is implied by (but not equivalent to)
another prominent tameness condition on first-order structures called stability: An Lformula ϕ(x; y) is said to be unstable for M if for every t ∈ N there are a1 , . . . , at ∈ M m
and b1 , . . . , bt ∈ M n such that M |= ϕ(ai ; bj ) ⇐⇒ i ≤ j, for all i, j ∈ {1, . . . , t}. The
L-structure M is called unstable if some L-formula ϕ is unstable for M ; and “stable”
(for formulas and structures) is synonymous with “not unstable.”
Laskowski’s observation [55] is that an L-formula defines a VC class in M if and
only if it does not have the independence property for M . In fact, given a collection S
of subsets of a set X, define the dual shatter function of S as the function n 7→ πS∗ (n)
whose value at n is the maximum number of equivalence classes defined by an n-element
subfamily T of S, where two elements of X are said to be equivalent with respect to T
if they belong to the same sets of T . Then a given partitioned L-formula ϕ(x; y) has the
independence property precisely if πS∗ ϕ (n) = 2n for every n. The dual shatter function
of Sϕ is really a shatter function in disguise: it agrees with the shatter function of Sϕ∗
where ϕ∗ (y; x) := ϕ(x; y) is the dual of the partitioned formula ϕ. (See Section 3.)
A complete L-theory T is said to have the independence property if some model of it
does, and is said not to have the independence property (or to be NIP) otherwise. Thus
a complete L-theory T is NIP if and only if every L-formula defines a VC class in every
model of T . Many theories arising in mathematical practice turn out to be NIP: By [86],
all stable theories (i.e., complete theories all of whose models are stable) are NIP; so,
for example, algebraically closed (more generally, separably closed) fields, differentially
closed fields, modules, or free groups furnish examples of NIP structures. Furthermore,
o-minimal (or more generally, weakly o-minimal) theories are NIP [55, 61]. By [36] any
ordered abelian group has NIP theory. Certain important theories of henselian valued
fields are NIP, for example, the completions of the theory of algebraically closed valued
fields and the theory of the field of p-adic numbers (and also their rigid analytic and
p-adic subanalytic expansions, respectively). In fact, in the language of rings with a
predicate for the valuation ring, an unramified henselian valued field of characteristic
(0, p) is NIP if and only if its residue field is NIP [12]. Similarly, henselian valued fields
4
ASCHENBRENNER, DOLICH, HASKELL, MACPHERSON, AND STARCHENKO
of characteristic (0, 0) and algebraically maximal Kaplansky fields of characteristic (p, p)
are NIP iff their residue fields are NIP [13, 12].
On the other hand, each pseudofinite field (infinite model of the theory of all finite
fields) is not NIP [29], since it defines the (Rado) random graph.
1.4. Uniform bounds on VC density. This paper is motivated by the following
question: Given a NIP theory T , can one find an upper bound, in terms of n only, on
the VC densities (in T ) of all L-formulas ϕ(x; y) with |y| = n? The intuition behind
this question is, of course, that the complexity of a family Sϕ of sets defined by a firstorder formula ϕ(x; y) in a NIP structure should be governed by the number n of freely
choosable parameters. Note that the minimum possible bound is |y| = n: for if ϕ(x; y),
where x is a single variable, is the formula x = y1 ∨ · · · ∨ x = yn , then the subsets of M
cut out by Sϕ are exactly the non-empty subsets of M of cardinality at most n, so ϕ(x; y)
has VC density n (in any complete theory). We note here in passing that the VC density
of a formula ϕ in a NIP theory may take fractional values, and that the shatter function
of Sϕ , though not growing faster than polynomially, is not asymptotic to a real power
function in general. See Section 4 below, where we explicitly compute the VC density
of certain incidence structures (related to the Szémeredi-Trotter Theorem) and of the
edge relation in Spencer-Shelah random graphs, and investigate the asymptotics of a
shatter function in the infinitary hypercube.
In this paper we employ VC duality to translate the problem of bounding the VC density of a formula ϕ into the task of counting ϕ∗ -types over finite parameter sets, which
then can be treated by model-theoretic machinery. Viewing VC density as a bound on
a number of types also illuminates the connection with a strengthening of the NIP concept, which is that of dp-minimality. (See Section 5.3 below for a definition.) Dolich,
Goodrick and Lippel [24] have observed that, if, in a theory, the dual VC density of
any L-formula in a single object variable is less than 2, then the theory in question is
dp-minimal. (No counterexample to the converse of this implication seems to be known.)
We now state our main results. First, an optimal bound on density is obtained for
weakly o-minimal theories (see Theorem 6.1 below). Recall that a complete theory T
in a language containing a binary relation symbol “<” which expands the theory of
linearly ordered sets is called weakly o-minimal if in every model of T , each partitioned
L-formula ϕ(x; y) with |x| = 1 defines a finite union of convex sets. (See [61] for more
on this notion, which generalizes the probably more familiar concept of an o-minimal
theory, cf. [25].)
Theorem 1.1. Suppose L contains a binary relation symbol “<”, interpreted in M as
a linear ordering. If T = Th(M ) is weakly o-minimal, then every L-formula ϕ(x; y)
has VC density at most n = |y| in T (in fact, πϕ (t) = O(tn )).
This bound is the same as that obtained by Karpinski-Macintyre [49] for o-minimal
expansions of the reals, or by Wilkie and by Johnson-Laskowski [47] for all o-minimal
structures. The motivating example of a theory which is weakly o-minimal but not ominimal is the theory of real closed valued fields, that is, real closed fields equipped with
a predicate for a proper convex valuation ring. In fact, the methods of Karpinski and
Macintyre can also be adapted to give the correct density bounds for this and certain
other weakly o-minimal expansions of real closed fields [40]. Some interesting weakly
o-minimal theories to which these methods do not readily adapt may be found in [5, 54].
VC DENSITY IN SOME NIP THEORIES, I
5
Our approach to Theorem 1.1, via definable types, was partly inspired by the use of
Puiseux series in [11, 81].
Let ACVF denote the theory of (non-trivially) valued algebraically closed fields,
in the ring language expanded by a predicate for the valuation divisibility. This has
completions ACVF(0,0) (for residue characteristic 0), ACVF(0,p) (field characteristic 0,
residue characteristic p), and ACVF(p,p) (field characteristic p). Because ACVF(0,0) is
interpretable in RCVF, our methods give (non-optimal) density bounds for ACVF(0,0)
(Corollary 6.3). However, they give no information on density in the theories ACVF(0,p)
and ACVF(p,p) . The problems arise essentially because a definable set in 1-space in
ACVF is a finite union of ‘Swiss cheeses’ but we have no way of choosing a particular
Swiss cheese. This means that the definable types technique in our main tool (Theorem 5.7) breaks down. On the other hand, our methods do yield:
Theorem 1.2. Suppose M = Qp is the field of p-adic numbers, construed as a firstorder structure in Macintyre’s language Lp . Then the VC density of every Lp -formula
ϕ(x; y) is at most 2|y| − 1.
The same result holds for the subanalytic expansions of Qp considered by Denef and
van den Dries [22]. (Theorem 7.2 and Remark 7.9.) Key tools available here, but
not in the case of ACVF, are cell decomposition and the existence of definable Skolem
functions. We do not know whether the bound in Theorem 1.2 is optimal.
The investigation of the fine structure of type spaces over finite parameter sets in
NIP theories is only just beginning, and the present paper can be seen as a first step in
studying one particular measure (VC density) for their complexity. Applications of the
results in this paper to transversals of definable families in NIP theories will appear in
a separate manuscript, under preparation by the first- and last-named authors.
As remarked above, all stable theories are NIP, so it also makes sense to investigate
VC density in stable theories. In a sequel of the present paper [6] we obtain bounds
on VC density in certain finite U-rank theories (including all complete theories of finite
Morley rank expansions of infinite groups).
We close off this introduction by pointing out that besides being of intrinsic interest,
uniform bounds on VC density of first-order formulas (as obtained in this paper) often
also help to explain why certain well-known effective bounds on the complexity of geometric arrangements, used in computational geometry, are polynomial in the number of
objects involved. For example, the bound on the number of semialgebraically connected
components of realizable sign conditions on polynomials over real closed fields from
[11, 81] breaks up into a topological and a combinatorial part, where the polynomial
nature of the latter may be seen as a consequence of Theorem 1.1:
Example. Let R be a real closed field, P = (P1 , . . . , Ps ) be a tuple of polynomials from
R[X] = R[X1 , . . . , Xk ], each of degree at most d. A sign condition for P is an s-tuple
σ ∈ {−1, 0, +1}s , and we say that σ is realized in a subset V of Rk if
σV := a ∈ V : (sign P1 (a), . . . , sign Ps (a)) = σ
is non-empty. Theorem 1.1 in the semialgebraic case yields: if V is an algebraic set
defined by polynomials of degree at most d, then the number of sign conditions for P
realized in V is at most Csm , where m = dim(V ) and the constant C = C(d, k) only
depends on d and k.
6
ASCHENBRENNER, DOLICH, HASKELL, MACPHERSON, AND STARCHENKO
To see this recall that by cell decomposition, V is a finite union of semialgebraic
subsets of Rk each of which is semialgebraically homeomorphic to some Rn ; moreover,
this decomposition (and the resulting homeomorphism) can be chosen uniformly in the
parameters: Every zero set of polynomials from R[X] of degree at most d is the zero set
of M such polynomials, where M = k+d
is the dimension of the R-linear subspace of
d
R[X] consisting of the polynomials of degree at most d; thus we may take a semialgebraic
(in fact, algebraic) family (Vb )b∈RN , where N = M 2 , whose fibers Vb are the algebraic
subsets of Rk defined by polynomials of degree at most d. Then there are finitely many
(i)
semialgebraic families (Vb )b∈RN of subsets of Rk and for each i there is a semialgebraic
S (i)
(i)
(i)
family (Fb )b∈RN of maps such that for each b ∈ RN we have Vb = i Vb , and Fb is
(i)
(i)
a homeomorphism Rm → Vb , for some m(i) .
Fix some i and write m = m(i) . Let ν = (ν1 , . . . , νk ) range over Nk , with |ν| =
ν1 + · · · + νk , and suppose y = (yν )|ν|≤d , so y has length M . Let P (X; y) be the general
polynomial in the indeterminates X of degree at most d with coefficient sequence y; so
every Pj is of the form Pj = P (X; bj ) with bj ∈ RM . Suppose also x = (x1 , . . . , xm ),
and let z be a tuple of new variables of length N , let z 0 be a single new variable, and
let ϕ(i) (x; y, z, z 0 ) be a formula in the language of ordered rings which expresses that
(i)
P (Fz (x); y) and z 0 have the same sign. So, e.g., for a ∈ Rm , b ∈ RN we have R |=
(i)
ϕ(i) (a; bj , b, 1) iff Pj (Fb (a)) > 0. In this way we see that the number of sign conditions
(i)
for P realized in Vb is bounded by πϕ∗ (i) (3s) and thus is O(sm ) by Theorem 1.1, where
the implicit constant only depends on ϕ(i) and hence on d and k. This yields the claim
highlighted above. (Of course we have been very nonchalant with the constants. Indeed,
[11] shows the more precise result that the sum of the number of semialgebraically
connected components of the sets σV , where σ ranges over all sign conditions for P
s
realized in V , is bounded by (O(d))k m
.)
A simpler example is the number of non-empty sets definable by equalities and inequalities of a finite collection of polynomials over an algebraically closed field:
Example. Here we let ν = (ν1 , . . . , νm ) range over Nm , and suppose y = (yν )|ν|≤d . Let
ϕ(x; y) be the partitioned formula
X
yν xν = 0
|ν|≤d
in the language L of rings, and fix an algebraically closed field K. Then Sϕ = SϕK is the
collection of all zero sets (in K m ) of polynomials in m indeterminates with coefficients in
K having degree at most d. Hence πS∗ ϕ (t) is the maximum number of non-empty Boolean
combinations of t such hypersurfaces. In the sequel of our paper (see [6, Theorem 1.1])
we will show that the shatter function of any partitioned L-formula with m parameter
variables (such as ϕ∗ ) is O(tm ) in Th(K); hence πϕ∗ (t) = πϕ∗ (t) = O(tm ). (In fact, [46]
Pm
proves that πϕ∗ (t) ≤ k=0 kt dk for every t, and this bound is asymptotically optimal.)
1.5. Organization of the paper. In the preliminary Section 2 we set the scene by
recalling the definitions and basic facts concerning VC dimension and VC density in
a general combinatorial setting. In Section 3 we then move to the model-theoretic
context; in particular we introduce the VC density function of a complete theory without
finite models, and the (dual) VC density of a finite set of formulas. In Section 4 we
give some interesting examples of formulas in NIP theories for which we can explicitly
compute their VC density or determine the asymptotic behavior of their shatter function.
VC DENSITY IN SOME NIP THEORIES, I
7
In Section 5 we introduce the VC d property (a refinement of Guingona’s notion of
uniform definability of types over finite sets) and get our main tool for counting types
(Theorem 5.7) in place, which is then employed, in Section 6, to prove Theorem 1.1
from above. A strengthening of the VC d property is defined and established for the
p-adics in Section 7, thus proving Theorem 1.2. We refer to the introductions of each
section for a more detailed description of their contents.
1.6. Notations and conventions. In this paper, d, k, m, n range over the set N :=
X
{0, 1, 2, . . . } of natural numbers. We set
[n] := {1, . . . , n}. Given a set X, we write 2
X
for the power set of X, and we let n denote the set of n-element subsets of X and
X
X
X
X
≤n := 0 ∪ 1 ∪ · · · ∪ n the collection of subsets of X of cardinality at most n.
1.7. Acknowledgments. Part of the work on this paper was done while some of the
authors were participating in the thematic program on O-minimal Structures and Real
Analytic Geometry at the Fields Institute in Toronto (Spring 2009), and in the Durham
Symposium on New Directions in the Model Theory of Fields (July 2009), organized by
the London Mathematical Society and funded by EPSRC grant EP/F068751/1. The
support of these institutions is gratefully acknowledged. Aschenbrenner was partly
supported by NSF grant DMS-0556197. He would also express his gratitude to Gerhard
Wöginger for suggesting the example in Section 4.4.1, and to Andreas Baudisch and
Humboldt-Universität Berlin for their hospitality during Fall 2010. Haskell’s research
was supported by NSERC grant 238875. Macpherson acknowledges support by EPSRC
grant EP/F009712/1. Starchenko was partly supported by NSF grant DMS-0701364.
2. VC Density
In this section we introduce various numerical parameters associated to abstract families
of sets: VC dimension, VC density, and independence dimension, and we recall the
well-known phenomenon of “VC duality” hinted at already in the introduction (which,
in particular, allows us to relate VC dimension with independence dimension). An
important role in later sections is played by a new parameter associated to a set system
defined here, which we call breadth, and which is the focus of the last part of this section.
2.1. VC dimension and VC density. A set system is a pair (X, S) consisting of a
set X and a collection S of subsets of X. We call X the base set of the set system (X, S),
and we sometimes also speak of a set system S on X. Given a set system (X, S) and a
set A ⊆ X, we let S ∩ A := {S ∩ A : S ∈ S} and call (A, S ∩ A) the set system on A
induced by S. Let now S be a set system on an infinite set X. The function πS : N → N
given by
X
πS (n) := max |S ∩ A| : A ∈
n
is called the shatter function of S. We have 0 ≤ πS (n) ≤ 2n and πS (n) ≤ πS (n + 1) for
all n. Note that if Y ⊇ X then πS does not change if S is considered as a set system
on Y . (This justifies our choice of notation for the shatter function, suppressing the
base set X of our set system.)
One says that A ⊆ X is shattered by S if S ∩ A = 2A . If S is non-empty, then we
define the VC dimension of S, denoted by VC(S), as the supremum (in N ∪ {∞}) of
the sizes of all finite subsets of X shattered by S; so VC(S) = ∞ means that arbitrarily
large finite subsets of X can be shattered by S. Equivalently,
VC(S) = sup n : πS (n) = 2n .
(2.1)
8
ASCHENBRENNER, DOLICH, HASKELL, MACPHERSON, AND STARCHENKO
One says that S is a VC class if VC(S) < ∞. Note that some sources (e.g., [55])
alternatively define the VC dimension of S to be the minimum n such that no set of
size n is shattered by S (i.e., VC(S) + 1, with VC(S) as given by (2.1)).
We have the following fundamental fact about set systems:
Lemma 2.1 (Sauer-Shelah). If S has finite VC dimension d (so πS (n) < 2n for n > d),
then
n
n
n
πS (n) ≤
:=
+ ··· +
for every n.
≤d
0
d
n
If n ≥ d, then ≤d
is bounded above by (en/d)d (where e is the base of the natural
logarithm). In particular, either πS (n) = 2n for every n (if S is not a VC class), or
πS (n) = O(nd ). One may now define the VC density vc(S) of S as the infimum of all
real numbers r > 0 such that πS (n) = O(nr ), if there is such an r, and vc(S) := ∞
otherwise. That is,
log πS (n)
.
vc(S) = lim sup
log n
n→∞
We also define VC(∅) := vc(∅) := −1. Then vc(S) ≤ VC(S) by Lemma 2.1, and
vc(S) < ∞ iff VC(S) < ∞. The VC density of S is also known as the real density [7] or
the VC exponent [17] of S. It is related to the combinatorial dimension of S introduced
by Blei [8] and to compression schemes for S [47].
X
Example. Suppose S = ≤d
. Then the inequality in the statement of Lemma 2.1 is an
equality, and VC(S) = vc(S) = d.
Example. Suppose X = Rd , and S is the collection of all closed affine half-spaces in Rd ,
i.e., sets of the form {x ∈ Rd : hx, ai ≥ β} where a ∈ Rd , β ∈ R, and h , i denotes the
usual inner product on Rd . Then VC(S) = d + 1. (The proof of this fact is based on
Radon’s Theorem on convex sets; see [7, Corollaire 3.5].) Moreover, vc(S) = d; in fact,
Pd
n
πS (n) = 2 i=0 (−1)d−i ≤i
for every n; see [30, Theorem 3.1].
Example. Suppose X = R, k ≥ 1, and let S be the collection whose members are
the unions of k disjoint (open) intervals in R. Then VC(S) = vc(S) = 2k, in fact,
n
πS (n) = ≤2k
for each n. (See [28, Exercise 11, Chapter 4].)
In all three examples, πS is actually given by a polynomial of degree d = vc(S). It
is worth pointing out that for a VC class S, in general πS is not even asymptotic to a
real power function; see Section 4.4 below.
Clearly, VC and vc are increasing: if S ⊆ T ⊆ 2X , then πS ≤ πT and so VC(S) ≤
VC(T ) and vc(S) ≤ vc(T ). If X 0 is an infinite subset of X then πS∩X 0 ≤ πS ; more
generally (see [7, Proposition 2.2]):
Lemma 2.2. Let X 0 be an infinite set and f : X 0 → X be a map, and let f −1 (S) :=
{f −1 (S) : S ∈ S}. Then πf −1 (S) ≤ πS , with equality if f is surjective. In particular,
VC(f −1 (S)) ≤ VC(S) and vc(f −1 (S)) ≤ vc(S), with equality if f is surjective.
It is easy to verify that VC(S) = 0 if and only if |S| = 1, and vc(S) = 0 if S
is finite; in fact, the converse of the latter implication also holds: if vc(S) < 1, then
S is finite [7, Proposition 2.19] (and hence actually vc(S) = 0). It is also easy to
verify (cf. [7, Proposition 2.4]) that if S1 , S2 are subsets of S with S = S1 ∪ S2 , then
vc(S) = max{vc(S1 ), vc(S2 )}. In particular, vc(S) does not change if we alter finitely
many sets from S.
VC DENSITY IN SOME NIP THEORIES, I
9
2.2. Independence dimension. Let X be a set. Given subsets A1 , . . . , An of X,
we denote by S(A1 , . . . , An ) the set of atoms of the Boolean algebra of subsets of X
generated by A1 , . . . , An (the “non-empty fields in the Venn diagram of A1 , . . . , An ”);
that is, S(A1 , . . . , An ) is precisely the set of non-empty subsets of X of the form
\
\
Ai ∩
X \ Ai
where I ⊆ [n] = {1, . . . , n}.
i∈I
i∈[n]\I
Note that S(A1 , . . . , An ) does not depend on the particular order of the Ai , so sometimes
we abuse notation and, e.g., write S(Ai : i = 1, . . . , n) instead of S(A1 , . . . , An ). We
have 0 ≤ |S(A1 , . . . , An )| ≤ 2n , and we say that the sequence A1 , . . . , An is independent
(in X) if |S(A1 , . . . , An )| = 2n , and call A1 , . . . , An dependent (in X) otherwise.
Suppose now that S is a collection of subsets of X. We define πS∗ : N → N by
πS∗ (n) := max |S(A1 , . . . , An )| : A1 , . . . , An ∈ S .
Note that 0 ≤ πS∗ (n) ≤ 2n for each n. We say that S is independent (in X) if πS∗ (n) = 2n
for every n, that is, if for every n there is an independent sequence of elements of S of
length n. Otherwise, we say that S is dependent (in X). If S is dependent, we define
the independence dimension IN(S) of S as the largest n such that πS∗ (n) = 2n , and if S
is independent, we set IN(S) = ∞. If S is finite, then clearly IN(S) ≤ |S|.
Example 2.3. IN(S) ≤ 1 iff for all S, S 0 ∈ S one of the following relations holds: S ∩S 0 =
∅, S ⊆ S 0 , S 0 ⊆ S, or S ∪ S 0 = X.
The function πS∗ is called the dual shatter function of S, since (for infinite S) one
has πS∗ = πS ∗ for a certain set system S ∗ on X ∗ = S, called the dual of S (cf. [7,
2.7–2.11] or [66, Section 10.3]). For the same reason, the independence dimension of
S is sometimes also called the dual VC dimension of S, denoted by VC∗ (S). The
correspondence between S and S ∗ is explained in the following subsection.
2.3. VC duality. Let X and Y be infinite sets, and let Φ ⊆ X × Y . For y ∈ Y we put
Φy := {x ∈ X : (x, y) ∈ Φ},
and we set
SΦ := {Φy : y ∈ Y } ⊆ 2X .
We also write
Φ∗ ⊆ Y × X := (y, x) ∈ Y × X : (x, y) ∈ Φ
for the dual of the binary relation Φ. In this way we obtain two set systems (X, SΦ )
and (Y, SΦ∗ ). To simplify notation, we denote the shatter function of SΦ by πΦ , and
∗
its dual shatter function by πΦ
; similarly for Φ∗ in place of Φ. One verifies easily that
given a finite set A ⊆ X, the assignment
\
\
A0 7→
Φ∗x ∩
Y \ Φ∗x
x∈A0
x∈A\A0
defines a bijection
SΦ ∩ A → S(Φ∗x : x ∈ A).
This implies:
∗
Lemma 2.4. πΦ = πΦ
∗.
10
ASCHENBRENNER, DOLICH, HASKELL, MACPHERSON, AND STARCHENKO
We set VC(Φ) := VC(SΦ ), and similarly with IN and vc in place of VC. By the previous lemma, VC(Φ) = IN(Φ∗ ), hence SΦ is a VC class iff SΦ∗ is dependent. Reversing
∗
the role of Φ and Φ∗ also yields πΦ∗ = πΦ
, hence VC(Φ∗ ) = IN(Φ), and SΦ∗ is a VC
class iff SΦ is dependent. The following is also well-known (see, e.g., [7, 2.13 b)]):
∗
Lemma 2.5. VC(Φ) < 21+VC(Φ ) . (In particular SΦ is a VC class iff SΦ∗ is a VC
class.)
Example 2.6. Suppose SΦ is finite (i.e., vc(Φ) = 0). Then SΦ∗ is also finite. (Take
y1 , . . . , yN ∈ Y , where N = |SΦ |, such that SΦ = {Φy1 , . . . , ΦyN }. Let Xi = Φyi and
Yi = {y ∈ Y : Φy = Φyi } for i ∈ [N ]; thus Φ = X1 × Y1 ∪ · · · ∪ XN × YN . Hence for each
x ∈ X, Φ∗x is a union of Y1 , . . . , YN , and so there are only finitely many choices for Φ∗x .
Thus SΦ∗ is also finite, of size at most 2N .)
Clearly every infinite set system S on X is of the form S = SΦ for some infinite set Y
and some binary relation Φ ⊆ X × Y : just take Y = S, Φ = {(x, S) : x ∈ S, S ∈ S}.
The resulting set system SΦ∗ on Y = S is called the dual S ∗ of S in [66, Section 10.3].
By the above VC(S ∗ ) = VC∗ (S). If S is a dependent infinite set system on X, then
by Lemmas 2.1 and 2.4, there is a real number r ≥ 0 such that πS∗ = O(nr ), and the
infimum of all such r is called the dual VC density of S, denoted by vc∗ (S); note that
vc(S ∗ ) = vc∗ (S) and vc∗ (S) ≤ VC∗ (S).
Given Φ ⊆ X ×Y we write ¬Φ for the relative complement (X ×Y )\Φ of Φ in X ×Y .
∗
∗
We clearly have π¬Φ
= πΦ
. It is also easy to show that given Φ, Ψ ⊆ X × Y we have
∗
∗
∗
∗
∗
∗
. By passing to
· πΨ
≤ πΦ
πΦ∪Ψ ≤ πΦ · πΨ and hence (using complementation) πΦ∩Ψ
duals and Lemma 2.4, this yields:
Lemma 2.7. Let Φ, Ψ ⊆ X × Y . Then
vc(¬Φ) = vc(Φ),
vc(Φ ∪ Ψ) ≤ vc(Φ) + vc(Ψ),
vc(Φ ∩ Ψ) ≤ vc(Φ) + vc(Ψ).
VC dimension does not satisfy a similar subadditivity property for unions and intersections (cf. [27, Proposition 9.2.8]). In this way, VC density is better behaved than VC
dimension.
An important class of relations Φ such that the associated set system SΦ is dependent
are the stable ones. An n-ladder for Φ is a 2n-tuple (a1 , . . . , an , b1 , . . . , bn ) where each
ai ∈ X and each bj ∈ Y , such that for all i, j ∈ [n],
(ai , bj ) ∈ Φ
⇐⇒
i ≤ j.
If there is an n such that there is no n-ladder for Φ, then Φ is called stable, and Φ is
said to be unstable otherwise. If Φ is stable then the largest n such that an n-ladder
for Φ exists is called the ladder dimension of Φ; if Φ is unstable then we say that the
ladder dimension of Φ is infinite. Clearly if Φ is stable then SΦ is a VC class (with
VC dimension bounded by the ladder dimension). It is well-known that Φ is stable iff
Φ∗ is stable (e.g, [88, Exercise II.2.8]), and that Boolean combinations of stable relations
are stable.
2.4. Breadth. In many cases of interest complicated set systems are generated by simpler collections of subsets, and then the following lemma (essentially due to Dudley)
can be used to show that the resulting set system is dependent. For this let X be a set
and B be a collection of subsets of X.
VC DENSITY IN SOME NIP THEORIES, I
11
Lemma 2.8. Let N > 0 and suppose S is a set system on X such that each set in S
is a Boolean combination of at most N sets in B. Then πS∗ (t) ≤ πB∗ (N t) for each t. (In
particular, if B is dependent then so is S.)
Proof. Let A1 , . . . , At ∈ S, and let each Ai be a Boolean combination of the sets
Bi1 , . . . , BiN ∈ B. Then the Boolean algebra of subsets of X generated by the sets Ai
(i ∈ [t]) is contained in the Boolean algebra generated by the sets Bij (i ∈ [t], j ∈ [N ]),
and every atom of the former Boolean algebra contains an atom of the latter.
Suppose there is a d > 0 such that every non-empty intersection B1 ∩ · · · ∩ Bn of
n > d sets from B equals an intersection of a subset consisting of d of the Bi . We call
the smallest such integer d > 0 the breadth of B. This choice of terminology is motivated
by lattice theory: Given a (meet-) semilattice (L, ∧), the smallest d > 0 (if it exists)
such that any meet b1 ∧ · · · ∧ bn of n > d elements of L equals the meet of d of the
bi is called the breadth of L; if there is no such d we say that L has infinite breadth.
(See [16, Section II.5, Exercise 6, and Section IV.10].) So if B is closed under (finite)
intersection and only contains non-empty subsets of X, then the breadth of B, viewed
as a sub-semilattice of (2X , ∩), agrees with the breadth of B as defined above. Every
set system of finite breadth is dependent:
Lemma 2.9. breadth(B) ≥ IN(B).
Proof. Suppose d := breadth(B) < n := IN(B). Let BT
1 , . . . , Bn ∈TB such that
|S(B1 , . . . , Bn )| = 2n . Choose I ⊆ [n] with |I| = d and i∈I Bi = i∈[n] Bi , and
T
T
T
take j ∈ [n] \ I. Then i∈[n]\{j} Bi = i∈[n] Bi and hence (X \ Bj ) ∩ i∈[n]\{j} Bi = ∅,
contradicting IN(B) = n.
The previous two lemmas in combination with Lemma 2.1 immediately yield the
following useful fact (cf. [25, Chapter 5, Lemma 2.6]):
Corollary 2.10. Suppose B has breadth d, let N > 0, and let S be a set system on X
with the property that each set in S is a Boolean combination of at most N sets in B.
Then
d X
Nt
πS∗ (t) ≤
for every t.
i
i=0
In particular, πS∗ (t) = O(td ) and hence vc∗ (S) ≤ d.
Example 2.11. Let < be a linear ordering on X. We first recall some terminology: A
subset S of X is said to be convex (with respect to <) if for all s, s0 ∈ S and x ∈ X the
implication s < x < s0 ⇒ x ∈ S holds. So ∅ and singleton subsets are convex, as are
intervals in X. Here and in the rest of the paper, an interval in X is a subset of the
form
(a, b) := {x ∈ X : a < x < b}
where a, b are elements of X ∪ {±∞} with a < b. Other examples of a convex subset
of X are its initial segments: a subset S of X is an initial segment of X if for all s ∈ S
and x ∈ X, the implication x < s ⇒ x ∈ S holds. Now let S be the family of unions
of at most N convex subsets of X, for some given N ∈ N, and let B be the collection
of all initial segments of X. Then B has breadth 1, and every set in S is a Boolean
combination of at most 2N sets in B. Thus πS∗ (t) = O(t) by Corollary 2.10.
12
ASCHENBRENNER, DOLICH, HASKELL, MACPHERSON, AND STARCHENKO
Example 2.12. Let K be a field and v : K → Γ∞ := Γ ∪ {∞} be a valuation on K. By
an open ball in K we mean any subset of K of the form {x ∈ K : v(x − a) > γ} where
a ∈ K, γ ∈ Γ∞ ; similarly a set of the form {x ∈ K : v(x − a) ≥ γ} is called a closed ball
in K. A ball in K is an open or a closed ball in K. Any two given balls in K are either
disjoint, or one contains the other. Hence the collection B of balls in a given valued field
has breadth 1. Thus if S is the family of all Boolean combinations of at most N balls
in K, for some N ∈ N, then πS∗ (t) = O(t).
The preceding examples can be subsumed under the following general example (inspired by [2]):
Example 2.13. A family B of subsets of X is said to be directed if B has breadth 1; i.e.,
for all B, B 0 ∈ B with B ∩ B 0 6= ∅ one has B ⊆ B 0 or B 0 ⊆ B. If B ⊆ 2X is directed and
S is the family of Boolean combinations of at most N sets in B, for some N ∈ N, then
πS∗ (t) = O(t).
We also note:
Example 2.14. Let G be a group and let H be a collection of subgroups of G with
breadth d. Let B = {gH : g ∈ G, H ∈ H} be the set of all (left) cosets of subgroups
from H. Then B also has breadth d. This follows from the general
fact that if H1 , . . . , Hn
T
are subgroups of G, g1 , . . . , gn ∈ G, then the intersection i∈[n] gi Hi is either empty or
T
a coset of i∈[n] Hi . (So if S is a family of Boolean combinations of at most N elements
of B, for some N ∈ N, then πS∗ (t) = O(td ).)
In connection with the previous example it is worth recording:
Lemma 2.15 (Poizat). Let G be a group and let H be a collection of subgroups of G.
Then breadth(H) = IN(H).
Proof. By Lemma 2.9 we already know that breadth(H) ≥ IN(H). Suppose this inequality
is strict.
Then there are H1 , . . . , Hn+1 ∈ H, where n = IN(H), such that
T
T
=
6
Hi for each j ∈ [n + 1]. So for each j ∈ [n + 1] we may take
i∈[n+1]
i∈[n+1]\{j}
T
Q
gj ∈
i∈[n+1]\{j} Hi \ Hj . Then for every subset I of [n + 1] the element gI :=
i∈I gi
T
T
(with g∅ = 1) is in i∈[n+1]\I Hi ∩ i∈I (G \ Hi ). This contradicts IN(H) = n.
Example. Let S be the collection of all subgroups of (Z, +). Then S has infinite breadth,
hence infinite independence dimension by the previous lemma, and thus is not a VC class
by Lemma 2.5. In particular, the collection of arithmetic progressions a + bZ (a, b ∈ Z)
in Z is also not a VC class.
If our family B has finite breadth d, then the Helly number of B is at most d. The
Helly number of B is defined as the smallest d > 0 such that every finite subfamily
{B1 , . . . , Bn } of B with n > d which is d-consistent, is consistent, that is to say: if for
T
T
every I ∈ [n]
we have i∈I Bi 6= ∅, then i∈[n] Bi 6= ∅. Note however that conversely,
d
the breadth may be infinite yet the Helly number finite, even in the case of cosets: the
collection of arithmetic progressions in Z is independent, but has Helly number 2. Also,
not every VC class has finite Helly number: the family whose members are the subsets of
R with two connected components, though a VC class (of VC dimension 4), has infinite
Helly number. (For each n the elements [0, i) ∪ (i + 1, n], i = 0, . . . , n − 1 of this family
form an n − 1-consistent subfamily which is inconsistent.)
The following example is a prototype for finite-breadth families when we have a
dimension function at our disposal:
VC DENSITY IN SOME NIP THEORIES, I
13
Example 2.16. Define the height of B to be the largest d (if it exists) such that there
are B1 , . . . , Bd ∈ B with
B1 ) B1 ∩ B2 ) · · · ) B1 ∩ · · · ∩ Bd 6= ∅.
So B has height 0 iff B does not contain a non-empty set, and B has height 1 iff B does
contain a non-empty set, but any two distinct elements of B are disjoint. Clearly if B
has height d > 0, then the breadth of B is at most d. If B has height d > 1 and in
addition B has a largest element (with respect to inclusion)
then the breadth of B is
T
smaller than d: to see this let B1 , . . . , Bd ∈ B with i∈[d] Bi 6= ∅ be given; if B1 is the
T
T
largest element B of B then clearly i∈[d] Bi = i∈[d]\{1} Bi , and otherwise we have a
chain
B ) B1 ⊇ B1 ∩ B2 ⊇ · · · ⊇ B1 ∩ · · · ∩ Bd 6= ∅,
T
T
T
T
hence i∈[j] Bi = i∈[j+1] Bi and so i∈[d] Bi = i∈[d]\{j+1} Bi , for some j ∈ [d − 1].
The following observation (the proof of which we leave to the reader) allows us to
produce new finite-breadth set systems from old ones:
Lemma 2.17. Let B, B 0 be set systems on X and X 0 , respectively, and consider the set
system
B B 0 := {B × B 0 : B ∈ B, B 0 ∈ B 0 }
on X × X 0 . Then
breadth(B B 0 ) ≤ breadth(B) + breadth(B 0 ),
and this inequality is an equality if both B and B 0 have breadth larger than 1 and contain
a largest element (with respect to inclusion).
This lemma immediately yields:
Corollary 2.18. Let B, B 0 be set systems on X. Then the set system
B u B 0 := {B ∩ B 0 : B ∈ B, B 0 ∈ B 0 }
on X has breadth at most breadth(B) + breadth(B 0 ).
Example. Suppose < is a linear ordering of X and B is the collection of convex subsets
of X. Every element of B can be expressed as an intersection of an initial segment of
(X, <) with a final segment of (X, <) (i.e., an initial segment of the linearly ordered set
(X, >)). Hence breadth(B) = 2.
If B is a sublattice of (2X , ∩, ∪) which does not contain ∅ and X, then B and the set
system ¬B := {X \B : B ∈ B} have the same breadth; this is an immediate consequence
of the following lemma:
Lemma 2.19. Suppose B is closed under (finite) intersections and unions, and B does
not contain the empty set. Then for each d the following are equivalent:
T
(1) For all B1 , . . . , Bd+1 ∈ B there is some i ∈ [d + 1] such that j6=i Bj ⊆ Bi ;
S
(2) for all B1 , . . . , Bd+1 ∈ B there is some i ∈ [d + 1] such that Bi ⊆ j6=i Bj .
S
Proof. To see (1) ⇒ (2) apply (1) to Bi0 = j6=i Bj (i ∈ [d + 1]) in place of the Bi , and
T
for the converse implication apply (2) to Bi00 = j6=i Bj (i ∈ [d + 1]).
14
ASCHENBRENNER, DOLICH, HASKELL, MACPHERSON, AND STARCHENKO
We finish our discussion of breadth by a surprising connection between breadth and
stability. We will not use this observation later in the paper, but we include it here
since it shows, under the assumption of stability, the ubiquity of set systems of infinite
breadth. The breadth of a relation between two sets is by definition the breadth of the
associated set system, cf. Section 2.3.
Proposition 2.20. Let X, Y be infinite sets and Φ ⊆ X × Y be a relation. If vc(Φ) > 0
then Φ is unstable, or at least one of Φ or ¬Φ has infinite breadth.
At the root of Proposition 2.20 is a theorem of Balogh and Bollobás [10], which we
explain first. For this we need some additional terminology: Let (X, S) and (X 0 , S 0 ) be
set systems. We say that (X, S) contains (X 0 , S 0 ) as a trace if there exists an injective
map f : X 0 → X such that f (S 0 ) ⊆ S ∩ f (X 0 ). For example, if (X, S) is a set system
and A ⊆ X then (X, S) trivially contains (A, S ∩ A). Also, if (X, S) contains (X 0 , S 0 ),
and (X 0 , S 0 ) contains (X 00 , S 00 ), then (X, S) contains (X 00 , S 00 ).
For k ≥ 2 consider now the following set systems on [k]:
Ck = [i] : i ∈ [k]
(the k-chain)
Sk = {i} : i ∈ [k]
(the k-star )
Tk = [k] \ {i} : i ∈ [k]
(the k-costar ).
Balogh and Bollobás [10, Theorem 1] showed that these set systems are unavoidable
among sufficiently large set systems. More precisely: for all integers k, l, m ≥ 2 there
is some N = N (k, l, m) such that every set system S on a finite base set with |S| ≥ N
contains the k-chain, the l-star, or the m-costar. (Note that there is no condition on
the size of the base set in this statement.)
Proof of Proposition 2.20. Let S = SΦ . We first observe, for k ≥ 2:
(1) S contains Ck iff there is a k-ladder for Φ;
(2) if breadth(Φ∗ ) ≥ k then S contains Tk ; and
(3) if S contains Tk+1 then breadth(Φ∗ ) ≥ k.
∗
Part (1) is obvious.
T For ∗(2) note that breadth(Φ ) ≥ k iff there exist elements x1 , . . . , xk
of X such that j∈[k] Φxj 6= ∅ and for each i ∈ [k],
\
(Y \ Φ∗xi ) ∩
Φ∗xj 6= ∅,
j∈[k]\{i}
0
and for such choice of xi , setting X = {x1 , . . . , xk } we have X 0 \ {xi } ∈ S ∩ X 0 for
X
each i. Similarly, for (3), if X 0 = {x1 , . . . , xk+1 } ∈ k+1
such that X 0 \ {xi } ∈ S ∩ X 0
for each i ∈ [k + 1], then for each such i we have
\
(Y \ Φ∗xi ) ∩
Φ∗xj 6= ∅;
j∈[k+1]\{i}
T
in particular, taking i = k + 1 we see that j∈[k] Φ∗xj 6= ∅, and for each i ∈ [k] we have
T
(Y \ Φ∗xi ) ∩ j∈[k]\{i} Φ∗xj 6= ∅, hence breadth(Φ∗ ) ≥ k. Also note that (2) and (3) are
true with Tk , Tk+1 and Φ∗ replaced by Sk , Sk+1 and ¬Φ∗ , respectively.
Suppose now that vc(Φ) > 0, i.e., S is infinite. Then S ∗ = SΦ∗ is also infinite (see
Example 2.6). Then we have vc(S ∗ ) ≥ 1, hence there are arbitrarily large n and B ∈ Yn
such that |S ∗ ∩B| ≥ n1/2 . In particular, for each N there is a finite subset BN of Y with
|S ∗ ∩ BN | ≥ N . Now suppose Φ is stable; then Φ∗ is also stable. Let k0 ≥ 2 be larger
than the ladder dimension of Φ∗ . Then if k ≥ 2 and N ≥ N (k0 , k, k) then S ∗ ∩ BN (and
VC DENSITY IN SOME NIP THEORIES, I
15
hence S ∗ ) contains the k-star or the k-costar. Thus by observation (3) above, at least
one of Φ or ¬Φ has infinite breadth.
Of course, the converse of the implication in this proposition also holds: if vc(Φ) = 0
then SΦ is finite, hence trivially Φ is stable, and both Φ and ¬Φ have finite breadth,
since S¬Φ is finite as well.
Example. Let < be a linear ordering of X. Suppose B is the collection of initial segments
of (X, <), as in Example 2.11. Then ¬B = {X \ B : B ∈ B} consists of final segments of
(X, <). Hence B and ¬B both have finite breadth (indeed, breadth 1). Proposition 2.20
shows that phenomena such as these are confined to unstable contexts (for infinite set
systems).
Using Lemma 2.19, Proposition 2.20 also implies:
Corollary 2.21. Suppose X and Y are infinite sets and Φ ⊆ X × Y such that SΦ is an
infinite sublattice of (2X , ∩, ∪) of finite breadth, with ∅, X ∈
/ SΦ . Then Φ is unstable.
3. The Model-Theoretic Context
Throughout this section we fix a first-order language L, and we let ϕ(x; y) be a partitioned L-formula (as defined in the introduction), with object variables x = (x1 , . . . , xm )
and parameter variables y = (y1 , . . . , yn ). The formula ϕ gives rise, in a given Lstructure, to a set system. The associated parameters introduced in the previous section (shatter function, VC density etc.) are elementary invariants of the structure in
question. In Section 3.2 below we also introduce the VC density function vcT of a complete first-order L-theory T with no finite models: if vcT (n) is finite then vcT (n) is a
uniform bound on the VC density of all partitioned formulas in T having n parameter
variables. In Section 3.3 we illustrate this concept by computing vcT (1) for various T .
In Section 3.4 we then extend the definition of dual VC density to finite sets of formulas;
this is convenient for later sections, but, as we see in Section 3.5, does not add much
extra generality. There is some indication that computing VC density is easier when
only parameters coming from initial segments of indiscernible sequences are considered;
although we will not pursue these issues in the rest of the paper, we think that the
relationship between quantities like VC or vc and their “indiscernible” counterparts
deserves further investigation; we explore some connections in the last subsection.
3.1. VC density of definable families. Given an L-structure M and a tuple b ∈ M n ,
we denote the subset of M m defined by the L-formula ϕ(x; b) with parameters b in M
by
ϕM (M m ; b) := {a ∈ M m : M |= ϕ(a; b)}.
A subset of M m is called definable (in M ) if it is of the form ϕM (M m ; b), for some ϕ
and b. We also denote by
SϕM := {ϕM (M m ; b) : b ∈ M n }
the family of subsets of M m defined by ϕ in M , and we call (M m , SϕM ) the set system
associated with ϕ in M . More generally, to a given collection Φ(x) = {ϕi (x; yi )}i∈I
of partitioned L-formulas in the tuple of object variables x (and in various tuples of
parameter variables yi ) we may associate the set system
m
|yi |
SΦM := {ϕM
}
i (M ; b) : i ∈ I, b ∈ M
16
ASCHENBRENNER, DOLICH, HASKELL, MACPHERSON, AND STARCHENKO
on M m defined by the instances of the formulas ϕi . If the L-structure M is understood
from the context, we drop the superscript M in our notation.
Suppose now M is an infinite L-structure. As usual, say that ϕ is invariant under
an extension M ⊆ N of L-structures if M |= ϕ(a; b) ⇐⇒ N |= ϕ(a; b) for all a ∈ M m
and b ∈ M n . The following is obvious:
Lemma 3.1. Suppose N is an L-structure with M ⊆ N and ϕ is invariant under
M ⊆ N . Then SϕM ⊆ M m ∩ SϕN , hence πϕM ≤ πϕN and therefore VC(SϕM ) ≤ VC(SϕN )
and vc(SϕM ) ≤ vc(SϕN ).
For each s, t ∈ N, consider the L-sentence
πϕs,t := ∀x(1) · · · ∀x(t) ∀y (1) . . . ∀y (s+1)

m
_ ^
(i)
(j)

xk = xk ∨
1≤i<j≤t k=1
(i)

_
^
ϕ(x(i) ; y (k) ) ↔ ϕ(x(i) ; y (l) ) ,
1≤k<l≤s+1 1≤i≤t
(j)
(i)
(j)
where x(i) = (x1 , . . . , xm ) and y (j) = (y1 , . . . , yn ) are tuples of new variables. Then,
with πϕM := πSϕM denoting the shatter function of SϕM , we obviously have:
Lemma 3.2. For each s, t ∈ N,
M |= πϕs,t
⇐⇒
πϕM (t) ≤ s.
In particular, if N is an L-structure with M ≡ N , then πϕM = πϕN .
From now until the end of this section we fix a complete L-theory T with only infinite
models, and let M range over models of T . By the previous lemma we may set
πϕ := πϕM ,
VC(ϕ) := VC(SϕM ),
vc(ϕ) := vc(SϕM ),
where M is an arbitrarily chosen model of T . We call πϕ the shatter function of ϕ
(in T ), and we call VC(ϕ) and vc(ϕ) the VC dimension of ϕ and VC density of ϕ
(in T ), respectively. If we want to stress the dependence of πϕ on T we write πϕT , and
similarly for VC and vc.
Note that the definition of πϕ only depends on the set system Sϕ and not on the particular representing formula ϕ. In particular, πϕ remains unchanged under ∅-definable
reparameterizations:
Lemma 3.3. Let γ(z; y) be an L-formula, where z = (z1 , . . . , zl ), which defines the
graph of a map g : M l → M n . Let σ(x; z) := ∃y(γ(z; y) ∧ ϕ(x; y)), so
Sσ = ϕM (M m ; g(c)) : c ∈ M l .
Then πσ ≤ πϕ , with equality if g is surjective.
The dual of the partitioned L-formula ϕ(x; y) is ϕ∗ (y; x) := ϕ(x; y); that is, ϕ∗ (y; x)
is syntactically the same L-formula ϕ, only with the role of the object and parameter
variables interchanged. We call VC∗ (ϕ) := VC(ϕ∗ ) and vc∗ (ϕ) := vc(ϕ∗ ) the dual VC
dimension and dual VC density of ϕ, respectively. By Lemma 2.4 we have πϕ∗ = πϕ∗
and hence VC∗ (ϕ) = IN(ϕ) and
vc∗ (ϕ) = inf r ∈ R>0 : πϕ∗ (n) = O(nr ) .
If any of the quantities VC(ϕ), vc(ϕ), VC∗ (ϕ), vc∗ (ϕ) is finite, then so are all the others,
and in this case we say that ϕ is dependent or that ϕ defines a VC class. Note that
VC DENSITY IN SOME NIP THEORIES, I
17
for every partitioned L-formula ϕ(x; y) we have vc(ϕ) ≥ 0, with equality if Sϕ is finite.
If Sϕ is infinite then vc(ϕ) ≥ 1. (See the remarks following Lemma 2.2.)
Letting Φ := ϕ(M m ; M n ) and X := M m , Y := M n , in the notation introduced in
the previous subsection we have SΦ = Sϕ and SΦ∗ = Sϕ∗ . Hence Lemma 2.7 yields:
Corollary 3.4. We have vc(¬ϕ) = vc(ϕ), and if ψ(x; z) is another partitioned Lformula, then vc(ϕ ∧ ψ), vc(ϕ ∨ ψ) ≤ vc(ϕ) + vc(ψ).
From Lemma 2.2 one also obtains the invariance of vc under inverse images of surjective ∅-definable maps:
Corollary 3.5. Let δ(v; x) be an L-formula, where v = (v1 , . . . , vk ), which defines the
graph of a map f : M k → M m , and let ρ(v; y) := ∃x(δ ∧ ϕ), so Sρ = f −1 (Sϕ ). Then
πρ ≤ πϕ , with equality if f is surjective.
The theory T is NIP iff every partitioned L-formula defines a VC class. The theorem
of Shelah [86] already mentioned in the introduction shows that in order for every
partitioned L-formula ϕ(x; y) to define a VC class, it is enough that this holds for all
such ϕ(x; y) with a single parameter variable (i.e., |y| = 1). Hence if for each partitioned
L-formula ϕ(x; y) with |x| = 1 the set system Sϕ has finite breadth then T is NIP, by
Lemma 2.9. The theory T is said to be stable if for every partitioned L-formula ϕ(x; y)
the associated relation Φ = ϕ(M m ; M n ) is stable (in the sense of Section 2.3); if T
is stable then for each ϕ(x; y) with Sϕ infinite, at least one of Sϕ or S¬ϕ has infinite
breadth, by Proposition 2.20. (Corollary 2.21 of the same proposition also yields that if
T is stable then all finite-breadth sublattices S of the lattice of all subsets of M m which
have the form S = Sϕ for some L-formula ϕ(x; y) with |x| = m are finite.)
3.2. VC density of a theory. We define the VC density of T to be the function
vc = vcT : N → R≥0 ∪ {∞}
given by
vc(n) := sup vc(ϕ) : ϕ(x; y) is an L-formula with |y| = n .
Note that we could have also defined vcT as
vc(m) = sup vc∗ (ϕ) : ϕ(x; y) is an L-formula with |x| = m .
In the introduction we already observed that vc(m) ≥ m for every m. If L0 is an
0
expansion of L and T 0 ⊇ T a complete L0 -theory, then vcT ≤ vcT , with equality if T 0
is an expansion of T by definitions. Moreover, vc does not change under expansions by
constants:
Lemma 3.6. Let L0 = L ∪ {ci : i ∈ I} where the ci are new constant symbols, and let
0
T 0 ⊇ T be a complete L0 -theory. Then vcT = vcT .
0
Proof. Let M 0 |= T 0 and C := {cM
: i ∈ I} ⊆ M 0 . Let ϕ(x; y, z) be an L-formula
i
|z|
∗
∗
with |x| = m, and let c ∈ C . Then πϕ(x;y,c)
(t) ≤ πϕ(x;y,z)
(t) for every t, hence
0
vc∗ (ϕ(x; y, c)) ≤ vc∗ (ϕ(x; y, z)) ≤ vcT (m) and thus vcT (m) ≤ vcT (m).
It is clear that vc(n) ≤ vc(n + 1) for every n, by viewing a formula with n parameter
variables as one with n + 1 parameters; perhaps less obviously:
Lemma 3.7. vc(n) + 1 ≤ vc(n + 1) for every n.
18
ASCHENBRENNER, DOLICH, HASKELL, MACPHERSON, AND STARCHENKO
Proof. By the preceding lemma we may assume that L contains a constant symbol 0.
Let ϕ(x; y) be a partitioned L-formula with |x| = m, |y| = n. We construct a formula
ψ(x, xm+1 ; y, yn+1 ) with πϕ (t) · t ≤ πψ (2t) for every t (hence vc(ϕ) + 1 ≤ vc(ψ)), which
then shows the lemma. We set
ψ := (xm+1 = 0 ∧ ϕ(x; y)) ∨ (xm+1 = yn+1 ).
n
Then for b ∈ M , c ∈ M we have
ψ(M m+1 ; b, c) = (ϕ(M m ; b) × {0}) ∪ (M m × {c}).
Let A ⊆ M m with |A| = t and πϕ (t) = |A ∩ Sϕ |. Choose pairwise distinct elements
a1 , . . . , at ∈ M \ {0} and an arbitrary element a0 ∈ M m , and set
A0 := A × {0} ∪ (a0 , a1 ), . . . , (a0 , at ) .
Then |A0 | = 2t, and for b ∈ M n and j = 1, . . . , t we have
A0 ∩ ψ(M m+1 ; b, aj ) = (A ∩ ϕ(M m ; b)) × {0} ∪ {(a0 , aj )}.
Take b1 , . . . , bk ∈ M n , k = πϕ (t), such that the sets A ∩ ϕ(M m ; bi ), i = 1, . . . , k, are
pairwise distinct. Then the sets A0 ∩ ψ(M m+1 ; bi , aj ) (where i = 1, . . . , k, j = 1, . . . , t)
are also pairwise distinct. Hence πψ (2t) ≥ |A0 ∩ Sψ | ≥ k · t = πϕ (t) · t as claimed.
In this paper we prove, for many (unstable) NIP theories T of interest, that vcT (m) <
∞ for every m, and in fact, in these cases we establish that vcT (m) is bounded by a
linear function of m. Note, however, that T NIP does not imply that vcT (m) < ∞ for
eq
all m: it is easy to see that for every T (whether NIP or not) we have vcT (1) = ∞,
whereas T is NIP iff T eq is NIP. (We thank Martin Ziegler for pointing this out.)
By Laskowski’s proof [55] of Shelah’s theorem [86], the VC dimension VC(ϕ) of an
L-formula ϕ(x; y) is bounded in terms of the VC dimensions of certain L-formulas with a
single parameter variable (which, however, are astronomical, involving iterated Ramsey
numbers). This together with the examples below raises the following question, the
answer to which we don’t know:
Question. If vcT (1) < ∞, is vcT (m) < ∞ for every m?
Provided the answer to this question is positive, one may then also ask how vc(m)
depends on m and vc(1); e.g.: is there a function β : N × R≥0 → R≥0
, independent of
T , with the property that if vcT (1) < ∞, then vcT (m) ≤ β m, vcT (1) for every m? (In
all examples which we considered where vcT is known to be real-valued, vcT grows at
worst linearly.)
3.3. Computing vcT (1). In concrete cases it is often easy to see that vcT (1) = 1:
Example 3.8. Suppose that M is strongly minimal. The collection B = M
of one1
T
element subsets of M has breadth 1; so vc (1) = 1. (Corollary 2.10.)
Example 3.9. Suppose that L contains a binary relation symbol “<”, M = (M, <, . . . )
is an expansion of a linearly ordered set (M, <), and T = Th(M ) is weakly o-minimal.
Then for every partitioned L-formula ϕ(x; y) with |x| = 1 there exists an integer N ≥ 0
such that for every b ∈ M m , the set ϕM (M ; b) is a finite union of at most N convex
subsets of M . Hence vcT (1) = 1 by Example 2.11.
VC DENSITY IN SOME NIP THEORIES, I
19
Example 3.10. Suppose that Ldiv is the expansion of the language {0, 1, +, −, ×} of rings
by a binary relation symbol “|”. In a field K equipped with a valuation v : K → Γ∪{∞},
we interpret | by putting a|b :⇐⇒ v(a) ≤ v(b), for all a, b ∈ K. Suppose T is a complete
theory of valued fields in an expansion of Ldiv , and T is C-minimal, i.e., for every
K |= T , every definable subset of K is a finite Boolean combination of balls in K. Then
for every partitioned L-formula ϕ(x; y) with |x| = 1 there exists an integer N ≥ 0 such
that for every b ∈ K m , the set ϕK (K; b) is a Boolean combination of at most N balls
in K. Thus vcT (1) = 1 by Example 2.12.
The definition of C-minimality used in the previous example agrees (for expansions
of valued fields) with the one in [44]; this definition is slightly more restrictive than
the original one, introduced in [38, 62]. Every completion of the Ldiv -theory ACVF of
non-trivially valued algebraically closed fields is C-minimal (essentially by A. Robinson’s
quantifier elimination in ACVF; see [43]). Conversely, every valued field with C-minimal
elementary theory is algebraically closed [38]. Moreover, the rigid analytic expansions
of ACVF introduced by Lipshitz [57] are C-minimal [58].
Example 3.11. Let R be a ring and suppose L = LR is the language of R-modules. (In
this paper, “R-module” always means “left R-module.”) Suppose M is an R-module,
construed as an LR -structure in the natural way. By the Baur-Monk Theorem, every
LR -formula is equivalent in T = Th(M ) to a Boolean combination of positive primitive
(p.p.) LR -formulas; given a p.p. LR -formula ϕ(x; y) and b ∈ M |y| , the set ϕ(M |x| ; b) is
a coset of ϕ(M |x| ; 0). Suppose M is p.p.-uniserial, i.e., the subgroups of M definable
by p.p. LR -formulas form a chain. By Example 2.14, if M is infinite, then we have
vcT (1) = 1. (In [6] this will be extended to vcT (m) = m for every m.) Examples for
(α)
p.p.-uniserial abelian groups (viewed as Z-modules) include Q(α) , Z(p) , Z(pn )(α) and
Z(p∞ )(α) , where p is a prime and α is a cardinal, possibly infinite. Here
Z(p) = a/b : a, b ∈ Z, b 6= 0, p - b ,
viewed as a subgroup of the additive group of Q, Z(pn ) denotes the cyclic group Z/pn Z
of order pn , and Z(p∞ ) denotes the Prüfer p-group (the group of pn th roots of unity,
for varying n, written additively). Given an R-module M and an index set I, M (I)
denotes, as usual, the R-submodule of the direct product M I consisting of all sequences
with cofinitely many zero entries.
Examples 3.8–3.11 may be generalized as follows:
Example 3.12. A family Φ(x) = {ϕi (x; yi )}i∈I of L-formulas in the object variables x
(and in various tuples of parameter variables yi ) is said to have dual VC dimension d
if the set system S = SΦ defined by the instances of the formulas ϕi has dual VC dimension d. If Φ has dual VC dimension at most 1, then we say that Φ is VC-minimal ;
cf. Example 2.3. We also say that Φ is directed if S is directed in the sense of Example 2.13.
The L-theory T is VC-minimal if there is a VC-minimal family of L-formulas Φ(x)
with |x| = 1 such that in every M |= T every definable (possibly with parameters)
subset of M is a Boolean combination of finitely many sets in SΦ . (This definition was
introduced in [2].) If T is a VC-minimal L-theory, then for every L-formula ϕ(x; y)
with |x| = 1 there exists some N ∈ N such that in every M |= T every instance ϕ(x; b)
(b ∈ M |y| ) of ϕ defines a subset of M which is a Boolean combination of at most N sets
in SΦ , by compactness.
20
ASCHENBRENNER, DOLICH, HASKELL, MACPHERSON, AND STARCHENKO
One says that the VC-minimal theory T is directed if one can additionally choose Φ(x)
to be directed; in that case we have vcT (1) = 1 by Example 2.13. By [2, Proposition 6],
if Φ(x) is VC-minimal and SΦ contains some ∅-definable set other than ∅ or M |x| , then
there is a directed set Ψ(x) of L-formulas such that SΦ = SΨ and S¬Φ = S¬Ψ . By
Lemma 3.6 this yields in fact vcT (1) = 1 for every complete VC-minimal T (directed or
not) without finite models.
Example 3.11 can also be generalized in a different direction:
Example 3.13. Suppose L is a language expanding the language {1, ·} of groups, and T is
a complete L-theory containing the theory of infinite groups. Suppose for every G |= T ,
every definable subset of G is a Boolean combination of cosets of acleq (∅)-definable
subgroups of G. (This condition holds, in particular, if T satisfies the model-theoretic
condition known as 1-basedness, cf. [45].) By Example 2.14, if the collection of acleq (∅)definable subgroups of G has breadth at most d (in particular, by Example 2.16, if it
has height at most d), then we have vcT (1) ≤ d.
Here is a particular instantiation of the previous example:
Example 3.14. Let R be a ring, M an R-module, and T = Th(M ) in the language LR ,
as in Example 3.11. We have M ℵ0 ≡ M (ℵ0 ) (see, e.g., [42, Lemma A.1.6] or [82,
Corollary 2.24]). Set T ℵ0 := Th(M ℵ0 ) = Th(M (ℵ0 ) ). It is well-known that T = T ℵ0
iff the class of models of T is closed under direct products, iff for all p.p. LR -formulas
ϕ(x), ψ(x), either ϕ(M |x| ) ⊆ ψ(M |x| ) or the index
Inv(M, ϕ, ψ) := ϕ(M |x| ) : (ϕ ∧ ψ)(M |x| )
is infinite. (See, e.g., [42, Lemma A.1.7].) So if T = T ℵ0 and the Morley rank MR(T )
of T is finite then the length n of every sequence
M ) ϕ1 (M ) ) ϕ1 (M ) ∩ ϕ2 (M ) ) · · · ) ϕ1 (M ) ∩ · · · ∩ ϕn (M ),
where each ϕi (x) is a p.p. LR -formula with |x| = 1, is bounded by d = MR(T ); so
by Examples 2.14 and 2.16 we see that vcT (1) ≤ d. (Note that this bound is far from
optimal: e.g., for R = Z, M = Z(pd )(ℵ0 ) we have MR(T ) = d, yet vcT (1) = 1 by
Example 3.11.) In [6] we will extend this to vcT (m) ≤ md for every m.
3.4. Dual VC density of sets of formulas. It is convenient to extend the definition of
dual VC density to finite sets of formulas. Let ∆ = ∆(x; y) be a finite set of partitioned
L-formulas ϕ = ϕ(x; y) with the object variables x and parameter variables y. We set
¬∆ := {¬ϕ : ϕ ∈ ∆}, and for B ⊆ M |y| we let
∆(x; B) := ϕ(x; b) : ϕ ∈ ∆, b ∈ B .
Given a finite set B ⊆ M |y| , we call a consistent subset of ∆(x; B)∪¬∆(x; B) a ∆(x; B)type. Note that our parameter sets are subsets of M |y| , and not of M , as is more common
in model theory. (This is simply a matter of convenience, in order to be compatible with
VC duality.) Given a ∆(x; B)-type p we denote by pM ⊆ M |x| its set of realizations
in M . Since we are only dealing with finite sets ∆ and finite parameter sets B ⊆ M |y| ,
all ∆(x; B)-types have realizations in M itself (rather than in an elementary extension).
0
Given another finite set ∆0 (x; y 0 ) of partitioned L-formulas and a finite B 0 ⊆ M |y | , we
say that a ∆(x; B)-type p is equivalent to a ∆0 (x; B 0 )-type q if pM = q M .
VC DENSITY IN SOME NIP THEORIES, I
21
Let now B ⊆ M |y| be finite. Given a ∈ M |x| we denote the ∆(x; B)-type of a by
tp∆ (a/B) :={ ϕ(x; b) : b ∈ B, ϕ ∈ ∆, M |= ϕ(a; b)} ∪
{¬ϕ(x; b) : b ∈ B, ϕ ∈ ∆, M 6|= ϕ(a; b)}.
We write S ∆ (B) for the set of complete ∆(x; B)-types (in M ), that is, the set of (in M )
maximally consistent subsets of ∆(x; B) ∪ ¬∆(x; B); equivalently,
S ∆ (B) = tp∆ (a/B) : a ∈ M |x| .
If ∆ = {ϕ} is a singleton, we also write S ϕ (B) instead of S ∆ (B). The elements of
S ∆ (B) are syntactical objects (sets of formulas), but associating to a type p ∈ S ∆ (B)
its set pM of realizations in M gives a bijection from S ∆ (B) onto the set
S ϕM (M |x| ; b) : b ∈ B, ϕ ∈ ∆
of atoms of the Boolean algebra generated by the subsets ϕM (M |x| ; b) of M |x| . (See
Section 2.2.) Hence for every partitioned L-formula ϕ(x; y) we have
πϕ∗ (t) = max |S ϕ (B)| : B ⊆ M |y| , |B| = t .
In the general case, for every t ∈ N we also set
∗
π∆
(t) := max |S ∆ (B)| : B ⊆ M |y| , |B| = t ,
∗
(t) ≤ 2|∆|t . Similarly as in Lemma 3.2 one shows that if we pass from M to an
so 0 ≤ π∆
∗
elementarily equivalent L-structure then π∆
does not change (justifying our notation,
which suppresses M ).
Let ∆0 (x; y) be a finite set of partitioned L-formulas with ∆0 ⊆ ∆, and B ⊆ M |y| be
finite. Then there is a natural restriction map S ∆ (B) → S ∆0 (B), written as p 7→ p ∆0 .
This map is onto: given p ∈ S ∆0 (B) let a ∈ pM be arbitrary; then q := tp∆ (a/B) ∈
S ∆ (B) satisfies q ∆0 = p. In particular, |S ∆0 (B)| ≤ |S ∆ (B)|. Note also that if
∆ 6= ∅, then
Q the restriction maps p 7→ p ϕ, whereQϕ ∈ ∆, combine to an injective map
S ∆ (B) → ϕ∈∆ S ϕ (B); in particular, |S ∆ (B)| ≤ ϕ∈∆ |S ϕ (B)|. This shows:
LemmaP3.15. If all ϕ ∈ ∆ are dependent, then there exists a real number r with
0 ≤ r ≤ ϕ∈∆ vc∗ (ϕ) and
|S ∆ (B)| = O(|B|r )
for all finite B ⊆ M |y| .
(3.1)
We define the dual VC density of ∆ as the infimum vc∗ (∆) of all real numbers r ≥ 0
such that (3.1) holds; that is,
∗
vc∗ (∆) = inf r ≥ 0 : π∆
(t) = O(tr ) .
We have
max vc∗ (ϕ) ≤ vc∗ (∆) ≤
ϕ∈∆
X
vc∗ (ϕ).
ϕ∈∆
Clearly vc∗ (∆) agrees with vc∗ (ϕ) as defined previously if ∆ = {ϕ} is a singleton.
Moreover, vc∗ (∆) = 0 iff vc∗ (ϕ) = 0 for every ϕ ∈ ∆, and if vc∗ (∆) < 1 then vc∗ (∆) = 0.
(See the remarks following Lemma 2.2.) Note that in computing vc∗ (∆) there is no
harm in assuming that ∆ is closed under negation, i.e., with every ϕ ∈ ∆ the set ∆
also contains a formula equivalent (in M ) to ¬ϕ. (Passing from ∆ to ∆ ∪ ¬∆ does not
change S ∆ (B).)
Example. Suppose ∆(x; y) = {x1 = y, . . . , xm = y} where |x| = m
P and |y| = 1. Then
for finite B ⊆ M we have |S ∆ (B)| = (|B| + 1)m , hence vc∗ (∆) = ϕ∈∆ vc∗ (ϕ) = m.
22
ASCHENBRENNER, DOLICH, HASKELL, MACPHERSON, AND STARCHENKO
We finish this subsection with an easy result about interpretations (related to Lemma 3.3 and Corollary 3.5).
Lemma 3.16. Let M 0 be an infinite structure in a language L0 and π : X → M 0 an
interpretation of M 0 in M without parameters, where X ⊆ M r is ∅-definable. Then
for any finite set ∆0 (x; y) of L0 -formulas there exists a finite set ∆(x; y) of L-formulas
∗
∗
such that |∆| = |∆0 |, |x| = r|x|, and π∆
0 ≤ π∆ .
Proof. Let m := |x| and n := |y|. Let B 0 ⊆ (M 0 )n be finite. Choose B ⊆ X n with
|B| = |B 0 | such that each b = (b1 , . . . , bn ) ∈ B 0 has the form (π(b1 ), . . . , π(bn )) for
some (b1 , . . . , bn ) ∈ B. For each L0 -formula ϕ(x; y) choose an L-formula ψϕ (x; y), where
x = (x1 , . . . , xm ), y = (y 1 , . . . , y n ) and |x1 | = · · · = |xm | = |y 1 | = · · · = |y n | = r, such
that ψϕ (M (m+n)r ) ⊆ X m+n and for any a1 , . . . , am , b1 , . . . bn ∈ X,
M |= ψϕ (a1 , . . . am ; b1 , . . . , bn ) ⇐⇒ M 0 |= ϕ π(a1 ), . . . , π(am ); π(b1 ), . . . , π(bn ) .
Let a finite set ∆0 (x; y) of L0 -formulas be given. Set ∆ := {ψϕ : ϕ ∈ ∆0 }. Then S ∆ (B) ⊆
0
X m , and (a1 , . . . am ) 7→ (π(a1 ), . . . , π(am )) yields a surjective map S ∆ (B) → S ∆ (B 0 ),
0
hence |S ∆ (B 0 )| ≤ |S ∆ (B)| as required.
By Lemmas 3.6 and 3.16:
Corollary 3.17. Let M 0 be an infinite structure in a language L0 , interpretable in M
(possibly with parameters) on a definable subset of M r . Then, writing T = Th(M ) and
0
T 0 = Th(M 0 ), we have vcT (m) ≤ vcT (rm) for every m.
So for example if G is a group (considered as a structure in the usual first-order
language of group theory) and H is a definable normal subgroup of G, then vcTh(G/H) ≤
vcTh(G) if H has infinite index in G, and vcTh(H) ≤ vcTh(G) if H is infinite.
3.5. Coding finite sets of formulas. We let L, M and ∆ be as in the previous
subsection, and T = Th(M ). The following useful lemma, essentially due to Shelah
[88, Lemma II.2.1], shows that counting ∆(x; B)-types where |∆| > 1 is not really more
general than counting ∆(x; B)-types where ∆ is a singleton:
Lemma 3.18. Let d = |∆| and y 0 = (y1 , . . . , y2d , z, z1 , . . . , z2d ) with |y| = |yi | = |zi | =
|z| for every i = 1, . . . , 2d. There is an L-formula ψ∆ (x; y 0 ) with the following properties:
0
(1) for every finite B ⊆ M |y| with |B| ≥ 2 there is some B 0 ⊆ M |y | with |B 0 | =
2d|B| such that every p ∈ S ∆ (B) is equivalent to some q ∈ S ψ∆ (B 0 );
0
(2) for every finite B 0 ⊆ M |y | there is some B ⊆ M |y| with |B| ≤ 2d|B 0 | such that
every q ∈ S ψ∆ (B 0 ) is equivalent to some (possibly incomplete) ∆(x; B)-type p0 .
∗
∗
In particular, we have π∆
(t) ≤ πψ∗ ∆ (2dt) for t > 1 and πψ∗ ∆ (t) ≤ π∆
(2dt) for t ≥ 0.
∗
∗
T
Thus vc (∆) = vc (ψ∆ ) ≤ vc (m) where m = |x|.
Proof. Write ∆ = {ϕ1 , . . . , ϕd } and define ψ∆ as follows:
ψ∆ =
d
^
k=1
2d
^
z = zk → ϕk (x; yk ) ∧
z = zk → ¬ϕk−d (x; yk ) ∧
k=d+1
2d
_
k=1
z = zk ∧
^
1≤k<l≤2d
¬(z = zk ∧ z = zl ).
VC DENSITY IN SOME NIP THEORIES, I
23
For (1), suppose B ⊆ M |y| is finite, and b0 6= b1 are distinct elements of B. For b ∈ B
and k ∈ [d] set
(k)
b0 := b0 , b0 , . . . , b,
. . . , b0 , b1 , b0 , . . . , b1 ,
...,
b0
y1 y2 . . . yd+k . . . y2d z
z1 . . . zd+k . . . z2d
and
(k)
b1 := b0 ,
y1
b0 , . . . ,
y2 . . .
b,
yk
...,
...
b0 ,
y2d
b1 , b0 , . . . ,
z
z1 . . .
b1 , . . . ,
zk . . .
b0
z2d
,
and put
(k) (k)
B 0 := b0 , b1 : b ∈ B, k ∈ [d] ⊆ (M |y| )4d+1 .
Then |B 0 | = 2d|B|, and for every b ∈ B, k ∈ [d] we have
(k)
ψ∆ (M |x| ; b0 ) = ¬ϕk (M |x| ; b),
(k)
ψ∆ (M |x| ; b1 ) = ϕk (M |x| ; b).
Given p ∈ S ∆ (B) we set
(k)
(k)
(k)
(k)
q := {¬ψ∆ (x; b0 ), ψ∆ (x; b1 ) : ϕk (x; b) ∈ p} ∪
{ ψ∆ (x; b0 ), ¬ψ∆ (x; b1 ) : ϕk (x; b) ∈
/ p}.
Then clearly q ∈ S ψ∆ (B 0 ), and q is equivalent to p. The map p 7→ q : S ∆ (B) → S ψ∆ (B 0 )
is injective, hence |S ∆ (B)| ≤ |S ψ∆ (B 0 )| ≤ πψ∗ ∆ (2d|B|).
For (2) note that if b1 , . . . , b2d , c, c1 , . . . , c2d ∈ M |y| then the formula
ψ∆ (x; b1 , . . . , b2d , c, c1 , . . . , c2d )
|x|
defines ϕk (M ; bk ), ¬ϕk (M |x| ; bk+d ), or ∅ (since the ci ’s are not necessarily distinct).
0
Let B 0 ⊆ M |y | be finite, and q ∈ S ∆ (B 0 ). Set
B := b ∈ M |y| : b = bi for some (b1 , . . . , b2d , c, c1 , . . . , c2d ) ∈ B 0
and let p0 be the set of formulas which have the form ϕk (x; b) where k ∈ [d], b = bk
for some ψ∆ (x; b1 , . . . , b2d , c, c1 , . . . , c2d ) ∈ q with c = ck , or the form ¬ϕk (x; b) with
k ∈ [d], b = bd+k for some ψ∆ (x; b1 , . . . , b2d , c, c1 , . . . , c2d ) ∈ q with c = ck+d . Then
|B| ≤ 2d|B 0 |, and p0 is a ∆(x; B)-type equivalent to q. For each q choose an extension p
of p0 to a complete ∆(x; B)-type. Then the map q 7→ p : S ψ∆ (B 0 ) → S ∆ (B) is injective,
∗
so |S ψ∆ (B 0 )| ≤ |S ∆ (B)| ≤ π∆
(2d|B 0 |).
In the rest of this subsection we give some applications of this lemma. We first note:
Corollary 3.19. Let Φ be a set of L-formulas with the tuple of object variables x and
varying parameter variables such that every L-formula ϕ(x; y) is equivalent in T to a
Boolean combination of formulas in Φ. Then
vcT (m) = sup vc∗ (∆) : ∆ ⊆ Φ finite
where m = |x|.
Proof. The inequality “≤” is a consequence of the hypothesis: for each L-formula ϕ(x; y)
there is a finite subset ∆ = ∆(x; y) of Φ such that |S ϕ (B)| ≤ |S ∆ (B)| for each finite
B ⊆ M |y| . The reverse inequality follows from the previous lemma.
Let M ∗ < M be a monster model of T . Consider the expansion LSh of L by a new
predicate symbol Rψ,c (x) for every L-formula ψ(x; z) and every c ∈ (M ∗ )|z| . The Shelah
expansion of M is the expansion of M to an LSh -structure M Sh where each predicate
∗
symbol Rψ,c (x) as before is interpreted by M |x| ∩ ψ M ((M ∗ )|x| ; c). Shelah showed
[89] (with another proof given in [20]) that if T is NIP then T Sh = Th(M Sh ) admits
24
ASCHENBRENNER, DOLICH, HASKELL, MACPHERSON, AND STARCHENKO
quantifier elimination and is also NIP. This provides an interesting way of constructing
new NIP theories from old ones. The previous lemma and its Corollary 3.19 allows us
to prove that T and T Sh share the same VC density function:
Corollary 3.20. vcT
Sh
= vcT .
Sh
Proof. Fix some m and assume |x| = m. The inequality vcT (m) ≥ vcT (m) being
Sh
obvious, we only need to show that vcT (m) ≤ vcT (m). Let ∆ = ∆(x; y) be a finite
set of atomic LSh -formulas; by Corollary 3.19 and Shelah’s theorem mentioned above,
it suffices to show that vc∗ (∆) ≤ vcT (m). Take a finite set Ψ = Ψ(x; y, z) of partitioned
L-formulas and some c ∈ M |z| such that ∆ = {Rψ,c (x; y) : ψ ∈ Ψ}. Let B ⊆ M |y| be
finite, B ∗ := B × {c}, and let p ∈ S ∆ (B). Let a be an arbitrary realization of p (in
M Sh ), and define p∗ := tpΨ (a/B ∗ ) (in M ∗ ). Then for ψ ∈ Ψ and b ∈ B we have
ψ(x; b, c) ∈ p∗
⇐⇒
M ∗ |= ψ(a; b, c)
⇐⇒
M Sh |= Rψ,c (a; b)
⇐⇒
∗
Rψ,c (x; b) ∈ p.
In particular, the map p 7→ p : S (B) → S (B ∗ ) is injective, so vc∗ (∆) ≤ vc∗ (Ψ) ≤
vcT (m) by Lemma 3.18.
∆
Ψ
It is well-known (see, e.g., [100, Theorem 4.7]) that the direct product of two NIP
structures is again NIP. As a consequence of the last lemma we can also now estimate
the VC density of a direct product in terms of the VC densities of its factors. We refer
to [42, Section 9.1] for the definition of the product of two L-structures, and to [42,
Corollary 9.6.4] for the Feferman-Vaught Theorem used in the proof below.
Lemma 3.21. Let M 0 be another infinite L-structure, T 0 = Th(M 0 ), and let T × =
Th(M × M 0 ) be the L-theory of the direct product of M and M 0 . Then
vcT
×
0
≤ vcT + vcT .
Proof. Given n-tuples a = (a1 , . . . , an ) ∈ M n and a0 = (a01 , . . . , a0n ) ∈ (M 0 )n we denote
by a × a0 the n-tuple ((a1 , a01 ), . . . , (an , a0n )) of elements of M × M 0 ; every element of
(M × M 0 )n has the form a × a0 for some a ∈ M n , a0 ∈ (M 0 )n .
Let ϕ(x; y) be an L-formula. By the Feferman-Vaught Theorem there exist finitely
many pairs of L-formulas (θi (x; y), θi0 (x; y)), i ∈ [n] = {1, . . . , n}, such that for all
a ∈ M |x| , a0 ∈ (M 0 )|x| and b ∈ M |y| , b0 ∈ (M 0 )|y| ,
M ×M 0 |= ϕ(a×a0 ; b×b0 )
⇐⇒
for some i ∈ [n], M |= θi (a; b) and M 0 |= θi0 (a0 ; b0 ).
Set Θ := {θ1 , . . . , θn }, Θ0 := {θ10 , . . . , θn0 }. Let C be a finite set of tuples from (M ×
M 0 )|y| . Take B ⊆ M |y| , B 0 ⊆ (M 0 )|y| with |B|, |B 0 | ≤ |C| such that each c ∈ C is of
the form c = b × b0 for a unique pair (b, b0 ) ∈ B × B 0 . For every p ∈ S ϕ (C) choose a
realization ap × a0p ∈ (M × M 0 )|x| of p in M × M 0 , and put
q := tpΘ (ap /B),
0
q 0 := tpΘ (a0p /B 0 ).
Then for all (b, b0 ) ∈ B × B 0 we have
ϕ(x; b × b0 ) ∈ p
⇐⇒
M × M 0 |= ϕ(ap × a0p ; b × b0 )
⇐⇒
M |= θi (ap ; b) and M 0 |= θi0 (a0p ; b0 ), for some i ∈ [n]
⇐⇒
θi (x; b) ∈ q and θi0 (x; b0 ) ∈ q 0 , for some i ∈ [n].
VC DENSITY IN SOME NIP THEORIES, I
25
0
Hence the map p 7→ (q, q 0 ) is an injection S ϕ (C) → S Θ (B) × S Θ (B 0 ). In particular we
∗
∗
∗
∗
∗
0
∗
obtain πϕ∗ (t) ≤ πΘ
(t) · πΘ
0 (t) for every t and hence vc (ϕ) ≤ vc (Θ) + vc (Θ ); here πϕ
0
∗
∗
0
∗
is computed in M × M and πΘ , πΘ0 in M and M , respectively, and similarly for vc .
×
0
By Lemma 3.18 therefore vcT (m) ≤ vcT (m) + vcT (m) where m = |x|.
Remark. In a similar way one shows that if M 0 is a finite L-structure and T × =
×
Th(M × M 0 ), then vcT ≤ vcT .
We finish this subsection by noting a further restriction on the growth of vc (cf. also
Lemma 3.7):
Lemma 3.22. d vc(m) ≤ vc(dm) for all d, m > 0.
Proof. Let ∆(x; y) be a finite set of L-formulas with |x| = m. Let x1 , . . . , xd be new
m-tuples of variables and set
∆0 (x1 , . . . , xd ; y) := ϕ(xi ; y) : ϕ(x; y) ∈ ∆, i = 1, . . . , d .
∗
(t) = |S ∆ (B)|. Let a1 , . . . , ar ∈
Let B ⊆ M |y| , |B| = t ∈ N, such that r := π∆
m
∆
M be realizations of the types in S (B). For each i = (i1 , . . . , id ) ∈ [r]d let ai :=
(ai1 , . . . , aid ) ∈ (M m )d = M dm . Then the ai realize pairwise distinct ∆0 (x1 , . . . , xd ; B)0
∗
∗
(t))d = |S ∆ (B)|d ≤ |S ∆ (B)| ≤ π∆
types. This yields (π∆
0 (t). Since t was arbitrary, we
obtain d vc∗ (∆) ≤ vc∗ (∆0 ). Hence d vc(m) ≤ vc(dm) by Lemma 3.18.
3.6. VC density and indiscernible sequences. In this subsection we assume that
M is sufficiently saturated. Recall that πϕ (t) is the maximum size of Sϕ ∩A as A ranges
over t-element subsets of M m , and πϕ∗ (t) is the maximum size of S ϕ (B) as B ranges
over all t-element subsets of M n ; here, as above m = |x|, n = |y|. These definitions
may naturally be relativized to parameters coming from indiscernible sequences. More
precisely:
Definition 3.23. For every t let πϕ,ind (t) be the maximum of |Sϕ ∩ A| as A ranges
over all sets of the form A = {a0 , . . . , at−1 } for some indiscernible sequence (ai )i∈N in
∗
M m , and let πϕ,ind
(t) be the maximum of |S ϕ (B)| where B = {b0 , . . . , bt−1 } for some
indiscernible sequence (bi )i∈N in M n . We call πϕ,ind the indiscernible shatter function
∗
the dual indiscernible shatter function of ϕ.
of ϕ and πϕ,ind
The indiscernible shatter functions give rise to corresponding notions of indiscernible
VC dimension VCind (ϕ) and indiscernible VC density vcind (ϕ) of ϕ (and their duals
VC∗ind (ϕ) and VC∗ind (ϕ)) in a natural way; for example, vc∗ind (ϕ) is the infimum of all
r > 0 having the property that there is some C > 0 such that for all t and indiscernible
sequences (bi )i∈N we have |S ϕ (B)| ≤ Ctr , where B = {b0 , . . . , bt−1 }; if there is no such r
then vc∗ind (ϕ) = ∞.
∗
As in the classical case (cf. Lemma 2.4) we see that πϕ,ind
= πϕ∗ ,ind and hence
∗
∗
∗
∗
VCind (ϕ ) = VCind (ϕ) and vcind (ϕ ) = vcind (ϕ). Directly from the definition we have
πϕ,ind ≤ πϕ and hence VCind (ϕ) ≤ VC(ϕ) and vcind (ϕ) ≤ vc(ϕ). In particular VCind (ϕ)
and vcind (ϕ) are finite if ϕ defines a VC class. Conversely, if VCind (ϕ) is finite, then so
is VC(ϕ). (This follows by saturation of M and extraction of an indiscernible sequence;
see proof of Proposition 4 in [3].) Hence if one of the quantities VC(ϕ), vc(ϕ), VCind (ϕ),
or vcind (ϕ) is finite, then so are all the others.
Another numerical parameter associated to ϕ and defined via indiscernible sequences
is the alternation number alt(ϕ) of ϕ (in M ). This is the largest d (if it exists) such
26
ASCHENBRENNER, DOLICH, HASKELL, MACPHERSON, AND STARCHENKO
that for some indiscernible sequence (ai )i∈N in M m and some b ∈ M n we have
ai ∈ ϕ(M m ; b)
⇐⇒
ai+1 ∈
/ ϕ(M m ; b)
for all i < d − 1.
If there is no such d we set alt(ϕ) = ∞. It is well-known (and essentially due to Poizat)
that alt(ϕ) ≤ 2 VCind (ϕ) + 1 (see, e.g., [3, Proposition 3]) and that if alt(ϕ) is finite
then ϕ defines a VC class [3, Proposition 4]. Moreover:
Lemma 3.24. vcind (ϕ) ≤ alt(ϕ) − 1.
Proof. Since this is trivial if ϕ has infinite alternation number, we assume that d :=
alt(ϕ) < ∞. Let (ai )i∈N be an indiscernible sequence in M m and A = {a0 , . . . , at−1 }.
Then for each b ∈ M n , there are less than d indices i < t − 1 such that ϕ(ai ; b)
and ϕ(ai+1 ; b) have different truth value in M , and the set A ∩ ϕ(M m ; b) is uniquely
Pd−1 determined by knowledge of these indices i. Thus |A ∩ Sϕ | ≤ 2 i=0 ti = O(td−1 ) and
hence vcind (ϕ) ≤ d − 1 as required.
m
Example. Suppose Sϕ ⊆ Md where d > 0. Then alt(ϕ) ≤ 2d + 1 and vcind (ϕ) ≤
m
vc(ϕ) ≤ d, and all these inequalities are equalities if Sϕ = Md .
The previous example shows that the inequality in Lemma 3.24, in general, is strict.
The inequality VCind (ϕ) ≤ VC(ϕ) may be strict if there are no non-trivial indiscernible
sequences:
Example. Suppose L = {A, S, P } where A and S are unary relation symbols and P is
a binary relation symbol, and suppose M is an L-structure, with the interpretations of
A, S and P in M denoted by the same symbols, such that
(1) |A| = d and |S| = 2d ;
(2) for s ∈ S, P (x, s) defines a subset of A so that when s runs through S we obtain
all subsets of A;
(3) for s ∈
/ S, P (x, s) defines the empty set.
Then VC(P ) = d and VCind (P ) = 1 (as well as vc(P ) = vcind (P ) = 0).
The inequality vcind (ϕ) ≤ vc(ϕ) may also be strict, as Lemma 4.8 in the next section
shows. We do not know the answer to the following question:
Question. Is vcind (ϕ) always integral-valued?
(After a first version of this manuscript had been completed, Guingona and Hill [35]
showed that this question indeed has a positive answer.)
We finish this section with a connection between vc∗ind and the Helly number. We
already remarked (see Section 2.4) that if M = (M, <) is a dense linearly ordered set
and ϕ(x; y1 , z1 , y2 , z2 ) = (y1 < x < z1 ∨ y2 < x < z2 ) then the set system Sϕ has infinite
Helly number: that is, for each d there is a finite subfamily of Sϕ which is d-consistent
yet inconsistent. In contrast to this, we have:
Lemma 3.25. Put d = bvc∗ind (ϕ)c + 1. Then for every indiscernible sequence (bi )i∈N in
M |y| the set system S = {ϕ(M m ; bi ) : i ∈ N} has Helly number at most d.
Proof. Suppose for a contradiction that (bi )i∈N is an indiscernible sequence such that
S = {ϕ(M m ; bi ) : i ∈ N} has Helly number larger than d. Then some finite subfamily S0
of S is d-consistent but not consistent. By indiscernibility of (bi ), every finite subfamily
of S of size at least |S0 | has this property. In particular, we can take D ∈ N maximal
such that the set {ϕ(M m ; bi ) : i < D} is consistent. Obviously D ≥ d. Since (bi ) is
VC DENSITY IN SOME NIP THEORIES, I
27
N
indiscernible, we obtain that for any I0 ∈ D
the set {ϕ(M m ; bi ) : i ∈ I0 } is consistent,
N
but for any D0 > D and any I1 ∈ D0 the set {ϕ(M m ; bi ) : i ∈ I1 } is inconsistent. Let
t > D be arbitrary, and set Bt = {bi : i < t}. For I ∈ Dt let qI (x) be the unique
ϕ-type over Bt with ϕ(x; bi ) ∈ qI for i ∈ I and
¬ϕ(x; bi ) ∈ qI for i 6∈ I. Since |I| = D
every qI is consistent. Thus |S ϕ (Bt )| ≥ Dt = Θ(tD ). Since D ≥ d, this contradicts
vc∗ind (ϕ) < d.
Remark. Note that in the context of the previous lemma, we cannot achieve the stronger
conclusion that S has breadth at most d: for the formula ϕ(x; y) = x 6= y and any
indiscernible sequence (bi ), the set system S always has infinite breadth.
By Lemma 3.25 and extraction of an indiscernible sequence (using that M is assumed to be sufficiently saturated) we obtain a consequence which does not mention
indiscernibles:
Corollary 3.26. Suppose the set system Sϕ is d-consistent, where d = bvc∗ (ϕ)c + 1.
Then there is an infinite subset of Sϕ which is consistent.
This is a weak version of a theorem of Matoušek [67], according to which, if Sϕ is
d-consistent, where d > vc∗ (ϕ), then one may write Sϕ = S1 ∪· · ·∪SN (for some N ∈ N)
where each Si is consistent.
4. Some VC Density Calculations
In this section we give an example of a formula in the language of rings which, in
every infinite field, defines a set system with fractional VC density, depending on the
characteristic of the field. The construction of this formula (which is inspired by an
example by Assouad [7], who in turn credits Frankl) proceeds in two steps: we first
associate to a given partitioned formula ϕ a bigraph (= bipartite graph with a fixed
ordering of the bipartition of the vertex set), and then we realize the set of edges of
this bigraph as a definable family Sϕb. For our example we choose ϕ so as to encode
point-line incidences in the affine plane; the calculation of vc(ϕ)
b in characteristic zero
uses an analogue of the Szémeredi-Trotter Theorem due to Tóth. We also discuss
the question whether VC density in NIP theories can take irrational values, and give
examples of formulas in NIP theories whose shatter function is not asymptotic to a real
power function.
Throughout this section L is a first-order language and M is an L-structure.
4.1. Associating a bigraph to a partitioned formula. We follow [59] and make a
distinction between bipartite graphs and bigraphs. A bipartite graph is a graph (V, E)
whose set V of vertices can be partitioned into two classes such that all edges connect
vertices in different classes. By a bigraph we mean a triple G = (X, Y, Φ) where X and Y
are (not necessarily disjoint) sets and Φ ⊆ X × Y . Thus a bipartite graph can be viewed
as a bigraph if we fix a partition and specify which bipartition class is first and second.
Conversely, if G = (X, Y, Φ) is a bigraph then we obtain a bipartite graph (V (G), E(G))
(the bipartite graph associated to G) by letting V (G) be the disjoint union of the sets
X and Y , and E(G) = Φ; by abuse of language we call V (G) the set of vertices of
G and E(G) the set of edges of G. We also say that G is a bigraph on V = V (G).
(What we call a bigraph G = (X, Y, Φ) is sometimes called an incidence structure, and
(V (G), E(G)) is called its Levi graph or incidence graph.) A bigraph is said to be finite
if its set of vertices is finite. It is easy to see that a finite bigraph G can have at most
1
2
4 |V (G)| edges.
28
ASCHENBRENNER, DOLICH, HASKELL, MACPHERSON, AND STARCHENKO
A bigraph G0 = (X 0 , Y 0 , Φ0 ) is a sub-bigraph of G = (X, Y, Φ) if X ⊆ X 0 , Y ⊆ Y 0 ,
and Φ0 ⊆ Φ. We say that a bigraph G contains a given bigraph G0 (as a sub-bigraph) if
G0 is isomorphic to a sub-bigraph of G. Given a bigraph G = (X, Y, Φ) and a subset V
of its vertex set V (G), we denote by
G V := X ∩ V, Y ∩ V, Φ ∩ (V × V )
the sub-bigraph of G induced on V . The complement of a bigraph G = (X, Y, Φ) is the
bigraph ¬G := (X, Y, ¬Φ), and its dual is G∗ := (Y, X, Φ∗ ) where ¬Φ and Φ∗ are as in
Section 2.3.
Let ϕ(x; y) be a partitioned L-formula, where |x| = m, |y| = n. We may associate a
bigraph Gϕ = (X, Y, Φ) to ϕ and M , where X = M m , Y = M n , and
Φ = ϕ(M m ; M n ) = (a, b) ∈ M m × M n : M |= ϕ(a; b) .
Note that G¬ϕ = ¬Gϕ and Gϕ∗ = (Gϕ )∗ . If we want to stress the dependence of Gϕ on
M , then we write GM
ϕ instead of Gϕ . If ϕ is invariant under the extension M ⊆ N of
M
M
L-structures, then GN
ϕ V = Gϕ where V = V (Gϕ ).
From now on until the end of this subsection we assume that M is infinite and m = n.
The collection
E(Gϕ ) = (a, b) : (a, b) ∈ ϕ(M m ; M m ) ⊆ M m × M m
of edges of Gϕ then maps naturally onto the definable family
m
M
Sϕb = {a, b} : (a, b) ∈ ϕ(M m ; M m ) ⊆
≤2
of subsets of M m by a map whose fibers have at most 2 elements; here ϕ(v;
b x, y) is the
partitioned L-formula with object variables v = (v1 , . . . , vm ) and parameter variables
(x, y) given by
ϕ(v;
b x, y) := ϕ(x; y) ∧ (v = x ∨ v = y).
c∗ and ϕ
Note that VC(ϕ)
b ≤ 2. Also, Sϕc∗ = Sϕb and hence ϕ
b have the same VC dimension
and VC density. A bound on the number of subsets of a given finite set which are cut
out by Sϕb may be computed as follows:
Lemma 4.1. Let A ⊆ M m be finite. Then
|A0 | + 21 |E(Gϕ V )| ≤ |A ∩ Sϕb| ≤ 1 + |A0 | + |E(Gϕ V )|
where
(1) A0 is the set of all a ∈ A such that M |= ϕ(a; b) or M |= ϕ(b; a) for some
b ∈ M m , but there is no b ∈ A with M |= ϕ(a; b) or M |= ϕ(b; a), and
(2) V ⊆ V (Gϕ ) is the disjoint union of A considered as a subset of X and A
considered as a subset of Y .
Proof. Each set S ∈ A ∩ Sϕb is of one of the following types: S = ∅; S = {a} where
a ∈ A0 ; or S = {a, b} where a, b ∈ A with M |= ϕ(a; b) or M |= ϕ(b; a). Each set of the
last two types actually occurs in A ∩ Sϕb, whereas S = ∅ only occurs iff there is some
edge (a, b) of Gϕ with a, b ∈
/ A.
Hence if we set
Πϕ (t) := max |E(Gϕ V )| : V ⊆ V (Gϕ ), |V | = t ∈ N,
VC DENSITY IN SOME NIP THEORIES, I
29
then the lemma shows that
1
2 Πϕ (t)
≤ πϕb(t) ≤ 1 + t + Πϕ (2t)
for every t.
(4.1)
This observation opens up a road to computing (upper or lower) bounds on the VC
density of the formula ϕ:
b find a bound on the number of edges of the subgraph of Gϕ
induced on finite subsets of its vertex set, in terms of the number of vertices. In the
following we give some applications of this approach.
For positive integers r and s we denote by Kr,s := [r], [s], [r] × [s] the complete
bigraph with the vertex set [r] ∪ [s]. The following is a fundamental fact about finite
bigraphs:
Theorem 4.2 (Kővári, Sós and Turán [51]). Let r ≤ s be positive integers. There exists
a real number C = C(r, s) such that every finite bigraph G which does not contain Kr,s
as a sub-bigraph has at most C |V (G)|2−1/r edges.
(In fact, a more precise bound is also available, in terms of the sizes of the vertex
sets X and Y , but we won’t need this.)
Corollary 4.3. Let r ≤ s be positive integers. There is a real number C1 = C1 (r, s)
with the following property: if ϕ(x; y) is an L-formula such that Gϕ does not contain
Kr,s as a subgraph, then πϕb(t) ≤ C1 t2−1/r for every t; in particular, vc(ϕ)
b ≤ 2 − 1r .
Proof. If V ⊆ V (Gϕ ) is finite, and the bigraph Gϕ V does not contain Kr,s , then
|E(Gϕ V )| ≤ C |V |2−1/r by Theorem 4.2, where C = C(r, s) > 0 is as in that theorem.
Thus πϕb(t) ≤ 1 + t + Πϕ (2t) ≤ 2(1 + 21−1/r C) t2−1/r by (4.1).
Given integers r, s ≥ 1, the bigraph Gϕ contains Kr,s if and only if there are pairwise
distinct a1 , . . . , ar ∈ M m and pairwise distinct b1 , . . . , bs ∈ M m such that M |= ϕ(ai ; bj )
for all i ∈ [r], j ∈ [s]. It is interesting to note that if Gϕ does not contain Kr,s as a
sub-bigraph, for some r, s ≥ 1, then the bigraph G¬ϕ associated to ¬ϕ does contain
Kt,t , for every t ≥ 1: by an analogue of Ramsey’s Theorem for bigraphs due to Erdős
and Rado [32], for every t there exists an n such that for all bigraphs G with |V (G)| ≥ n,
one of G, ¬G contains Kt,t as a sub-bigraph. Hence in this case the VC density of the
formula ¬ϕ
c associated to ¬ϕ equals 2, by (4.1).
4.2. Point-line incidences. Let K be an infinite field, construed as a first-order structure in the language of rings as usual. The partitioned formula
ϕ(x1 , x2 ; y1 , y2 ) := x2 = y1 x1 + y2
gives rise to the bigraph Gϕ = (X, Y, Φ) where X = Y = K 2 and
Φ = ((η, ξ), (a, b)) ∈ K 2 × K 2 : η = aξ + b .
We may think of V (Gϕ ) = X ∪Y as the disjoint union of the set X of points p = (η, ξ) ∈
K 2 in the affine plane A2 (K) over K and the set Y of non-vertical lines ` in A2 (K);
thus E(Gϕ ) is the set of point-line incidences (p, `) where p ∈ A2 (K) and ` ⊆ A2 (K) is
a non-vertical line containing p. The bigraph Gϕ does not contain K2,2 as a subgraph.
(Two distinct points in A2 (K) lie on a unique line.) Hence by Corollary 4.3:
Corollary 4.4. There is a real number C1 > 0 (independent of K) such that πϕb(t) ≤
C1 t3/2 for every t; in particular, vc(ϕ)
b ≤ 23 .
30
ASCHENBRENNER, DOLICH, HASKELL, MACPHERSON, AND STARCHENKO
Note that this bound is better than what we get from the general estimate vc ≤ VC,
since VC(ϕ)
b = 2. Also, if K = R, then for our original formula ϕ we have πϕ (t) =
1 + t + 2t for every t. In particular VC(ϕ) = vc(ϕ) = 2.
A lower bound on vc(ϕ)
b is given by:
Lemma 4.5. Suppose K has characteristic 0. Then vc(ϕ)
b ≥ 34 .
Proof. This is due to Erdős, with the following simpler argument by Elekes [31]: let k
be a positive integer, t = 4k 3 , and consider the subsets
P := (η, ξ) : η = 0, 1, . . . , k − 1, ξ = 0, 1, . . . , 4k 2 − 1
L := (a, b) : a = 0, 1, . . . , 2k − 1, b = 0, 1, . . . , 2k 2 − 1
of Z2 , and set V := P ∪L ⊆ V (Gϕ ). Then for each i = 0, 1, . . . , k −1, each line η = aξ +b
with (a, b) ∈ L contains a point (η, ξ) ∈ P with ξ = i, so
|E(Gϕ V )| ≥ k · |L| = 4k 4 =
and hence vc(ϕ)
b ≥
4
3
1 4/3
t
41/3
= 41 |V |4/3
by (4.1).
The precise value of vc(ϕ)
b depends on the characteristic of K:
Proposition 4.6.
(1) Suppose K has characteristic 0. Then vc(ϕ)
b = 43 .
(2) Suppose K has positive characteristic. Then vc(ϕ)
b = 32 .
In the proof of this proposition we use the following generalization of a famous theorem of Szémeredi and Trotter [96] (although a weaker version of this theorem from [93],
with a somewhat simpler proof, would also suffice for our purposes):
Theorem 4.7 (Tóth [97]). There exists a real number C such that for all m, n > 0
there are at most C(m2/3 n2/3 + m + n) incidences among m points and n lines in the
affine plane over C.
Proof of Proposition 4.6. The lower bound vc(ϕ)
b ≥ 43 in (1) was shown in the previous
lemma. From Theorem 4.7 and Lemma 4.1 we obtain vcC (ϕ)
b ≤ 43 . If K is any field of
alg
K
characteristic 0 with algebraic closure K alg , then πϕK
= πϕCb by Lemmas 3.1 and
b ≤ πϕ
b
3.2, showing part (1) of Proposition 4.6.
The upper bound vc(ϕ)
b ≤ 23 in (2) is a consequence of Corollary 4.4. For the lower
bound we use the following observation: if F is a finite subfield of K, say |F | = q, then
2
F
3
|V (GF
ϕ )| = 2q and |E(Gϕ )| = q , hence
1
F
3/2
|E(GK
ϕ V )| = |E(Gϕ )| = √ |V |
8
Together with (4.1) this yields the inequality vc(ϕ)
b ≥
where V = V (GF
ϕ ).
3
2
in (2).
Proposition 4.6 shows in particular that there is no hope for a “Los Theorem” for
VC density: if M is a non-principal ultraproduct of a family (Mi )i∈I of infinite Lstructures, then one may have vcM (ϕ) 6= vcMi (ϕ) for all i ∈ I.
It is interesting to contrast Proposition 4.6 with the outcome of only considering
parameters from an indiscernible sequence:
Lemma 4.8. The formula ϕ
b has alternation number 2, hence vcind (ϕ)
b = 1.
VC DENSITY IN SOME NIP THEORIES, I
31
Proof. It suffices to show alt(ϕ)
b = 2, since then Lemma 3.24 yields vcind (ϕ)
b = 1.
Suppose for a contradiction that (ai )i∈N is an indiscernible sequence in K 2 and b =
(p, `) ∈ K 2 × K 2 witnessing that alt(ϕ)
b ≥ 3. We think of the elements of K 2 both as
2
points in the affine space A (K) over K and as non-vertical lines in A2 (K), and let i, j
range over {0, 1, 2, 3}. The ai are pairwise distinct, p ∈ `, and ai = p, aj = ` for some
i 6= j; hence ai ∈ aj for some i 6= j. If ai ∈ aj where i < j, then ai ∈ aj for all i < j
(by indiscernibility) and hence a0 , a1 ∈ a2 ∩ a3 , and this forces a0 = a1 or a2 = a3 , in
both cases a contradiction. Similarly the assumption that ai ∈ aj with i > j leads to a
contradiction.
Many other results in the combinatorial literature lead to non-trivial (upper and
lower) bounds on vc(ϕ)
b if ϕ encodes the incidence of points on various geometric objects;
see [66, Chapter 4] or [74]. For example, let R = (R, 0, 1, +, −, ×, <) be the ordered
field of real numbers. Let
ϕ(x1 , x2 ; y1 , y2 ) := (x1 − y1 )2 + (x2 − y2 )2 = 1,
so SϕR is the collection of circles with radius 1 in the plane, and E(GR
ϕ ) is the set
of incidences between points in R2 and circles of radius 1. Then an analogue of the
Szémeredi-Trotter Theorem [95] (or a more general result due to Pach and Sharir [73]
on families of simple plane curves) and (4.1) yields vcR (ϕ)
b ≤ 34 . (However, it is unknown
whether this bound is sharp, cf. [74, Section 2].)
4.3. Irrational VC density. In [8] it is shown that for every real number r ≥ 1 there
N
exists a set system S ⊆ dre
with vc(S) = r. We do not know the answer to the
following question (though we suspect the answer to be negative):
Question. Is the VC density of a formula in a NIP theory always rational?
Let Lgr = {E} be the language with a single binary relation symbol E. The Lgr structures are nothing but the (directed) graphs (with E interpreted as the edge relation). Given a graph G we denote by V (G) its set of vertices and by E(G) its set of
edges. Spencer and Shelah [91] established a 0-1-law for Lgr -sentences about random
(symmetric, loopless) graphs with n vertices and edge probability n−α , where α is an
irrational number between 0 and 1. We denote the resulting complete Lgr -theory by Tα .
It was shown by Baldwin and Shelah [9] that Tα is stable. (This can also be checked
by simply verifying that Tα is superflat in the sense of [77]; cf. Section 4.4 below.) In
particular, Tα is NIP, so it makes sense to investigate VC density of formulas in Tα . We
consider the following Lgr -formula:
ϕ(x; y) = E(x, y) ∨ x = y.
For any graph G and vertex v of G, the formula ϕ(x; v) defines the (closed) neighborhood
of v, i.e., the set consisting of v together with all vertices adjacent to it. It is tempting
to guess that vc(ϕ) = 1/α. (This would give rise to a negative answer of the question
posed above.) However, it turns out that vc(ϕ) is an integer:
Lemma 4.9. vc(ϕ) = b1/αc.
Before we give the proof, we recall some basic facts about the theory Tα ; our main
reference is [94]. We let G be a model of Tα .
A rooted graph is a pair (R, H) where H is a finite graph and R a proper subset of its
set of vertices; the elements of R will be called roots. We consider each finite non-empty
graph H as a rooted graph by identifying it with (∅, H). Given a rooted graph (R, H), a
32
ASCHENBRENNER, DOLICH, HASKELL, MACPHERSON, AND STARCHENKO
S
S0
...
S = {S, S 0 , . . . }
R = [t]
Figure 4.1. The rooted graph associated to a set system
rooted graph (R, H 0 ), where H 0 is subgraph of H whose vertex set properly contains R,
is called a rooted subgraph of (R, H).
A weak embedding of a rooted graph (R, H) into G is an injective map ι : V (H) →
V (G) such that for all roots v and non-roots w of (R, H), v and w are adjacent in H iff
ι(v) and ι(w) are adjacent in G; such a weak embedding is called an embedding if also
any two non-roots v and w of (R, H) are adjacent in H iff ι(v) and ι(w) are adjacent
in G. Note that there is no requirement about edges between roots. (This terminology
does not appear in [94] which talks about “(R, H)-extensions” instead.)
Let (R, H) be a rooted graph. The average degree of (R, H) is adeg(R, H) := 2e/v
where v = v(R, H) > 0 is the number of vertices of H which are not roots and e =
e(R, H) is the number of edges of H which do not have both ends in R. The maximum
average degree mdeg(R, H) of (R, H) is defined as the maximum of adeg(R, H 0 ) where
(R, H 0 ) is a rooted subgraph of (R, H). If adeg(R, H) > 2/α then (R, H) is called dense,
and sparse otherwise (i.e., if adeg(R, H) < 2/α). If mdeg(R, H) < 2/α then (R, H) is
called safe, and unsafe otherwise.
Now if H is dense then G does not contain a copy of H, whereas if H is safe then G
contains a copy (indeed, an induced copy) of H. More generally, if (R, H) is unsafe then
there is no weak embedding of (R, H) into G [94, p. 69], and if (R, H) is safe then every
injective map R → V (G) extends to an embedding of (R, H) into G [94, Theorem 5.2.1].
Let now S be a non-empty set system on [t] = {1, . . . , t}, where t > 0. We associate a
rooted graph (R, H) = (RS , HS ) to S as follows: the set of vertices of H is the disjoint
union of [t] and S, the set of roots is R = [t], there are no edges between two roots and
no edges between two non-roots, and a root i ∈ [t] and a non-root S ∈ S are related
by an edge iff i ∈P
S. (Cf. Figure 4.1.) Note that this rooted graph has average degree
2
adeg(R, H) = |S|
S∈S |S| and maximum average degree
mdeg(R, H) = max
0
∅6=S ⊆S
2 X
|S|.
|S 0 |
0
S∈S
[t]
So if S ⊆ k where k ∈ {0, . . . , t} then adeg(R, H) = mdeg(R, H) = 2k; hence if in
addition k < 1/α then (R, H) is safe (so there exists an embedding of (R, H) into G)
whereas if k > 1/α then (R, H) is dense (and so there is no weak embedding of (R, H)
into G).
[t]
Proof of Lemma 4.9. Applying the remarks above to S = b1/αc
, where t > 0, we see
that, as b1/αc < 1/α, there are pairwise distinct vertices a1 , . . . , at and bS (S ∈ S) of
G such that ai and bS are adjacent iff i ∈ S, i.e., writing A = {a1 , . . . , at } we have
VC DENSITY IN SOME NIP THEORIES, I
33
t
A ∩ ϕ(G; bS ) = {ai : i ∈ S} and hence |A ∩ Sϕ | ≥ |S|. So πϕ (t) ≥ b1/αc
for each t,
therefore vc(ϕ) ≥ b1/αc.
To show the reverse inequality suppose for a contradiction that vc(ϕ) > b1/αc. Let
% be a real number with vc(ϕ) > % > b1/αc. Note that for every set A of vertices of G
the set system A ∩ Sϕ is the union of the set system
{b} ∪ {a ∈ A : (a, b) ∈ E(G)} : b ∈ A
(4.2)
consisting of at most |A| sets, and
{a ∈ A : (a, b) ∈ E(G)} : b ∈ V (G) \ A .
(4.3)
Since vc(ϕ) > %, for every C ≥ 1 there are arbitrarily large t > 0 and A ⊆ V (G) with
|A| = t such that |A ∩ Sϕ | ≥ 2Ct% . The set system (4.2) has at most t elements; hence
(4.3) contains at least 2Ct% − t sets and so, since % > b1/αc ≥ 1, contains at least Ct%
sets. Identifying A with [t], the set system (4.3) on A thus gives rise to a set system S
on [t] with |S| ≥ Ct% whose associated rooted graph (RS , HS ) weakly embeds into G.
On the other hand, let S be any set system on [t] such that (RS , HS ) weakly embeds
into G, and for k = 0, . . . , t consider the set system Sk := S ∩ [t]
k on [t]. If Sk 6= ∅ then
(RSk , HSk ) is a rooted subgraph of (RS , HS ) and hence
2k = adeg(RSk , HSk ) ≤ mdeg(RS , HS ) < 2/α.
Therefore S does not contain a k-element subset of [t] with k > 1/α, i.e., S ⊆
[t]
≤b1/αc
.
Hence |S| ≤ Ctb1/αc where C = Cα is a constant only depending on α. This contradicts
the previous paragraph.
We remark that a similar analysis shows that the simpler formula E(x, y) also has
VC density b1/αc in Tα . We chose ϕ as above because it allows us to compare Lemma 4.9
with the main result of [4], where the precise value of the VC dimension of ϕ (as it
93
then
depends on α) is computed. In particular, [4, Corollary 8] shows that if 0 < α < 650
b1/αc + 3 ≤ VC(ϕ) ≤ b1/α + 3(α + 1)c.
4.4. Shatter functions not growing like a power. We finish this section with two
examples of VC classes definable in NIP theories whose shatter function is not asymptotic to a real power function.
4.4.1. The hypercube. Let Q be the “infinitary hypercube”, i.e., the (symmetric, loopless) graph whose vertex set is the set of all sequences s = (sn ) in {0, 1}N with finite
support, with two sequences related by an edge iff they differ in only one component.
(Alternatively, Q can be represented as the set of all finite sets of natural numbers, with
an edge between themSiff their symmetric difference is a singleton.) Note that Q is the
increasing union Q = d>0 Qd of its induced subgraphs Qd with vertex set
V (Qd ) = s = (sn ) ∈ Q : sn = 0 for n ≥ d ,
which we may identify with {0, 1}d in the natural way. (So Qd is the d-dimensional
hypercube.) We construe Q as an Lgr -structure. In [77] a condition sufficient for a
(symmetric) graph to be stable is introduced, called superflatness: a graph G is superflat
if for every m there is some n such that no subdivision of the complete graph Kn on n
vertices, obtained by placing at most m additional vertices on each edge, embeds into G.
Note that the graph Q is not superflat: in fact, for every d there is an embedding of a
subdivision of Kd+1 , obtained by placing at most one additional vertex on each edge,
into Qd , cf. [37]. However, we do have:
34
ASCHENBRENNER, DOLICH, HASKELL, MACPHERSON, AND STARCHENKO
Proposition 4.10. Q is ω-stable.
Towards a proof of this proposition, we first introduce some notation and terminology:
Given a language L let L(P ) = L ∪ {P } where P is a new predicate symbol, and given
an L-structure M and a subset A of its domain let (M , A) be the expansion of M to
an L(P )-structure obtained by interpreting P by A. The induced structure Aind on A
is the structure whose language consists of an m-ary relation symbol Rϕ for every Lformula ϕ(x), where m = |x|, interpreted in Aind by ϕM (M m ) ∩ Am . An L(P )-formula
ϕ(x), where x = (x1 , . . . , xm ), is said to be bounded if it has the following form (slightly
abusing syntax):
ϕ(x) = ♦1 y1 ∈ P · · · ♦n yn ∈ P ψ(x1 , . . . , xm , y1 , . . . , yn ),
where ♦i ∈ {∀, ∃} and ψ is an L-formula. Casanovas and Ziegler have shown:
Theorem 4.11. Let M be a structure in a language L and let A ⊆ M .
(1) Suppose M is strongly minimal. Then in (M , A), every L(P )-formula is equivalent to a bounded formula.
(2) Suppose that in (M , A), every L(P )-formula is equivalent to a bounded formula,
and let λ ≥ |L| be a cardinal. If both M and Aind are λ-stable then (M , A) is
λ-stable.
(See [18, Corollary 5.4 and Proposition 3.1]; part (1) had actually first been shown
by Pillay [76].)
Consider now, slightly more general than necessary, an arbitrary field K, and let
LK = {0, +, (λ· )λ∈K } be the language of K-vector spaces. Let M be an infinitedimensional K-vector space. Then M , construed as an LK -structure, has quantifier
elimination and is ω-stable. Let A be a set of linearly independent elements of M .
Then the induced structure on A is trivial: every subset of Am definable in Aind is
definable in the empty language. Hence by the theorem above, the LK (P )-structure
(M, A) is ω-stable.
For the proof of Proposition 4.10, it now suffices to note that the Lgr -structure Q
is definable
in (M, A), for suitable choice of K, M and A: Take K = F2 and let
L
M =
n F2 an be a countably infinite F2 -vector space with distinguished basis A =
{an : n ≥ 0}. Then Q is definable in (M, A): identifying V (Q) with M in the natural
way, we have, for all vertices s, t of Q: (s, t) ∈ E(Q) iff s − t ∈ A. Since (M, A) is
ω-stable, so is Q.
Now consider the Lgr -formula ϕ(x; y) := E(x, y). Then
Sϕb = {s, s0 } : s, s0 ∈ Q, (s, s0 ) ∈ E(Q)
is the collection of undirected edges of Q. In the following A denotes a subset of Q
(unlike in the proof of Proposition 4.10). Note that if A ⊆ Qd then
A ∩ Sϕb = A1 ∪ E[A] where E[A] := {a, a0 } : a, a0 ∈ A, (a, a0 ) ∈ E(Q) .
Since Qd has 2d vertices and 12 d2d undirected edges, we thus see that πϕb(t) ≥ t + 21 t log t
for infinitely many t; in fact:
Proposition 4.12. πϕb(t) = 21 t log t (1 + o(1)) as t → ∞.
VC DENSITY IN SOME NIP THEORIES, I
35
Proof. Set
Ed (t) := max |E[A]| : A ⊆ Qd , |A| = t
for d > 0 and t ≤ 2d .
Then πϕb(t) = t + maxd≥dlog te Ed (t). It is known (see, e.g., [1]) that there is some
function g with g(t) = 21 t log t (1 + o(1)) as t → ∞ such that Ed (t) = g(t) for all d and
t ≤ 2d . This yields the claim.
4.4.2. An example in R = (R, 0, 1, +, −, ×, <). For this we use another one of the rare
examples (besides the Szémeredi-Trotter Theorem) where tight bounds on the number
of incidences are known:
Theorem 4.13 (Pach and Sharir [72]). Let α be a real number with 0 < α < π. The
maximum number of times that α occurs as an angle among the ordered
triples of t
√
points in the plane is O(t2 log t). Furthermore, suppose tan(α) ∈ Q d where d ∈ N
is not a square. Then there exists a constant C = Cα > 0 and, for every t > 3, a
t-element set St ⊆ R2 with the property that at least Ct2 log t ordered triples of points
from St determine the angle α.
Let x = (x1 , x2 ), y = (y1 , y2 ), z = (z1 , z2 ) and consider the formula
ϕ(x, y, z) := x 6= y ∧ x 6= z ∧ 2hy − x, z − xi = ||y − x|| ||z − x||
in the language of the ordered field of real numbers R, where h , i denotes the usual inner
product on R2 and || || the associated norm. Then for a, b, c ∈ R2 we have R |= ϕ(a, b, c)
iff the vectors b−a and c−a are non-zero and the angle ∠(b, a, c) between them is π3 . Let
ϕ(v;
b x, y, z) be the partitioned formula with object variables v = (v1 , v2 ) and parameter
variables (x, y, z) given by
ϕ(v;
b x, y, z) := ϕ(x, y, z) ∧ (v = x ∨ v = y ∨ v = z),
2
so Sϕb consists of all {a, b, c} ∈ R3 with R |= ϕ(a, b, c). We now have:
Corollary 4.14. There exist constants C1 , C2 > 0 such that
C1 t2 log t < πϕb(t) < C2 t2 log t
for every t > 0.
That is, πϕb(t) = Θ(t2 log t) as t → ∞.
Proof. Let A ⊆ R2 be finite. Then
n
A
A ∩ Sϕb = ≤2
∪ {a, b, c} ∈
A
3
o
: R |= ϕ(a, b, c) .
√
Since tan(π/3) = 3, the second part of Theorem 4.13 applies to α = π3 . The upper
bound in Corollary 4.14 now follows from the first assertion in Theorem 4.13, and the
lower bound from the second assertion in the same theorem.
5. Theories with the VC d Property
After defining the VC d property we prove that if a theory has this property then the
dual VC density of any finite set of partitioned formulas in the tuple of object variables
x is at most d|x|. In Section 6 we will then show that various theories have the VC d
property. In the following M is a structure in a language L, ∆(x; y) is a finite non-empty
set of partitioned L-formulas, and m = |x|.
36
ASCHENBRENNER, DOLICH, HASKELL, MACPHERSON, AND STARCHENKO
5.1. Uniform definability of types over finite sets. Given a ∆(x; B)-type q ∈
S ∆ (B), where B ⊆ M |y| , a family F = (ϕ# )ϕ∈∆ of L(M )-formulas ϕ# (y) is said to
define q if for all ϕ ∈ ∆ and b ∈ B we have
ϕ(x; b) ∈ q
⇐⇒
M |= ϕ# (b).
We also say that F is a definition of q. For a family F = F(y; v) of partitioned
L-formulas ψ(y; v), we denote by F(y; c) the family (ψ(y; c))ψ∈F of L(M )-formulas
obtained by substituting a given tuple c ∈ M |v| for the tuple of variables v. The
following generalizes a definition due to Guingona [34]:
Definition 5.1. We say that ∆ has uniform definability of types over finite sets (abbreviated as UDTFS ) in M if there are finitely many families
Fi = ϕi (y; y1 , . . . , yd ) ϕ∈∆
(i ∈ I)
of L-formulas (where |yj | = |y| for every j = 1, . . . , d) such that for every finite set B ⊆
M |y| and q ∈ S ∆ (B) there are b1 , . . . , bd ∈ B and some i ∈ I such that Fi (y; b1 , . . . , bd )
defines q. We call the family F = (Fi )i∈I a uniform definition of ∆(x; B)-types over
finite sets in M with d parameters. If ∆ = {ϕ} is a singleton, we also speak of ϕ having
UDTFS.
The following observation shows in particular that every finite set ∆(x; y) of partitioned L-formulas which is directed (see Example 3.12) has UDTFS in T with a single
parameter:
Lemma 5.2. Let ∆(x; y) be a finite set of partitioned L-formulas, and suppose the
set system S∆ = {ϕ(M |x| ; b) : b ∈ M |y| } has breadth d. Then ∆ has UDTFS with d
parameters.
Proof. To see this set F0 (y) := (∃z(z 6= z))ϕ∈∆ and, for each d-tuple ψ = (ψ1 , . . . , ψd ) ∈
∆d , define
Fψ (y; y1 , . . . , yd ) := (ϕψ (y; y1 , . . . , yd ))ϕ∈∆
where
ϕψ (y; y1 , . . . , yd ) := ∀x ψ1 (x; y1 ) ∧ · · · ∧ ψd (x; yd ) → ϕ(x; y) .
Then F = (Fψ )ψ∈{0}∪∆d is a uniform definition of ∆(x; B)-types over finite sets. For
suppose q ∈ S ∆ (B) where B ⊆ M |y| is finite. If ϕ(x; b) ∈
/ q for all ϕ ∈ ∆, b ∈ B, then
F0 (y) defines q. Otherwise, by assumption we can pick ψ1 (x; b1 ), . . . , ψd (x; bd ) ∈ q such
that
\
ϕ(M |x| ; b) = ψ1 (M |x| ; b1 ) ∩ · · · ∩ ψd (M |x| ; bd ).
ϕ(x;b)∈q
Then Fψ (y; b1 , . . . , bd ), where ψ = (ψ1 , . . . , ψd ), defines q.
A uniform definition of ∆(x; B)-types over finite sets in M remains a uniform definition of ∆(x; B)-types over finite sets in any elementarily equivalent structure, and so
it makes sense to speak of uniform definability of types over finite sets in a complete
theory. If we do not care about the number of parameters, we can always do with a
single defining scheme (at least for non-trivial parameter sets):
Lemma 5.3. Let F = (Fi )i∈I be a uniform definition of ∆(x; B)-types over finite sets
with d parameters as above, and let n = |I|. Then there exists a family
F# = ϕ# (y; y1 , . . . , yd , v, w1 , . . . , wn ) ϕ∈∆
VC DENSITY IN SOME NIP THEORIES, I
37
such that for every finite set B ⊆ M |y| with |B| ≥ 2 and every q ∈ S ∆ (B) there are
b1 , . . . , bd , c, c1 , . . . , cn ∈ B such that F# (u; b1 , . . . , bd , c, c1 , . . . , cn ) defines q.
Proof. This is a simple coding trick due to Shelah (proof of Theorem II.2.12 (1) in [88],
cf. also [34, Lemma 2.5]). For every ϕ ∈ ∆ define
ϕ# (y; y1 , . . . , yd , v, w1 , . . . , wn ) :=
n
^
v = wi → ϕi (y; y1 , . . . , yd )
i=1
and let F# = (ϕ# )ϕ∈∆ . Let B ⊆ M |y| , |B| ≥ 2, and q ∈ S ∆ (B). By hypothesis,
there are b1 , . . . , bd ∈ B and i ∈ I such that Fi (y; b1 , . . . , bd ) defines q. Pick c, c0 ∈ B
with c 6= c0 , and put ci := c, cj := c0 for j 6= i. Then F# (y; b1 , . . . , bd , c, c1 , . . . , cn )
defines q.
Similarly, the proof of Lemma 3.18, (1) shows that if the L-formula ψ∆ which we associated there to the finite set of L-formulas ∆ admits a uniform definition of ψ∆ (x; B 0 )types over finite sets with d parameters, then ∆ itself admits a uniform definition of
∆(x; B)-types over finite parameter sets B (with at least 2 elements) having d + 2
parameters.
On the other hand, if we have tight control over the number d of parameters in our
defining schemes, then we can bound the sizes of the ∆(x; B)-type spaces over finite sets
by polynomial functions (in the size of the parameter set) of degree d: more precisely, if
∆ allows a uniform definition F = (Fi )i∈I of ∆(x; B)-types over finite sets in M with
∗
(t) ≤ |I| td
d parameters, then |S ∆ (B)| ≤ |I| |B|d for every finite B ⊆ M |y| , hence π∆
∗
for each t, and so vc (∆) ≤ d.
5.2. The VC d property. We say that M has the VC d property if any ∆(x; y) with
|x| = 1 has a uniform definition of ∆(x; B)-types over finite sets with d parameters.
Clearly, if M has the VC d property, then so does every elementarily equivalent Lstructure. We say that a theory T has the VC d property if every model of T has the
VC d property.
We point out that the VC 0 property only holds in a very special situation; recall
that a structure is called rigid if it has no automorphisms besides the identity.
Lemma 5.4. The following are equivalent:
(1) M has the VC 0 property;
(2) every model of Th(M ) is rigid;
(3) M is finite and rigid.
Proof. Suppose M has the VC 0 property; to see (2), it suffices to show that M is rigid.
Let ϕ(x; y) be the L-formula x = y, and let F = (ϕi (y))i∈I be a uniform definition of
ϕ(x; B)-types over finite sets. Suppose σ ∈ Aut(M ) and b ∈ M satisfy b 6= σ(b). Let
p = tpϕ (b/B) where B = {b, σ(b)}, and choose i ∈ I such that ϕi defines p. Then
b=b
⇐⇒
M |= ϕi (b)
⇐⇒
M |= ϕi (σ(b))
⇐⇒
b = σ(b),
a contradiction. This shows (1) ⇒ (2), and (2) ⇒ (3) is obvious. Suppose now that M
is finite, and let ∆(x; y), where |x| = 1, be a finite set of partitioned L-formulas. It is
easy to see that then for each p ∈ S ∆ (M |y| ) there is a family Fp = {ϕp (y) : ϕ ∈ ∆} of Lformulas such that for all b ∈ M |y| we have M |= ϕp (b) iff there is some automorphism σ
of M such that ϕ(x; b) ∈ σ(p). Hence if in addition M is rigid then F = (Fp )p∈S ∆ (M |y| )
is a uniform definition of ∆(x; B)-types over finite sets in M . This shows (3) ⇒ (1). 38
ASCHENBRENNER, DOLICH, HASKELL, MACPHERSON, AND STARCHENKO
The following result and its Corollary 5.6 are useful if we have some kind of quantifier
elimination result at hand:
Lemma 5.5. Suppose ∆ = ∆(x; y) and Ψ = Ψ(x; y) are finite sets of partitioned Lformulas such that every formula in ∆ is equivalent in T to a Boolean combination of
formulas in Ψ, and Ψ has UDTFS in T with d parameters. Then ∆ has UDTFS in T
with d parameters.
Proof. We may assume that each ϕ ∈ ∆ has the form
^ _
ϕ=
r,s,ϕ ψr,s,ϕ
r∈Rϕ s∈Sr,ϕ
where Rϕ , Sr,ϕ are finite index sets, r,s,ϕ is ¬ or no condition, and ψr,s,ϕ ∈ Ψ. Suppose
G = (Gi )i∈I is a uniform definition of Ψ(x; B)-types over finite sets in T , where
Gi = ψi (y; y1 , . . . , yd ) ψ∈Ψ
for each i ∈ I.
Let F = (Fi )i∈I where
Fi = ϕi (y; y1 , . . . , yd ) ϕ∈∆
where ϕi =
^
_
r,s,ϕ (ψr,s,ϕ )i .
r∈Rϕ s∈Sr,ϕ
Let q ∈ S ∆ (B) where B ⊆ M |y| is finite. Let a ∈ M |x| realize q, and put p := tpΨ (a/B).
Take i ∈ I and b1 , . . . , bd ∈ B such that Gi (y; b1 , . . . , bd ) defines p. One now easily verifies
that Fi (y; b1 , . . . , bd ) defines q.
Corollary 5.6. Suppose Φ is a family of partitioned L-formulas in the single object
variable x such that
(1) every partitioned L-formula in the object variable x is equivalent in T to a
Boolean combination of formulas from Φ, and
(2) every finite set of L-formulas from Φ has UDTFS in T with d parameters.
Then T has the VC d property.
The following theorem is at the root of the proof of Theorem 1.1 from the introduction; it shows that having UDTFS with a constant number of parameters for all sets of
formulas in a single object variable entails UDTFS with a linearly bounded number of
parameters for sets of formulas in an arbitrary number of object variables:
Theorem 5.7. Suppose that M has the VC d property. Then every ∆(x; y) has a
uniform definition of ∆(x; B)-types over finite sets in M with d|x| parameters.
Before we embark on the proof, we introduce some notation: for a sequence a ∈ M m
and a set B ⊆ M n we write aB := {(a, b) : b ∈ B} ⊆ M m+n and Ba := {(b, a) : b ∈
B} ⊆ M n+m .
Proof. We proceed by induction on m = |x|. The base case m = 1 holds by hypothesis.
For the inductive step write x = (x0 , x0 ) where x0 = (x1 , . . . , xm ), and let ∆(x; y) be
given. Let
∆0 (x0 ; x0 , y) = {ϕ(x0 ; x0 , y) : ϕ(x; y) ∈ ∆}.
By the VC d property applied to ∆0 , we take finitely many families
Fi = ϕi (x0 , y; y1 , . . . , yd ) ϕ∈∆
(i ∈ I)
VC DENSITY IN SOME NIP THEORIES, I
39
of L-formulas with the following property: for any a0 ∈ M m , any finite set B ⊆ M |y|
and any q ∈ S ∆0 (a0 B), there are b1 , . . . , bd ∈ B and i ∈ I such that Fi (a0 , y; b1 , . . . , bd )
defines q, i.e., for all ϕ ∈ ∆, b ∈ B:
ϕ(x0 ; a0 , b) ∈ q
⇐⇒
M |= ϕi (a0 , b; b1 , . . . , bd ).
(5.1)
In the rest of this proof let ϕ range over ∆ and i over I. For each i, let
∆i (x0 ; y, y1 , . . . , yd ) = ϕi (x0 ; y, y1 , . . . , yd ) : ϕ(x; y) ∈ ∆
and apply the inductive hypothesis to each ∆i . Thus for each i there are finite families
(j ∈ Ji )
Fij := ϕij (y, y1 , . . . , yd ; v1 , . . . , vn ) ϕ∈∆
of L-formulas, where n = md, such that for all finite subsets B ⊆ M |y| , all b =
(b1 , . . . , bd ) ∈ (M |y| )d , and every p ∈ S ∆i (Bb), there exists some j ∈ Ji and c1 , . . . , cn ∈
B such that for each ϕ and each b ∈ B we have
ϕi (x0 ; b, b) ∈ p
⇐⇒
M |= ϕij (b, b; c1 , . . . , cn ).
(5.2)
Partition the variable tuple of the L-formulas ϕij as (y; y1 , . . . , yd , v1 , . . . , vn ), and set
F := (Fij )i∈I,j∈Ji . We claim that F is a uniform definition of ∆(x; B)-types over finite
sets in M ; since F has d + n = d(m + 1) parameters, this will then finish the inductive
step. To see this, let a finite B ⊆ M |y| and some a = (a0 , a0 ) ∈ M 1+m be given. We
let b range over B. We need to show that there are some i ∈ I, j ∈ Ji and b ∈ B d ,
c1 , . . . , cn ∈ B such that for all ϕ and b we have
M |= ϕ(a; b)
⇐⇒
M |= ϕij (b; b, c1 , . . . , cn ).
(5.3)
Let q = tp∆0 (a0 /a0 B) be the type in S ∆0 (a0 B) realized by a0 . Take i and b =
(b1 , . . . , bd ) ∈ B d such that (5.1) holds for all ϕ and all b. Set p = tp∆i (a0 /Bb). Then
we may take j ∈ Ji and c1 , . . . , cn ∈ B such that for each ϕ and each b, the equivalence
(5.2) holds. This yields (5.3), for all ϕ and b.
In the next corollary we assume that M is infinite (so we can meaningfully talk about
VC density). We already remarked that if ∆ admits a uniform definition of ∆(x; B)types with d parameters, then vc∗ (∆) ≤ d; in particular, if M has the VC d property
then this conclusion holds for all ∆(x; y) with |x| = 1. The previous theorem generalizes
this observation to the upper bound vc∗ (∆) ≤ d|x| for all ∆. Hence:
Corollary 5.8. If T = Th(M ) has the VC d property then m ≤ vcT (m) ≤ d · m for
every m. In particular, if T has the VC 1 property then vcT (m) = m for every m.
In a multi-sorted setting, the VC d property can be naturally localized. Suppose M
is a multi-sorted structure, and let S be one of the sorts of M . We say that M has the
VC d property in the sort S if any ∆(x; y) with x a single variable of sort S has a uniform
definition of ∆(x; B)-types over finite sets with d parameters. With this definition, the
following analogue of Theorem 5.7 holds (with the same proof):
Corollary 5.9. If M has the VC d property in the sort S, then every ∆(x; y) with each
variable xi of sort S has a uniform definition of ∆(x; B)-types over finite sets with d|x|
parameters.
So for example, if M is an infinite single-sorted structure and some expansion of
M eq has the VC d property in the home sort (where the uniform defining formulae are
in the expanded language), then vcT (m) ≤ dm for each m.
40
ASCHENBRENNER, DOLICH, HASKELL, MACPHERSON, AND STARCHENKO
5.3. Relationship to other notions. We now want to put the VC d property into
perspective and compare it to two other strengthenings of the NIP concept, namely,
uniform definability of types over finite sets and dp-minimality. The following notion
was introduced and studied in [34]:
Definition 5.10. The structure M is said to have uniform definability of types over
finite sets (UDTFS ) if every partitioned L-formula has UDTFS in M . Clearly UDTFS
is an invariant of the elementary theory of M , and so we say that an L-theory T has
uniform definability of types over finite sets (UDTFS ) if every model of T does.
By Theorem 5.7, if M has the VC d property, for some d, then M has UDTFS.
More generally, [34, Lemma 2.6] shows that if every partitioned L-formula ϕ(x; y) with
|x| = 1 has UDTFS in M , then M has UDTFS. The proof of this lemma as given
in [34] in fact shows that if every partitioned L-formula with a single object variable
admits a uniform definition of ϕ(x; B)-types with d parameters, then every partitioned
L-formula in the object variables x admits a uniform definition of ϕ(x; B)-types with at
most (d + 1)|x| − 1 parameters. (In contrast, our bound in Theorem 5.7 is linear in |x|.)
Every stable formula has UDTFS; in particular, every stable theory has UDTFS. In
fact, Laskowski [56] has shown that if T is stable then every partitioned L-formula ϕ(x; y)
has UDTFS in T with Rm (x = x, ϕ, 2) parameters. (See [88, Definition II.1.1] for the
definition of the rank Rm (−, −, 2). Stability of T is equivalent to Rm (x = x, ϕ, 2) < ω
for all ϕ(x; y), cf. [88, Theorem II.2.2].)
An ICT pattern in M consists of a pair α(x; y), β(x; y) of partitioned L-formulas,
where |x| = 1, and sequences (ai )i∈N , (bj )j∈N in M |y| such that for all i and j the set of
L(M )-formulas
α(x; ai ), β(x; bj ) ∪ ¬α(x; ak ) : k 6= i ∪ ¬β(x; bl ) : l 6= j
is consistent (with M ). This notion and the following definition originate in [90]:
Definition 5.11. An L-theory T is said to be dp-minimal if in no model of T there is
an ICT pattern, and M is dp-mininmal if Th(M ) is dp-minimal (equivalently, if there
is no ICT pattern in an elementary extension of M ).
The following proposition (which shows that in the definition of dp-minimality we
could have restricted ourselves to ICT patterns given by identical formulas α, β) and
its Corollary 5.13 are due to Dolich, Goodrick and Lippel [24]; for convenience of the
reader we indicate their proofs:
Proposition 5.12. Suppose M is a monster model of the complete L-theory T . Then
T is dp-minimal iff there are no L-formula ϕ(x; y) with |x| = 1 and sequences (ci )i∈N ,
(dj )j∈N in M |y| such that for all i, j,
ϕ(x; ci ), ϕ(x; dj ) ∪ ¬ϕ(x; ck ) : k 6= i ∪ ¬ϕ(x; dl ) : l 6= j
is consistent.
Proof. Suppose α(x; y), β(x; y) (where |x| = 1) and the sequences (ai )i∈N , (bj )j∈N are an
ICT pattern in M . Let ϕ(x; y, z) := α(x; y)∨β(x; z), and for every i, j let ci := (a2i , b2i )
and dj := (a2j+1 , b2j+1 ). Let a ∈ M realize the type
α(x; a2i ), β(x; b2j+1 ) ∪ ¬α(x; ak ) : k 6= 2i ∪ ¬β(x; bl ) : l 6= 2j + 1 .
Then a satisfies ϕ(x; ci ) and ϕ(x; dj ). If k 6= i then a satisfies ¬α(x; a2k ) (since 2k 6= 2i)
and ¬β(x; b2k ) (since 2k 6= 2j + 1) and hence also ¬ϕ(x; ck ). Similarly we see that a
satisfies ¬ϕ(x; dl ) for l 6= j.
VC DENSITY IN SOME NIP THEORIES, I
41
Let us tentatively say that a complete L-theory T is vc-minimal if vc∗ (ϕ) < 2 for
every L-formula ϕ(x; y) with |x| = 1. So if vcT (1) < 2 (in particular, if T is VCminimal), then T is vc-minimal. An example of a theory T which is not VC-minimal
yet satisfies vcT (1) = 1 (and thus is vc-minimal) was given in [24, Proposition 3.7].
Corollary 5.13. Every vc-minimal theory is dp-minimal.
Proof. Suppose M is a monster model of T = Th(M ), and T is not dp-minimal.
Take ϕ and (ci ), (dj ) with the properties in the previous proposition. For every n let
Bn := {ci : i < n} ∪ {dj : j < n}. Then |S ϕ (Bn )| ≥ n2 ≥ 41 |Bn |2 , hence vc∗ (ϕ) ≥ 2. In particular, each of the theories in Examples 3.8–3.10 (being VC-minimal) is dpminimal. Other proofs of the dp-minimality of weakly o-minimal theories can be found
in [2, 24]. See also [48] for a generalization of Corollary 5.13 to a bound on “dp-rank”
in terms of VC density.
A characterization of dp-minimal theories among stable theories was given in [71].
The main result of [34] is that every dp-minimal theory T has UDTFS. In particular,
by Corollary 5.13, every vc-minimal T has UDTFS. (Actually, [34, Theorem
3.14] gives
t+1
a more precise result: if ϕ(x; y) is an L-formula such that πϕ (t) ≤ 2 for some t > 0,
then ϕ has UDTFS.)
We summarize the implications between the properties of a theory T discussed above
in the following diagram:
VC 1
+3 VC d for some d > 0
!
vcT (1) = 1
KS
+3 vc-minimal
+3 dp-minimal
!
$,
+3 UDTFS
+3 NIP
!
VC-minimal
Here the arrows marked with an exclamation mark are known not to be reversible.
(For an example showing that vc(1) = 1 6⇒ VC 1 see [6, Example 3.15].) We do not
know which of the other arrows are reversible; whether the converse of the implication
UDTFS ⇒ NIP holds was first asked by Laskowski [34, Open Question 4.1].
Recall from Corollary 3.20 that the Shelah expansion M Sh of M has the same
VC density function as M . In [71] it is observed that the Shelah expansion of a dpminimal structure is again dp-minimal. We finish this section by showing that the
analogous statement also holds for the VC d property; in fact, we have more generally:
Proposition 5.14. Suppose every finite set of partitioned L-formulas in m object variables has UDTFS in T with d parameters. Then every finite set of partitioned LSh formulas in m object variables has UDTFS in T Sh with d parameters.
Proof. As in the definition of the Shelah expansion (cf. Section 3.5) let M ∗ be a very
saturated elementary extension of M . Let ∆ = ∆(x; y) be a finite set of partitioned LSh formulas where m = |x|; below ϕ ranges over ∆. We need to show that ∆ has UDTFS
in T Sh with d parameters. For this, by Lemma 5.5 and since T Sh admits quantifier
elimination, we may assume that each of the LSh -formulas in ∆ is atomic; that is, there
exist L-formulas ψϕ (x, y; z), one for each ϕ, and a tuple c ∈ (M ∗ )|z| such that each ϕ
42
ASCHENBRENNER, DOLICH, HASKELL, MACPHERSON, AND STARCHENKO
has the form ϕ(x; y) = Rψϕ ,c (x; y). Let now Ψ(x; y, z) := {ψϕ (x; y, z) : ϕ ∈ ∆} and take
a uniform definition G = (G)i∈I of Ψ(x; B ∗ )-types over finite sets in T Sh , where
for each i ∈ I.
Gi = (ψϕ )i (y, z; (y1 , z1 ), . . . , (yd , zd )) ϕ∈∆
For each i ∈ I set
Fi := Rϕi ,c (y; y1 , . . . , yd )
ϕ∈∆
where ϕi (y, y1 , . . . , yd ; z) := (ψϕ )i (y, z, (y1 , z), . . . , (yd , z)).
We claim that F = (Fi )i∈I is a uniform definition of ∆(x; B)-types over finite sets in
T Sh . To see this let p ∈ S ∆ (B) where B ⊆ M |y| is finite, and let a ∈ M m be a realization
of p in M Sh . Put B ∗ := B × {c} ⊆ M |y| × (M ∗ )|z| and let p∗ := tpΨ (a/B ∗ ) (in M ∗ ).
Take b1 , . . . , bd ∈ B and i ∈ I such that Gi (y, z; (b1 , c), . . . , (bd , c)) defines p∗ ; then for
every ϕ and b ∈ B we have
M Sh |= ϕ(a; b)
⇐⇒
M ∗ |= ψϕ (a; b, c)
⇐⇒
M ∗ |= (ψϕ )i (b, c; (b1 , c), . . . , (bd , c))
⇐⇒
M Sh |= Rϕi ,c (b; b1 , . . . , bd ).
That is, Fi (y; b1 , . . . , bd ) defines p (in M Sh ).
6. Examples of VC d: Weakly O-minimal Theories and Variants
In this section we apply Theorem 5.7 from the preceding section to give a proof of
Theorem 1.1 on VC density in weakly o-minimal theories from the introduction. We
also observe that a similar technique allows us to treat all (weakly) quasi-o-minimal
theories.
Throughout this section L is a language containing a binary relation symbol “<” and
T is a theory extending the theory of infinite linear orderings.
6.1. Weakly o-minimal theories. We begin by introducing some terminology concerning ordered sets. Let (X, <) be a linearly ordered set, and let S be a subset of
X which is a union of finitely many non-empty convex subsets of X. We refer to the
convex sets in the unique minimal such presentation of S as its (convex ) components.
Suppose S has N convex components, where N > 0. These components are ordered
by <, so for i = 1, . . . , N we can refer to the ith component of S; for i > N we declare
the ith component of S to be equal to the N th.
Recall that T is called weakly o-minimal if for any M |= T , any definable subset of
M is a finite union of convex subsets of M .
Theorem 6.1. Assume that T is weakly o-minimal. Then T has the VC 1 property,
and hence any finite set ∆(x; y) of L-formulas has dual VC density at most |x|.
Proof. Let M |= T . Fix a finite non-empty set of L-formulas ∆(x; y) with |x| = 1. We
let ϕ range over ∆ and b over M |y| . By the weak o-minimality of T , there is an integer
N > 0 such that for any ϕ and any b, ϕ(M ; b) has at most N components. For any ϕ
and i ∈ [N ] there is an L-formula ϕi (x; y) such that for every b with ϕ(M ; b) 6= ∅, the
ith component of ϕ(M ; b) equals ϕi (M ; b), and such that for every b with ϕ(M ; b) = ∅
we have ϕi (M ; b) = ∅. So
ϕ(M ; b) = ϕ1 (M ; b) ∪ · · · ∪ ϕN (M ; b)
for every b.
(6.1)
VC DENSITY IN SOME NIP THEORIES, I
43
Set
ϕi≤ (x; y) := ∃x0 (ϕi (x0 ; y) ∧ x ≤ x0 ),
ϕi< (x; y) := ∀x0 (ϕi (x0 ; y) → x < x0 ).
Then clearly
ϕi (M ; b) = ϕi≤ (M ; b) ∩ M \ ϕi< (M ; b)
for all ϕ, b and i ∈ [N ].
(6.2)
Now set
Ψ(x; y) := ϕi (x; y) : ϕ ∈ ∆, i ∈ [N ], ∈ {≤, <} .
For each ψ ∈ Ψ and each b, the set ψ(M ; b) is an initial segment of M ; hence SΨ
is directed, so Ψ has UDTFS with a single parameter, by Lemma 5.2. Moreover, by
(6.1) and (6.2), every ϕ is equivalent to a Boolean combination of 2N formulas from Ψ.
Hence ∆ also has UDTFS with a single parameter, by Lemma 5.5. Thus M has the
VC 1 property. By Corollary 5.8 therefore vc∗ (∆) ≤ |x| for every finite set ∆(x; y) of
L-formulas.
Remark. In the previous theorem we assume that the theory T is weakly o-minimal
(i.e., all models of T are weakly o-minimal). Recall that a weakly o-minimal structure
need not have weakly o-minimal theory. We do not know whether the conclusion of
Theorem 6.1 holds if T is merely assumed to have some weakly o-minimal model.
Put Ldiv,< := Ldiv ∪ {<}, where Ldiv = {0, 1, +, −, ×, | } is the language of rings
expanded by a divisibility predicate (see Example 3.10 above) and “<” is a binary
relation symbol. Let RCVF denote the theory of real closed fields equipped with a
proper convex valuation ring, parsed in the language Ldiv,< . The following corollary is
now immediate, as by [23], RCVF is weakly o-minimal.
Corollary 6.2. Let K |= RCVF. Then any finite set ∆(x; y) of Ldiv,< -formulas has
dual VC density at most |x| in K.
This result in turn yields a VC density bound for algebraically closed valued fields of
residue characteristic 0 (which is non-optimal by Example 3.10):
Corollary 6.3. Let ACVF(0,0) be the theory of non-trivially valued algebraically closed
fields of residue characteristic 0, in the language Ldiv . Let ∆(x; y) be a finite set of
Ldiv -formulas. Then ∆ has dual VC density at most 2|x| in ACVF(0,0) .
Proof. The theory ACVF(0,0) , which is complete, is interpretable in RCVF: if K is a
model of RCVF, then its algebraic closure K alg is a degree 2 extension of K: K alg =
K(i), where i2 = −1. So K alg can be identified with K 2 , and the valuation v of K can
be definably extended to one of K alg by setting v(a + bi) = 21 v(a2 + b2 ) for a, b ∈ K.
Thus, by Lemma 3.16 and Corollary 6.2, ∆(x; y) has dual VC density at most 2|x|. We have no results on VC density for ACVF in characteristics other than (0, 0).
6.2. Quasi-o-minimal theories. We now turn to quasi-o-minimal theories: T is said
to be quasi-o-minimal if for any M |= T , any definable subset of M is a finite Boolean
combination of singletons, intervals in M , and ∅-definable sets. (See [14].)
Theorem 6.4. Assume that T is quasi-o-minimal. Then T has the VC 1 property, and
hence vcT (n) = n for each n.
44
ASCHENBRENNER, DOLICH, HASKELL, MACPHERSON, AND STARCHENKO
Proof. Let M |= T . Fix a finite set ∆(x; y) of L-formulas with |x| = 1; we let ϕ
range over ∆ and b over M |y| . There is some positive integer N and ∅-definable subsets
D1 , . . . , DN of M so that for any ϕ and any choice of parameters b, the set ϕ(M ; b) of
realizations of ϕ(x; b) is a Boolean combination of the Di and at most N singletons and
intervals in M [14, Theorem 3].
e1 ∩ · · · ∩ D
e N , where D
e i is either Di or
Let D be the collection of sets of the form D
its complement in M (so D is a partition of M into at most 2N sets). We let D range
over D. For every D, ϕ, and b, the set D ∩ ϕ(M ; b) is then a finite union of at most N
convex subsets of the ordered set D. For every i ∈ [N ] and every D let ϕi,D (x; u) be
an L-formula such that for every b, if the set D ∩ ϕ(M ; b) is non-empty, then the ith
convex component of D ∩ ϕ(M ; b) (viewed as a subset of the ordered set D) is given by
ϕi,D (M ; b), and if D ∩ ϕ(M ; b) = ∅ then ϕi,D (M ; b) = ∅. Hence for each ϕ and b we
have
[
ϕ(M ; b) =
ϕi,D (M ; b).
D∈D, i∈[N ]
Now let (slightly abusing syntax)
i
ϕi,D
≤ (x; y) := x ∈ D ∧ ∃x0 (x0 ∈ D ∧ ϕ (x0 ; y) ∧ x ≤ x0 ),
i
ϕi,D
< (x; y) := x ∈ D ∧ ∀x0 (x0 ∈ D ∧ ϕ (x0 ; y) → x < x0 ).
Then
i,D
ϕi,D (M ; b) = ϕi,D
≤ (M ; b) ∩ M \ ϕ< (M ; b)
for all ϕ, b, D and i ∈ [N ].
Each set ϕi,D
(x; b), where ∈ {≤, <}, is an initial segment of D, and any two distinct
elements of D are disjoint. Thus the set system SΨ , where
Ψ(x; y) = ϕi,D
(x; y) : ϕ ∈ ∆, i ∈ [N ], ∈ {≤, <}, D ∈ D ,
is directed. As in the proof of Theorem 6.1 it now follows that ∆ has UDTFS with a
single parameter.
Corollary 6.5. The following structures all have the VC 1 property:
(1) (R, <, Q) (i.e., the ordered set of reals with a predicate for the rationals);
(2) (Zn , <, +) where < is the lexicographic ordering on Zn ;
(3) (Zn × Q, <, +) where < is the lexicographic ordering on Zn × Q.
Proof. Each of the examples has quasi-o-minimal theory: For (1) this was noted in [14,
Section 1], and for (2) and (3) this is proved (based on a quantifier-elimination result
from [99]) in [15, Theorem 15]. The corollary now follows from Theorem 6.4.
Remark. The ordered abelian groups in (2) and (3) of the previous corollary are typical
for quasi-o-minimal groups. Here and below, “quasi-o-minimal group” means “quasio-minimal expansion of an ordered group.” (A quasi-o-minimal group is necessarily
abelian [14, Theorem 11].) An expansion G of an ordered group is called coset-minimal
if every subset of G definable in G is a finite union of cosets of definable subgroups,
intersected with intervals. A theory expanding the theory of ordered groups is said to
be coset-minimal if all its models are. (See [15].) Now by [79, Theorem 5.3], the theory
of G is coset-minimal iff the theory of the expansion of G by constant symbols for the
elements of G is quasi-o-minimal, and in this case G is an expansion of an ordered group
elementarily equivalent to either (Zn , <, +) or (Zn × Q, <, +), for some n.
VC DENSITY IN SOME NIP THEORIES, I
45
Part (2) of the previous corollary shows in particular that Presburger Arithmetic,
i.e., the theory of the ordered group (Z, <, +) of integers, has the VC 1 property, and
hence is dp-minimal. By Theorem 10 in [14], (Z, <, +) has no proper quasi-o-minimal
expansions. One can strengthen this statement:
Proposition 6.6. No proper expansion of (Z, <, +) is dp-minimal.
The proof is the same as in [14], replacing the use of [14, Theorem 7] by a result from
[92]; we state the latter employing some convenient terminology from [14]: Let (X, <)
be a linearly ordered set. We say that two subsets S, T of X are eventually equal (in
symbols: S ≈ T ) if there is some a ∈ X such that S ∩ (a, +∞) = T ∩ (a, +∞). Clearly
≈ is an equivalence relation on subsets of X. We say that a family of subsets of X is
eventually finite if it is partitioned into finitely many classes by ≈.
Lemma 6.7 (Simon [92, Lemma 2.9]). Suppose T is dp-minimal. Let M |= T and let
ϕ(x; y) be a partitioned L(M )-formula where |x| = 1. Then Sϕ is eventually finite.
For the benefit of the reader we now indicate the details of the proof of Proposition 6.6.
Let Z be a proper expansion of (Z, <, +). By a theorem of Michaux and Villemaire [69]
(and an easy extra argument, given in the proof of [14, Theorem 10]), there is a subset
U of Z which is definable in Z but not definable in (Z, <, +). By Simon’s lemma, the
family {a + U : a ∈ Z} is eventually finite; thus the subgroup A of Z consisting of all
a ∈ Z such that a + U ≈ U is non-zero, so A = aZ for some positive integer a. Let V
be the union of all cosets of A which contain arbitrarily large elements of U ; then V is
definable in (Z, <, +), hence it suffices to show that U ≈ V . As a + U ≈ U , we can take
α ∈ Z such that for every u ∈ Z with u ≥ α we have a + u ∈ U ⇐⇒ u ∈ U . One now
proves easily that for every u ∈ Z with u ≥ α we have u ∈ U ⇐⇒ u ∈ V .
So for example, the expansion of (Z, <, +) by the set bN = {bn : n ≥ 0} of powers of
a natural number b > 1 is not dp-minimal, as is the expansion of (Z, <, +) by the set of
factorials or by the set of Fibonacci numbers. In all these examples, the corresponding
expansion of (Z, <, +) has quantifier elimination in a natural expansion of {<, +, U }
(see [19, 78]) and is NIP (as will be shown elsewhere).
6.3. Weakly quasi-o-minimal theories. In [53], T is called weakly quasi-o-minimal
if for any M |= T , any definable subset of M is a finite Boolean combination of convex
subsets of M and ∅-definable sets. Every weakly quasi-o-minimal theory is NIP [53,
Theorem 2.3]. In fact, the proof of Theorem 6.4 (mutatis mutandis) also shows more
generally:
Theorem 6.8. All weakly quasi-o-minimal theories have the VC 1 property.
This observation can be used to strengthen [92, Proposition 4.2], where it is shown
that the complete theories of colored linearly ordered sets with monotone relations
(shown to be NIP in [85]) are dp-minimal. A binary relation R on a set X is said to be
monotone with respect to a linear ordering < of X if
x0 ≤ xRy ≤ y 0 ⇒ x0 Ry 0
for all x, x0 , y, y 0 ∈ X.
A colored linearly ordered set with monotone relations is a structure of the form M =
(M, <, {Ci }i∈I , {Rj }j∈J ) where < is a linear ordering on M , the Ci are unary predicates
(“colors”), and the Rj are binary relations which are monotone (with respect to <). It
was shown by Simon [92, Proposition 4.1] that every such colored linearly ordered set
with monotone relations has quantifier elimination provided that each ∅-definable subset
46
ASCHENBRENNER, DOLICH, HASKELL, MACPHERSON, AND STARCHENKO
of M is given by one of the predicates Ci and each monotone ∅-definable binary relation
is given by one of the Rj .
Proposition 6.9. Let M be a colored linearly ordered set with monotone relations as
above. Then T = Th(M ) is weakly quasi-o-minimal, and hence has the VC 1 property.
Proof. By the result of Simon just quoted, we may assume that M admits quantifier
elimination. Now for each b ∈ M and j ∈ J the set
{x ∈ M : M |= xRj b}
is an initial segment of M , and
{x ∈ M : M |= bRj x}
is a final segment of M (i.e., its complement is an initial segment of M ). Hence any
definable subset of M is a finite Boolean combination of initial segments of M and
∅-definable sets, so T is weakly quasi-o-minimal.
As in [85, 92] this leads to a result for (partially) ordered sets of finite width. Let
P = (P, <) be an ordered set, i.e., a set P equipped with an irreflexive, asymmetric and
transitive binary relation < on P . A subset A of P is an antichain if for all a 6= a0 in
A, neither a < a0 nor a0 < a holds, and C ⊆ P is a chain if for all c 6= c0 in C, either
c < c0 or c0 < c. The width of P is defined to be the supremum of the cardinalities of
antichains in P , and denoted by width(P ). (Dually, the height of P is defined to be
the supremum of the cardinalities of a chain in P , denoted by height(P ).) A colored
ordered set is a structure P = (P, <, (Ci )i∈I ) where (P, <) is an ordered set and each
Ci is a unary predicate.
Corollary 6.10. Let P = (P, <, (Ci )i∈I ) be an infinite colored ordered set of finite
width. Then vcTh(P ) (m) = m for every m.
Proof. Let n = width(P ) and let i, j range over [n]. By Dilworth’s Theorem there is a
partition P = P1 ∪ · · · ∪ Pn of P into disjoint chains Pi . Define a linear ordering ≺ on P
by setting a b iff either a, b ∈ Pi for some i and a ≤ b, or a ∈ Pi , b ∈ Pj with i < j.
For all i, j the binary relation
Rij := (a, b) ∈ P : ∃a0 ∈ Pi , b0 ∈ Pj : a a0 ≤ b0 b
is monotone with respect to ≺, and the original ordering < is ∅-definable in the linearly
ordered set with monotone relations (P, ≺, (Rij )i,j ), noting that a ≤ b iff aRii aRij bRjj b
for some i and j. The claim now follows immediately from Proposition 6.9.
Question. Is every ordered set of finite width VC 1?
By an interpretability argument, the previous corollary also leads to a (perhaps nonoptimal) bound on the VC density for those distributive lattices with NIP theory. By [85,
Theorem 6] these are exactly the distributive lattices of finite breadth. From Section 2.4
recall that a semilattice (L, ∧) has breadth at most d if for all b1 , . . . , bd+1 ∈ L there is
some i ∈ [d + 1] such that b1 ∧ · · · ∧ bd+1 = b1 ∧ · · · bbi · · · ∧ bd+1 , and the smallest such d
(if it exists) is called the breadth of (L, ∧).
Corollary 6.11. Let L = (L, ∧, ∨) be an infinite distributive lattice of breadth d. Then
vcTh(L) (m) ≤ dm for every m.
VC DENSITY IN SOME NIP THEORIES, I
47
Proof. Let P = (P, <) be an ordered set of width d, and let A(P ) be the set of antichains
of P (so each element of A(P ) is a subset of P of size ≤ d). For A, A0 ∈ A(P ) let A ∧ A0
denote the set of minimal elements of A ∪ A0 and A ∨ A0 the set of maximal elements of
A∪A0 ; then A∧A0 , A∨A0 ∈ A(P ), and (A(P ), ∧, ∨) is a distributive lattice of breadth d.
Moreover, one can choose P such that the given distributive lattice L is isomorphic
to A(P ) [84, Theorem 3]. Since A(P ) is interpretable in P on a definable subset of P d ,
we have vcTh(L) (m) ≤ vcTh(P ) (dm) by Corollary 3.17 and hence vcTh(L) (m) ≤ dm by
the previous corollary.
In [92] it is shown that the complete theory of each infinite tree T (viewed as an
ordered set) is dp-minimal. Here, a tree is an ordered set T = (T, <) with the property
that for each t ∈ T the set {t0 ∈ T : t0 < t} is linearly ordered (by the restriction of <).
Problem. Determine the VC density function of each (infinite) tree.
(It is known [75] that a tree T is stable iff T has finite height, and then T is superstable
of U-rank ≤ height(T ), so conceivably, the methods of [6] could be applied.)
7. A Strengthening of VC d, and P -adic Examples
In this section we first introduce a strengthening of the VC d property defined and
studied in Section 5, and we prove a more precise version of Theorem 5.7 for strong
VC d structures. The extra precision afforded by this theorem is useful in situations
where vc(1) = 1, yet we can only prove the VC d property for some d > 1. This is the
case for P -minimal theories, which are discussed in the last subsection, where we prove
Theorem 1.2 from the introduction.
7.1. The strong VC d property. In the following M is a structure in a language L
and ∆ = ∆(x; y) is a finite non-empty set of partitioned L-formulas. Let F = (Fi )i∈I
be a uniform definition of ∆(x; B)-types over finite sets, where
Fi = ϕi (y; y1 , . . . , yd ) ϕ∈∆
(i ∈ I).
Recall from Definition 5.1 above that this means the following: for every finite set B ⊆
M |y| and q ∈ S ∆ (B) there are b1 , . . . , bd ∈ B and some i ∈ I such that Fi (y; b1 , . . . , bd )
defines q. If in addition for every choice of b1 , . . . , bd ∈ M |y| and i ∈ I, the set
pi (x; b1 , . . . , bd ) := ϕ(x; b) : ϕ ∈ ∆, b ∈ M |y| , M |= ϕi (b; b1 , . . . , bd )
of L(M )-formulas is consistent (with M ), then we say that F is a coherent definition
of ∆(x; B)-types over finite sets. (In this case, every restriction of pi (x; b1 , . . . , bd ) to a
finite parameter set B ⊆ M |y| extends to a complete ∆(x; B)-type, but Fi (y; b1 , . . . , bd )
does not in general define such an extension.)
Remark. Often all our defining formulas ϕi have the syntactic form
ϕi (y; y1 , . . . , yd ) = ∀x χi (x; y1 , . . . , yd ) → ϕ(x; y)
where χi is an L-formula. In this case, the coherency condition for F is automatically
|y| d
) . For example if S∆ has
satisfied provided χM
i (x; b) 6= ∅ for all i ∈ I and b ∈ (M
breadth d and is d-consistent, then ∆ has a coherent definition of ∆(x; B)-types over
finite sets with d parameters. (See Lemma 5.2.)
48
ASCHENBRENNER, DOLICH, HASKELL, MACPHERSON, AND STARCHENKO
We say that M has the strong VC d property if any ∆(x; y) with |x| = 1 has a coherent
definition of ∆(x; B)-types over finite sets with d parameters. Clearly, the strong VC d
property is a property of the elementary theory of M . We say that a theory T has the
strong VC d property if every model of T has the strong VC d property.
Remark. Suppose |x| = 1 and F = (Fi )i∈I is a coherent definition of ∆(x; B)-types
over finite sets in M , where Fi = (ϕi )ϕ∈∆ . Then for every ∆0 ⊆ ∆, the families
Fi0 := (ϕi )ϕ∈∆0 form a coherent definition of ∆0 (x; B)-types over finite sets in M . This
shows in particular that in order to check that M has the strong VC d property, one
may restrict oneself to sets of L-formulas ∆ which are closed under negation.
In the rest of this subsection we assume that M is infinite. We have the following
result on counting types in structures with the strong VC d property:
Theorem 7.1. Suppose that M has the strong VC d property, and let r ∈ R such that
∗
π∆
(t) = O(tr )
for every ∆(x; y) with |x| = 1.
Then we have
∗
π∆
(t) = O(td(|x|−1)+r )
for every ∆(x; y).
The proof of Theorem 7.1 proceeds by counting types via sets of representatives (and
with an induction supported by the strong VCd property): given a finite set B ⊆ M |y| ,
we say that R ⊆ M |x| is a set of representatives for S ∆ (B) if for every q ∈ S ∆ (B) there
is α ∈ R realizing q. Equivalently, for every a ∈ M |x| there is α ∈ R such that for every
ϕ ∈ ∆ and every b ∈ B, M |= ϕ(a; b) if and only if M |= ϕ(α; b). Thus |S ∆ (B)| ≤ K
iff there is a set of representatives for S ∆ (B) of size at most K.
Proof. The proof is similar to that of Theorem 5.7. We again induct on m = |x|, with
the case m = 1 holding by hypothesis. For the inductive step write x = (x0 , x0 ) where
x0 = (x1 , . . . , xm ), and let ∆(x; y) be given. We may assume that ∆ is closed under
negation. As in the proof of Theorem 5.7 let
∆0 (x0 ; x0 , y) = {ϕ(x0 ; x0 , y) : ϕ(x; y) ∈ ∆}.
By the strong VC d property applied to ∆0 , we can take finitely many families
Fi = ϕi (x0 , y; y1 , . . . , yd ) ϕ∈∆
(i ∈ I)
of L-formulas with the following two properties: for any a0 ∈ M m , any finite B ⊆ M |y|
and any q ∈ S ∆0 (a0 B), there are b1 , . . . , bd ∈ B and i ∈ I such that Fi (a0 , y; b1 , . . . , bd )
defines q, i.e., for all ϕ ∈ ∆, b ∈ B:
ϕ(x0 ; a0 , b) ∈ q
0
m
⇐⇒
M |= ϕi (a0 , b; b1 , . . . , bd );
(7.1)
|y| d
and b ∈ (M ) , the set
pi (x0 ; a0 , b) := ϕ(x0 ; a0 , b) : ϕ ∈ ∆, b ∈ M |y| , M |= ϕi (a0 , b; b)
and for all i ∈ I, a ∈ M
of L(M )-formulas is consistent (with M ). In the rest of this proof let ϕ range over ∆
and i over I. For each i, let
∆i (x0 ; y, y1 , . . . , yd ) = ϕi (x0 ; y, y1 , . . . , yd ) : ϕ(x; y) ∈ ∆
and apply the inductive hypothesis to each ∆i . Thus there are constants Ki such that
for any finite C ⊆ (M |y| )(d+1) there is a set of representatives for S ∆i (C) of size at most
Ki |C|d(m−1)+r .
VC DENSITY IN SOME NIP THEORIES, I
49
Now let a finite B ⊆ M |y| be given. We let b range over B and b = (b1 , . . . , bd ) over
B . For each b and each i, let Ri (Bb) be a set of representatives for S ∆i (Bb). Thus for
any a0 ∈ M m and i there is some α ∈ Ri (Bb) such that for any ϕ and b,
d
M |= ϕi (a0 ; b, b)
⇐⇒
M |= ϕi (α; b, b).
(7.2)
Notice that there are |B|d sequences b, and |Bb| = |B| for each b. As above, we may
suppose |Ri (Bb)| ≤ Ki |Bb|d(m−1)+r = Ki |B|d(m−1)+r .
For each i, given α ∈ M m and b, let δi,α,b ∈ M realize the restriction of the type
pi (x0 ; α, b) to the (finite) parameter set αB. Let
R∆ = (δi,α,b , α) : i ∈ I, b ∈ B d , α ∈ Ri (Bb)
and observe that
!
|R∆ | ≤
X
d
|B| Ki |B|
d(m−1)+r
i
=
X
Ki
|B|md+r .
i
Thus we are finished once we have shown that R∆ is a set of representatives for S ∆ (B).
For this, let a = (a0 , a0 ) ∈ M 1+m be given; we need to show that there is an element
(δi,α,b , α) ∈ R∆ such that for every ϕ and every b,
M |= ϕ(a; b)
⇐⇒
M |= ϕ(δi,α,b , α; b).
(7.3)
Let q = tp∆0 (a0 /a0 B) ∈ S ∆0 (a0 B). Take i and b such that (7.1) holds for all ϕ and all
b, and then take a representative α ∈ Ri (Bb) for tp∆i (a0 /Bb). Note that by (7.1) and
since ∆ is assumed to be closed under negation, for each ϕ and b, either M |= ϕi (a0 , b; b)
or M |= ψi (a0 , b; b), where ψ ∈ ∆ is equivalent to ¬ϕ; hence also either M |= ϕi (α; b, b)
or M |= ψi (α; b, b), by (7.2), and thus
M |= ϕi (α; b, b)
⇐⇒
M |= ϕ(δi,α,b ; α, b),
by choice of δi,α,b . Combining (7.1), (7.2) and (7.4) now yields (7.3) as required.
(7.4)
In each of the cases treated in Theorems 6.1 and 6.4 one can show that the theory
in question has, indeed, the strong VC 1 property. Since this is of limited interest for
computing VC density (the VC 1 property already gives the optimal result vc(m) = m),
we do not give the details, and instead now turn to an application of Theorem 7.1 to
p-adic examples.
7.2. P -minimal theories. Let Lrings be the language of rings, let Lp = Lrings ∪ {Pn :
n > 1} where the Pn are unary predicates, and let L be a language containing Lp . Here
and below, p is a fixed prime number. Let pCF denote the Lp -theory of Qp , where each
Pn is interpreted as the set of nth powers in Qp :
Qp |= ∀x Pn (x) ↔ ∃y(y n = x) .
By a theorem of Macintyre [60], pCF has elimination of quantifiers. Following [39], an
L-theory T containing pCF is called P -minimal if, in every model of T , every definable
subset in one variable is quantifier-free definable just using the language Lp . (In fact,
the setting of [39] also allowed for p-adically closed fields of arbitrary fixed p-rank, and
our methods here could be adjusted to that.)
By [26], a motivating example of a P -minimal theory is the theory pCFan first investigated by Denef and van den Dries [22]. This
P is the theory of the p-adic numbers
equipped with, for every n > 0 and power series ν aν X ν ∈ Qp [[X]] such that |aν | → 0
as |ν| → ∞, a function symbol f of arity n taking value identically zero off Znp , and
50
ASCHENBRENNER, DOLICH, HASKELL, MACPHERSON, AND STARCHENKO
P
such that f (x) = ν aν xν for all x ∈ Znp . (Here, X = (X1 , . . . , Xn ), |a| denotes the
p-adic norm of a ∈ Qp , ν = (ν1 , . . . , νn ) ∈ Nn is a multi-index, |ν| = ν1 + · · · + νn , and
xν = xν11 · · · xνnn .)
Our main result about VC density in P -minimal theories is:
Theorem 7.2. Let T be a P -minimal L-theory with definable Skolem functions. Then
T has the strong VC 2 property, and any finite set ∆(x; y) of L-formulas has dual VC
density at most 2|x| − 1.
Before proving this theorem, we introduce some notation and establish some auxiliary
facts. We fix a model K of pCF, with valuation v : K → Γ∞ . We view Z as a convex
subgroup of Γ, by identifying 1 with v(p). In the following, by a ball in K we always
mean a closed ball, i.e., a set of the form
where c ∈ K and ρ ∈ Γ.
B = Bρ (a) = x ∈ K : v(x − c) ≥ ρ
Its radius, denoted rad(B), is ρ. By convention rad(K) := −∞. Let B denote the set
of all balls in K. There is a natural semilinear partial order on B, with B ≤ B 0 if
and only if B ⊇ B 0 . A ball B = Bρ (a) as above has a unique immediate predecessor,
namely Bρ−1 (a), and p immediate successors, namely Bρ+1 (ai ) where ai = a + ir for
i = 0, . . . , p − 1; here r is an arbitrary element of K with v(r) = ρ. Thus, if we form a
graph with vertex set B, with vertices B, B 0 adjacent if and only if one of B, B 0 is an
immediate successor of the other in the partial order, each of its connected components
is an unrooted tree of valency p + 1. For B, B 0 ∈ B we write dist(B, B 0 ) = d if B and
B 0 are at distance d in this graph; for each B ∈ B, there are (p + 1)d balls at distance
Pd
d to B, and βd := i=0 (p + 1)i = p1 ((p + 1)d+1 − 1) balls at distance at most d to B.
Note that dist is a metric on each connected component of B.
Lemma 7.3. Let A ⊆ K be finite, A 6= ∅. Then there at most |A| − 1 distinct balls of
the form Bv(a−b) (a) where a, b ∈ A, a 6= b.
Proof. We may assume |A| > 1. For a ∈ A set
µa := 1 + max v(a − b) : b ∈ A, a 6= b .
Let BA be the smallest connected subgraph of B containing all Bµa (a), a ∈ A. Note
that the balls Bµa (a), a ∈ A, are pairwise disjoint; in particular, they are the leaves of
the tree BA . The balls Bv(a−b) (a), where a, b ∈ A, a 6= b are vertices of BA , and all but
one of them has degree greater than 2. Now
P use the fact that any (undirected) tree with
finite vertex set V with |V | > 1 has 2 + v∈V,deg(v)>2 (deg(v) − 2) leaves. (This follows
P
immediately from the well-known formula 2(|V | − 1) = v∈V deg(v).)
We also recall the following basic fact (a consequence of the Newton formulation
of Hensel’s Lemma, see [39, Lemma 2.3]) about the subgroups Pn× = Pn \ {0} of the
multiplicative group K × = K \ {0} of K:
Lemma 7.4. Suppose n > 1, and let x, y, a ∈ K with v(y − x) > 2v(n) + v(y − a). Then
(x − a)(y − a)−1 ∈ Pn× .
Suppose now that T is an L-theory satisfying the hypothesis of Theorem 7.2, and
K |= T . Employing definability of Skolem functions and the explicit description of
immediate predecessors and successors in the partial order of B given above, one easily
shows, by induction on d:
VC DENSITY IN SOME NIP THEORIES, I
51
Lemma 7.5. Let (Bb )b∈K m be a ∅-definable family of subsets of K. Then there exist
∅-definable functions ci , ri : K m → K, i ∈ N, with the following property: if b ∈ K m is
such that B = Bb is a ball in K, then the balls Bv(ri (b)) (ci (b)), where i = 1, . . . , βd , are
exactly the balls of distance at most d to B.
The assumption of definable Skolem functions also guarantees that any model of
the theory T has cell decomposition [70]. Let ∆(x; y) be a finite set of L-formulas,
where |x| = 1, closed under negation. Then there are integers N, n > 0, and for each
i = 1, . . . , N there are ∅-definable functions fi , gi , ci : K |y| → K and elements λi of a
fixed set of representatives of the cosets of the subgroup Pn× of K × with the following
properties: for any ϕ ∈ ∆ and b ∈ K |y| , the set ϕ(K; b) of realizations of ϕ(x; b) is a
finite union of some of the cells U1 (b), . . . , UN (b) defined by the data given above, i.e.,
sets of the form
(7.5)
Ui (b) = x ∈ K : v(fi (b))i1 v(x − ci (b))i2 v(gi (b)) & Pn (λi (x − ci (b)))
where each symbol ij is ≤, <, or no condition. Note that this includes the cases where
Ui (b) = {ci (b)} is a singleton, or where
Ui (b) = x ∈ K : Pn (λi (x − ci (b))) .
(7.6)
The center of the cell Ui (b) is given by ci (b). Since the value group of K has smallest
positive element v(p), using the equivalences
v(a) < v(a0 )
⇐⇒
v(pa) ≤ v(a0 )
v(a) ≤ v(a0 )
⇐⇒
v(a) < v(pa0 ),
valid for all a, a0 ∈ K, not both zero, one sees that we may assume that in (7.5), i1 is ≤
or no condition, and i2 is < or no condition. From now on, we assume for convenience
that all our cells have this particular form.
Based on the above data for the cell decomposition, we now describe a uniform
definition of ∆(x; B)-types over finite sets in K, in terms of the graph of balls B. A
special ball is a ball having one of the following forms:
Bv(ci (b)−cj (b0 )) (ci (b)),
Bv(fi (b)) (ci (b)),
Bv(gi (b)) (ci (b))
(b, b0 ∈ K |y| ).
Note that each special ball can be defined by using at most two parameter tuples.
We say that Bv(ci (b)−cj (b0 )) (ci (b)), Bv(fi (b)) (ci (b)) and Bv(gi (b)) (ci (b)) are special balls
defined over {b, b0 }. We also say that a special ball is defined over a subset B of K if it
is defined over {b, b0 } where b, b0 ∈ B. By Lemma 7.3, given a finite B ⊆ K, there are
no more than 3N · |B| − 1 special balls defined over B.
Let us say that a ball B 0 is near a ball B if dist(B, B 0 ) ≤ n + 4v(n) + 2; each ball has
β := βn+4v(n)+2 balls near it. In particular, given b1 , b2 ∈ K |y| there are at most M :=
(6N − 1) · β balls near special balls defined over {b1 , b2 }. Set I (1) := [M ] = {1, . . . , M }
and I (2) = I (3) := [N ]. By Lemma 7.5 there are L-formulas χi (x; y1 , y2 ), i ∈ I (1) , such
that for each b1 , b2 ∈ K |y| , the formulas χi (x; b1 , b2 ) define exactly the balls near special
balls defined over {b1 , b2 }. For each i ∈ I (1) and ϕ ∈ ∆ define
(1)
ϕi (y; y1 , y2 ) := ∀x χi (x; y1 , y2 ) → ϕ(x; y) .
So for b, b1 , b2 ∈ K |y| we have:
(1)
K |= ϕi (b; b1 , b2 )
⇐⇒
ϕ(K; b) ⊇ χi (K; b1 , b2 ).
52
ASCHENBRENNER, DOLICH, HASKELL, MACPHERSON, AND STARCHENKO
For each i ∈ I (2) = I (3) and ϕ ∈ ∆ set
(2)
ϕi (y; y1 ) := ϕ(ci (y1 ); y)
and
(3)
ϕi (y; y1 ) := ∀x PN (λi (x − ci (y1 ))) → ϕ(x; y) .
(j)
(j)
Now set Fi := (ϕi (y; y1 , y2 ))ϕ∈∆ for each i ∈ I (j) , j = 1, 2, 3. The first part of
Theorem 7.2 will be proved once we show the following:
(j)
Claim. F = (Fi ) is a coherent definition of ∆(x; B)-types over finite sets.
Since the coherency condition is obviously satisfied, it is enough to show that F is a
uniform definition of ∆(x; B)-types over finite sets. For this let B ⊆ K |y| be finite and
non-empty, and let q ∈ S ∆ (B). Let
c(B) := ci (b) : b ∈ B, i = 1, . . . , N
be the set of centers of the cells Ui (b). In the following we let i range over [N ] =
{1, . . . , N } and b (possibly with decorations) over B. By a “special ball” we always
mean a special ball defined over B, and similarly a “near ball” is a ball near a special
ball (defined over B).
(2)
We first eliminate two special cases (which are taken care of by the families (Fi )
(3)
and (Fi )): Suppose first that q K ∩ c(B) 6= ∅, say ci (b1 ) ∈ q K for some i and b1 ∈ B.
(2)
In this case, with such choice of i and b1 , Fi (y; b1 ) defines q. Similarly, if |c(B)| = 1,
K
and for all i and b with q ⊆ Ui (b), the condition i1 is vacuous and gi (b) = 0, then all
(3)
such cells Ui (b) have the form as in (7.6), and for suitable i and b1 , Fi (y; b1 ) defines q.
So from now on we may assume that:
(a) q K is disjoint from c(B); and
(b) if c(B) is a singleton, then for some i and b with q K ⊆ Ui (b) the condition i1
is ≤ or gi (b) 6= 0.
Under these assumptions, it is enough to show: there is a near ball D such that D ⊆ q K .
We first note:
Lemma 7.6. Let a ∈ K \ c(B), and let B1 be a ball containing a which is maximal
subject to the condition B1 ∩ c(B) = ∅; that is, B1 = Bδ (a) where
δ = 1 + max v(a − c) : c ∈ c(B) .
Let also B0 := Bδ+2v(n) (a). Then for all x ∈ K and c ∈ c(B) we have:
(1) x ∈ B1 ⇒ v(x − c) = v(a − c);
(2) x ∈ B0 ⇒ (x − c)(a − c)−1 ∈ Pn .
In particular, if a cell Ui (b) contains a then it contains B0 .
The proof of (1) is obvious, and to deduce (2) from (1) use Lemma 7.4.
Let a ∈ K realize q, and define δ, B0 and B1 as in the previous lemma. Also, take
c ∈ c(B) such that δ = 1 + v(a − c).
Lemma 7.7. Let B2 be a ball. Then
(1) B2 properly contains B1 iff it contains both a and c;
(2) if B2 contains c but not a, then dist(B1 , B2 ) = rad(B2 ) − δ + 2.
VC DENSITY IN SOME NIP THEORIES, I
53
The proof of (1) is clear, and for (2) note that if c ∈ B2 and a ∈
/ B2 , then B1 , B2 ⊆
Bδ−1 (a).
We first assume that there is a special ball E such that dist(B1 , E) ≤ 2v(n) + n + 1.
Then by Lemma 7.6, D := B0 = Bδ+2v(n) (a) is contained in those cells Ui (b) which
contain a; hence D ⊆ q K . Also, dist(D, E) ≤ n + 4v(n) + 2, so D is near E. Hence the
ball D has the required properties.
So from now on, we may suppose that for any special ball E we have dist(B1 , E) >
2v(n) + n + 1. We distinguish two cases:
Case 1: there is a special ball which contains B1 . We let C be the smallest such special
ball, with radius ρ. We have c ∈ C (since C properly contains B1 ) and hence C = Bρ (c).
The idea now is to replace a and the ball D = B0 by another realization a0 ∈ C of q and
a ball D0 which is contained in and near the special ball C. As Γ is a Z-group, there is
a unique δ 0 ∈ Γ such that
ρ + 2v(n) + 1 < δ 0 ≤ ρ + 2v(n) + 1 + n
and
δ 0 ≡ δ mod n.
By assumption we have
2v(n) + n + 1 < dist(B1 , C) = δ − ρ
0
and hence δ > δ > ρ + 1. Now choose d ∈ Pn with v(d) = δ 0 − δ. (From now on until
the end of the proof of Theorem 7.2 we temporarily suspend our promise of d always
denoting a natural number.) Put
D0 := Bδ0 +2v(n)+1 (a0 )
where a0 := d(a − c) + c.
Then D0 is contained in the special ball C, and D0 is near C. Indeed, D0 ⊆ Bδ0 −1 (a0 ),
and these two balls are at distance 2v(n) + 2. The latter ball also contains c, so
dist Bδ0 −1 (a0 ), C = dist Bδ0 −1 (c), Bρ (c) = δ 0 − 1 − ρ ≤ 2v(n) + n,
hence dist(D0 , C) ≤ 4v(n)+2+n. Thus D0 has the right properties, provided we manage
to show:
Claim 1. Let Ui (b) be a cell as in (7.5) which contains a. Then D0 ⊆ Ui (b).
Towards the proof of this claim, we first show two auxiliary claims:
Claim 2. Let c0 ∈ c(B). Then
(
δ0 − 1
if v(c − c0 ) ≥ v(a − c),
0
0
v(a − c ) =
v(c − c0 ) ≤ ρ < δ 0 − 1 otherwise.
Proof. If v(c − c0 ) ≥ v(a − c) then
v(c − c0 ) ≥ v(a − c) = δ − 1 > δ 0 − 1
and hence
v(a0 − c0 ) = v(d(a − c) + (c − c0 )) = δ 0 − 1.
So suppose v(c − c0 ) < v(a − c). We have v(a − c) > v(c − c0 ), so v(a − c0 ) = v(c − c0 )
and hence a is contained in the special ball E := Bv(c−c0 ) (c). In fact, for every x ∈ B1
we have
v(x − a) ≥ δ = v(a − c) + 1 > v(a − c) > v(c − c0 )
and hence B1 ⊆ E. By minimality of C thus C ⊆ E. This yields δ 0 − 1 > ρ ≥ v(c − c0 )
and thus v(a0 − c0 ) = v(c − c0 ) < δ 0 − 1.
54
ASCHENBRENNER, DOLICH, HASKELL, MACPHERSON, AND STARCHENKO
Claim 3. For every c0 ∈ c(B) we have v(a − c0 ) ≥ v(a0 − c0 ).
Proof. Certainly, v(a − c0 ) ≥ min{v(a − a0 ), v(a0 − c0 )}. But the minimum is always
achieved by v(a0 − c0 ), as
v(a − a0 ) = v((d − 1)(a − c)) = δ 0 − 1 ≥ v(a0 − c0 )
by Claim 2.
By Claim 2 and Lemma 7.6 (applied to a0 in place of a), in order to show Claim 1, it is
enough to prove that a0 ∈ Ui (b). We abbreviate c0 = ci (b). Suppose that i1 is ≤. Then
fi (b) 6= 0, and by Lemma 7.6, (1), all elements x of B1 satisfy the condition v(fi (b)) ≤
v(x − c0 ); hence C is contained in the special ball Bv(fi (b)) (c0 ), by the minimality of C.
Since D0 ⊆ C, all elements x of D0 also satisfy v(fi (b)) ≤ v(x − c0 ); in particular, of
course, v(fi (b)) ≤ v(a0 −c0 ). If i2 is <, then by Claim 3, v(a0 −c0 ) ≤ v(a−c0 ) < v(gi (b)),
as required. It remains to check that a − c0 and a0 − c0 lie in the same coset of Pn× . We
distinguish two cases. If v(c − c0 ) ≤ ρ then v(a − c) = δ − 1 > ρ ≥ v(c − c0 ), hence
v(a − c0 ) = v(c − c0 ) and thus
v(a − a0 ) = δ 0 − 1 > 2v(n) + ρ ≥ 2v(n) + v(a − c0 );
therefore a−c0 and a0 −c0 are in the same Pn× -coset, by Lemma 7.4. Suppose v(c−c0 ) > ρ.
Then by Claim 2 we have v(c − c0 ) ≥ v(a − c) = δ − 1 and v(a0 − c0 ) = δ 0 − 1. Now
consider the special ball E := Bv(c−c0 ) (c). Note that a ∈
/ E: otherwise v(c − c0 ) =
v(a − c) = δ − 1 and hence B1 = Bδ (a) ⊆ E with dist(B1 , E) = 1, contrary to our initial
assumption (made before Case 1). Thus, by Lemma 7.7, (2) and said assumption, we
obtain 2v(n) + n + 1 < v(c − c0 ) − δ + 2. Hence
v(c − c0 ) > 2v(n) + δ − 1 = 2v(n) + v(a − c) ≥ 2v(n) + v(a0 − c0 ),
with the last inequality by Claim 3. So by Lemma 7.4, a − c0 and a − c are in the same
Pn× -coset, as are a0 − c and a0 − c0 . Certainly, as a0 − c = d(a − c) and d ∈ Pn× , the
elements a − c and a0 − c lie in the same coset of Pn× . Hence a − c0 and a0 − c0 also lie
in the same coset of Pn× . This finishes the proof of Claim 1, and hence of Case 1.
Case 2: no special ball contains B1 . In this case, for every c0 ∈ c(B), the special ball
C = Bv(c−c0 ) (c) does not contain a, so v(c0 − c) > v(a − c) = δ − 1 and hence
v(a − c0 ) = min{v(a − c), v(c − c0 )} = v(a − c) = δ − 1.
Since C is of distance greater than 2v(n) + n + 1 from B1 , by part (2) of Lemma 7.7 we
also obtain
v(c − c0 ) > δ + 2v(n) + n − 1.
(7.7)
Similarly, since each special ball Bv(gi (b)) (ci (b)) does not contain a, we get
v(gi (b)) > δ + 2v(n) + n − 1,
and since a ∈
/ Bv(fi (b)) (ci (b)), the condition i1 is vacuous for each i and b with a ∈ Ui (b).
Fix a special ball E of the form Bv(gi (b)) (ci (b)) with minimal radius γ = v(gi (b)), if
there is such a special ball; otherwise let γ = ∞. Also, if |c(B)| > 1, let C be a special
ball of the form Bv(c−c0 ) (c), where c0 ∈ c(B), with minimal radius ρ = v(c − c0 ); we set
ρ = ∞ if |c(B)| = 1. Note that by our general assumption (made before Lemma 7.6),
not both of γ and ρ are ∞. We now distinguish two subcases:
VC DENSITY IN SOME NIP THEORIES, I
55
Case 2a: ρ − 2v(n) ≤ γ. Let δ 0 ∈ Γ such that
ρ − 2v(n) − n < δ 0 ≤ ρ − 2v(n),
δ0 ≡ δ
mod n,
0
choose d ∈ Pn with v(d) = δ − δ, and set
D0 := Bδ0 +2v(n) (a0 )
where a0 := d(a − c) + c.
By (7.7) we have δ 0 > δ. Moreover, for each c00 ∈ c(B) we have v(c − c00 ) ≥ ρ > δ 0 − 1 =
v(d(a − c)) and hence v(a0 − c00 ) = δ 0 − 1. Note that the ball Bδ0 −1 (a0 ) contains D0 and
is of distance 2v(n) + 1 to D0 . The ball Bδ0 −1 (a0 ) contains c, hence
dist(Bδ0 −1 (a0 ), C) ≤ ρ − (δ 0 − 1) < 2v(n) + n + 1
and thus dist(D0 , C) < 4v(n) + n + 2, so D0 is near C. Let Ui (b) be a cell containing a; it
remains to show that then a0 ∈ Ui (b). We already noted that condition i1 is vacuous.
As to i2 , suppose that condition is <. Writing c0 = ci (b) we then have
v(a0 − c0 ) = δ 0 − 1 < ρ − 2v(n) ≤ γ ≤ v(gi (b))
as required. Finally, by (7.7) and Lemma 7.4, a − c0 and a − c are in the same Pn× -coset,
and since
v(c − c0 ) ≥ ρ > 2v(n) + δ 0 − 1 = 2v(n) + v(a0 − c0 ),
the elements a0 − c and a0 − c0 are also in the same Pn× -coset. As a0 − c = d(a − c) and
a − c are in the same Pn× -coset, finally a0 − c0 and a − c0 are in the same Pn× -coset, as
required.
Case 2b: ρ − 2v(n) > γ. In this case we let δ 0 ∈ Γ be such that
γ − 2v(n) − n < δ 0 ≤ γ − 2v(n),
0
0
δ0 ≡ δ
mod n,
0
and with this choice of δ define d, a and D as in Case 2a. Note that ρ > γ, so for each
c00 ∈ c(B) we have v(c − c00 ) > δ 0 − 1 = v(d(a − c)) and hence v(a0 − c00 ) = δ 0 − 1. Since
dist(Bδ0 −1 (a0 ), E) ≤ γ − (δ 0 − 1) < 2v(n) + n + 1
we see, similarly as in Case 2a, that dist(D0 , E) < 4v(n)+n+2, so D0 is near E. Let Ui (b)
be a cell containing a, and suppose i2 is <. Then v(a0 − c0 ) = δ 0 − 1 < γ ≤ v(gi (b)),
and as at the end of Case 2a one sees that a0 − c0 and a − c0 are in the same coset
of Pn× .
To complete the proof of the theorem, we apply Theorem 7.1 with r = 1. By what we
have shown above, for every finite non-empty B ⊆ M |y| , each type in S ∆ (B) is uniquely
determined by either a center ci (b), where b ∈ B, or a near ball. However, there are
at most N |B| = O(|B|) centers, and at most (3N |B| − 1) · β = O(|B|) near balls; thus
|S ∆ (B)| = O(|B|) as required.
From Theorem 7.2 and Corollary 5.13 we obtain:
Corollary 7.8. Every P -minimal theory with definable Skolem functions is dp-minimal.
Remark 7.9. In [24, Section 6], Dolich, Goodrick and Lippel already showed that pCF =
Th(Qp ) is dp-minimal. By 3.6 of [22], the P -minimal theory pCFan has definable Skolem
functions. (Formally, [22] handles the corresponding subanalytic structure on Zp , but
the translation is straightforward.) Note that the proof of [22, 3.6] takes place in the
ground model Zp , where all elements are named by constants, so ‘definable’ means ‘∅definable’, and curve selection, as stated there, gives definable Skolem functions. Hence
the conclusion of Theorem 7.2 and Corollary 7.8 apply to it and its reducts. (See also
Lemma 3.6.) Cell decomposition in pCFan is also proved in [21].
56
ASCHENBRENNER, DOLICH, HASKELL, MACPHERSON, AND STARCHENKO
References
[1] K. A. S. Abdel-Ghaffar, Maximum number of edges joining vertices on a cube, Inform. Process.
Lett. 87 (2003), no. 2, 95–99.
[2] H. Adler, Theories controlled by formulas of Vapnik-Chervonenkis codimension 1, preprint (2008).
[3]
, An introduction to theories without the independence property, preprint (2008).
[4] M. Anthony, G. Brightwell, and C. Cooper, The Vapnik-Chervonenkis dimension of a random
graph, Discrete Math. 138 (1995), no. 1-3, 43–56.
[5] M. Aschenbrenner, L. van den Dries, Closed asymptotic couples, J. Algebra 225 (2000), 309–358.
[6] M. Aschenbrenner, A. Dolich, D. Haskell, D. Macpherson, S. Starchenko, Vapnik-Chervonenkis
density in some theories without the independence property, II, preprint (2011).
[7] P. Assouad, Densité et dimension, Ann. Inst. Fourier (Grenoble) 33 (1983), no. 3, 233–282.
[8]
, Observations sur les classes de Vapnik-Cervonenkis et la dimension combinatoire de
Blei, in: Seminaire d’Analyse Harmonique, 1983–1984, pp. 92–112, Publications Mathématiques
d’Orsay, vol. 85-2, Université de Paris-Sud, Département de Mathématiques, Orsay, 1985.
[9] J. Baldwin, S. Shelah, Randomness and semigenericity, Trans. Amer. Math. Soc. 349 (1997), no.
4, 1359–1376.
[10] J. Balogh, B. Bollobás, Unavoidable traces of set systems, Combinatorica 25 (2005), no. 6, 633–
643.
[11] S. Basu, R. Pollack, M.-F. Roy, On the number of cells defined by a family of polynomials on a
variety, Mathematika 43 (1996), no. 1, 120–126.
[12] L. Bélair, Types dans les corps valués munis d’applications coefficients, Illinois J. Math. 43
(1999), no. 2, 410–425.
[13] L. Bélair, M. Bousquet, Types dans les corps valués, C. R. Acad. Sci. Paris Sér. I Math. 323
(1996), no. 8, 841–844.
[14] O. Belegradek, Y. Peterzil, F. Wagner, Quasi-o-minimal structures, J. Symbolic Logic 65 (2000),
no. 3, 1115–1132.
[15] O. Belegradek, V. Verbovskiy, F. Wagner, Coset-minimal groups, Ann. Pure Appl. Logic 121
(2003), no. 2-3, 113–143.
[16] G. Birkhoff, Lattice Theory, 3rd ed., American Mathematical Society Colloquium Publications,
vol. XXV, American Mathematical Society, Providence, R.I., 1967.
[17] H. Brönnimann, M. T. Goodrich, Almost optimal set covers in finite VC-dimension, Discrete
Comput. Geom. 14 (1995), no. 4, 463–479.
[18] E. Casanovas, M. Ziegler, Stable theories with a new predicate, J. Symbolic Logic 66 (2001), no.
3, 1127–1140.
[19] G. Cherlin, F. Point, On extensions of Presburger arithmetic, in: B. I. Dahn (ed.), Proceedings of
the fourth Easter conference on model theory (Gross Köris, 1986 ), pp. 17–34, Seminarberichte,
vol. 86, Humboldt Universität, Sektion Mathematik, Berlin, 1986.
[20] A. Chernikov, P. Simon, Externally definable sets and dependent pairs, preprint (2010), available
online at http://front.math.ucdavis.edu/1007.4468.
[21] R. Cluckers, Analytic p-adic cell decompositions and integrals, Trans. Amer. Math. Soc. 356
(2004), 1489–1499.
[22] J. Denef, L. van den Dries, p-adic and real subanalytic sets, Ann. Math. 128 (1988), 70–138.
[23] M. A. Dickmann, Elimination of quantifiers for ordered valuation rings, J. Symbolic Logic 52
(1987), 116–128.
[24] A. Dolich, J. Goodrick, D. Lippel, Dp-minimal theories: basic facts and examples, Notre Dame
J. Formal Logic 52 (2011), no. 3, 267–288.
[25] L. van den Dries, Tame Topology and O-minimal Structures, London Mathematical Society Lecture Note Series, vol. 248, Cambridge University Press, Cambridge, 1998.
[26] L. van den Dries, D. Haskell and H. D. Macpherson, One-dimensional p-adic subanalytic sets, J.
London Math. Soc. (2) 59 (1999), 1–20.
[27] R. M. Dudley, A course on empirical processes, in: P. L Hennequin (ed.), École d’été de probabilités de Saint-Flour XII, pp. 1–142, Lecture Notes in Mathematics, vol. 1097, Springer-Verlag,
Berlin, 1984.
, Uniform Central Limit Theorems, Cambridge Studies in Advanced Mathematics, vol.
[28]
63, Cambridge University Press, Cambridge, 1999.
[29] J.-L. Duret, Les corps faiblement algébriquement clos non séparablement clos ont la propriété
d’indépendence, in: L. Pacholski et al. (eds.), Model Theory of Algebra and Arithmetic (Proc.
VC DENSITY IN SOME NIP THEORIES, I
[30]
[31]
[32]
[33]
[34]
[35]
[36]
[37]
[38]
[39]
[40]
[41]
[42]
[43]
[44]
[45]
[46]
[47]
[48]
[49]
[50]
[51]
[52]
[53]
[54]
[55]
[56]
[57]
57
Conf., Karpacz, 1979), pp. 136–162, Lecture Notes in Mathematics, vol. 834, Springer-Verlag,
Berlin, 1980.
H. Edelsbrunner, Algorithms in Combinatorial Geometry, EATCS Monographs on Theoretical
Computer Science, vol. 10, Springer-Verlag, Berlin, 1987.
Gy. Elekes, SUMS versus PRODUCTS in number theory, algebra and Erdős geometry, in: G.
Halász et al. (eds.), Paul Erdős and his Mathematics, II (Budapest, 1999), pp. 241–290, Bolyai
Soc. Math. Stud., vol. 11, János Bolyai Math. Soc., Budapest, 2002.
P. Erdős, R. Rado, A partition calculus in set theory, Bull. Amer. Math. Soc. 62 (1956), 427–489.
Z. Füredi, J. Pach, Traces of finite sets: extremal problems and geometric applications, in: P.
Frankl et al. (eds.), Extremal Problems for Finite Sets (Visegrád, 1991), pp. 251–282, Bolyai Soc.
Math. Stud., vol. 3, János Bolyai Math. Soc., Budapest, 1994.
V. Guingona, On uniform definability of types over finite sets, preprint (2010), available online
at http://front.math.ucdavis.edu/1005.4924.
V. Guingona, C. Hill, Local dp-rank and VC-density over indiscernible sequences, preprint (2011),
available online at http://front.math.ucdavis.edu/1108.2554.
Y. S. Gurevich, P. H. Schmitt, The theory of ordered abelian groups does not have the independence property, Trans. Amer. Math. Soc. 284 (1984), 171–182.
J. Hartman, The homeomorphic embedding of Kn in the m-cube, Discrete Math. 16 (1976), no.
2, 157–160.
D. Haskell, H. D. Macpherson, Cell decompositions of C-minimal structures, Ann. Pure Appl.
Logic 66 (1994), no. 2, 113–162.
, A version of o-minimality for the p-adics, J. Symb. Logic 62 (1997), 1075–1092.
, VC density in real closed valued fields, Prépublications de la séminaire de structures
algébriques ordonnées 83 (2008–2009), Equipe de logique, Université Paris VII.
D. Haussler, Sphere packing numbers for subsets of the Boolean n-cube with bounded VapnikChervonenkis dimension, J. Combin. Theory Ser. A 69 (1995), no. 2, 217–232.
W. Hodges, Model Theory, Encyclopedia of Mathematics and its Applications, vol. 42, Cambridge
University Press, Cambridge, 1993.
J. E. Holly, Canonical forms for definable subsets of algebraically closed and real closed valued
fields, J. Symbolic Logic 60 (1995), no. 3, 843–860.
E. Hrushovski, D. Kazhdan, Integration in valued fields, in: V. Ginzburg (ed.), Algebraic Geometry and Number Theory, pp. 261–405, Progress in Mathematics, vol. 253, Birkhäuser Boston,
Inc., Boston, MA, 2006.
U. Hrushovski, A. Pillay, Weakly normal groups, in: Ch. Berline et al. (eds.), Logic Colloquium ’85
(Orsay, 1985 ), pp. 233–244, Stud. Logic Found. Math., vol. 122, North-Holland, Amsterdam,
1987.
G. Jeronimo, J. Sabia, On the number of sets definable by polynomials, J. Algebra 227 (2000),
no. 2, 633–644.
H. R. Johnson, M. C. Laskowski, Compression schemes, stable definable families, and o-minimal
structures, Discrete Comput. Geom. 43 (2010), no. 4, 914–926.
I. Kaplan, A. Onshuus, A. Usvyatsov, Additivity of the dp-rank, preprint (2011), available online
as no. 251 at http://www.logique.jussieu.fr/modnet/Publications/Preprint%20server/.
M. Karpinski, A. Macintyre, Polynomial bounds for VC dimension of sigmoidal and general
pfaffian neural networks, J. Comput. System Sci. 54 (1997), 169–176.
, Approximating volumes and integrals in o-minimal and P-minimal theories, in: A. Macintyre (ed.), Connections between Model Theory and Algebraic and Analytic Geometry, pp. 149–
177, Quad. Mat., vol. 6, Dept. Math., Seconda Univ. Napoli, Caserta, 2000.
T. Kővari, V. T. Sós, and P. Turán, On a problem of K. Zarankiewicz, Colloquium Math. 3
(1954), 50–57.
K. Kudaı̆bergenov, On the independence property, Siberian Math. J. 41 (2000), no. 1, 113.
, Weakly quasi-o-minimal models, Siberian Adv. Math. 20 (2010), no. 4, 285–292.
F.-V. Kuhlmann, Abelian groups with contractions, II: Weak o-minimality, in: A. Facchini, C.
Menini (eds.): Abelian Groups and Modules, Kluwer, Dordrecht (1995), 323–342.
M. C. Laskowski, Vapnik-Chervonenkis classes of definable sets, J. London Math. Soc. (2) 45
(1992), no. 2, 377–384.
, unpublished notes.
L. Lipshitz, Rigid subanalytic sets, Amer. J. Math. 115 (1993), no. 1, 77–108.
58
ASCHENBRENNER, DOLICH, HASKELL, MACPHERSON, AND STARCHENKO
[58] L. Lipshitz, Z. Robinson, One-dimensional fibers of rigid subanalytic sets, J. Symbolic Logic 63
(1998), 83–88.
[59] L. Lovász and B. Szegedy, Regularity partitions and the topology of graphons, in: I. Bárány et al.
(eds.), An Irregular Mind. Szemerédi is 70, pp. 415–446, Bolyai Society Mathematical Studies,
vol. 21, Springer-Verlag, Berlin; János Bolyai Mathematical Society, Budapest, 2010.
[60] A. J. Macintyre, On definable subets of p-adic fields, J. Symbolic Logic 41 (1976), 605–610.
[61] D. Macpherson, D. Marker and C. Steinhorn, Weakly o-minimal structures and real closed fields,
Trans. Amer. Math. Soc. 352 (2000), no. 12, 5435–5483.
[62] D. Macpherson, C. Steinhorn, On variants of o-minimality, Ann. Pure Appl. Logic 79 (1996),
no. 2, 165–209.
[63] D. Marker, Model Theory, Graduate Texts in Mathematics, vol. 217, Springer-Verlag, New York,
2002.
[64] J. Matoušek, Tight upper bounds for the discrepancy of half-spaces, Discrete Comput. Geom. 13
(1995), no. 3-4, 593–601.
[65]
, Geometric set systems, in: A. Balog et al. (eds.), European Congress of Mathematics, II
(Budapest, 1996 ), 1–27, Progr. Math., vol. 169, Birkhäuser, Basel, 1998.
[66]
, Lectures on Discrete Geometry, Graduate Texts in Mathematics, vol. 212, SpringerVerlag, New York, 2002.
, Bounded VC-dimension implies a fractional Helly theorem, Discrete Comput. Geom. 31
[67]
(2004), no. 2, 251–255.
[68] J. Matoušek, E. Welzl, L. Wernisch, Discrepancy and approximations for bounded VC-dimension,
Combinatorica 13 (1993), no. 4, 455–66.
[69] C. Michaux, R. Villemaire, Presburger arithmetic and recognizability of sets of natural numbers
by automata: new proofs of Cobham’s and Semenov’s theorems, Ann. Pure Appl. Logic 77 (1996),
no. 3, 251–277.
[70] M.-H. Mourgues, Cell decomposition for P -minimal fields, MLQ Math. Log. Q. 55 (2009), no. 5,
487–492.
[71] A. Onshuus, A. Usvyatsov, On dp-minimality, strong dependence and weight, J. Symbolic Logic
76 (2011), no. 3, 737–758.
[72] J. Pach, M. Sharir, Repeated angles in the plane and related problems, J. Combin. Theory Ser. A
59 (1992), no. 1, 12–22.
, On the number of incidences between points and curves, Combin. Probab. Comput. 7
[73]
(1998), no. 1, 121–127.
[74]
, Geometric incidences, in: J. Pach (ed.), Towards a Theory of Geometric Graphs,
pp. 185–223, Contemporary Mathematics, vol. 342, American Mathematical Society, Providence,
RI, 2004.
[75] M. Parigot, Théories d’arbres, J. Symbolic Logic 47 (1982), no. 4, 841–853.
[76] A. Pillay, The model-theoretic content of Lang’s conjecture, in: E. Bouscaren (ed.), Model Theory
and Algebraic Geometry, pp. 101–106, Lecture Notes in Mathematics, vol. 1696, Springer-Verlag,
Berlin, 1998.
[77] K.-P. Podewski, M. Ziegler, Stable graphs, Fund. Math. 100 (1978), no. 2, 101–107.
[78] F. Point, On decidable extensions of Presburger arithmetic: from A. Bertrand numeration systems to Pisot numbers, J. Symbolic Logic 65 (2000), no. 3, 1347–1374.
[79] F. Point, F. O. Wagner, Essentially periodic ordered groups, Ann. Pure Appl. Logic 105 (2000),
no. 1–3, 261–291.
[80] B. Poizat, Cours de Théorie des Modèles, Nur al-Mantiq wal-Marifah, Villeurbanne (1985).
[81] R. Pollack, M.-R. Roy, On the number of cells defined by a set of polynomials, C. R. Acad. Sci.
Paris Sér. I Math. 316 (1993), no. 6, 573–577.
[82] M. Prest, Model Theory and Modules, London Mathematical Society Lecture Note Series, vol.
130, Cambridge University Press, Cambridge, 1988.
[83] N. Sauer, On the density of families of sets, J. Combinatorial Theory Ser. A 13 (1972), 145–147.
[84] J. H. Schmerl, ℵ0 -categorical distributive lattices of finite breadth, Proc. Amer. Math. Soc. 87
(1983), no. 4, 707–713.
, Partially ordered sets and the independence property, J. Symbolic Logic 54 (1989), no.
[85]
2, 396–401.
[86] S. Shelah, Stability, the f.c.p., and superstability; model theoretic properties of formulas in firstorder theory, Ann. Math. Logic 3 (1971), no. 3, 271–362.
VC DENSITY IN SOME NIP THEORIES, I
[87]
[88]
[89]
[90]
[91]
[92]
[93]
[94]
[95]
[96]
[97]
[98]
[99]
[100]
59
, A combinatorial problem; stability and order for models and theories in infinitary languages, Pacific J. Math. 41 (1972), 247–261.
, Classification Theory and the Number of Nonisomorphic Models, 2nd ed., Studies in
Logic and the Foundations of Mathematics, vol. 92, North-Holland Publishing Co., Amsterdam,
1990.
, Dependent first order theories, continued, Israel J. Math. 173 (2009), 1–60.
, Strongly dependent theories, Israel J. Math., to appear, available online at http://arxiv.
org/abs/math/0504197.
S. Shelah, J. Spencer, Zero-one laws for sparse random graphs, J. Amer. Math. Soc. 1 (1988),
no. 1, 97–115.
P. Simon, On dp-minimal ordered structures, J. Symbolic Logic 76 (2011), no. 2, 448–460.
J. Solymosi, T. Tao, An incidence theorem in higher dimensions, preprint (2011), available online
at http://front.math.ucdavis.edu/1103.2926.
J. Spencer, The Strange Logic of Random Graphs, Algorithms and Combinatorics, vol. 22,
Springer-Verlag, Berlin, 2001.
J. Spencer, E. Szemerédi, W. T. Trotter, Jr., Unit distances in the Euclidean plane, in: B. Bollobś
(ed.), Graph Theory and Combinatorics, pp. 293–303, Academic Press, Inc., London, 1984.
E. Szemerédi, W. T. Trotter, Jr., A combinatorial distinction between the Euclidean and projective
planes, European J. Combin. 4 (1983), no. 4, 385–394.
C. Tóth, The Szemerédi-Trotter theorem in the complex plane, preprint (2003), available online
at http://front.math.ucdavis.edu/0305.5283.
V. N. Vapnik, A. Ja. Červonenkis, The uniform convergence of frequencies of the appearance of
events to their probabilities, Theor. Probability Appl. 16 (1971), 264–280.
V. Weispfenning, Elimination of quantifiers for certain ordered and lattice-ordered abelian groups,
Bull. Soc. Math. Belg. Sér. B 33 (1981), 131–155.
J. Wierzejewski, On stability and products, Fund. Math. 93 (1976), no. 2, 81–95.
Department of Mathematics, University of California, Los Angeles, Box 951555, Los Angeles, CA 90095-1555, U.S.A.
E-mail address: [email protected]
Department of Mathematics, East Stroudsburg University, Science & Technology Center, Room 118, East Stroudsburg, PA 18301, U.S.A.
Department of Mathematics and Statistics, McMaster University, 1280 Main St W, Hamilton ON L8S 4K1, Canada
E-mail address: [email protected]
School of Mathematics, University of Leeds, Leeds LS2 9JT, U.K.
E-mail address: [email protected]
Department of Mathematics, University of Notre Dame, 255 Hurley Building, Notre
Dame, IN 46556-4618, U.S.A.
E-mail address: [email protected]