18.434 - PRESENTATION 1
MATROID THEORY AND THE GREEDY ALGORITHM
ARIEL SCHVARTZMAN
Matroids are combinatorial structures that generalize the notion of vector spaces
and independence. In fact, the word matroid comes from the word matrix and the
suffix -oid (from latin, ’likeness’). Many of the concepts from linear algebra such
as a basis or a rank are similarly abstracted into matroid theory.
There are many equivalent definitions of a matroid. I will not have time to go
over all these definitions and show that they are equivalent. I will be using the
definitions provided in [Goemans] which focus on independent sets.
Definition A matroid M is defined on a finite set of elements E called the ground
set and a collection I ⊆ 2E which are said to be independent. The usual notation
for this is M = (E, I). In order for M to be a matroid, three properties must be
met:
• (I1) ∅ ∈ I
• (I2) If A ∈ I and B ⊆ A, then B ∈ I.
• (I3) If A, B ∈ I and |A|> |B|, then ∃e ∈ A \ B such that B ∪ {e} ∈ I.
The third property is usually referred to as the exchange property. Given two
independent sets of different size, there exists at least one element on the larger one
that can be added to the smaller one without altering independence.
Definition a maximal independent set is called a base of the matroid. Recall that
maximal (from matching theory) means that adding any element to this set will
cause it to be no longer independent. A maximum set is a base of largest possible
size.
(analogy with matching theory - 6-cycle)
Theorem 1. All bases of a matroid have the same size.
Proof. Suppose for the sake of contradiction that we are given two bases B1 , B2
such that |B1 |< |B2 |. Therefore, by the exchange property there is an element
b ∈ B2 , 6∈ B1 such that B1 ∪ b is also independent. But this implies that B1 was
not a maximal independent set. Therefore, this contradicts the hypothesis that B1
was a base.
1. Examples
1.1. Uniform Matroids. A uniform matroid M = (E, I) is defined in the following way
I = {X ⊆ E||X|≤ k}
for a given k. The three properties above hold trivially.
1
2
ARIEL SCHVARTZMAN
1.2. Matrices. Linear matroids are defined from a matrix A. The ground set E
is the set of indices of the columns of A. For a subset X ∈ E, let AX denote the
restriction of A to the columns indexed by X. The the independent set is
I = {X ∈ E|rank(AX ) = |X|}
The first property is easily satisfied. The second corresponds to the fact that
the a subset of independent vectors is still independent. The last fact follows from
a well known result in linear algebra. Given two independent sets of vectors such
that rank(AX ) > rank(AY ), then there must be a column vector in X that can be
added to Y to increase the rank by 1.
Consider
on the following matrix A.
defining a matroid
1 0 1 2 0
A = 0 1 1 2 0
0 1 1 2 1
Let the columns vectors be labelled E = {1, 2, 3, 4, 5} (from left to right). Therefore, the ground set of the matroid is 2E . What are some members of the I?
{1, 2}, {3}, {1, 2, 6} and many more. Which elements are not in I? {3, 4}, {1, 2, 3}
and so on. Notice how the exchange property applies for the examples above.
1.3. Graphic Matroids. A graphic matroid is defined on graph G = (V, E), where
the ground set is the subset of edges of the graph. A set of edges X ⊆ E is said
to be independent if it contains no cycles (i.e., they are forests). In this case, the
first two properties follow easily. The last property requires some combinatorial
analysis.
Theorem 2. Given two forests F1 , F2 defined on the same vertex set such that
|F1 |> |F2 |, then there exists one edge e ∈ F1 such that F2 ∪ {e} is still a forest.
First, we need the following useful Lemma.
Lemma 1. If F is a forest, then the number of connected components of (V, F ) is
given by κ(V, F ) = |V |−|F |.
This lemma can be easily proved by induction on the size of F . I will skip this
proof. Therefore, from our lemma we have that |V |−|F1 |< |V |−|F2 |. Therefore,
there must be an edge e ∈ F1 \ F2 such that adding it to F2 connects two distinct
components creating a larger tree.
If the original graph is connected, bases correspond to spanning trees. If the
original graph is disconnected, a base corresponds to taking the spanning tree of
each connected component.
2. Matroid Optimization
Given a matroid M = (E, I) and a weight function w : E → R, P
we are interested
in finding an independent set S ∈ I of maximum cost w(S) = e∈S c(e). If all
w(e) ≥ 0, the problem is equivalent to finding the maximum cost base. If there
are e ∈ E, w(e) ≤ 0, then by the second property defined above no maximal cost
independent set will contain it. Therefore, we can eliminate all elements of negative
cost.
The greedy algorithm does exactly what one would expect it to do: it picks
at every stage the largest element that can be added to our current set without
18.434 - PRESENTATION 1
MATROID THEORY AND THE GREEDY ALGORITHM
3
altering independence. So, if we start with an independent set and at every step
we don’t alter the independence we will end up with an independent set.
The steps are as follows.
• Sort the elements and renumber them such that w(e1 ) ≥ w(e2 ) ≥ ... ≥
w(e|E| ).
• Define S = ∅
• For i = 1 to i = |E|, look at element the i−th largest element, ei . Check if
S ∪ ei ∈ I. If so, add ei to S. Otherwise, move on.
Theorem 3. The greedy algorithm defined above always returns the maximum cost
independent set.
Proof. We can assume without loss of generality that the elements of the matroid
have positive weights. Why? Define a new matroid on the positive subset of the
original ground set, E and the independent set I without the sets that have negative
elements. This still satisfies the definition of a matroid.
Suppose that the algorithm spits out a base B = {b1 , b2 , ..., bn }, where b1 ≥
b2 ≥ ... ≥ bn . For the sake of contradiction, suppose that there is another base
C = {c1 , c2 , ..., cn } such that w(C) > w(B). Notice that the bases must have
the same number of elements due to the Theorem we showed above. The weight
condition implies that
c1 + c2 + ... + cn > b1 + b2 + ... + bn
Let j be the first index such that cj ∈ C and cj > bj (Why must there exist one?).
Consider the following sets: Bj−1 = {b1 , b2 , ..., bj−1 } and Cj = {c1 , c2 , ..., cj }. Naturally, these sets are independent because they are subsets of B, C respectively.
Moreover, |Cj |> |Bj−1 |. Therefore, by the exchange lemma there exists an index
i such that ci ∪ Bj−1 is independent. Moreover, we know that ci ≥ cj > bj . But
w(Bj−1 ∪ ci ) > w(Bj ) and Bj−1 ∪ ci is independent. So the algorithm lied to us
and should have chosen ci instead of bj . This is a contradiction.
I will present but not prove a stronger theorem:
Theorem 4. A non-empty collection I of subsets of E is the set of independent
sets of a matroid, if and only if, the following conditions hold:
• X ∈ I, Y ⊆ X → Y ∈ I.
• For all non-negative weight functions w : E → R, the greedy algorithm
selects a member of A ∈ I such that w(A) ≥ w(B) for all B ∈ I.
3. Applications
3.1. Relation to Kruskal’s Algorithm. As we have shown above, one can define a graphic matroid on a graph. Moreover, if the original graph is connected
and weighted, the notion of a maximal cost independent set corresponds to a maximal spanning tree. Thus, our greedy algorithm provides a way to get a maximal
spanning tree. Moreover, the proof above can be easily modified to show that the
greedy algorithm can also find the minimum cost base. This is the explanation for
Kruskal’s algorithm.
4
ARIEL SCHVARTZMAN
In fact, to find MSTs one can simply do the following change of weights: wi0 =
wmax − wi . We are therefore making the smallest the element the first toPbe considered, and the heaviest the last. All trees will have cost (n − 1)wmax − wi . So
the tree with the maximum weight in this case will have the smallest ’actual’ cost
of the tree.
3.2. Unit time scheduling. The problem of scheduling unit time tasks with deadlines and penalties for a single processor can also be solved with the greedy algorithm. The problem is as follows:
• A set S = {a1 , a2 , ..., an } of n unit tasks.
• A set o n integer deadlines d1 , ..., dn such that di satisfies 1 ≤ di ≤ n and
task ai is supposed to finish before time di .
• A set of nonnegative weights or penalties w1 , ..., wn , such that we incur a
penalty of wi if task ai is not finished by time di , and we incur no penalty
if a task finisher by its deadline.
We want to find a schedule for S that minimizes the total penalty incurred. We
can define a matroid M in the following way. Let E be all subsets of S. We say
that a set of tasks A is independent if there exists a schedule for these fast such
that no tasks are late.
© Copyright 2026 Paperzz