University of Melbourne
Department of Mathematics and Statistics
Column Generation Methods for
Planning Sustainable Vegetable
Crop Rotations
Supervisors:
Author:
Alysson Costa
Nicholas Leong
Gerardo Berbeglia
A thesis submitted for the degree of
Master of Science (Mathematics and Statistics)
October 2015
Abstract
There are many concerns about the environmental impacts of current industrial
agricultural practices. Crop rotation is a common method for increasing the sustainability of modern farming practices. This thesis will investigate methods for
planning vegetable crop rotations which respect a number of biological sustainability constraints. We present two approaches; firstly, a conventional column generation
approach and secondly, a new constraint programming based column generation approach. Heuristics to capture potential practical restrictions are also presented. All
methods are evaluated on test instances based on real-world data.
Contents
1 Introduction
1.1 Crop Rotations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.2 Sustainable Vegetable Crop Rotations . . . . . . . . . . . . . . . . . .
1.3 Problem Specification . . . . . . . . . . . . . . . . . . . . . . . . . . .
2 Theoretical Background
2.1 Linear Programming . . . . . . . . . . . . . . . . . . . . .
2.1.1 Column Generation . . . . . . . . . . . . . . . . . .
2.2 Mixed Integer Programming . . . . . . . . . . . . . . . . .
2.2.1 Branch & Bound . . . . . . . . . . . . . . . . . . .
2.2.2 Improvements . . . . . . . . . . . . . . . . . . . . .
2.3 Constraint Programming . . . . . . . . . . . . . . . . . . .
2.3.1 Defining a Constraint Satisfaction Problem . . . . .
2.3.2 Constraint Propagation Algorithms . . . . . . . . .
2.3.3 The Search Procedure . . . . . . . . . . . . . . . .
2.3.4 Interval and Sequence Variables . . . . . . . . . . .
2.3.5 Constraint Programming Based Column Generation
3
3
4
6
.
.
.
.
.
.
.
.
.
.
.
8
8
9
11
11
12
14
14
15
16
18
21
3 Mixed Integer Program Formulation
3.1 Crop Rotation Schedule . . . . . . . . . . . . . . . . . . . . . . . . .
3.2 Supplying Demand . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.3 Column Generation . . . . . . . . . . . . . . . . . . . . . . . . . . . .
22
22
24
26
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
4 Constraint Programming Formulation
28
4.1 Pricing Problem Formulation . . . . . . . . . . . . . . . . . . . . . . 28
4.2 Constraint Programming Based Column Generation . . . . . . . . . . 31
5 Plot Reduction Heuristics
32
5.1 Eliminating Small Plots . . . . . . . . . . . . . . . . . . . . . . . . . 32
5.2 Minimising Number of Plots . . . . . . . . . . . . . . . . . . . . . . . 33
1
6 Computation Tests and Results
6.1 Mixed Integer Programming . . . . . . . . . . . . . . . . . . . . . . .
6.2 Constraint Programming Results . . . . . . . . . . . . . . . . . . . .
6.2.1 Optimising the Pricing Problem with Constraint Programming
6.2.2 Optimising Pricing Problem Adding Intermediate Solutions . .
6.2.3 Adding Other Sets of Columns in Each Iteration . . . . . . . .
6.2.4 Introducing Diversity in Columns . . . . . . . . . . . . . . . .
6.2.5 Hybrid Approaches . . . . . . . . . . . . . . . . . . . . . . . .
6.3 Plot Reduction Heuristics . . . . . . . . . . . . . . . . . . . . . . . .
6.3.1 Eliminating Small Plots . . . . . . . . . . . . . . . . . . . . .
6.3.2 Minimising Number of Plots . . . . . . . . . . . . . . . . . . .
35
36
38
38
39
41
43
45
46
46
49
7 Conclusions and Future Work
52
A Crop Data
54
B MIP Results
56
C CP Optimisation Results
57
D Pricing Problem Solve Times when Optimising
58
E CP Multiple Column Results
60
F Results of CP Multiple with Restart Search
62
2
Chapter 1
Introduction
Current industrial agricultural practices have considerable detrimental effects on the
environment, such as, air and water pollution, soil depletion and diminishing biodiversity. Agriculture is directly responsible for 20% of human generated greenhouse
gas emissions (Horrigan et al., 2002). As a result of these negative impacts on the
environment and scarce resources there is increasing requirement for agricultural
decisions to be made based on technically well-grounded evidence (Carravilla et al.,
2013).
1.1
Crop Rotations
Crop rotations are often identified as one way of increasing the sustainability of
modern farming practices (Horrigan et al., 2002; McBratney et al., 2005). A crop
rotation is the specification of a sequence of crops to be grown in the same parcel of
land in successive periods (Hildreth and Reither, 1951). There are many advantages
to using a crop rotation planning strategy, including; breaking the dominance of
weeds and pathogens (Marcroft et al., 2004), improved soil structure (Hamza and
Anderson, 2005), reduced erosion, preventing groundwater pollution (Leteinturier
et al., 2006) and contributing to biodiversity (Albrecht, 2003). A combination of
these benefits results in the reduced use of fertilizers and agrotoxics. Motivated
by these benefits, there are a number of desirable attributes that a crop rotation
should possess, including; a combination of deep-rooted and shallow-rooted crops,
crops with high level of ground cover, a combination of water-demanding crops and
those that require less water, crops that leave a large amount of plant residue after
harvest and nitrogen fixing legumes (Mudgal et al., 2010).
Operations Research (OR), particularly linear programming, has been used to
make decisions about agricultural planning for many years and is one of the fields in
3
which linear programming was first employed. Work in the area is ongoing, Bjørndal
et al. (2012) provides a survey of OR applications in the natural resources industry.
Determining the best combination of crops is identified by Heady (1954) as an
application of OR techniques. Optimal distribution of arable land is also addressed
by Kantorovich (1960) as an application of “Mathematical Methods”, where crops
are assigned to plots of land to achieve maximum output. There is also a brief mention of using crop rotations to extend the planning to multiple year periods. Hildreth
and Reither (1951) present a model for crop planning based on pre-determined crop
rotations. The reliance on pre-defined crop rotations is a significant drawback when
the number of crops increases and it becomes infeasible to enumerate all possible
rotations. El-Nazer and McCarl (1986) removed the reliance on predefined crop
sequences by assuming the yield of a crop depends on the previous three crops that
were cultivated in the same plot. Haneveld and Stegeman (2005) focus on crop
sequences that are inadmissible, from this set of inadmissible sequences the set of
admissible sequences and the feasible way these sequences can be combined is determined. Detlefsen and Jensen (2007) approach the crop scheduling problem using
network flows. A network interpretation of the problem is presented and solved using associated network algorithms. Arcs in the network corresponding to infeasible
crop sequences are explicitly removed.
Besides trying to find the best crop rotation through optimisation there have
been a number of tools developed to aid decision makers in choosing crop rotations.
Dogliotti et al. (2003) have developed a tool called ROTAT, for generating all feasible crop rotations under a set of agronomic criteria. Jones et al. (2003) present
DSSAT, a software system for managing agricultural information to aid decision
making, which allows crop rotations to be evaluated. Stöckle et al. (2003) present
CropSyst, a cropping system simulation model which captures climate, soil, water and nitrogen considerations and allows crop rotation to be assessed in a wider
context. Bachinger and Zander (2007) have developed a tool called ROTOR for
determining and evaluating crop rotations in the context of organic farming, where
no nitrogen fertiliser or synthetic pesticides are used. More recently there has been
research relating specifically to planning vegetable crop rotations.
1.2
Sustainable Vegetable Crop Rotations
Santos et al. (2008) introduce botanical constraints that are specific to vegetable
crop rotations. Vegetable crops differ from traditional seasonal crops in that:
• they usually have short and variable production times,
4
• they have specific allowable planting times,
• they belong to a wide range of botanical families.
In Santos et al. (2008), a fixed configuration of plots is assumed and a binary
linear optimisation model is presented to determine crop rotations which maximise
the occupation of plots. Biological constraints restrict the succession of crops of
the same botanical family, both in the same plot and in neighbouring plots. Each
rotation must include a period of fallow and green manure crops. A heuristic based
on column generation is proposed and shown to perform very well on test instances
(Column generation is explained in section 2.1.1). On all test instances with greater
than four plots the heuristic found better solutions than exact methods in the time
limit of one hour. When the number of plots was increased, exact methods experienced trouble finding feasible solutions whilst the heuristic still performed well.
Santos et al. (2010) consider a vegetable crop production problem to determine
planting schedules that meet demand conditions, subject to the botanical constraints
detailed above. Heterogeneous land characteristics due to climatic variation means
that allowable planting periods and crop yields can vary between locations. A
linear formulation is presented and solved using column generation to deal with the
exponential number of variables corresponding to each feasible cropping schedule.
Post-optimisation heuristics are proposed to deal with the potentially large number
of plots with very small areas present in the final solution. Very small plots are
undesirable as they are infeasible in practice. The heuristics presented provide
significant reductions in the number of plots from the optimal solution while making
insignificant changes to the amount of demand that is met. Computational tests
on a range of test instances, based on real data, find that the solution method is
efficient.
Costa et al. (2014) extend the framework presented above to allow parts of the
current harvest to be stored in order to supply future demand. This accounts for the
limited lifespan of vegetables kept in storage and losses associated with the duration
of storage. This paper also introduces a framework to deal with uncertainties in
the demand market. A two-stage stochastic programming with recourse model is
presented in which the first stage decision variables are related to allocation of land
and the second stage decision variables are associated with assigning harvests to
inventory or demand fulfillment. The column generation approach developed by
Santos et al. (2008) was extended and found to perform well, even under the added
complexity. Results show that stocking reduces the amount of unmet demand and
increases the possibility of extra production. It was determined that the stochastic
component of the model provided benefit under uncertainty.
5
Santos et al. (2015) present a branch-price-and-cut method to enforce cropping
areas to be multiples of a minimum cropping size. This represents the reality for
farmers, where it is impractical to cultivate a large number of very small plots.
Enforcing a minimal cropping size means that the underlying formulation is an
integer program which is considerably harder to solve. As such, the solution method
is enhanced with valid subadditive inequalities, two primal heuristics and a strong
branching rule. This combination of tools results in a robust and efficient method
capable of solving real-world sized instances.
1.3
Problem Specification
This thesis discusses is an adaption of the problem that is presented by Santos et al.
(2010). We aim to determine crop rotations which will supply a known demand for
a set of vegetable crops. The planning horizon is one year and demand is specified
for each week in the planning period. A set of crops is given and for each crop the
following characteristics are available:
• total production time,
• time from planting until first harvest,
• harvest in each period after the first period of harvest,
• its associated botanical family,
• earliest and latest allowable planting times.
Details of the crops included in this study are presented in Appendix A.
There are three biological constraints which each crop rotation must follow:
1. Two crops from the same botanical family cannot be cultivated successively in
the same plot of land. This is motivated by reducing the dominance of weeds
and pathogens which usually thrive on one botanical family (Marcroft et al.,
2004).
2. Each crop rotation must include the cultivation of a green manure (Mudgal
et al., 2010). This is a nitrogenous crop grown to replenish the soil, without
an associated demand.
3. Each crop rotation must include a period of fallow (Tilman et al., 2002).
6
The following chapters discuss solution methods to determine optimal crop rotation plans. In Chapter 2 we present a theoretical background of the concepts relevant
to this thesis. This includes linear programming, mixed integer programming and
constraint programming. Chapter 3 presents a mixed integer programming approach
to the problem using column generation. In Chapter 4 we present a constraint programming formulation for the pricing problem. Post optimisation heuristics are
presented in Chapter 5, to eliminate small plots and reduce the number of plots. In
Chapter 6 we present the results of computational experiments. Finally, in Chapter
7 we draw conclusions and discuss possible directions of future research.
7
Chapter 2
Theoretical Background
In this chapter we provide a theoretical background to the concepts of linear programming and constraint programming. This will provide the basis for the formulations to the crop planning problem presented in Chapters 3, 4 and 5.
2.1
Linear Programming
A linear program (LP) is an optimisation problem where we wish to maximise or
minimise a linear function of decision variables subject to linear equality or inequality
constraints on these decision variables. A LP in standard form is a LP where all
constraints are equality constraints and all variables are non-negative. It is easy to
convert any LP into standard form with the inclusion of auxiliary variables. The
general form of a standard LP is:
cT x
(2.1)
s.t. Ax = b
(2.2)
x≥0
(2.3)
min
Dantzig et al. (1955) presented the now well known Simplex Algorithm which is
efficient at solving LP’s of considerable size. Since the introduction of simplex there
have been vast improvements both algorithmically and in terms of computing power.
Bixby (2002) analyses the evolution of the efficiency of commercial solvers in the two
preceding decades and attributes three orders of magnitude improvement in solving
power to computing power and three orders of magnitude improvement to new and
improved algorithms. Such algorithms include the dual simplex algorithm, primaldual log barrier methods and improvements in linear algebra. The improvements in
linear algebra have impacts for the simplex algorithm. Despite these improvements,
8
there are still problems that cannot be solved as simple LPs, especially when the
number of variables grows exponentially with the size of the problem. One method
for dealing with a huge number of variables is to use column generation.
2.1.1
Column Generation
The following is a brief explanation of column generation, summarised from Lübbecke
and Desrosiers (2005). We now consider a linear program of the following form,
called the Master Problem (MP):
z∗ := min
X
s.t.
X
cj λ j
(2.4)
aj λj ≥ b
(2.5)
j∈J
j∈J
λj ≥ 0 , j ∈ J
(2.6)
Where J is the set of indexes of all feasible columns and λj is a decision variable
associated with the column j. At each iteration of the simplex algorithm we seek to
move a non-basic variable into the basis. If the vector of dual variables is given by
π, then we wish to find:
arg min{cj − π T aj |j ∈ J }
(2.7)
An issue arises when the number of columns in the master problem is very large
and therefore determining which variable should enter the basis is inefficient. The
idea of column generation is to deal with a Restricted Master Problem (RMP),
where only a relatively small subset J 0 of the full set of column indexes, J , are
considered. Solving the linear program RMP on the reduced set of columns means
it is still efficient to determine which variable should enter the basis. Assuming the
feasibility of the RMP, let λ̄ and π̄ be the optimal primal and dual solutions to the
RMP, respectively. The pricing problem is defined as finding the feasible column aj
which is:
arg min{cj − π̄ T aj |j ∈ J }
(2.8)
Where cj is the cost of including the column aj in the solution to the RMP. The
pricing problem finds the most promising column. If the optimal objective value of
the pricing problem is non-negative, then there is no column with negative reduced
cost. As such λ̄ is the optimal solution to the MP. Otherwise, the column aj found by
the pricing problem has negative reduced cost and should be added to the RMP. The
9
process is then iterated, re-solving the RMP and again looking for new promising
columns.
A classical example of column generation is the linear relaxation of the cutting
stock problem. We have n items each of length li with associated demand di . These
items must be cut from rolls of length L. We wish to meet the demand for each
item using the smallest number of rolls possible. This is expressed by the following
linear program:
min
X
λp
(2.9)
p∈P
s.t.
X
bpi λp ≥ di
(2.10)
λp ≥ 0
(2.11)
p∈P
Where: λp is the number of rolls that should be cut to pattern p, P is the set of
all possible patterns for cutting a single roll and bpi is the number of times item i
appears in pattern p.
The obvious issue here is that the size of the set P is potentially huge, since it
contains every feasible pattern. Column generation is used to deal with this issue.
An initial set of patterns is used to create a Restricted Master Problem. A naive,
but certainly feasible, initial set of patterns assigns a pattern to each item where
that item is cut once from the roll. The RMP is solved, on this initial set of columns,
to obtain dual variables for each constraint (πi ). An economical interpretation of
dual variables is the price which one would pay to decrease the right hand side of
a constraint by one unit (for a minimisation problem). As such we wish to find
P
the pattern which minimises the function 1 − ni=1 πi bi . The value 1 is the cost of
including a pattern in the objective function of the RMP from which we subtract
the benefit in the objective function, of the RMP, across all items which are present
in the pattern. In order to determine which pattern should be added to the set P 0 ,
the following pricing problem must be solved:
min 1 −
n
X
π i bi
(2.12)
li bi ≤ L
(2.13)
bi ∈ Z +
(2.14)
i=1
s.t.
n
X
i=1
Where constraint (2.13) ensures that all the items fit into one roll.
10
Whenever the optimal solution to the pricing problem has a negative objective
value, corresponding to a negative reduced cost, the pattern should be added to the
RMP. The RMP is then re-solved to obtain new dual variables. This process iterates
until no pattern with negative reduced cost can be found. At which point we can
say that the optimal solution to the RMP is also the optimal solution to the MP.
2.2
Mixed Integer Programming
A mixed integer program (MIP) is a linear program with the additional constraint
that some of the variables take integer values. This is very often a requirement
when modelling real-world problems. It is important to note that the somewhat
intuitive idea of simply rounding the solution of the LP relaxation can lead to very
poor or infeasible integer solutions in many cases. Capturing the integrality of some
variables considerably increases the complexity of the problem.
2.2.1
Branch & Bound
Branch & bound algorithms are a general tool that can be used to find solutions
to constrained optimisation problems. It is a method of intelligently exploring the
search space of all feasible solutions. The feasible space is split into smaller subsets
and a bound on the best solution in each subset is calculated by some means. If the
bound on a subset is worse than a known feasible solution then no further exploration
is required in that subset. The subsets are further partitioned until no promising
subsets are available (Lawler and Wood, 1966).
The branch & bound paradigm can be applied to solving general MIP’s, this was
first presented by Land and Doig (1960). In this case we solve the linear relaxation
of the MIP, ignoring the integrality constraints, to find a bound on the solution of
the MIP.
If the solution to the LP relaxation has all integer components, then it is feasible
for the MIP and will be optimal. Otherwise, we choose one of the non-integer components, xi , with current value x̄i and create two subproblems, adding ramification
constraints (xi ≤ bx̄i c and xi ≥ dx̄i e). We solve the linear relaxation at each of the
subproblems and find ourselves in one of the following situations:
• the subproblem is infeasible,
• the subproblem has an integer optimal solution. We compare this solution to
the best feasible solution so far (incumbent solution) and update the incumbent
solution if necessary.
11
• the subproblem has an optimal solution, not necessarily integer, that is worse
than the best feasible solution so far. We do not need to go any further. This
branch of the search is bounded by this optimal value and there is no benefit
in pursuing it further.
• the subproblem has a non-integer optimal solution that is better than the best
feasible solution so far. We choose one non-integer variable to branch on and
create two more subproblems.
This process continues until all branches have been explored. We then know that
the incumbent solution is the optimal solution to the MIP.
2.2.2
Improvements
Current commercial solvers still use the branch & bound algorithm however they
also implement a variety of algorithmic improvements on top of the basic idea.
Since the first implementations of MIP solvers there have been huge improvements
in the complexity of problems which can be solved. This is due to both hardware
improvements and progress in the theory of solving MIP’s, as stated above (Bixby,
2002). The main improvements are outlined below (Lodi, 2010).
Presolve and Constraint Strengthening
In presolve, solvers try to make changes to the input of the problem that result in
improved performance, without altering the set of optimal solutions. One method
is to detect and remove redundant constraints and variables from the model. A
second method is to increase the strength of constraints. It is possible that bounds
on variables are excessively large and could be tightened. There are also much more
complex techniques which attempt to detect the sub-structure of the problem, then
make changes to capitalise on this sub-structure (Savelsbergh, 1994).
Cutting Plane Generation
Cutting plane generation is the addition of constraints to the model which do not
remove any integer solutions, but do remove solutions which are feasible for the linear
relaxation. There are many ways these cuts can be generated including: ChvátalGomory Cuts, Gomory mixed-integer cuts, rounding cuts and split cuts (Cornuéjols,
2008). Given the large number of cuts it is possible to create, the real challenge for
the solver is choosing how many and which cuts should be generated to increase
solve speed. The benefit of including cuts is that the optimal solution to the linear
12
x2
5
4
3
2
1
0
1
2
3
4
x1
Figure 2.1: Example of a Cutting Plane
relaxation will be closer to integer solutions, providing better bounds. In the ideal
case, the solution to the LP relaxation will be an integer point and we have solved
the original problem.
For example, consider a problem with two variables, x1 and x2 and two constraints 7x1 + 8x2 ≤ 28 and 5x1 + 3x2 ≤ 15 where both x1 and x2 must take integer
variables. The constraints are shown in Figure 2.1 by solid lines and the feasible
integer points by black dots. We note that including the constraint x1 + x2 ≤ 3,
shown by the dashed line, does not remove and feasible integer points, but does
remove some points which would be feasible to the linear relaxation. This new constraint is a cutting plane and its addition to the integer program will improve the
formulation, leading to more efficient solving. This idea can be generalised to higher
dimensions where finding a cutting plane is more challenging
Branching Strategies
When conducting the search in a branch & bound framework there are two main
decisions to be made: which subproblem to deal with next (node selection) and
which non-integer variable to branch upon (variable selection). The two ends of
the spectrum for node selection are best-bound, where the node which has the
best LP relaxation solution is chosen and depth-first, where one continues down
13
a single branch until the possibilities are exhausted. Modern techniques are some
combination of these two approaches. For variable selection, the classical choice was
to choose the most fractional variable, i.e. the variable with fractional part closest
to 0.5. However, this has been empirically shown by Achterberg et al. (2005) to be
no better than selecting a variable randomly. There are a number of techniques that
tackle variable selection, these include: strong branching, pseudocast branching and
reliability branching.
Primal Heuristics
The goal of using heuristics in MIP is to find “good” feasible integer solutions
quickly. This means that if the solve is terminated before optimality can be proved,
we still have a good solution. Also in the branch & bound context, having a good
feasible solution means that parts of the tree may be disregarded, due to their bound.
Some commonly implemented routines are rounding and diving heuristics. Rounding
heuristics seek to round variables which are close to integer in order to achieve a
feasible solution. Diving heuristics generally round a single variable then re-solve the
LP relaxation. This process is iterated until a feasible solution is achieved (Fischetti
et al., 2005).
2.3
Constraint Programming
Constraint Programming (CP) is a framework for solving combinatorial problems.
A problem is modelled as a Constraint Satisfaction Problem (CSP). This consists
of a set of variables which are restricted by a set of constraints. Once a problem is
modelled as a CSP, we solve it using algorithms to reduce and explore the search
space. In recent years there are many areas which have used constraint programming
to solve real world problems. These areas include project scheduling (Berthold et al.,
2010), train scheduling (Rodriguez, 2007), bin packing (Pisinger and Sigurd, 2007)
and employee scheduling (Demassey et al., 2005). The following explanation of CP
is presented from Van Hoeve et al. (2006) and Berbeglia (2009).
2.3.1
Defining a Constraint Satisfaction Problem
A CSP is defined by a triple P = < X, D, C > where X = (x1 , . . . , xn ) is a tuple
of n variables, D = (D1 , . . . , Dn ) is a tuple of n domains (such that xi ∈ Di ∀i ∈
{1, . . . , n}) and C = {C1 , . . . , Cm } is a set of m constraints. A constraint Ci is
defined on a subset of variables and describes the allowable combinations of values
14
they may take. The variables here are usually restricted to binary or integer values.
A simple example of a CSP is the graph colouring problem. Suppose we wish
to colour the eight states and territories of Australia with four colours such that
any two states or territories with a common border do not share the same colour.
We define a variable for each state, X = (xwa , xnt , xsa , xque , xnsw , xact , xvic , xtas ) all
in the domain {1, . . . , 4}. Our set of constraints prevents any two adjacent states
from having the same colour. E.g Western Australia is adjacent to the Northern
Territory and South Australia, as such two constraints are:
• xwa 6= xnt
• xwa 6= xsa
The full formulation would include constraints relating to the adjacencies for the
remaining states and territories.
2.3.2
Constraint Propagation Algorithms
In order to solve a CSP we use algorithms to reduce the search space. These algorithms are called constraint propagation algorithms or domain filtering. Constraint
propagation algorithms determine the effect that the domain of one variable has on
the domain of another variable through a constraint.
For example, consider two variables X = (X1 , X2 ) with domains D1 = {1, 2, 3},
D2 = {2, 3} respectively and the constraint x1 ≥ x2 . A constraint propagation
algorithm may determine that there is no feasible value of x2 for which x1 = 1, it
will then remove 1 from the domain of x1 and update D1 to {2, 3}. This has reduced
the solution space, while maintaining all feasible solutions.
There are two classes of constraints in CP; constraints that relate a fixed number
of variables and constraints that relate a non-fixed number of variables. A constraint
that affects only one variable is called unary (e.g. x1 > 3). A constraint that affects
two variables is called binary (e.g. x1 + x2 > 7). These are both constraints of the
first class. Constraints of the second class are called global constraints and are much
more useful. One example of a global constraint is the allDifferent(x1 , . . . , xn )
constraint. This specifies that all pairs of {x1 , . . . , xn } must take different values.
The effect of a global constraint can always be achieved by using a number of simpler
constraints however it is usually more efficient to use global constraints. There are
specific filtering algorithms that have been developed to deal with global constraints.
This may allow the elimination of more values from variable domains than would
otherwise be possible. There are many global constraints that have been developed
15
for different purposes. A catalog of global constraints is presented in Beldiceanu
et al. (2005).
After the execution of constraint propagation algorithms, there are three possibilities:
• the domain of any variable is reduced to the empty set (Di = ∅)–this means
the CSP is infeasible,
• the domain of every variable is reduced to a single value–this is a feasible
solution,
• the domain of every variable is non-empty and at least one variable has a
domain with more than one value.
In the last case we require further steps to find a feasible solution, this is where a
search procedure is required.
2.3.3
The Search Procedure
During the search procedure a CSP, P , will be split into two more CSPs where the
set of solutions to the first problem is exactly the union of the sets of solutions to
the two new problems. This is achieved by selecting a variable (x1 ) which has at
least two values (v1 , v2 ) in its domain. Then we split the CSP into two subproblems
one which is P with the added constraint that x1 = v1 and the second CSP which
has the constraint x1 6= v1 .
There are different variable selection criteria to determine which variable should
be selected and value selection criteria to determine which value in its domain should
be selected. These criteria can be tailored to the specific problem to improve the
efficiency in solving the problem.
Once we have split the problem we have two CSPs which must be solved. Each
one may be split into further more problems, as such we have a tree of CSPs. At
each node we can apply constraint filtering to reduce the domain of the search tree.
The way that we explore the tree can have significant impact on how long it takes
to solve the CSP. There are different search procedures that are useful for different
problems.
Optimisation
The primary goal of constraint programming is to find feasible solutions. In order
to solve an optimisation problem with CP we look for a chain of feasible solutions
16
where there is successive improvement in the objective value. Initially, we look for
any feasible solution. We then include the constraint that the next solution must
have objective value strictly better than the current solution (strictly smaller for a
minimisation problem, strictly greater for a maximisation problem). When it is not
possible to find a feasible solution with better objective value, we conclude that the
current solution is optimal.
Finding Multiple Solutions
Within constraint programming it is possible to find multiple solutions in a single
search. In order to find a feasible solution, a value must be assigned to each variable.
This process is called labelling. We move down the search tree fixing variables to
values, building a feasible solution. Once a solution is found, the last variable to
be assigned a value is unlabelled and we backtrack up the search tree until there is
another variable which can be labelled or a different value to assign to the current
variable. We then move down that branch to find a new feasible solution. This
process can be repeated many times to find many feasible solutions.
To illustrate this process we propose a CSP with three binary variables (a, b, c).
Figure 2.2 illustrates the process of finding two feasible solutions. Once we have
found the first feasible solution (0, 1, 1) we take one step back up the tree then
assign c to the new value 0.
The potential downside is that these multiple solutions may have very similar
structure. The variables a and b have no chance of changing in this scenario and
we have found two feasible solutions that are very similar. If there were a larger
number of variables it is easy to anticipate how the majority of the solution would
remain unchanged when we move to the next solution.
Restarted Search Procedure
One method for forcing the solutions of a CSP to be different is to use a restarted
search procedure with random value and random variable selection criteria. Each
time a feasible solution is found, the search is restarted, returning to the root node.
A random variable and random value selection criteria is used to ensure that each
time we restart the search we find a new solution.
The way this procedure introduces diversity is illustrated in Figure 2.3 where
we extend the example presented above. Once we find the first feasible solution,
(0, 1, 1), we restart the search procedure. At this point the variables a and b can be
assigned new values according to the random value selection criteria and we find a
17
a=0
a=0
b=1
b=1
c=0
c=1
First Feasible
Solution (0,1,1)
c=1
Second Feasible
Solution (0,1,0)
Figure 2.2: Normal Search Procedure
new solution. It is also possible that these variables will be labelled in a different
order, using the random variable selection.
2.3.4
Interval and Sequence Variables
More recently, two new types of variables called interval variables and sequence
variables have been introduced (Laborie and Rogerie, 2008; Laborie et al., 2009).
Interval variables are specifically designed for scheduling optional tasks or activities.
Previously, such problems have been modelled using additional variables to represent
the presence of an activity in the schedule.
An interval variable a is a variable whose domain is a subset of {⊥}∪{[s, e)|s, e ∈
Z, s ≤ e}, where:
• ⊥ indicates that the activity is not executed,
• [s0 , e0 ) indicates the activity is executed, starting at time s0 and completing at
time e0 , where [s0 , e0 ) is a subset of [s, e).
The significance of allowing the variable to take the value ⊥ is that if the activity
is not executed, we do not want the variable to take part in any other constraints.
If an interval a is executed and a = [s, e) then there are several characteristics
of the variable which are defined.
18
a=1
a=0
b=1
b=0
c=1
c=0
First Feasible
Solution (0,1,1)
Second Feasible
Solution (1,0,0)
Figure 2.3: Restarted Search Procedure
• presenceOf(a) = 1, indicates the activity a is executed
• startOf(a) = s, represents the start time of activity a
• endOf(a) = e, represents the end time of activity a
• duration(a) = d, represents the duration of activity a, such that d = e − s
If the interval is not executed then presenceOf(a) = 0 and all other characteristics
are not defined.
A sequence variable p is defined on a set of interval variables A. The idea of the
variable p is to represent the permutation of A which is the order in which activities
are executed. Let n be the number of activities in A that are executed. The values
that a sequence variable p takes is a function from A to [0, n]. If an activity a ∈ A
is not executed then p(a) = 0, otherwise p(a) is the position which activity a takes
in the sequence of executed activities.
For example, consider three activities represented by interval variables:
• a an optional activity with duration 3
• b a compulsory activity with duration 2
• c a compulsory activity with duration 4
19
Each activity must lie in the interval from 0 to 10. There is sequence variable p
defined on the set {a, b, c}. A feasible solution is:
Value
a ⊥
b [6, 8)
c [1, 5)
p(•)
0
2
1
Where a is not executed, b is executed, starting at period 1 and c is executed,
starting at period 6.
There are a number of constraints that are specific to interval variables. One
such constraint is the endBeforeStart constraint. This constraint takes two interval
variables as arguments and ensures that if both intervals are executed, then the
completion time of the first interval is less than the start of the second interval.
There is the additional option that we can enforce a time gap between the completion
of the first activity and the start of the second activity.
For example, given two interval variables a and b, the constraint endBeforeStart(a, b, k)
enforces that:
• a is not executed, and/or
• b is not executed, or
• endOf(a) + k ≤ startOf(b)
Another constraint specific to interval variables is the noOverlap constraint.
This constraint is used to ensure that activities do not overlap in time and can also
enforce setup times between different types of jobs. We define a transition matrix
M , where the entry M [i, j] represents the time taken to transition from processing
an activity of type i to an activity of type j. The constraint noOverlap(p, M ), for
a sequence variable p enforces the following:
• any two activities in p which are executed do not overlap in time,
• for any pair of activities which are executed the associated transition time is
enforced between the two jobs. For example, if job a of type i is executed
before job b of type j then we have endOf(a) + M [i, j] ≤ startOf(b).
All of the constraints that are expressed over interval variables and sequence
variables have in the past been implemented using binary and integer variables.
20
The benefits of the new variable types is twofold; firstly, they allow the simplification of the modelling process through more intuitive interpretation of variables and
secondly, they allow more efficient solving, through constraint filtering algorithms
which are specific to these constraints. Greater information regarding the structure
of the problem is communicated to the solver and as such, more efficient constraint
filtering and search operations can be conducted (Laborie et al., 2009).
2.3.5
Constraint Programming Based Column Generation
Constraint programming based column generation (CPCG) is a combination of the
two methods we have previously discussed. In regular column generation both the
master problem and the pricing problem are solved using the traditional mathematical programming techniques of linear programming and mixed integer programming.
In CPCG, the master problem is solved using linear programming, allowing us to
access the dual variables. The pricing problem, however, is solved using constraint
programming.
CPCG was developed by Yunes et al. (2000) and Junker et al. (1999) both in the
context of crew assignment for the airline industry. The complex rules relating to
allowable schedules for airline staff are easily managed by constraint programming
and real world instances (with 150 trips and 12 million feasible duties (Yunes et al.,
2000)) are solved in reasonable times. The main benefit of using CP to solve the
pricing problem is the ease with which CP can handle complex logical constraints or
non-linearity of the problem (Gualandi and Malucelli, 2013). Since its introduction,
constraint programming based column generation has found applications in many
areas, such as, airline planning (Grönkvist, 2006), two dimensional bin packing
(Pisinger and Sigurd, 2007), graph colouring (Gualandi and Malucelli, 2012), vehicle
routing (Rousseau et al., 2004) and machine assignment (Sadykov and Wolsey, 2006).
21
Chapter 3
Mixed Integer Program
Formulation
Santos et al. (2010) have developed a Mixed Integer Program (MIP) to model the
problem detailed in Section 1.3. The formulation has an exponential number of
variables and as such a column generation approach is used to find optimal solutions. The master problem determines what area should be allocated to each feasible
schedule in order to meet demand while minimizing the total area of all schedules.
The pricing problem determines what crops and planting times should be included
in a schedule. Firstly, we will define a crop rotation schedule, secondly, we will define a formulation to supply demand and finally, we describe the column generation
procedure.
3.1
Crop Rotation Schedule
Firstly, we present the constraints which a feasible planting schedule must follow.
This follows the criteria defined in Santos et al. (2010) and explained in section 1.3.
A planting schedule is defined by a set of crops that are planted in sequence on the
same area of land in a year long cycle.
The rest of the paper will use the following notation:
22
M
number of periods in a crop cycle
C
set of crops that are available
G
set of crops available for green manure
N
cardinality of C ∪ G
n = N + 1 artificial crop associated with fallow
NF
number of botanical families
F(k)
set of crops in the botanical family k, k = 1, . . . , N F
ti
total production time of crop i
fi
botanical family associated with crop i
We define the xi,j to be a binary decision variable representing whether or not
crop i is planted in period j.
(
1, if crop i is planted in period j
0, otherwise
xi,j =
(3.1)
A planting schedule is feasible if the following constraints are respected:
ti −1
n X
X
i=1 r=0
ti
X X
xi,j−r ≤ 1,
j = 1...M
(3.2)
xi,j−r ≤ 1,
k = 1 . . . N F, j = 1 . . . m
(3.3)
i∈F (k) r=0
M
XX
xi,j = 1
(3.4)
xn,j = 1
(3.5)
i∈G j=1
M
X
j=1
Constraints (3.2) prevent two crops from being planted at the same time. These
constraints look at all the crops that could have been planted in the past resulting
in the crop being present at time j and constrain that, at most one of these crops
is planted. Constraints (3.3) prevent two crops from the same botanical family
being planted in direct succession. This uses the same logic as (3.2) but extends the
production time by one period to enforce a gap between crops of the same botanical
family. Constraint (3.4) ensures that a green manure crop is planted. Constraint
(3.5) ensures that a fallow period is included in the schedule.
Consider an example with three crops A, B, C and in which F represents a fallow
period. Crops A and B are from the same botanical family and C is associated with
green manure. Production times for crops A, B, C and F are 4, 3, 2 and 1 respectively
23
and there are 12 periods in the planning cycle. Assume that A is planted in period
2, occupying the land until period 5. Looking at constraint (3.2) when j = 5 results
in:
xA,5 + xA,4 + xA,3 + xA,2 + xB,5 + xB,4 + xB,3 + xC,5 + xC,4 + xF,5 ≤ 1
(3.6)
Here xA,2 = 1 imples, for example, that crop B cannot be planted in period 5, as we
would expect. Further, constraint (3.3) for j = 5 and the botanical family of A and
B becomes:
xA,6 + xA,5 + xA,4 + xA,3 + xA,2 + xB,6 + xB,5 + xB,4 + xB,3 ≤ 1
(3.7)
We note that since crops A and B are members of the same botanical family, crop
B cannot be planted at period 6, even though crop A has completed its production
time. Also, crop A cannot be planted repeatedly without a break period. A feasible
schedule is presented in Figure 3.1 . Crop A is planted in period 2, crop C (green
manure) planted in period 6, crop B planted in period 8 and a time of fallow at
period 1.
1
2
F
A
3
4
5
6
7
C
8
9
10
11
12
B
Figure 3.1: An example of a feasible planting schedule
3.2
Supplying Demand
We now turn to satisfying demand. Given a set of feasible planting schedules we
determine what area should be assigned to each schedule in order to meet demand
for each crop for each period in the year, whilst minimising the total area assigned
to all schedules. In order to do so we need to calculate the harvest of each crop in
each time period for a given planting schedule. We use the following notation:
oi
hi,r
di,j
number of periods between planting and first harvest of crop i
production per unit area of crop i in its rth harvest period
demand for crop i in period j
Define asi,j as the production of crop i in period j under planting schedule s.
24
Equation (3.8) shows how to calculate asi,j from the planting schedule.
asi,j
=
tX
i −oi
hi,r xsi,j−oi −r
(3.8)
r=1
In order to obtain the rth production of crop i it must have been planted (oi + r)
periods before.
We now present a linear program formulation to ensure demand is met, making
use of the set of all feasible planting schedules S. Here λs is the area assigned to
planting schedule s.
X
min
λs
(3.9)
s∈S
X
st :
asi,j λs ≥ di,j ,
i ∈ C, j = 1 . . . M
(3.10)
s∈S
λs ≥ 0 s ∈ S
(3.11)
The objective function (3.9) is to minimise the total area. Constraints (3.10) ensure
demand for each crop for each period is met.
We can now extend the example presented above to include information about
harvests. Table 3.1 presents the data for the crops in this example.
fi
ti
oi
hi,1
hi,2
A 1
B 1
C G
F -
4
3
2
1
2
1
2
1
1
3
3
2
Table 3.1: Example crop data
As such, crop A will commence harvest 2 periods after it is planted, i.e. in period
4. Figure 3.2 illustrates the harvest for each crop.
Harvest:
1
2
F
A
3
4
5
6
C
1
3
7
8
9
10
3
2
B
Figure 3.2: An example of a harvest schedule
25
11
12
3.3
Column Generation
Formulation (3.9) - (3.11) is viewed as a master problem in a column generation
framework. It is a linear optimisation problem where there are an exponential
number of columns corresponding to all feasible planting schedules. A restricted
master problem (RMP) is solved with a small number of columns and the dual
variables to constraints (3.10) are used to find new columns to add to the RMP.
The master problem and pricing problem are solved iteratively until no column that
improves the objective function of the master problem can be found.
We define the dual variable associated with satisfying demand of crop i in period
j as πi,j . This represents the local benefit gained in the objective function if the
demand for crop i in period j could be reduced by one unit. Each column of
the master problem relates to one planting schedule. We now view the feasibility
problem presented in formulation (3.2) - (3.5) as an optimisation problem, to find the
feasible schedule that has the largest reduced cost, where we introduce the reduced
P P
cost as the objective function. The reduced cost is defined by 1 − ni=1 M
j=1 πi,j ai,j .
The pricing problem is then:
min
1−
n X
M
X
πi,j ai,j
(3.12)
s.t. (3.2) - (3.5)
(3.13)
i=1 j=1
The master problem is initially solved by some heuristic to generate an initial
feasible solution. The heuristic used here is to create a schedule with unit production
for each period of demand for each crop. This heuristic is guaranteed to generate a
feasible solution which is all that we require to start the column generation procedure. The dual variable for each constraint is extracted and used in the objective
function of the pricing problem to find the schedule with the most negative reduced
cost. This new schedule is included in the master problem as a new column and the
master problem is re-solved. This process iterates until there is no solution to the
pricing problem with negative reduced cost. At this point we can say the current
solution to the master problem is optimal.
The solution procedure is as follows:
1. Choose an initial set of columns which provide a feasible solution to the restricted master problem.
26
2. Solve the master problem and extract the dual variables (π) associated with
constraints (3.10).
3. Solve the pricing problem to obtain the objective value and optimal column
as .
4. If the objective value is negative, add as to the restricted master problem and
go to 2. Otherwise, stop, the current solution is optimal.
27
Chapter 4
Constraint Programming
Formulation
In this chapter we present a formulation of the pricing problem to be solved using constraint programming. This is an alternative to the MIP approach we have
presented in the previous chapter. We then discuss constraint programming based
column generation for the sustainable vegetable crop supply problem.
4.1
Pricing Problem Formulation
In this formulation we represent each crop by an interval decision variable. This
interval variable consists of a start time, end time and duration. Each interval
variable is optional, meaning that it does not have to be present in the solution. In
order to allow repetitions of a crop, we define one or more crops which will represent
each crop species. Here, a crop can only appear once in a solution. However, we
may assign more than one crop to a species, indicating that a crop species appears
more than once. The number of crops corresponding to a species is determined by
the maximum number of repetitions that will fit in a single rotation. We represent
the set of crops that correspond to a species i by the set of crops Ri .
We now consider production times, harvests, time to first harvest and allowable
planting periods of crops not of crop species. There is a direct relationship between
attributes of crops and crop species. If crop p corresponds to crop species i (p ∈ Ri ),
then the characteristics of crop p are exactly the characteristics of crop species i.
For each crop p, we introduce an optional interval variable xp to represent if and
when crop p is included in the schedule. We define a sequence variable seq over all
crops, which represents the order in which the crops are planted. When defining
this sequence variable we define the type of each crop as the botanical family of that
28
crop. We introduce the function F which maps a crop p to its botanical family fp .
We also define an auxiliary variable sp which takes the value of the start time of
crop p if it is present, or M + 1 if the crop is not present. This variable is used to
define the objective function.
The objective of the pricing problem is to find the feasible schedule that has the
most negative reduced cost. As such we define the objective function to be:
1−
min
X
B(p, sp )
(4.1)
p
Where B(p, sp ) represents the benefit in the master problem objective function of
having crop p planted at period sp . It is calculated using equation (4.2), where we
look at all periods where there will be production if planting occurs in period sp and
sum the dual variables for crop p in that period multiplied by the production for
that period.
tp −op
B(p, sp ) =
X
hp,r πp,sp +op +r−1
(4.2)
r=1
Define startp and endp as the beginning and end of the allowable planting period
for crop p. The variables of the CP pricing problem are as follows:
xp
seq
sp
An interval variable in the range [startp , endp + tp ] of duration tp
A sequence variable of {xp }
An auxiliary variable which takes the value startOf(xp ) if crop p is
present in the solution or M + 1 if p is not present.
The constraints of the CSP are:
noOverlap(seq, T )
(4.3)
presenceOf(xp ) ≥ 1
(4.4)
presenceOf(xn ) = 1
(4.5)
endBeforeStart(p1 , p2 , −M ) ∀p1 , p2
(4.6)
endBeforeStart(p1 , p2 , −M + 1) ∀p1 , p2 ∈ F (k) ∀k ∈ {1 . . . N F }
(4.7)
presenceOf(pj−1 ) =⇒ presenceOf(pj ) ∀j ∈ Ri \ {1} ∀i ∈ C
(4.8)
endBeforeStart(pj−1 , pj , 1) ∀j ∈ Ri \ {1} ∀i ∈ C
(4.9)
X
p∈G
The function of constraint (4.3) is twofold; firstly, it prevents two crops from
29
being planted at the same time in this schedule and secondly, it prevents crops
from the same family being planted in direct succession. The second argument of
the constraint, T , is a matrix of transitions times between botanical families. T is
the identity matrix of size NF , indicating that there is a set up time of one period
whenever we move between two crops of the same botanical families, otherwise there
is no setup time.
For example, the constraint noOverlap([p1 , p2 ], T ) enforces one of the following:
• crop p1 is not present, or
• crop p2 is not present, or
• if p1 appears before p2 in the schedule then p2 starts at least T (F (p1 ), F (p2 ))
time units after p1 ends. Where T (F (p1 ), F (p2 )) is the transition time between
the botanic family associated with p1 and p2 .
Constraint (4.4) enforces that one of the green manure crops is present in the
schedule. Constraint (4.5) ensures that there is a period of fallow in the schedule.
A number of extra constraints are required to represent the cyclic nature of the
problem. The nature of the sequence variable and the noOverlap constraint does
not allow for cyclic scheduling.
Constraints (4.6) enforce that for any pair of crops, the start time of one crop is
greater than the end time of the other crop shifted by M periods. This is important
when a crop is planted near the end of the year and finishes at a time greater than
M , e.g 53. In reality this represents the crop being present until period 5 at the
start of the year. As such we want to ensure that no crops are planted before period
53 − 48 = 5.
In addition, it is also necessary to introduce constraints to prevent the direct
succession of crops from the same family, when one is planted near the end of the
year. Constraints (4.7) enforce this.
Each crop species consists of a number of crops. In order to break symmetry in
the solution we introduce a number of constraints that enforce crops corresponding
to a single crop species are included in sequence. That is, if crops 1, 2 and 3 all
contribute to species 1, then crop 1 must be present before crops 2 or 3 can be
included. This is represented in constraints (4.8).
Finally, constraints (4.9) enforce an order on crops corresponding to the same
crop species, i.e. if crops 1, 2 and 3 are all present in the solution, crop 1 will appear
before crop 2 which will appear before crop 3. This constraint is made slightly
30
stronger due to the fact that crops from the same botanical family must have a gap
of at least one period, so we enforce a break of one period.
4.2
Constraint Programming Based Column Generation
In order to utilise constraint programming to solve the sustainable vegetable supply
crop problem we include it in a constraint programming based column generation
framework. The master problem described in (3.9) - (3.11) is solved using linear
programming as was the case for standard column generation. However, we use the
CP formulation (4.3) - (4.9) to determine schedules that have negative reduced cost
and should be included in the master problem. When solving the pricing problem
using MIP we are only able to generate one optimal solution and then include this
solution in the master problem. The nature of CP allows more flexibility when
solving the pricing problem. We can optimise and find the column with the most
negative reduced cost, as was the case for MIP. Alternatively we can look for a
certain number of solutions (p = 5, 10, 15, 20, 30) each with negative reduced cost
and include all of these solutions in the master problem at each iteration. We aim
to explore the effect of trying different strategies in solving the pricing problem on
overall solution time.
31
Chapter 5
Plot Reduction Heuristics
The formulations for solving the sustainable vegetable crop scheduling problem detailed above both rely on solving a linear master problem. In reality we might like
to impose some restrictions on the types of plots that we generate. This chapter
focuses on two restrictions; firstly, we aim to eliminate plots with very small area
and secondly, we aim to minimise the number of plots that are used. Both methods
are are presented as a post-optimisation heuristics. We take the set of schedules
that is generated through the column generation procedure and then solve a MIP
on this set of schedules. In order for this method to be exact we would have to solve
the MIP within a branch-and-price framework.
5.1
Eliminating Small Plots
It is feasible in the master problem to allow crop schedules which are assigned a
very small area. In practice such a plot would be impractical and is undesirable.
An additional constraint we would like to impose is that all plots are larger than a
minimum plot size (k). A MIP is solved after the column generation procedure on
the set of schedules created (S).
The binary decision variable ys represents if schedule s is assigned a non-zero
area. K is an upper bound on the size of any plot.
32
min
X
λs
(5.1)
s∈S
st :
X
asi,j λs ≥ di,j
i ∈ C, j = 1 . . . M
(5.2)
s∈S
λs ≤ K y s
k y s ≤ λs
s∈S
s∈S
(5.3)
(5.4)
λs ≥ 0 s ∈ S
(5.5)
ys ∈ {0, 1} s ∈ S
(5.6)
The objective function (5.1) remains to minimise the total cropping area across
all schedules. Constraints (5.2) ensures that demand is met for each crop in each
period. Constraints (5.3) ensures that if schedules λs takes a non-zero value then
ys will take the value 1. Constraints (5.4) enforces that for any schedule which is
assigned non-zero area that this area is at least the minimum plot size k.
5.2
Minimising Number of Plots
In the previous formulations there is no restriction on the number of plots present
in the final solution. In this heuristic, we minimise the number of plots which
are assigned non-zero area. To facilitate this, we allow a small increase in the
area required to supply demand. This increase is represented as a multiple of the
optimal area found with the column generation procedure. We want to determine
what reduction in the number of plots can be achieved for a percentage increase in
the total area.
As above ys is a binary decision variable representing if schedule s is assigned a
non-zero area. K is an upper bound on the area of any singe plot. z ∗ is the optimal
solution of the original problem and α is the percentage increase in total area we
wish to allow.
33
min
X
ys
(5.7)
s∈S
st :
X
asi,j λs ≥ di,j
i ∈ C, j = 1 . . . M
(5.8)
s∈S
λs ≤ K y s s ∈ S
X
λs ≤ (1 + α) z ∗
(5.9)
(5.10)
s∈S
λs ≥ 0 s ∈ S
(5.11)
ys ∈ {0, 1} s ∈ S
(5.12)
The objective function (5.7) becomes to minimize the number of plots with non-zero
area. Constraints (5.8) ensures we meet demand while constraints (5.9) ensures the
variable ys takes the value 1 whenever λs is non-zero. Constraint (5.10) limits the
total area to be at most α% greater than the optimal area z ∗ .
34
Chapter 6
Computation Tests and Results
The following chapter describes the results of computational tests to evaluate the
mixed integer programming and constraint programming approaches to modelling
the sustainable vegetable crop supply problem. We evaluate the post-optimisation
heuristics to eliminate small plots and reduce the number of plots and also discuss
the changes when applied to different column generation methods. All methods were
tested on instances with a planning horizon of one year divided into 48 periods. The
following dataset is sourced from Santos et al. (2015). In each instance a percentage
(df ) of possible harvesting periods was assumed to have positive demand, with
df = 20 to 50% or 60 to 90%. Each of these demand profiles have 15 instances with
different numbers of crops (5 each for n = 10, 15, 20). This results in 6 different
groups of instances, shown in Table 6.1. A set of 26 crops with data regarding
botanical family, production time, harvest schedule and allowable planting times is
used. The details of each crop is included in Appendix A.
Each instance is assigned an identifier, “r1bD153” for example. The presence of
“b” before the D indicates that df = 20 − 50%, the two digits after the D identifies
the number of crops (15) and the final digit is a counter (i.e. the third instance in
this group).
Instance ID
Group
Number of Instances
Number of Crops
r1bD100 - 4
r1bD150 - 4
r1bD200 - 4
r1D100 - 4
r1D150 - 4
r1D200 - 4
1
2
3
4
5
6
5
5
5
5
5
5
10
15
20
10
15
20
Table 6.1: Instance Groups
35
df
20
20
20
60
60
60
-
50
50
50
90
90
90
%
%
%
%
%
%
6.1
Mixed Integer Programming
The column generation procedure described in Chapter 3 was implemented in Optimization Programming Language (OPL, 2009) and used CPLEX (CPLEX, 2009)
to solve both the master and pricing problems. This technique was very efficient
with all instances solving in less than 4 minutes with an average time of 61 seconds
and an average of 144 iterations. Solve times and number of iterations required for
each instance are detailed in Appendix B.
The most computationally intensive part the column generation procedure is
solving the pricing problem. On average the master problem contributed 15% of
the time, while 85% of time was spent on the pricing problem. Figure 6.1 shows the
times taken to solve the master and pricing problems at each iteration, for instance
r1bD200. We see a steady increase in the time taken by the master problem as the
number of columns increases. There is large variation in the time taken to solve the
pricing problem and no strong trends emerge as iterations progress. This is expected
as the only change at each iteration is the value of dual variables which appear only
in the objective function.
Time to Solve Master and Pricing problem
Time (s)
0.12
Master Problem
0.08
Pricing Problem
0.04
0
50
100
Number of Iterations
Figure 6.1: Times to solve master and pricing problem for instance r1bD200.
36
The objective function for both the pricing problem (indicating the reduced cost
of the new column) and the master (indicating the total area) are illustrated in
Figure 6.2 and Figure 6.3 respectively. Initially, the pricing problem finds columns
with large negative reduced costs, due to the poor quality of the initial heuristic
solution. This means that we are able to find a schedule which is very promising
and will likely result in a large reduction in the overall area. This is reflected in
the plot of the master problem objective as the area decreases from an initially very
large value and plateaus out as the initial heuristic solutions are superseded.
Pricing Problem Objective Value
Pricing Problem Objective
0
−200
−400
−600
0
100
200
Number of iterations
Figure 6.2: Pricing Problem Objective Function for instance r1bD200.
37
Master Problem Objective Value
Area
150000
100000
50000
0
0
100
200
Number of Iterations
Figure 6.3: Master Problem Objective Function for instance r1bD200.
6.2
Constraint Programming Results
The constraint programming pricing problem was implemented in OPL and solved
using ILOG CP Optimizer (Optimizer, 2014). The master problem was solved using
CPLEX. A number of different approaches regarding how many solutions to find in
each iteration and whether or not to optimise the pricing problem were tested with
constraint programming. The following sections discuss the performance of each
method and describe the benefits of each approach.
6.2.1
Optimising the Pricing Problem with Constraint Programming
The initial strategy was similar to the MIP approach, in each iteration we find the
optimal solution to the pricing problem. The way that CP solves an optimisation
problem is to find any feasible solution, then find another feasible solution with the
additional constraint that the objective value is strictly better than the objective
value of the current feasible solution. This process is iterated until there is no feasible
solution with a strictly better objective value. At this point the search terminates
38
and the current solution is optimal.
Optimising the pricing problem with constraint programming served to validate
the constraint programming formulation, as the optimal area matched the MIP solution in every case, as expected. Although the search concept is the same, the
schedules found by CP differ from that of MIP. At each iteration there are often
multiple optimal solutions to the pricing problem and MIP and CP often find different schedules. The number of iterations taken to find the optimal minimal area
was comparable to the MIP approach.
Optimisation with CP produced solve times that were longer than the times for
the similar MIP approach. On average solve times increased to 269 seconds. Details
of solve time and number of iterations is included in Appendix C. This difference
was more apparent on more challenging instances, characterised by a larger number
of crops and a larger number of demand periods. The downside of optimising using
constraint programming is that it can take a long time to prove optimality. Once the
optimal solution is found, we search for a new solution which has a smaller reduced
cost. When there is no better solution, many branches are investigated before we
can conclusively say that there is no feasible solution with lower objective value.
6.2.2
Optimising Pricing Problem Adding Intermediate Solutions
As described above, in order to find the optimal solution with CP, a number of
intermediate solutions are encountered along the way. It was anticipated that including these intermediate solutions in the master problem would speed up the
column generation process. The more columns that the master problem has, the
more information that the dual variables contain. This directs the pricing problem
towards schedules that will have the best impact in the master problem.
Adding multiple columns at each execution of the pricing problem resulted in a
significant reduction of the number of iterations. The number of iterations reduced
from an average of 135 to an average of 50, a 67% reduction. This served to reduce
the overall solve time for each instance from an average of 269 seconds to an average
of 110 seconds, a 59% reduction. It was found that the average number of solutions
added in each pricing problem ranged from 2.3 to 7.7 with a maximum of 12.
We see a more sharp increase in the time taken to solve the master problem as
multiple solutions are included in each iteration, as shown in in Figure 6.4. However,
the master problem was still very efficient, with each solve taking less than 0.05
seconds. The decrease in the number of iterations outweighed the increase in master
problem solve times and resulted in an overall time reduction.
39
Figure 6.5 shows that the time taken for optimisation pricing problems is effectively the same as the time taken to optimise and also keep track of all intermediate
solutions. The average time to solve the pricing problem in each instance for both
optimisation and optimisation with intermediate solutions is included in Appendix
D and supports the hypothesis that there is no increase.
Time to Solve Master Problem
0.12
Time (s)
0.10
Optimise
0.08
Optimise with Intermediate Solutions
0.06
0.04
0
50
100
150
200
250
Number of Iterations
Figure 6.4: Master problem solve times under CP optimisation.
Time to Solve Pricing
Time (s)
6
Optimise
4
Optimise with Intermediate Solutions
2
0
50
100
150
200
250
Number of Iterations
Figure 6.5: Pricing problem solve times under CP optimisation.
40
6.2.3
Adding Other Sets of Columns in Each Iteration
Results suggested that including multiple new schedules at each iteration was beneficial. In this search procedure we aim to find a certain number of profitable columns,
instead of optimising the pricing problem. The rationale is that adding multiple
columns will provide more information to the master problem and this benefit will
outweigh the drawback of not solving the pricing problem to optimality. A profitable
column is any feasible column that has negative reduced cost. In each iteration of
the pricing problem p feasible columns are found (p varies from 5 to 30) and added
to the master problem. In this case it is unlikely that the columns which are found
will contain the optimal solution to the pricing problem. However, this method is
still exact as we do not terminate the column generation procedure until we have
proved there are no feasible schedules that have negative reduced cost.
Multiple values of p were tested to determine the best number of columns to
add to the master problem at each iteration. As expected, it was found that as the
number of columns added increased, the number of times the master problem had
to be solved reduced. For p = 5 the average number of iterations was 161 reducing
to an average of 58 when p = 30. The trend in the number of iterations is illustrated
in Figure 6.6. This reduction is due to the fact that the more columns included in
the master problem, the higher the quality of the dual variables and the search in
the pricing problem is more directed to the required space.
The effect on solution time is more complex. We found that adding a small
number of columns had large solve times as many iterations were required. However,
adding a large number of columns was also not ideal, as this increases time spent
finding many columns using the same dual variables. It was found that adding 10
columns in each iteration was usually the best option. In Figure 6.7 we plot the
standardised solve time against the number of columns added in each iteration, for
each group of instances. We define the standardised solve time, for each number of
added columns as the the average time to solve each instance divided by the total
time to solve all instances in that group, across all number of added columns. This
standardisation simply serves to plot the different groups on a single scale which
allows easy comparison.
An analysis of the proportion of time spent solving the master problem compared
to the pricing problem indicated that the increase in time is due to more time spent
searching for a large number of columns in the pricing problem. Despite the fact
that the master problem linear program had a large number of columns to explore,
it was still able to find the optimal solution in a very short period of time (less than
0.5 seconds in all cases). This is due to the efficiency of linear program solvers for
41
the size of problems we are considering. Full details of solve times and number of
iterations required for each instance is included in Appendix E.
Change in Iterations as Columns Added Increases
200
Number of Iterations
Group
1
150
2
3
4
100
5
6
50
5
10
15
20
30
Columns Added in each Iteration
Figure 6.6: Change in number of iterations as number of columns added increases.
Change in Solve Time as Columns Added Increases
0.275
Standardised Solve Time
0.250
Group
1
2
0.225
3
4
5
0.200
6
0.175
5
10
15
20
25
30
Columns Added in Each Iteration
Figure 6.7: Change in standardised time as number of columns added increases.
42
Examining the columns created with this procedure indicated that the majority
of columns were very similar. They usually had the same crops present with start
times only varying slightly. The similarity in columns is due to the way that constraint programming finds multiple feasible solutions (explained in Section 2.3.3)
and the fact that there are many feasible solutions to the pricing problem (simply
changing the start time of a single crop by a single period will likely result in a new
feasible solution). The fact that there are many feasible solutions means it is likely
we do not backtrack many steps after a feasible solution is found and, as such, the
majority of variables to do not change their values.
6.2.4
Introducing Diversity in Columns
In order to determine the true effect of adding multiple columns to the master
problem at each iteration we would like to have a way of finding a diverse set of
columns. One way of introducing diversity in the columns found is to use a restarted
search procedure (explained in Section 2.3.3). A crop is chosen at random, then a
value for its start time, or non-inclusion, is selected also at random. Once a feasible
solution is found we restart the search procedure, returning to the root node. This
significantly reduces the probability that the new solution will be similar to the
current solution.
The process of restarting the search many times increases the time spent in the
constraint programming routine. Indeed, solve times increased for all instances when
restarts were introduced. In order to find each feasible solution we must assign values
to each variable every time. Despite the increase in solve times, there is the benefit
that finding diverse columns resulted in a decrease in the number of iterations taken
to solve each instance. Introducing diversity resulted in an average 57% reduction
in the number of iterations required as compared to the case where we do not force
diversity. Figure 6.8 illustrates the improvement in number of iterations and Figure
6.9 illustrates the increase in average time to solve the pricing problem.
43
Improvement from Any Column to Random Columns
Average Number of Iterations
160
Number of
120
Columns Added
5
10
20
80
30
40
Any
Random
Type of Columns Created
Figure 6.8: Decrease in average number of iterations when diverse columns are
found.
Increase in Time from Any Column to Random Columns
Average Pricing Problem Time (s)
60
Number of
Columns Added
40
5
10
20
30
20
0
Any
Random
Type of Columns Created
Figure 6.9: Increase in average time to solve pricing problem when diverse columns
are found.
The reduction in iterations when diverse columns are found indicates the potential for a more efficient solution method. If a diversity criteria could be included as
44
a secondary objective, then this could lead to a reduction in the overall solve time.
However, it is not easy to define exactly what constitutes a diverse set of columns.
Also the diversity criteria would have to be efficiently integrated into the CP search
procedure.
6.2.5
Hybrid Approaches
A number of approaches that combined different methods for solving the pricing
problem, as described above, were tested. These approaches used a different method
each time the pricing problem was solved. For example, alternating between the MIP
pricing problem and finding 10 solutions with constraint programming. Some of the
methods tested were to alternate between:
• MIP and finding multiple solutions with CP,
• MIP and finding multiple solutions with restarted CP search,
• optimising with CP and finding multiple solutions with CP,
• optimising with CP and finding multiple solutions with restarted CP search.
It was anticipated that a combination of finding the best schedule and finding
multiple schedules could potentially lead to a more efficient solution method, either
in the number of iterations or the time taken. However, this was not the case. None
of the methods improved on the pure MIP pricing problem or always optimising with
constraint programming. There were also no reductions in the number of iterations.
There were some improvement from always finding multiple solutions, however this
is to be expected as using MIP or optimising with CP were faster methods in general.
That is, we are alternating to a more efficient method and would expect some speed
up as a result.
The results when alternating between using MIP and finding multiple columns
with CP are shown in Table 6.2. We illustrate the trend as the number of columns
found each time we use CP varied from 3 to 10. We observe the same trend in the
number of iterations as in the pure CP methods, whereby, the more columns added
in each iteration, the less iterations we require. There is a clear increase between
the time taken when we adding 5 columns and adding 10 columns. This is because
we approach the pure CP method with longer associated solve times. The changes
between adding 3 columns and 5 columns is less clear. It appears that the decrease
in the number of iterations balances the longer times spent searching for 5 columns.
45
MIP - 3 CP
MIP - 5 CP
MIP - 10 CP
Average Time (s)
Average Number of Iterations
178.7
178.4
315.2
153.7
129.7
103.0
Table 6.2: Average solve time and number of iterations for a hybrid method.
6.3
Plot Reduction Heuristics
In this section we discuss the results of eliminating small plots and minimizing the
number of plots that are used in the final solution. Both methods were programmed
in OPL and solved used CPLEX.
6.3.1
Eliminating Small Plots
The elimination of small plots was applied to the set of columns generated from
various types of column generation. These types were:
• mixed integer programming pricing problem,
• optimising pricing problem with constraint programming including intermediate solutions,
• finding 10 diverse columns with constraint programming using restarted search
in each iteration.
The minimum plot size (k) was set to one unit area. We consider any plot which
has area less than one unit to be a small plot. Prior to the elimination of small plots,
the solutions from the above methods had some variation in the number of small
plots, even though they had the same overall area. The column generation method
that resulted in the largest number of small plots was finding 10 diverse columns,
while MIP and optimisation with CP resulted in a similar number of small plots. In
these solutions there were generally few small plots, with all instances except one
having fewer than 15 small plots under each method. The largest number of small
plots was 39 on a single instance. There were also several instances that had no
small plots in the optimal solution.
When solving formulation (5.1) - (5.5), the elimination of all small plots was
possible in every instance with an increase in area of less than 1%. The solve times
for the elimination of small plots was less than one second for all instances except
one. The only instance that had the longer solve times was the same instance that
46
had 39 small plots before plot elimination. In this case, solve times ranged between
2.5 - 15 minutes when the heuristic was applied to the set of columns from the
different column generation methods.
In general there was no difference in the the ability to eliminate small plots
between different column generation methods. The constraint programming approaches which produced many more columns did not provide any benefit as it was
easy to eliminate small plots even in the case of MIP where only the optimal solution
was included at each iteration. The number of small plots in the original optimal
solution and the percentage increase in area required to eliminate them for each
column generation method is shown in Table 6.3.
47
MIP
CP - Optimisation
CP - Diverse
Instance
Small Plots
in Optimal
Solution
% Area
Increase
Small Plots
in Optimal
Solution
% Area
Increase
Small Plots
in Optimal
Solution
% Area
Increase
r1bD100
r1bD101
r1bD102
r1bD103
r1bD104
r1bD150
r1bD151
r1bD152
r1bD153
r1bD154
r1bD200
r1bD201
r1bD202
r1bD203
r1bD204
r1D100
r1D101
r1D102
r1D103
r1D104
r1D150
r1D151
r1D152
r1D153
r1D154
r1D200
r1D201
r1D202
r1D203
r1D204
0
3
0
0
0
0
3
0
2
2
1
6
14
0
3
39
2
12
3
0
4
6
4
8
1
4
16
5
5
3
0.00%
0.00%
0.00%
0.00%
0.00%
0.00%
0.01%
0.00%
0.00%
0.00%
0.02%
0.01%
0.03%
0.00%
0.00%
0.76%
0.00%
0.04%
0.03%
0.00%
0.01%
0.00%
0.00%
0.05%
0.00%
0.00%
0.02%
0.00%
0.00%
0.00%
0
1
0
0
0
1
2
2
0
0
1
6
9
0
4
37
0
8
3
0
4
5
5
5
2
3
14
3
6
1
0.00%
0.00%
0.00%
0.00%
0.00%
0.00%
0.01%
0.00%
0.00%
0.00%
0.02%
0.01%
0.03%
0.00%
0.00%
0.62%
0.00%
0.04%
0.03%
0.00%
0.00%
0.00%
0.01%
0.04%
0.00%
0.00%
0.03%
0.00%
0.01%
0.00%
0
1
1
3
2
3
4
3
2
1
4
9
10
1
3
31
0
14
1
0
3
10
11
4
1
13
30
5
3
6
0.00%
0.00%
0.00%
0.00%
0.00%
0.00%
0.00%
0.01%
0.00%
0.00%
0.02%
0.01%
0.02%
0.00%
0.00%
0.26%
0.00%
0.04%
0.03%
0.00%
0.00%
0.01%
0.01%
0.04%
0.00%
0.00%
0.02%
0.00%
0.01%
0.00%
Table 6.3: Number of small plots in optimal solution and increase in area required
to eliminate small plots.
48
6.3.2
Minimising Number of Plots
Plot minimisation was tested when allowing the area required to supply demand to
increase 0%, 0.1%, 1% and 10% from the area required in the original optimal solution (α = 0%, 0.1%, 1%, 10%). In the case of α = 0% we do not allow any increase
in the area, but still seek to reduce the number of plots. This heuristic was again
applied to the set of columns generated by the types of column generation detailed
above. In general this was a much harder problem to solve than the elimination of
small plots. On many instances optimal solutions were not found within a two hour
time limit. Despite the challenging nature of the problem there were considerable
reductions in the number of plots required under all column generation methods,
even when α = 0.1%. This occurred even when the optimal solution was not found,
but a good feasible solution was found. The average reduction in number of plots
from the optimal solution is detailed in Table 6.4.
As we allow a larger increase in area, the number of plots required to supply
demand reduced. There is a trade-off between the area required to supply demand
and the number of plots required to do so. The changes as we increase α across
each group of instances is shown in Figure 6.10. The column “None” corresponds
to doing nothing to the solution of column generation.
Allowed Area Increase (α)
0.10%
1%
10%
MIP
Random
CP Opt
12%
11%
12%
23%
24%
24%
41%
41%
42%
Table 6.4: Reduction in number of plots under different column generation methods
and allowed area increase.
49
Change in Number of Plots as Area Increases
90
Group
Number of Plots
1
2
3
60
4
5
6
30
None
0%
0.1%
1%
10%
Percentage Increase in Area Allowed
Figure 6.10: Changes in number of plots as allowed area increases.
The differences when plot minimisation is applied after different types of column
generation were not clear in terms of the objective function. That is, in general
there was little difference in the reduction in number of plots. However, the solution
times varied significantly. When 10 columns are added in each iteration of the
pricing problem we end up with a much larger set of solutions. It was anticipated
that this may result in some benefit when reducing the number of plots, due to
the fact that there are more schedules to choose from. The results show that this
was not the case and the larger number of columns simply meant that the solver
took longer to find the optimal solution, or did not find the optimal solution in the
two hour time limit. The times taken to solve the plot minimisation heuristic when
α = 0.1% is shown in Table 6.5.
50
Instance
MIP
Random 10
CP opt
r1bD100
0.01
0.03
0.01
r1bD101
1.28
*
4.14
r1bD102
0.22
12.75
1.47
r1bD103
0.04
0.40
0.02
r1bD104
0.02
0.52
0.13
r1bD150
0.36
141.18
14.19
r1bD151 5605.80
*
*
r1bD152
4.09
22.23
6.40
r1bD153
0.84
181.14
5.64
r1bD154
0.12
3.39
0.16
r1bD200
1.90
20.18
1.31
r1bD201
*
*
*
r1bD202
*
*
*
r1bD203
0.10
4.10
0.16
r1bD204
8.25
*
6.10
r1D100 6316.13
*
5059.06
r1D101
0.94
40.44
1.46
r1D102
46.90
*
72.96
r1D103
0.02
0.09
0.03
r1D104
0.02
0.09
0.03
r1D150
17.31
*
66.63
r1D151
15.93
*
2037.53
r1D152
81.09
*
*
r1D153
6.41
109.12
8.97
r1D154
0.04
0.28
0.10
r1D200
16.76
*
164.25
r1D201
56.87
*
2503.48
r1D202
2.69
48.88
2.71
r1D203
4.14
1558.39
114.37
r1D204
7.40
222.84
9.29
* indicates the optimal solution was not found in two hours
Table 6.5: Solve times for α = 0.1% under different column generation strategies.
51
Chapter 7
Conclusions and Future Work
In this thesis, we have outlined the sustainable vegetable crop supply problem and
discussed multiple solution methods. We provided an overview of the mathematical theory used to deal with the problem: mixed integer programming, column
generation and constraint programming. Firstly, a column generation approach is
discussed, with a linear formulation for the master problem and a mixed integer
program formulation for the pricing problem. Secondly, we develop a constraint
programming formulation for the pricing problem. We make use of interval and
sequence variables, specific to scheduling problems, and propose a constraint programming based column generation solution method. Finally, we present two post
optimisation heuristics to include potential real world restrictions. These eliminate
small plots and minimise the total number of plots in the final solution. The two
heuristics are formulated as mixed integer programs that are solved on the set of
schedules found using column generation.
Computational tests were conducted on a set of instances based on real data.
The most efficient method was mixed integer programming. However, optimisation
using constraint programming when including the intermediate solutions at each
iteration was also an efficient method, with slightly longer solve times than the
mixed integer programming approach.
The heuristic for the elimination of small plots was able to remove all plots with
less than unit area with a very small increase in the total area (less than 1%). This
is partly due to the fact that the majority of instances had few plots with small area.
The plot minimisation heuristic found that significant reductions in the number of
plots were possible even when only a small increase in the area was permitted. This
reduction in the number of plots was relatively stable across the types of column
generation. Using constraint programming to find many columns did not provide
any benefit when we attempt to reduce the number of plots.
52
The models studied here do not account for many real-world restrictions, such
as, labour considerations, use of water, uncertainties in demand and variations in
harvest. A potential direction of further research is to explore the impact of these
constraints on each solution method. We expect that constraint programming methods would be less affected by the introduction of new constraints than the mixed
integer programming method. Constraint programming is generally more expressive than mixed integer programming and does not have the restriction of linear
constraints. It is possible that the introduction of new constraints would be easier
to model as a constraint satisfaction problem than a mixed integer program.
Another direction of future research is to focus on the impact of column diversity
on overall solve time. We found that the inclusion of multiple diverse columns at
each iteration, with the restarted search procedure, required far fewer iterations to
find optimal solutions. However, the method for finding diverse columns used here
resulted in increased solve times. We have strong evidence, from computational
tests, that if a more efficient method for driving diversity in columns was established,
then this would lead to an efficient solution method. The challenge is to develop some
measure of diversity and to integrate this measure into the constraint programming
search procedure in an efficient manner. One possibility is Limited Discrepancy
Search which is identified by Gualandi and Malucelli (2013) as a mechanism to
generate diverse columns in the constraint programming based column generation
setting. Another approach may be to introduce specific diversity constraints similar
to the approach taken by Sellmann et al. (2002) in the context of airline crew
scheduling.
53
Appendix A
Crop Data
Table A.1: Data for 26 crops: crop name, botanical family, allowable planting periods and production time. (Source: Santos et al. (2015))
Crop
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
Botanical
Family
Crisp Head Lettuce
Asteraceae
Loose Leaf Lettuce
Asteraceae
Butter Head Lettuce
Asteraceae
Endive
Asteraceae
Watercress
Brassicaceae
Chinese Cabbage
Brassicaceae
Brocoli
Brassicaceae
Cauliflower
Brassicaceae
Beet
Chenopodiaceae
Spinach
Chenopodiaceae
Zucchini
Cucurbitaceae
Pumpkin
Cucurbitaceae
Cucumber
Cucurbitaceae
Garlic
Liliaceae
Onion
Liliaceae
Leek
Liliaceae
Okra
Malvaceae
Tomato
Solanaceae
Carrot
Umbelliferae
Parsley
Umbelliferae
Bean
Leguminosae
Black Velvet Bean
Leguminosae
Jack Bean
Leguminosae
Lupine
Leguminosae
Hairy Vetch
Leguminosae
Sunnhemp
Leguminosae
54
Planting
Begin
Jan
Jan
Jan
Jan
Feb
Feb
Feb
Mar
Feb
Feb
Oct
Nov
Sep
Mar
Mar
Apr
Nov
Jan
Jan
Oct
Oct
Oct
Oct
Mar
Mar
Oct
Planting
End
Dec
Dec
Dec
Dec
July
Sep
Oct
Oct
Sep
Sep
Feb
Jan
Mar
Apr
Jul
Apr
Jan
Dec
Dec
Feb
Feb
Jan
Feb
Jul
Jul
Jan
Production
Time
7
7
7
9
32
32
20
18
11
20
14
19
13
24
24
12
27
24
16
21
12
16
12
18
20
16
Table A.2: Data for 21 crops with associated demand: crop name, time until first harvest and harvest in each period of production.
(Source: Santos et al. (2015))
Crop
55
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
Crisp Head Lettuce
Loose Leaf Lettuce
Butter Head Lettuce
Endive
Watercress
Chinese Cabbage
Brocoli
Cauliflower
Beet
Spinach
Zucchini
Pumpkin
Cucumber
Garlic
Onion
Leek
Okra
Tomato
Carrot
Parsley
Bean
Time to
First Harvest 1
5
5
5
5
12
12
10
16
8
5
9
16
8
23
23
8
14
15
13
8
11
9
9
9
9
1
1
1
1
1
3
0.2
0.5
1
0.3
3
3
0.2
0.8
1.5
14
0.3
2
3
3
3
3
2
2
1
3
2
4
0.3
0.5
2
3
0.3
0.8
2
16
3
4
5
6
7
8
Harvesting
9
10 11
2
2
2
2
2
2
2
2
3
2
2
3
2
2
3
2
2
2
2
2
2
2
2
1
4
4
4
4
1
4
4
4
4
0.3 0.3 0.2
0.5
2
1
1
3
0.3
1
1.5
16
12
13 14 15 16 17 18 19 20
2
2
2
2
2
2
2
2
2
2
4
4
3
3
2
3
0.4 0.4 0.4 0.4 0.4 0.4 0.4 0.2 0.2
1
1.2 1.2 1
0.8 0.2
16
16
16
16
17
17
17
17
14
10
2
2
2
2
2
2
1
1
1
1
Appendix B
MIP Results
Table B.1: Solve details for mixed integer programming
Instance
Group
# Crops
Time (s)
Area
Iterations
r1bD100
r1bD101
r1bD102
r1bD103
r1bD104
r1bD150
r1bD151
r1bD152
r1bD153
r1bD154
r1bD200
r1bD201
r1bD202
r1bD203
r1bD204
r1D100
r1D101
r1D102
r1D103
r1D104
r1D150
r1D151
r1D152
r1D153
r1D154
r1D200
r1D201
r1D202
r1D203
r1D204
1
1
1
1
1
2
2
2
2
2
3
3
3
3
3
4
4
4
4
4
5
5
5
5
5
6
6
6
6
6
10
10
10
10
10
15
15
15
15
15
20
20
20
20
20
10
10
10
10
10
15
15
15
15
15
20
20
20
20
20
2.21
7.70
10.77
2.94
4.71
38.28
51.18
49.59
27.15
14.50
42.26
106.99
131.70
34.09
79.16
25.82
12.93
31.57
4.64
2.76
69.84
58.97
125.89
60.30
28.21
177.33
230.48
124.74
141.59
132.89
1089
494
1201
778
760
1892
814
1960
1703
1484
1971
1018
1032
1902
1606
247
389
751
280
2499
5452
3721
3010
1595
2701
2327
1759
2239
3065
2011
16
75
86
41
41
145
166
152
118
72
126
216
257
94
188
148
74
144
38
27
166
184
298
166
93
251
300
204
217
208
56
Appendix C
CP Optimisation Results
Table C.1: Solve times and number of iterations for constraint programming optimisation
and optimsation with intermediate solutions.
Instance
Group
# Crops
r1bD100
r1bD101
r1bD102
r1bD103
r1bD104
r1bD150
r1bD151
r1bD152
r1bD153
r1bD154
r1bD200
r1bD201
r1bD202
r1bD203
r1bD204
r1D100
r1D101
r1D102
r1D103
r1D104
r1D150
r1D151
r1D152
r1D153
r1D154
r1D200
r1D201
r1D202
r1D203
r1D204
1
1
1
1
1
2
2
2
2
2
3
3
3
3
3
4
4
4
4
4
5
5
5
5
5
6
6
6
6
6
10
10
10
10
10
15
15
15
15
15
20
20
20
20
20
10
10
10
10
10
15
15
15
15
15
20
20
20
20
20
CP Optimisation
Time (s) Iterations
2.3
38.1
40.2
4.5
15.1
288.8
243.4
186.2
129.4
46.7
250.2
468.7
951.2
146.9
455.7
83.2
18.8
88.2
6.4
3.3
180.9
309.2
541.9
157.9
64.2
699.1
1251.1
356.0
625.4
411.9
15
65
78
30
35
140
162
126
116
64
140
198
251
95
186
130
67
130
36
28
163
198
289
152
85
223
304
171
208
172
57
CP Optimisation with Intermediates
Time (s)
Iterations
2.3
18.1
31.8
6.7
14.2
122.5
114.2
93.0
76.7
35.3
127.1
230.9
330.4
82.0
164.6
36.8
19.1
39.8
3.6
2.4
84.2
107.8
199.8
95.8
38.1
257.1
336.3
154.2
233.6
232.8
9
26
37
24
22
48
56
50
52
40
60
69
66
38
53
47
49
49
18
17
71
61
90
83
48
67
72
67
60
66
Appendix D
Pricing Problem Solve Times when
Optimising
Table D.1: Average pricing problem solve times: constraint programming optimisation
and optimisation with intermediate solutions
Instance
Group
# Crops
CP Optimisation
r1bD100
r1bD101
r1bD102
r1bD103
r1bD104
r1bD150
r1bD151
r1bD152
r1bD153
r1bD154
r1bD200
r1bD201
r1bD202
r1bD203
r1bD204
r1D100
r1D101
r1D102
r1D103
r1D104
r1D150
r1D151
r1D152
r1D153
r1D154
r1D200
r1D201
r1D202
r1D203
r1D204
1
1
1
1
1
2
2
2
2
2
3
3
3
3
3
4
4
4
4
4
5
5
5
5
5
6
6
6
6
6
10
10
10
10
10
15
15
15
15
15
20
20
20
20
20
10
10
10
10
10
15
15
15
15
15
20
20
20
20
20
0.08
0.39
0.32
0.06
0.29
1.50
0.94
1.02
0.71
0.43
1.13
1.53
2.63
1.04
1.65
0.35
0.13
0.49
0.04
0.03
0.56
0.89
0.97
0.51
0.35
1.75
2.54
1.13
1.84
1.35
58
CP Optimisation
with Intermediates
0.12
0.30
0.34
0.10
0.32
1.32
0.94
0.99
0.62
0.38
1.03
1.53
2.51
1.10
1.60
0.29
0.14
0.32
0.07
0.02
0.44
0.72
0.80
0.40
0.26
1.57
2.15
0.92
1.64
1.38
59
Appendix E
CP Multiple Column Results
60
Table E.1: Results of constraint programming adding multiple columns at each iteration.
Instance
61
r1bD100
r1bD101
r1bD102
r1bD103
r1bD104
r1bD150
r1bD151
r1bD152
r1bD153
r1bD154
r1bD200
r1bD201
r1bD202
r1bD203
r1bD204
r1D100
r1D101
r1D102
r1D103
r1D104
r1D150
r1D151
r1D152
r1D153
r1D154
r1D200
r1D201
r1D202
r1D203
r1D204
Group
1
1
1
1
1
2
2
2
2
2
3
3
3
3
3
4
4
4
4
4
5
5
5
5
5
6
6
6
6
6
5
# Crops
10
10
10
10
10
15
15
15
15
15
20
20
20
20
20
10
10
10
10
10
15
15
15
15
15
20
20
20
20
20
Time (s)
Iterations
3
260
114
15
61
989
611
376
278
116
390
1209
3454
268
872
355
50
168
10
5
312
944
1309
249
75
1267
2260
663
916
802
14
164
103
37
71
210
196
151
139
89
138
246
394
108
205
205
84
142
30
21
162
272
323
140
69
248
292
190
199
193
Number of Columns Added
10
15
20
Time (s) Iterations Time (s) Iterations Time (s) Iterations
4
203
91
14
58
820
442
368
322
117
444
1108
2890
254
956
400
48
118
12
8
304
731
1035
178
83
1274
2277
587
839
594
11
95
61
24
44
132
114
101
102
61
101
164
233
69
144
144
53
81
27
18
107
162
194
77
47
164
192
122
131
111
6
234
107
14
60
810
538
415
330
121
443
1391
2885
353
1125
440
49
142
15
10
321
798
1150
174
86
1205
2006
548
959
658
11
80
51
18
35
98
98
86
81
47
79
147
190
67
120
119
42
71
24
18
86
132
165
60
39
129
155
91
106
95
6
229
111
18
77
1055
469
334
335
107
420
1291
3654
335
1019
314
51
167
29
9
287
882
1143
163
75
1189
2264
571
883
606
10
67
43
18
33
99
76
63
68
36
64
112
166
53
98
84
37
64
23
15
68
118
137
50
32
107
133
80
88
75
Time (s)
11
250
139
21
101
956
553
402
417
114
477
1369
3660
398
1332
297
57
162
17
12
330
821
830
198
115
1957
2604
652
977
817
30
Iterations
10
52
37
13
30
71
64
53
58
27
52
95
140
45
87
60
30
49
17
13
56
86
87
41
28
108
115
63
71
69
Appendix F
Results of CP Multiple with Restart
Search
62
Table F.1: Results of constraint programming adding multiple columns at each iteration with restarted search.
Instance
63
r1bD100
r1bD101
r1bD102
r1bD103
r1bD104
r1bD150
r1bD151
r1bD152
r1bD153
r1bD154
r1bD200
r1bD201
r1bD202
r1bD203
r1bD204
r1D100
r1D101
r1D102
r1D103
r1D104
r1D150
r1D151
r1D152
r1D153
r1D154
r1D200
r1D201
r1D202
r1D203
r1D204
Group
1
1
1
1
1
2
2
2
2
2
3
3
3
3
3
4
4
4
4
4
5
5
5
5
5
6
6
6
6
6
5
# Crops
10
10
10
10
10
15
15
15
15
15
20
20
20
20
20
10
10
10
10
10
15
15
15
15
15
20
20
20
20
20
Time (s)
Iterations
5
199
229
24
294
4984
1240
858
978
179
900
2772
11934
916
4106
296
25
188
13
7
328
1174
977
215
135
5730
11099
761
4568
1621
15
74
68
37
49
117
141
118
92
57
103
167
217
72
142
118
36
104
26
16
129
162
232
94
51
168
190
126
152
126
Number of Columns Added
10
20
Time (s) Iterations Time (s) Iterations
7
240
318
25
295
5427
1191
712
1127
166
777
2909
14020
900
3448
265
17
169
13
6
270
1039
828
219
130
4327
11648
695
5316
1607
13
41
43
24
33
65
83
64
51
34
56
89
117
44
77
61
21
63
20
11
77
83
135
63
30
92
104
69
81
69
9
240
293
28
231
5773
1259
946
1079
149
821
3038
14708
1083
4728
296
17
210
16
8
294
891
651
261
176
4962
20521
930
4109
2407
12
24
26
18
25
37
53
39
30
19
38
51
68
26
44
37
13
42
17
8
52
47
77
44
20
50
60
42
46
42
Time (s)
14
221
355
38
294
6723
1357
775
946
171
933
3728
12646
1401
6020
393
19
232
23
10
366
788
902
321
149
6750
16896
1059
8640
2458
30
Iterations
12
18
20
16
22
28
37
30
23
16
29
39
50
20
31
29
11
34
15
8
43
32
62
38
16
36
42
33
34
33
Bibliography
Achterberg, T., Koch, T., and Martin, A. (2005). Branching rules revisited. Operations
Research Letters, 33(1):42–54.
Albrecht, H. (2003). Suitability of arable weeds as indicator organisms to evaluate species
conservation effects of management in agricultural ecosystems. Agriculture, Ecosystems
& Environment, 98(1):201–211.
Bachinger, J. and Zander, P. (2007). Rotor, a tool for generating and evaluating crop
rotations for organic farming systems. European Journal of Agronomy, 26(2):130–143.
Beldiceanu, N., Carlsson, M., and Rampon, J.-X. (2005). Global constraint catalog.
Berbeglia, G. (2009). Complexity analyses and algorithms for pickup and delivery problems. PhD thesis, HEC Montreal.
Berthold, T., Heinz, S., Lübbecke, M. E., Möhring, R. H., and Schulz, J. (2010). A
constraint integer programming approach for resource-constrained project scheduling.
In Integration of AI and OR Techniques in Constraint Programming for Combinatorial
Optimization Problems, pages 313–317. Springer.
Bixby, R. E. (2002). Solving real-world linear programs: A decade and more of progress.
Operations Research, 50(1):3–15.
Bjørndal, T., Herrero, I., Newman, A., Romero, C., and Weintraub, A. (2012). Operations
research in the natural resource industry. International Transactions in Operational
Research, 19(1-2):39–62.
Carravilla, M., Oliveira, J., et al. (2013). Operations research in agriculture: Better
decisions for a scarce and uncertain world. The Journal of Agris on-line Papers in
Economics and Informatics, 2:37–46.
Cornuéjols, G. (2008). Valid inequalities for mixed integer linear programs. Mathematical
Programming, 112(1):3–44.
Costa, A. M., dos Santos, L. M. R., Alem, D. J., and Santos, R. H. (2014). Sustainable
vegetable crop supply problem with perishable stocks. Annals of Operations Research,
219(1):265–283.
CPLEX, I. I. (2009). V12. 1: User’s manual for cplex. International Business Machines
Corporation, 46(53):157.
Dantzig, G. B., Orden, A., Wolfe, P., et al. (1955). The generalized simplex method
for minimizing a linear form under linear inequality restraints. Pacific Journal of
Mathematics, 5(2):183–195.
64
Demassey, S., Pesant, G., and Rousseau, L.-M. (2005). Constraint programming based
column generation for employee timetabling. In Integration of AI and OR Techniques
in Constraint Programming for Combinatorial Optimization Problems, pages 140–154.
Springer.
Demassey, S., Pesant, G., and Rousseau, L.-M. (2006). A cost-regular based hybrid
column generation approach. Constraints, 11(4):315–333.
Detlefsen, N. K. and Jensen, A. L. (2007). Modelling optimal crop sequences using
network flows. Agricultural Systems, 94(2):566–572.
Dogliotti, S., Rossing, W., and Van Ittersum, M. (2003). Rotat, a tool for systematically
generating crop rotations. European Journal of Agronomy, 19(2):239–250.
El-Nazer, T. and McCarl, B. A. (1986). The choice of crop rotation: A modeling approach
and case study. American Journal of Agricultural Economics, 68(1):127–136.
Fischetti, M., Glover, F., and Lodi, A. (2005). The feasibility pump. Mathematical
Programming, 104(1):91–104.
Grönkvist, M. (2006). Accelerating column generation for aircraft scheduling using constraint propagation. Computers & Operations Research, 33(10):2918–2934.
Gualandi, S. and Malucelli, F. (2012). Exact solution of graph coloring problems via
constraint programming and column generation. INFORMS Journal on Computing,
24(1):81–100.
Gualandi, S. and Malucelli, F. (2013). Constraint programming-based column generation.
Annals of Operations Research, 204(1):11–32.
Hamza, M. and Anderson, W. (2005). Soil compaction in cropping systems: a review of
the nature, causes and possible solutions. Soil and Tillage Research, 82(2):121–145.
Haneveld, W. and Stegeman, A. W. (2005). Crop succession requirements in agricultural
production planning. European Journal of Operational Research, 166(2):406–429.
Heady, E. O. (1954). Simplified presentation and logical aspects of linear programming
technique. Journal of Farm Economics, 36(5):1035–1048.
Hildreth, C. and Reither, S. (1951). On the choice of a crop rotation plan. Proceedings
of the Conference of Linear Programming held in Chicago 1949, In T.C. Koopmans
ed.:177–188.
Horrigan, L., Lawrence, R. S., and Walker, P. (2002). How sustainable agriculture can
address the environmental and human health harms of industrial agriculture. Environmental Health Perspectives, 110(5):445.
65
Jones, J. W., Hoogenboom, G., Porter, C. H., Boote, K. J., Batchelor, W. D., Hunt,
L., Wilkens, P. W., Singh, U., Gijsman, A. J., and Ritchie, J. T. (2003). The dssat
cropping system model. European Journal of Agronomy, 18(3):235–265.
Junker, U., Karisch, S. E., Kohl, N., Vaaben, B., Fahle, T., and Sellmann, M. (1999).
A framework for constraint programming based column generation. In Principles and
Practice of Constraint Programming–CP’99, pages 261–274. Springer.
Kantorovich, L. V. (1960). Mathematical methods of organizing and planning production.
Management Science, 6(4):366–422.
Laborie, P. and Rogerie, J. (2008). Reasoning with conditional time-intervals. In FLAIRS
Conference, pages 555–560, Gentilly Cedex, France.
Laborie, P., Rogerie, J., Shaw, P., and Vilim, P. (2009). Reasoning with conditional
time-intervals. part ii: An algebraical model for resources. In FLAIRS Conference,
pages 201–206, Gentilly Cedex, France.
Land, A. H. and Doig, A. G. (1960). An automatic method of solving discrete programming problems. Econometrica: Journal of the Econometric Society, 28:497–520.
Lawler, E. L. and Wood, D. E. (1966). Branch-and-bound methods: A survey. Operations
Research, 14(4):699–719.
Leteinturier, B., Herman, J., Longueville, F. d., Quintin, L., and Oger, R. (2006). Adaptation of a crop sequence indicator based on a land parcel management system. Agriculture, Ecosystems & Environment, 112(4):324–334.
Lodi, A. (2010). Mixed integer programming computation. In 50 Years of Integer Programming 1958-2008, pages 619–645. Springer.
Lübbecke, M. E. and Desrosiers, J. (2005). Selected topics in column generation. Operations Research, 53(6):1007–1023.
Marcroft, S., Sprague, S., Pymer, S., Salisbury, P., and Howlett, B. (2004). Crop isolation, not extended rotation length, reduces blackleg (leptosphaeria maculans) severity of canola (brassica napus) in south-eastern australia. Animal Production Science,
44(6):601–606.
McBratney, A., Whelan, B., Ancev, T., and Bouma, J. (2005). Future directions of
precision agriculture. Precision Agriculture, 6(1):7–23.
66
Mudgal, S., Lavelle, P., Cachia, F., Somogyi, D., Majewski, E., Fontain, L., Bechini, L.,
and Debaeke, P. (2010). Environmental impacts of different crop rotations in the european union. Technical report, European Commission (DG Env), 20-22 Villa Deshayes
- 75014 Paris - France.
OPL, I. I. (2009). V6. 3. IBM ILOG OPL Language User’s Manual, IBM Corporation.
Optimizer, I. I. C. C. (2014). V12. 6. IBM ILOG CPLEX Optimization Studio CP
Optimizer User’s Manual, IBM Corporation.
Pisinger, D. and Sigurd, M. (2007). Using decomposition techniques and constraint
programming for solving the two-dimensional bin-packing problem. INFORMS Journal
on Computing, 19(1):36–51.
Rodriguez, J. (2007). A constraint programming model for real-time train scheduling at
junctions. Transportation Research Part B: Methodological, 41(2):231–245.
Rousseau, L.-M., Gendreau, M., Pesant, G., and Focacci, F. (2004). Solving vrptws with
constraint programming based column generation. Annals of Operations Research,
130(1-4):199–216.
Sadykov, R. and Wolsey, L. A. (2006). Integer programming and constraint programming
in solving a multimachine assignment scheduling problem with deadlines and release
dates. INFORMS Journal on Computing, 18(2):209–217.
Santos, L. M., Munari, P., Costa, A. M., and Santos, R. H. (2015). A branch-price-andcut method for the vegetable crop rotation scheduling problem with minimal plot sizes.
European Journal of Operational Research, 245(2):581–590.
Santos, L. M. R., Costa, A. M., Arenales, M. N., and Santos, R. H. S. (2010). Sustainable
vegetable crop supply problem. European Journal of Operational Research, 204(3):639–
647.
Santos, L. M. R., Michelon, P., Arenales, M. N., and Santos, R. H. S. (2008). Crop rotation
scheduling with adjacency constraints. Annals of Operations Research, 190(1):165–180.
Savelsbergh, M. W. (1994). Preprocessing and probing techniques for mixed integer
programming problems. ORSA Journal on Computing, 6(4):445–454.
Sellmann, M., Zervoudakis, K., Stamatopoulos, P., and Fahle, T. (2002). Crew assignment
via constraint programming: integrating column generation and heuristic tree search.
Annals of Operations Research, 115(1-4):207–225.
Stöckle, C. O., Donatelli, M., and Nelson, R. (2003). Cropsyst, a cropping systems
simulation model. European Journal of Agronomy, 18(3):289–307.
67
Tilman, D., Cassman, K. G., Matson, P. A., Naylor, R., and Polasky, S. (2002). Agricultural sustainability and intensive production practices. Nature, 418(6898):671–677.
Van Hoeve, W.-J., Pesant, G., and Rousseau, L.-M. (2006). On global warming: Flowbased soft global constraints. Journal of Heuristics, 12(4-5):347–373.
Yunes, T. H., Moura, A. V., and de Souza, C. C. (2000). Solving very large crew scheduling problems to optimality. In Proceedings of the 2000 ACM symposium on Applied
computing-Volume 1, pages 446–451. ACM.
68
© Copyright 2026 Paperzz