Implementing the Flow-Covering Location

Implementing the Flow-Covering Location-Allocation
Model with Geographic Information Systems
Daniel Turner
Master’s Project
Submitted in partial satisfaction of the requirements for the degree of
Master in Geographic Information Sciences
Introduction
Despite burgeoning commercial and cross-disciplinary academic interest, as well as the
relatively long existence of a transportation-oriented sub-discipline, GIS-T, network
analysis capabilities remain relatively under-developed in GIS (Miller 1999). Out-of-thebox network analysis extensions in ArcGIS 9.1, for example, are largely limited to
elementary utility network modeling and shortest path calculations. The reasons for this
are manifold, but certainly major obstacles include the difficulty in collecting data (such
as precise traffic flow data) and the computational resources required to produce analytic
results.1 Network location problems, for example, face serious issues in how to aggregate
demand, how to identify feasible sites, the viability of the different solution
methodologies, and devising the data structures necessary to facilitate a range of different
potential applications efficiently (Church and Sorenson 1994). This project addresses the
latter two problems: solution methodologies and, to a lesser degree, data structures. It
implements a flow-covering location-allocation model (FCLM) in ArcGIS 9.1 using
variants of a greedy heuristic which are known to be efficient for FCLM problems.
The FCLM locates a set of facilities at nodes on a network such that the maximum
amount of flow moving across that network is “covered” by a facility or, put another
way, moves through a node with a facility. Practical applications of the FCLM include
retail store and billboard placement, evaluation of communication network robustness,
evaluation of hazardous waste transportation risk, location of speed traps, placement of
medical facilities across evacuation routes, and location of pipeline monitoring stations
(Goodchild and Noronha 1987; Hodgson 1990; Hodgson, Rosing et al. 1996). A related
set of models, network flow interdiction models, has a long history of use in the military
as a means of choosing targets for reducing the capacity of transportation and
communication networks (Cormican, Morton et al. 1998). Another related set of models
are flow interception models, which attempt to maximally cover flows as close to their
inception point as possible (Berman, Hodgson et al. 1995).
The FCLM is considered a special case of the maximal covering location model
(MCLM), to which it is structurally similar. The MCLM locates a set of facilities that
can cover or satisfy demand on a network within a certain “service distance” such that the
maximum amount of demand is satisfied. Both models fall within a class of complexity
1
However, many problems with data collection are becoming easier; GPS and RFID technologies can track
vehicles, enabling gathering of more accurate traffic data, for example.
1
problems known as NP (Non-deterministic Polynomial-time) which can be solved using
linear programming techniques, but which require prohibitive amounts of processing time
to solve optimally with all but the smallest data sets. The MCLM is known to be NPHard, so the same can be expected of the FCLM (Hodgson 1990). Selection of a
heuristic procedure which can quickly estimate a good (though not necessarily optimal)
solution is therefore of particular importance when considering the application of GIS
technology to solving flow covering problems.
This paper will address these issues by discussing the FCLM in more detail, then
providing a literature review of the formulations of various covering problems, the
applications of those problems, and the solution methods and solution difficulty for realworld problem instances. This is followed by the presentation of a dataset to which both
optimal and heuristic solution procedures are applied. A description of the ArcGIS 9.1
implementation is provided. Conclusions and a discussion of possible future research
avenues follow.
Research Goals
This project implements the FCLM in an add-on application for ArcGIS 9.1. It can
utilize one of two variants of a greedy algorithm to obtain solutions. While such
heuristics can be expected to produce sub-optimal solutions, they can be calculated
relatively quickly and can often produce solutions that are “good enough” for many
applications. The solutions calculated using the tool developed here are compared with
optimal results obtained from ILOG CPLEX, an industry-standard linear programming
application, in order to evaluate the effectiveness of the heuristics and the
implementation.
The FCLM presents some unique problems compared to the more well-known MCLM.
In the MCLM, systems of facilities are located based on the demand capable of being
allocated to the particular configuration of locations. In the MCLM and most models like
it, demand is expressed as a weight at the nodes. However, this is not always the case.
The FCLM addresses location-allocation problems where demand is expressed as flows.
From an implementation standpoint, this requires some tailoring. Since flows can run
through multiple nodes, the application must keep track of which flows run through
which nodes.
2
Literature Review
The literature surrounding the FCLM that pertains to this research falls broadly into three
categories: 1) the formulation of the FCLM itself and its predecessor, the MCLM, 2)
solution procedures for the FCLM and MCLM, and 3) the implementation of
combinatorially complex location models in GIS.
Formulations of the FCLM and MCLM
Church and ReVelle (Church and ReVelle 1974) present their seminal formulation of the
MCLM, a model that locates a fixed number of facilities with a defined service distance
for covering the maximum population (either with or without a minimum service distance
constraint). A greedy adding and a greedy adding with substitution algorithm are
considered and compared to an optimal linear programming solution.
Hodgson (Hodgson 1990) defined his formulation of the FCLM as a special case of the
MCLM. A basic linear programming algorithm and cannibalizing and non-cannibalizing
greedy heuristic algorithms are considered, the latter non-cannibalizing heuristic noted as
being highly efficient. The article also points out how significant the issue of flow
cannibalization is to the application of the model, as in some cases redundant flow
capturing is beneficial.
Berman and Krass (Berman and Krass 2002) present a generalization of the maximal
cover location problem, where degrees of coverage can be designated based on other
values such as distance to the nearest facility. This paper marks an effort to formulate a
more generalized version of the MCLM.
Church and ReVelle (Church and ReVelle 1976) show that the MCLM can be structured
and solved as a p-median problem. They demonstrate how solution techniques for pmedian problems can be applied to MCLM problems by essentially editing the distance
matrix used in the formulation. They argue this correspondence between the two
problems suggests that all location problems can, in fact, be structured as p-median
problems.
Chung (Chung 1986) discusses the viability of some non-traditional applications of the
MCLM outside of location problems. His proposals include data abstraction, cluster
identification and analysis, and quantitative classification/categorization.
Solution Procedures
Adenso-Diaz and Rodriguez (Adenso-Diaz and Rodriguez 1997) describe the use of the
TABU search heuristic in locating facilities under differing objective functions. They
conclude the heuristic to be effective even with small TABU tenures (the range the stairclimbing method can search to avoid becoming captured in local optima). While the
results are not optimal, they conclude the low computing time using TABU search
validates use of the heuristic.
3
Berman and Krass (Berman and Krass 2002) evaluate a number of integer programming
heuristics for location problems, demonstrating that heuristic methods usually produce
optimal or near-optimal solutions, especially as the number of selected facilities
increases.
Current and Schilling (Current and Schilling 1990) discuss the difficulties associated with
aggregating demand data in p-median problems. Models like the MCLM are shown to be
sensitive to small changes in source-demand distances, so aggregation processes can
introduce significant error into the data and solution. They propose measures for
mitigating these aggregation problems and, particularly, describe a procedure for
aggregating demand data to eliminate the loss of locational information while minimizing
solution time.
Implementing Location Problems in GIS
Church and Sorenson (Church and Sorenson 1994) discuss the problems of integrating
location models into GIS. They point out that these problems fall into four major
categories: how to aggregate demand, how to identify feasible sites, the viability of
different solution methodologies, and the data structures necessary to facilitate a range of
different potential applications. They focus, however, primarily on an evaluation of
heuristic methods, concluding that Teitz and Bart (Teitz and Bart 1968) or the GRIA
heuristic (Densham and Rushton 1992) are the best candidates.
Hodgson, Rosing, and Zhang (Hodgson, Rosing et al. 1996) consider a practical
application of the FCLM in locating vehicle inspection stations on a transportation
network. They differentiate a punitive inspection approach (which would use MCLM)
versus a preventive inspection approach (FCLM). They implement the FCLM using a
mixed integer program method and compare it to a greedy heuristic. The greedy heuristic
is found to be insufficiently robust for the example, so other heuristics are suggested,
including a vertex substitution heuristic (Teitz and Bart 1968).
Pirkul and Schilling (Pirkul and Schilling 1991) propose a capacitated maximal covering
location problem in which workload limit constraints are placed on facilities in the
model. They demonstrate that such a constraint can significantly affect location
assignment, and develop a solution procedure for the variant problem based on
Lagrangian relaxation.
Revelle, Schweitzer, and Snyder (ReVelle, Schweitzer et al. 1996) present three
variations of covering location problems, two variations of a Maximal Conditional
Covering Problem (MCCP I and II) and the Multi-objective Conditional Covering
Problem (MOCCP). MCCP I and II are covering problems where a goal of the
formulation is to maximize the number of facilities with supporting coverage subject to a
constraint on the number of facilities. MCCP I does not allow more than one facility to
4
be sited on a node, whereas MCCP II allows multiple facilities at the same node.
MOCCP calculates an optimum compromise between primary and supporting coverage.
Maximal Covering Location Model
The MCLM from which the FCLM is derived is defined as follows (Church and ReVelle
1974):
Maximize z =
a y
i
x
 y i for all i  I (that is, demand is satisfied at location i if there is
iI
jN i
i
j
(that is, satisfy as much demand as possible)
at least one facility at location j in the set of locations Ni)
 x j  P (that is, only P facilities are located)
jJ
xj = (0,1) for all j  J
yi = (0,1) for all i  I
Where:
I = the set of demand nodes;
J = the set of facility sites;
S = the distance beyond which a demand point is considered “uncovered” (the value of S
can be chosen differently for each demand point if desired);
dij = the shortest distance from node I to node j;
xj = a binary variable,
= 1 if a facility is allocated to site j,
= 0 otherwise;
Ni = { j  J | dij  S} (the set of facility sites j capable of satisfying demand at location
i);
ai = population to be serviced from demand node i;
yi = a binary variable,
= 1 if demand can be satisfied at site i,
= 0 if not;
P = the number of facilities to be located.
Flow Covering Location Model
The FCLM is very similar to the MCLM, the major difference being that set N is
composed of the set of nodes capable of covering a given demand node in the MCLM,
and in the FCLM set N is the set of nodes capable of capturing a given flow between two
nodes. The flow capturing location-allocation model is defined as follows (Hodgson
1990):
5
Maximize
z=
f
qK
x
k N q
k
q
y q (that is, capture as much flow as possible)
 y q for all q  Q (that is, flow on path q is captured if there is at
least one facility at k on path q)
x
kK
k
 p (that is, only p facilities are located)
y q = (0, 1) for all q  Q
x k = (0, 1) for all k  K
Where:
q is a particular origin and destination (OD) pair;
Q is the set of all OD pairs;
f q is the flow between OD pair q;
y q is a binary variable;
= 1 if f q is captured;
= 0 if not.
k indicates a potential facility location.
K is the set of all potential facility locations.
x k is a binary variable,
= 1 if there is a facility at location k,
= 0 if not
N q = is the set of nodes capable of capturing f q (that is, the set of nodes on path q
between Oi and D j )
Applications
The FCLM can be applied to some problems which can be reduced to a network structure
with flows moving across that network. Many scheduling and workflow problems, for
example, might provide potential applications. In the domain of GIS, most FCLM
applications have to do with retail store placement. Goodchild and Noronha, for
example, discuss the utility of the FCLM in locating gas stations (Goodchild and
Noronha 1987). Other examples include the location of convenience stores, automatic
teller machines, and other businesses that rely on impulse or “convenience” sales.
Berman, Hodgson, and Krass discuss in detail a scenario in which vehicle inspection
stations are located on a network such that a maximal amount of the total network flow
can be inspected (Berman, Hodgson et al. 1995). Speculatively, the FCLM could be
applied to locating medical facilities along evacuation routes and monitoring stations on
pipelines.
6
Solution Procedures
One major obstacle to the implementation of more extensive network analysis
capabilities in GIS is that many such functions are prohibitively time-consuming to
calculate optimally on networks of real-world size—even on present-day processors.
However, heuristic procedures can be used that can produce relatively efficient solutions.
Optimal solution procedures
Enumeration
The location models discussed here, the FCLM and MCLM, are among a class of
complexity problems in which the number of possible permutations will be:
n
n!
  
 p  p!(n  p)!
where n = the number of potential facility locations and p = the number of facilities one
wants to locate. For even modestly sized networks, the number of permutations becomes
so large that a brute force enumeration method which simply checks every possible
arrangement of facility sites is impractical.
Linear Programming
The development of linear programming (LP) techniques since the 1950s has enabled
more efficient computation of many combinatorially complex problems like the FCLM.
LP requires that the problem be reduced to a set of linear functions. The standard form
for LP problems requires an objective function, which is a function to be maximized or
minimized, along with a set of linear constraint functions. For simple problems this can
be represented graphically (Figure 1).
Figure 1: LP Feasible Region
7
In this example, four constraint functions define the convex hull of a feasible region of
possible x1 and x2 combinations, the non-zero constraints and the sloped functions.
Corners in the feasible region are called feasible solutions.
Perhaps the most common method of solving LP problems is the simplex algorithm. In
effect, the simplex algorithm can be thought to move along the boundaries of the feasible
region, evaluating each adjacent corner or feasible solution against the objective function,
and returning the x1, x2 combination that results in the highest (or lowest) result in the
objective function. In enumeration, each combination of x1 and x2 in the feasible region
must be inspected, whereas, using simplex, only four (or less) combinations are evaluated
in this example.
Clearly, LP procedures like simplex are much more efficient than enumeration.
However, location problems like the FCLM still have an n-dimensional feasible region,
so while LP can reduce the number of permutations that are evaluated, it does not reduce
the problems’ inherent combinatorial complexity. Similarly, further refinements of
simplex such as a branch and bound procedure that divides the feasible region into
smaller regions (branching) and determining the upper and lower bounds of the objective
function for each sub-region (bounding) can further reduce the solution space (and
facilitate integer solutions), but again do not affect the problems’ complexity.
Heuristic solution procedures
Greedy Heuristics
Greedy heuristics can be understood to be simple solution algorithms that choose a local
optimum at each step of the procedure in the hopes of achieving a global optimum. For
example, a greedy heuristic algorithm for the traveling salesman problem (an NP-hard
problem where a salesman must visit a set of cities while covering a minimal distance)
would select the closest city which has not already been visited. The problem with
greedy heuristics is that they do not operate on all the data and are liable to “commit” to a
solution path that is optimal locally, but not globally.
There are many variants of greedy algorithms, including the GRASP algorithm which
selects randomly from a set of the k best candidates. When k=1, GRASP is identical to a
basic greedy heuristic. When k=2 or greater, GRASP functions as a randomized greedy
heuristic. By collecting results from many searches with this algorithm, the best
permutation can be expected to be a good result (Church and Sorenson 1994).
Interchange Heuristics
Interchange heuristics start with a random feasible solution (i.e. a set of potential facility
locations or a permutation in the feasible region), evaluating the objective function for
that solution, interchanging an element in the solution with one outside the solution, and
evaluating the objective function again for the new permutation (Teitz and Bart 1968).
Variants of interchange heuristics generally differ on the rules for whether an interchange
is accepted or not, rules that aim to minimize the possibility of the algorithm becoming
caught in local optima without achieving a global optimum. A simple greedy interchange
8
heuristic, for example, would update the feasible solution whenever a new permutation
with a better objective function is found. Results can vary depending on the initial
random feasible solution, so use of interchange heuristics often include multiple runs
seeded with differing initial solutions.
Tabu Heuristics
Tabu heuristics can be thought of as elaborated interchange heuristics in which solution
paths (or permutation subsets) which have already been explored are not reevaluated—
the paths are made tabu (Church and Sorenson 1994). In effect, this alters the local
neighborhood structure of potential solution paths, reducing the possibility of the
algorithm cycling through solutions it has already evaluated and enabling the algorithm to
back out of local optima. Variants of this algorithm can be configured based on the size
of permutation subsets or the number of steps a particular path is “remembered”.
Data
The test data used to test and evaluate the application is a network with 72 nodes and 93
arcs (Figure 2). The original source is a TIGER-formatted set of streets in suburban
Dallas from the 2000 U.S. Census Bureau dataset. This vector data is in the ESRI
Geometric Network format, converted from an original shapefile format.
9
Figure 2: 72-Node Test Network
Flow data is consists of a random number between 0 and 100 for each origin-destination
node pair. Flow is separate for each direction for each node pair (that is, flow is
considered separately for flow from node 1 to node 2 and for flow from node 2 to node
1). Flow data is stored as a tab-delimited table with row/column positions assumed to
correspond to OID values in the geometric network.
10
Methods
Heuristics
Basic greedy algorithms are known to be very robust for the FCLM, producing
consistently near-optimal results. Hodgson, for example, finds greedy heuristic solutions
never fall below 99.4% of the optimal solution. This is believed to be related to the way
networks structure flows, but the precise mechanism is poorly understood (Hodgson
1990). In any case, these are very good results for a greedy algorithm. This project
implements two variants of a greedy heuristic. The primary difference between the two
variants is how they each handle flow coverage.
The Concept of Cannibalization/Multiple Coverage
Original applications of the FCLM were envisaged for retail store placement such as gas
stations and convenience stores, businesses which attract sales from passing motorists. A
primary concern in such a scenario might be that placing too many stores along a heavily
traveled corridor would result in diminishing returns as the different stores cannibalized
each others sales. This is called flow cannibalization. Here we call it multiple coverage,
since flows covered in one step may be counted again in subsequent selections. The
algorithms used here include both multiple coverage and non-multiple coverage versions.
Multiple Coverage Greedy Heuristic
This heuristic simply chooses the p nodes with the greatest number of flows passing
through them in a step-wise fashion. Ties are resolved randomly. Flows passing through
a node selected in one step may be counted again in a subsequent step, hence there is
“cannibalistic” or multiple coverage.
Non-Multiple Coverage Greedy Heuristic
This heuristic also chooses the p nodes with the greatest number of flows in a step-wise
fashion. However, when a node is selected, all flows that pass through that node are
removed from further consideration in subsequent steps (hence, non-multiple and “noncannibalistic” coverage).
CPLEX: Heuristic vs. Optimal
Results from the heuristic methods are compared against optimal solutions produced by
ILOG CPLEX, an industry-standard LP application. CPLEX uses a simplex algorithm to
11
perform calculations specified using a scripting language. Below is the script used to
calculate optimal, non-multiple coverage FCLM solutions for this project:
int P = ...;
int numsites = ...;
range IJRange 1..numsites;
{int} N[IJRange,IJRange] = ...;
int+ f[IJRange,IJRange] = ...;
var int+ x[IJRange] in 0..1;
var int+ y[IJRange,IJRange] in 0..1;
var int+ k[IJRange] in 1..numsites;
maximize
sum(i,j in IJRange) (f[i][j] * y[i][j])
subject to {
forall(i,j in IJRange)
sum(k in N[i][j]) x[k] >= y[i][j];
sum(j in IJRange) x[j] = P;
};
12
Software
The project is implemented in C#. The code is compiled as a COM assembly which can
be registered for use in ArcMap. The user interface is displayed in Figure 3.
Figure 3: GUI
Procedure
Data Input
In the first step, the user selects a data file for the network and a data file for the flow
information. The network must be an ESRI Geometric Network containing only
SimpleEdge and SimpleJunction objects. The flow matrix file consists of a simple tabdelimited text file where the position of the rows (the origin nodes) and columns (the
destination nodes) correspond to the OID + 1 of the nodes in the Geometric Network.
Once a geometric network is assigned, a “cost” field in the network arc dataset can be
selected. Shortest path calculations will be based on the values in this field.
Heuristic Configuration
In this step, the user selects the type of algorithm they wish to apply to the FCLM and
whether they wish to use a multiple coverage or non-multiple coverage variant. They
also select the number of facilities they wish to select in the solution.
Data Output
The user can select to have solution results output to a text file, a separate window, or as
node selections in ArcMap itself.
13
Calculation
When all the parameters have been assigned, the user clicks the “Calculate” button. A
simplified flow chart of the procedure is displayed in Figure 4.
Cost Field
Geometric
Network
Flow Matrix
Associate OriginDestination Pairs
with Flow Volumes
Calculate Shortest
Paths for Each OD
Pair
Select Node with
Greatest Flow
Associate Nodes
with OD Pairs
TRUE
Cannibalize/
Multiple Coverage
Remove
Selected Node
from Set of
Candidate
Nodes
FALSE
Select Node with
Greatest Flow
Remove All
Flows that
Pass Through
Selected Node
TRUE
Select More
Nodes
Select Mode
Nodes
FALSE
FALSE
Output Solution
Figure 4: Processing Flow Chart
14
Analysis
Using a random, node-to-node flow matrix combined with calculation of the shortest
paths between each origin-destination pair, an initial dataset was generated for the
network (Figure 5). The network structures flows such that some nodes are clearly on
more shortest paths than others.
Figure 5: Test Data Initial Node-Flow Structure
Multiple Coverage Greedy Heuristic
Using the multiple coverage greedy heuristic, the algorithm simply selects nodes
sequentially from the node with the largest number of flows running through it to the
least.
Non-Multiple Coverage Greedy Heuristic
The non-multiple coverage greedy heuristic generates more interesting and complex
results. Again, it operates in a step-wise fashion, selecting the node with the largest
number of flows running through it, removing those flows, and repeating. This can be
seen visually (Figures 6 and 7).
15
Figure 6: Non-Multiple Coverage Greedy Heuristic p = 1
The first node selected is node 29. All flows which have paths starting or ending at node
29 are removed, as are all flows passing through it. Flows at neighboring nodes are
visibly reduced, but potentially all nodes’ flows are affected.
Figure 7 displays the network after the second node selection, illustrating the process
further.
16
Figure 7: Non-Multiple Coverage Greedy Heuristic p = 2
Results
Multiple Coverage Greedy Heuristic
Test results using the multiple coverage greedy heuristic were identical to the results
produced by CPLEX configured to produce multiple coverage results, both in terms of
the locations selected and Z values. The solution diverges from the non-multiple
coverage optimal result at p = 3. Since flows are not removed using this heuristic, Z
values comparable to both the non-multiple coverage optimal or greedy heuristic
algorithms are not possible without reconfiguring the application. However, Hodgson
notes that the multiple coverage variant is consistently outperformed by the non-multiple
coverage algorithm and, indeed, will make selections for some values of p where there is
no increase in the Z value, all flow passing through the chosen location having already
been covered (Hodgson 1990).
Non-Multiple Coverage Greedy Heuristic
The non-multiple coverage greedy heuristic finds the optimal solution for all values of p
from 1 to 7. When p is increased further, the greedy heuristic commits to a different
17
solution path than the optimal solution produced by CPLEX, although the heuristic
solutions are very near optimal.
At p = 7, the network configuration with the previous selections is displayed in Figure 8.
Figure 8: Non-Multiple Coverage Greedy Heuristic p = 7
At this point, the Z value of the optimization algorithm, literally the number of flows
covered by the first seven selections, is 212604 of 248250 total network flows, so most of
the flows have already been covered. Node 6 has the largest number of flows with 5799
and will be selected by the greedy heuristic. However, the CPLEX optimal solution
selects node 61 with 5171 flows. Here is an example of the greedy heuristic choosing a
local optimum as opposed to a global optimum. Because each selection affects the
structure of flows in the network differently, from that point forward the two algorithms
are calculating solutions on fundamentally different data.2 The different selections at p =
8 will therefore place the two algorithms on different solution paths in subsequent
2
The notion of comparing the algorithms step-by-step like this is useful for explanation, but technically
incorrect. The CPLEX algorithm does not in fact compute a solution in a step-wise fashion the way the
heuristic does. Instead, it computes a solution for each p for all locations simultaneously. Since an optimal
permutation at one p may differ from the optimum at p + 1 by more than one location (permutation
element), a side-by-side comparison of the actual solution selections is difficult. At p = 12, for example,
the optimal solution with this data selects a solution set that differs by two elements from that at p = 11.
18
selections. Indeed, the CPLEX optimal solution achieves 100% flow coverage at p = 34,
as opposed to the non-multiple coverage greedy heuristic which achieves full coverage at
p = 38. While the heuristic produces a sub-optimal solution for p = 8 to 38, the solution
is never less than 99.4% optimal and averages 99.8% optimal for p = 1 to 38 (Figure 9).
100.10%
100.00%
99.90%
% Optimal
99.80%
99.70%
Z Heuristic / Z
Optimal
99.60%
99.50%
99.40%
99.30%
99.20%
99.10%
1
7 13 19 25 31 37 43 49 55 61 67
p
Figure 9: Non-Mulitple Coverage Greedy Heuristic Optimality
% Total Flow Coverage
120.00%
100.00%
80.00%
Optimal
Heuristic
60.00%
40.00%
20.00%
0.00%
1 6 11 16 21 26 31 36 41 46 51 56 61 66 71
p
Figure 10: Total Flow Coverage
19
As Figure 10 illustrates, both the optimal and heuristic algorithms rapidly cover most of
the total network flow within a few p, and have almost identical results in terms of flow
coverage.
Processing Times
Processing times for both heuristic variants with the 72-node test network vary from
about 9 to 13 seconds given normal system resources on a 1.5 Ghz WinXP computer with
512 mb RAM. The multiple coverage heuristic takes slightly less time to calculate
because flow removal processes are not carried out. Calculation of shortest paths
accounts for about 65% of the total processing time for the non-multiple coverage
heuristic and about 80% for the multiple coverage heuristic. This is attributable to the
difference in heuristic calculation times.
Processing Time Averages (20 samples: milliseconds)
Multiple Coverage
Non-Multiple Coverage
Total Processing Time
9541.6830
12589.9066
Shortest Path Calculation
7578.7993
8166.8485
Data Loading
1924.9731
2173.4943
Heuristic Calculation
29.9709
2233.5381
For comparison, experimentation with a 243-node test network yielded the following
processing times.
Non-Multiple Coverage: 243 Node Network (milliseconds)
Total Processing Time:
1397527.059
Shortest Path Calculation:
1015064.236
Data Loading:
12207.3098
Heuristic Calculation:
370245.4986
Conclusions and Future Research
Heuristics
The heuristics performed in line with results found by Hodgson (Hodgson 1990).
However, some details remain open issues.
One issue with the non-multiple coverage greedy heuristics is the handling of ties. The
algorithm determines a node selection based on which node has the most flow. However,
in the case of ties, selecting one node and removing the flows which pass through it may
well have a different effect on the overall flow structure of the system compared to
selecting another node with the same number of flows. The current implementation
handles ties by selecting one randomly. In this case, however, such contingencies might
best be handled by an elaboration of the heuristic such as the addition of an interchange
or tabu search algorithm to explore different solutions.
20
Another issue is the possibility of including a weighting scheme for flow cannibalization.
The current implementation takes an all-or-nothing approach to cannibalization; covered
flows are either considered in future solution steps or they are not. With the non-multiple
coverage greedy heuristic, for example, it may be desirable that covered flows are still
taken into account in subsequent selections, but at diminished values depending on such
factors as the number of times they have been covered in previous steps, the “distance”
the node under consideration is from other covering nodes, etc. Indeed, some
applications of the FCLM, such as billboard placement, might regard multiple coverage
as beneficial.
Software
Unimplemented Heuristics/Location Problems
While the heuristics implemented here produce near-optimal results, the option to use
other heuristics may be desirable. This especially would be the case were the application
to be expanded to include other location problems similar to the FCLM. These greedy
heuristics are known not to be robust for flow interception problems (Hodgson, Rosing et
al. 1996).
As was pointed out earlier in this discussion, the FCLM is structurally very similar to the
MCLM. There is no reason why the current implementation could not be the basis for a
tool that could process other types of location-allocation models like the MCLM.
However, a significant issue might be the data structures necessary to facilitate a range of
different potential applications. Church and Sorenson specifically point to this issue as
one of the primary obstacles in implementing general location-allocation problems in GIS
(Church and Sorenson 1994).
Processing Time Optimization
The current implementation has considerable room for processing optimization. While
this is a minor issue for smaller datasets, processing times increase prohibitively as the
datasets get larger. As discussed in the Results section, for example, processing time on a
243-node test network using the non-multiple coverage greedy heuristic is just over 23
minutes. More than half this time is spent calculating shortest paths between all the
nodes in the network. A matrix-based shortest path solution should be faster. In
addition, all data searches are currently implemented as sequential searches. Use of a
data sorting scheme and a binary search algorithm could potentially improve processing
times significantly, especially with larger datasets.
21
References
Adenso-Diaz, B. and F. Rodriguez (1997). "A simple search heuristic for the MCLP:
Application to the location of ambulance bases in a rural region." OmegaInternational Journal of Management Science 25(2): 181-187.
Berman, O., M. J. Hodgson, et al. (1995). Flow interception problems. Facility Location:
A Survey of Applications and Methods. Z. Drezner, Springer-Verlag: 427-452.
Berman, O. and D. Krass (2002). "The generalized maximal covering location problem."
Computers & Operations Research 29(6): 563-581.
Chung, C. H. (1986). "Recent Applications of the Maximal Covering Location Planning
(Mclp) Model." Journal of the Operational Research Society 37(8): 735-746.
Church, R. L. and C. ReVelle (1974). "The maximal covering location problem." Papers
of the Regional Science Association 32: 101-118.
Church, R. L. and C. S. ReVelle (1976). "Theoretical and Computational Links between
P-Median, Location Set-Covering, and Maximal Covering Location Problem."
Geographical Analysis 8(4): 406-415.
Church, R. L. and P. Sorenson (1994). Integrating Normative Location Models into GIS:
Problems and Prospects with p-median Model. Santa Barbara, CA, National
Center for Geographic Information and Analysis.
Cormican, K., D. P. Morton, et al. (1998). "Stochastic network interdiction." Operations
Research 46: 184-197.
Current, J. R. and D. A. Schilling (1990). "Analysis of Errors Due to Demand Data
Aggregation in the Set Covering and Maximal Covering Location-Problems."
Geographical Analysis 22(2): 116-126.
Densham, P. and G. Rushton (1992). "A more efficient heuristic for solving large pmedian problems." Papers of the Regional Science Association 71: 307-329.
Goodchild, M. F. and V. T. Noronha (1987). Location-allocation and impulsive
shopping: the case of gasoline retailing. Spatial Analysis and Location-Allocation
Models. A. Ghosh and G. Rushton, Van Nostrand Reinhold: 121-136.
Hodgson, M. J. (1990). "A flow-capturing location-allocation model." Geographical
Analysis 22(3): 270-279.
Hodgson, M. J., K. E. Rosing, et al. (1996). "Locating Vehicle Inspection Stations to
Protect a Transportation Network." Geographical Analysis 28(4): 299-314.
Miller, H. J. (1999). "Potential contributions of spatial analysis to geographic information
systems for transportation (GIS-T)." Geographical Analysis 31: 373-399.
Pirkul, H. and D. A. Schilling (1991). "The Maximal Covering Location Problem with
Capacities on Total Workload." Management Science 37(2): 233-248.
ReVelle, C., J. Schweitzer, et al. (1996). "The maximal conditional covering problem."
Infor 34(2): 77-91.
Teitz, M. B. and P. Bart (1968). "Heuristic Methods for Estimating the Generalized
Vertex Median of a Weighted Graph." Journal of the Operational Research
Society of America 16(5): 955-961.
22

Download Report

Implementing the Flow-Covering Location

Paperzz.com

Your Paperzz