The max-p-region problem (draft)

r
The max-p-region problem
DRAFT
JUAN
C.
D UQUE!
Luc
ANSELIN
2
S E RGI O
J. R Ey 3
January 29, 2007
1
Regional Analysis Laboratory (REG AL), San Diego State University.
jduque ~rohan .sdsu.edu
2Spatial Analysis Laboratory (SAL), University of Illinois Urbana Champaign.
anselin@uiuc .edu
3Regional Analysis Laborat ory (REGAL), San Diego St at e University and Regional Economics App licat ions Laboratory (RE AL), University of Illinois Urba na
Cha mpaign. s e rge~rohan . sdsu . edu
T he max-p-region problem
1
1
Introduction
A common problem when dealing wit h a set of areal units is to combine those
units into mutually exclusive and exhaust ive groups. Numerous clustering
algorit hms have been suggested in t he literature which offer solut ions to
t his problem by assigning the basic units into groups in such a way that
int ragro up homogeneity is maxim ized while intergroup het erogeneity is also
maximized ( ~ l urt ag l1, 1%5 ; Gordon, 19% , 1g~jg ) . A subset of this literat ure
concerns itself with an additional const raint concern ing cont iguity relations
between units wit hin each region (Duque et al., 200,; Shiralx-, 2UO:;; Com
and Church , 20(0).
In this pap er we int roduce t he exact formulation and heuristical solut ion
for a new ty pe of const rained clust ering called t he max-p- region problem.
The max-p-region problem is a special case of const rained clustering where
a finite number of geogra phical areas, n, are aggregat ed into t he max imum
number of regions, p , such t hat each region satisfies t he following const raints:
1. The areas within a region must be geogra phically connected. T his
const raint is commonly known as the spatial contiguity const raint .
2. T he regional value of a predefined attribute must be greater than or
equal to a minimum predefined threshold value. This regional value
is obtained by add ing up t he areal values of the att ribute of the areas
assigned to each region.
3. Each area must be assigned to one an d only one region.
4. Each region must cont ain at least one area .
A unique feature of this model is t hat the number of regions formed
becomes an endogenous variable. As we det ail below, that solut ion will be
dependent on the specification of t he t hreshold value (const ra int two) as
well as t he spatial and value dist ributions of t he attribute var iables.
2
Formulation of the max-p-region problem
Based on the descrip tion of t he max-p-region problem , it is possible to enumerate t he elements that ar e required in order to solve t his problem:
1. Aggregation variables: It comp rises a set of attribute variab les characterizing each ar ea. The variables included in t his set will be utilized to
calculate t he dissimilari ty between areas . T his dissimilarity measure
is required since one of t he object ives in the max-p-region pro blem is
t hat t he areas assigned to a given region should be similar in order to
reduce the aggregation bias. Finally, becaus e of our methodological
approach, th e dissimilari ty measure is not necessarily related to t he
posit ion of the areas in th e geographical space. 1
2. Heterogeneity m easure: Once the dissimilarity between areas is est imated , it is necessary to determine first , how these values will be
combined in order to measure t he level of heterogeneity for each region, and second, how t hese regional heterogeneities ar e combined into
a single measure of global heterogeneity. In this regar d, Cordon ( 1 9~J9)
present a wide range of ways to estimate regional het erogeneity.
3. Constrained att ribute: It is t he attribute variab le whose value at regional level is const rained to be greater t han or equal to a predefined
threshold value. T he const ra ined attribute does not necessarily have
to belong to the set of aggregat ion variables.
4. Neighbouring stru cture in the areal units: It is t he information ab out
t he binary relationships of spat ial adjacency/contiguity between areas. T his information is used within th e model to cont rol t he spatial
contiguity const raint . Besides the convent ional rook an d queen contiguity matrices, there are other alternatives methods for defining spatial
cont iguity between areas, for exam ple, Zoltnr-rs and Sinha (198:1) seek
to take into account natural obstacles such as mount ains, lakes, an d
rivers, by using the road and highway network as a way to represent
how areas are interconnected.
All t he elements stated above are t hen included in t he following mixed
integer programming (MIP) to solve th e max-p-region problem:
IThis topic will be covered later in this section .
Th e max-p-region problem
3
Paramet ers:
index and set of areas , i = {I , . .. ,n} ;
index and set of potential regions , k = {I " .. ,n } ;
index an d set of contiguity order, 0 = {O, " ' , q}, with q = (n - I );
I , if areas i an d j share a border, with i ,j E I and i # j
{ 0, ot herwise;
{jl wi,j = I} ;
dissimilarity relat ionships between areas i an d j, wit h i ,j E I and
i <j;
h
1 + lI09(2::; L j lj> i Dij)J , which is the numb er of digit s of the
floor funtion of L ; L j li>; D;j ;
const rained at t ribute value of area i ;
L;
threshold regional threshold value;
i ,I
k,K
0, 0
Decision variables:
'D .
.)
Xko
•
1
1' if areas i and j belong to t he same region k, wit h i < j
0, otherwise;
1, if areas i is assigned to region k in order
0, otherwise;
0
Minimize:
z=
(n -
tt
kO
)
Xi
* 1.0eh + L
k= 1 i= 1
i
L
DijTij
(1)
jlj>i
Sub ject to :
Vk = I , .. ·, n
Vi = I ," ' ,n;Vk = I , "' , n ;Vo= I , ,, , , q
n
(2)
(3)
q
LL Xfo= 1
Vi = I , ·· · , n
(4)
k=lo=O
n
q
n
L L XfoL; 2: thresho ld * L XikO
i = l o=O
i=l
Vk = I , " ' , n
(5)
Th e max-p-region problem
q
q
ko
7',""
1,J >
- "
L- X '/,
0=0
4
+ "L- X Jko -
1
XfOE {O, I }
= 1" """
n li
< J"" Vk = 1 "" "
1
"
n
Vi = 1, · · · , n; Vk = 1,· . . , n; '10 = 0, · .. , q
Tij E {O, I}
2.1
Vi ' J"
(6)
0= 0
Vi , j= l ,· · · ,n[i <j
(7)
(8)
Objective function
The MIP model is formulated as a minimization problem wit h an object ive
function that comprises two te rms, one term that cont rols t he number of
regions, an d a second term t hat controls the global heterogeneity of the
solut ion.
T he numb er of regions, which is to be maximized, is t rans formed into a
minimization objective by subtracting the number of regions created by t he
model from t he t heoretical maximum number of regions t hat can be created.
The number of regions created by the model is calculate d by ad ding t he
number of areas assigned in order zero, L;;=1L~1 X ;kO. T hose ar eas ar e
called "core areas" , and each region must have one and only one core area.
T he theoretical maximum number of regions, n , can occur only if every
area has a const rained at t ribute value (L;) greater or equal to the regional
t hreshold value (t hreshold).
T he second term, which accounts for the global heterogeneity, is taken
from Cordon ( 1 9~1 9 ) who presents several ways to measure t he global heterogeneity of a given regional configur at ion. In t his pap er we ado pt a global
het erogeneity measure that consist s of t he sum of all t he pairwise dissimilarities (D;j) between th e areas assigned to t he same region (Tij). T he fact
of taking into account all t he pairwise dissimilari ties seeks to enforce homogeneity between th e areas assigned to t he same region, regar dless their
geographical locat ion within their regions.
Finally, t hose two terms are merged int o a single object ive funct ion value
(Z) by adding the terms in such a way t hat the values of each term will not
overlap. T his is achieved by mult iplying the first term by a power of ten
(l.Oeh ) th at ensures th at the addit ion of bot h terms will not modify their
values.
5
The max-p-region problem
To illustrate how t he objective function works , let us suppose we have
twenty areas, n = 20; the sum of t he dissimilari ty relationships is L:i L:j lj >i D ij =
2, 548.24, which is th e maximum global heterogeneity we can obtain (Le. all
the areas assign ed to the same region ); and h = 1 + llog (2,548.24)J = 4.
T he objective function values for three hyp othetical feasibl e solut ions (Zj,
z, and Z3) are:
First solution (Zj):
• number of regions = 6
• global het erogeneity = 1,385.6
• Zj
= (20 -
6) * 1.04
+ 1, 385.6 = 141, 385.6
Second solut ion (Z2):
• number of regions = 11
• globa l heterogeneity = 648.2
• Z2 = (20 - 11) * 1.04
+ 648.2 =
90, 648.2
Third solut ion (Z3):
• number of regions = 11
• global heterogeneity = 530.9
• Z3 = (20 - 11) * 1.04
+ 530.9 =
90,530.9
As can be seen Z3 < Z2 < Zj. Thus, t he model in first instance prefers an
increment in t he num ber of regions an d, given t he sa me num ber of regions,
t he model will prefer the solut ion wit h lower globa l heterogeneity value.
There are two main reasons th at justify the sepa rability of the two te rms .
First , it is a way to add two quantities with different measures. Second , it
avoids the "unfair" competit ion between globa l heterogeneity values from
solut ion with different scales.
2.2
Constraints
Constraint (2) forces that a region should not have more than one core area ,
which are those areas assigned to a region k in order zero (XfO) . Constraint
(3) requires that area i can be assigned to region k at order 0 if and only if
there exist an area j , in the neighbo rhoods of i , assigned to the sa me region
Th e max-p-region problem
6
k in order (0-1 ). Constraint (4) imposes that each ar ea i must be assigned
to one region k and one cont iguity ord er o. Constraint (5) ensures that
each region satisfies the regional t hres hold value requirement. Constraint
(6) selects t he pairw ise dissimilarities that must be taken into account for
calculating t he global heterogeneity meas ure . T hus, the binary variable T ij
will be one if both area i and area j are assigned to the same region. Finally,
constraint (7) and (8) gua rantee variab le int egrality.
To ensure the that t he regions sat isfy the spatial cont iguity constraint is
perh aps the most challenging aspect of this formulation. This const ra int has
been studied in previous lit erature, mostl y within t he context of districting
problems. Duque ct al. ( ~ no 7) provides an extensive literature review about
t he different approaches to ensure t he spatial contiguity. In our model we use
t he ordered-area formul ation proposed by Duque et al. (200G) who extend
the spat ial contiguity const ra int proposed by COVi' and Church (2000) for
solving single-region site search problems.
This approach for ensuring spatial contiguity constraint also sat isfies
some add itional attributes we want to incorporate to a regiona l configuration :
1. Opposit e to those formulations that achieve spatial contiguity by maximizing a measure of regional compactness, the ordered-ar ea formulation does not incorporate any assumption regarding the regional shape .
This char act erist ic allows our model the possibility of designing elongated and even concentric regions if needed .
2. Although each region is required to contain one "core area", t he pres-ence of such a type of are has not any economic interpret ation. The
core are is just an ar t ifice to ensure spat ial contiguity. Thus, in the
max-p-region model, all the areas have the same status. This concept is then different to those approaches that seek to ident ify a very
import ant area which other neighbouring areas are attached to.
Tab les 1 and 2 pr esent two examples of how the decision variab les take a
value equal to one to generate different regional shapes. The example in
t able 1 presents three concent ric regions , and the example in t able 2 shows
that there exist mult iple ways to get a specific regional sha pe (regardless
the location of the core area). The second column of each table shows
the pairwise dissimilari ty measures t hat should be taken into account to
calculate t he global het erogeneity.
Th e max-p-region problem
7
Table 1: Decision variables with value 1 in a feasible solutions. Concentric
regions
x~'o
1';• ,J.
•
'1 1,2, '1 '1,3 , '1 1,4, '1 '1,5, '1 1,61 '1 1,10 , '1 '1, 111
T t,IS ,
~
•
3 ,
J
:
Xi ,4 ,
,
X: ·I
._--_....._---
u
,
X 3 ,4
21
21;
2 ,5
12
__
2 •6
X 17
X 3 ,.
22
..:
):
xi·
sf
X 2•7
,, X- 3 ,6 "
,, • 23 j
22;
23:
T4 ,Zll T4 ,22 , T4 ,23 , T4 ,24 , T4 ,2S , T S ,6 , T S, l O,
T S , l1 , T S, l S , T S ,16 , T S ,20 , T S,2 1 , T S ,2 2 , T S ,23 ,
s
io
2 •1
X 14
u
1 • 18
17~
s
10
" 9
10lJ
•
X 3 ,.
X2 ,2
3
13
.... ._.u
X 3 ,4
3 3
•X-4 •
"
X
3 •2
X II
3 ,2
2~
1:
3 .3
X 16
! X3
X-2 •
X 3 .6
I'
u
U
Ilx 192 ',"1
X' :~ ,7
• 24
2'-:
T n ,l S ,
Tn
,l6 ,
Ti i .a o ,
TIS ,I 6 ,
T n ,Z h
TI S ,z o,
T I6 .ZI , T I6 ,Z2 , T I6 ,23 , T I 6 ,24 , T I6 ,ZS , T ZO,2I ,
T ZO,23 , T ZO,24 , T ZO,25 , T 21,22 , T Z1,23 ,
T n ,23 , T 22 ,24 , T 22 ,2S, T 23 ,2 4 ,
T Z3 ,ZS , T 24 ,25 , T7 ,s , T 7 ,9 , T7,1 2, T 7 ,14 ,
T 7 ,I7' T 7 , I S , T 7, 19 , T S ,9 , T S , I 2 , T S , 14 , T S, 17 ,
T S , IS , T S ,19 , T 9 , l Z, T9 , 14 , T 9 , 17 , T9 ,IS, T 9 ,19 ,
T I Z ,I4 , T I Z, 17 , T 1z ,lS , T12, 19 , T 14 , 17 , TI 4 , l S ,
T ZO,Z2 ,
X :~ , 8
25
T l o ,Z4 , T l o ,Z5 ,
TI S, Zh TI S ,22 , TI S ,Z3. TI S ,Z4 . T IS ,ZS , TI 6 ,ZO,
ac
]
T S ,24 , T S ,2S , TS,lO, T6,n, T6,15, T6, lS, T S,20,
T S ,2 1 , T S ,22 , T S ,23 1 T S ,24 , TS,2S , T 1O, 1l ,
T W ,IS , T lO , 16 , T lO ,20, TlO ,Z1, TlO ,22 , T lO ,2 3 ,
Tn ,zz , T n ,Z3 , T n ,Z4, T n ,25 ,
3 •7
X 20
,
T 1 , 16 , T I ,20 , T1, 2 1, T t ,22 , TI t23 , T 1 ,24 ,
T 1 ,2S , TZ,3, TZ,4 , Tz,s . TZ,6, TZ ,lO , Tz,n,
T Z,15 , T Z ,16 . T Z,20 , T Z,2 1 , T Z,2 2 , T Z,23 , T Z,24 ,
TZ,25, T3,4, Tats, Tats, Ta,lO , Ta,u l Ta,ls,
Ta , t 6 , T 3 ,20, Ta ,2 1 , T a,22 , T 3 ,23 , T a,24 . Ta,2S ,
T 4.5 , T 4,6 , T4 ,10, T 4,n , T 4.15 , T 4,16, T 4,20 ,
,
T21 ,Z4 , TZ 1,25 ,
T 14 . 19 , T 17 IS, T 17 19, TIS 19
2.3
P oss ible ways to r ed uce t he exact formula tion
T he MIP formulati on of the max-p-region model is computationally expen2;
2;n
sive. It has 3n + (n _1 )n 2+n n n constra ints and (n - 1)n 2+ n
variables,
which make it easily intractable when the number of areas increases. However t here are some options that can be cons idered to redu ce t he size of the
problem:
1. All the areas i wit h L, 2: thresh old can be ass igned each one to a
different region k by adding constraints of t he ty pe x t O= 1.
2. The upper limit of t he indexes k and 0 can be redu ced , since they were
set for very extreme cases. Currently t here is not a specific meth od to
define how much the upper limits of k and 0 can be reduced wit hout
affecting optima lity.
3. It is clear t hat, for a given solut ion, t he objective fun cti on will not
8
Th e max-p-region problem
Table 2: Decision variables with value 1 in a feasible solutions. Two ident ical
regions
T.• ,J.
i
i
v I ,1
"'\' 2
t-- -4
--------~l--.----------~
X ' .2
X•l , l
x3l ,2
X I •3
5
6
...
--------_._---'4{:----------_ 5: -_._--_.-
!
XI ,3 ' v i ,'
l ,2
X 7
:
8
.A 9
,
...----------.13
2 ,2 :
X 16
i
T4 ,6 , T4 ,7 . T4 ,8 . T4 ,9.
TS,6. TS,7. Ts,s .
TS ,9 .
T6 ,7 . T6 ,8 . T6 ,9 ,
Tr, s, T7 ,9, TS,9. TlO,ll, TlO,12, TIO,13. TlO ,14, TlO,15. TlO,16,
. ~~>!~.!.,,~~:;"
X 2 ,2 j X 2 ,l
TI t 2 , Ti ta • T 1•4 , T I tS , T t ,6 . T t ,7 . T 1,8 , T l . 9 , T 2 ,3 , T 2 ,4 . T Z ,5 .
T 2 ,6 , T Z ,7 . T Z ,8 , Tz,g. T 3 ,4 , Tats . T3 ,6 , Ta t7 , Tats , Ta,g. T4 ,5 ,
! X 2 ,2
T IO , l7 , T10 , 18 , T ll ,12 . Tll , 13 , T ll , 14 , Tn ,Is. T n ,16 . T U , l7 ,
T n ,I B, T1 2 ,13 , T 12 , 14 . T 12 , 15 . T 12 , 16 , T 12 , 17 , T 12 ,18 . T 13 , 14 ,
T I3, I S , T 13,16 , T I 3,l 7 , TI 3, I B, T 14 , 15 . T 14 , 16 , T 14 ,17 . T 14 , 18 ,
T1S ,16 . TI S, l7 . TIS,I S , T 16 ,17 , T 16 ,18 , T17, 18
i ···············1 5
be affected if we modify the index of t he region as long as t he set of
areas per region is not modified. This implies that , when using t he
branch and bound method , t he optimal solut ion will exist in mult iple
branches of t he solution tree. T hus a Depht-first branching direction
may reduce the solution time.
3
Solution method
In this sect ion we present a heuristic solution for the max- p-region problem.
The heuristic consists of two phases, a construction phase and a local sear ch
phase. The construct ion phase generates a set of feasible solut ions, and
a local search process is appli ed to a subset of solutions generated in t he
construction phase.
3 .1
Construction phase
T he const ruction ph ase st arts by select ing at random an un assigned area
i , t his are a is t he "seed" of a growing region (Gk). If the const rained at-
The max-p-region problem
9
tribute value of the selected area (L i ) is greater than or equal to t he regional
threshold value (threshold), t he area becomes a region by itself; ot herwise,
at each iteration, one neighboring unassigned area is added to t he growing
region until the regional threshold value is satisfied.
The area to be ad ded is determ ined by ordering t he candidate areas
(C) with respect to an ad apt ive greedy function g(.). The greedy function
measures the benefit of select ing each cand idate. In our model, the greedy
function is given by equation (9).
9
(i) =
L
d ij
(9)
Vi E C
jEG k
where:
(IO)
Where U is the set of un assigned areas, which is to be updated every time an
area is assigned . The greedy funct ion meas ures t he increment in t he global
het erogeneity due to the assignat ion of an un assigned area to a growing
region. T hus, t he candidate area wit h t he lowest greedy function value is
ass igned to t he growing region G k . T he greedy funct ion is adaptive becau se
t he list of cand idates and their greedy funct ion values are updated at each
iterati on of t he construction ph ase.
Equat ion (IO) constraints t he set of can didates to those unassigned areas that share border wit h at least one already assigned area in Gk. This
condit ion ensures t hat the const ru ct ion ph ase will lead to a feasible solut ion
in terms of spat ial contiguity.
Once th e current growing region becomes feas ible in te rms of its regional
t hres hold value, a new seed shou ld be selected from t he set of unass igned
areas (U) to start growing a new region Gk. Not e t hat t he const ruction
phase creates a region at a t ime, t his implies that a cand idate area can not
be assigned to a region t hat has already reached the regional threshold value
(t his condition is relaxe d later).
The process of select ing seeds to grow regions stops when eit her U = 0
or when is not possib le to grow a new feasible region from t he remaining
un assigned areas . Those areas are known in t he litera ture as "enclaves," an d
t hey have to be ass igned to one of t he p alrea dy exist ing feasi ble regions.
T he assignation of enclaves uses the following greedy funct ion:
g (i ,k) =
L
jEGk
d ij
Vi E C; Vk
I c,
nN i # 0
(11)
The max-p-region problem
10
where:
C {i
=
E U
I k01 Gk nNd
0}
(12)
At the end of th e const ruction ph ase, we have a feasible solution with p
regions and a respecti ve global heterogeneity value. Table 3 presents an
illust rat ion of t he main steps t ha t comprise t he const ruct ion phase.
T he st rategy of creating regions from the selection of an initial area
appeared in the early 60s wit h Vickrey ( 1% 1) for solving districting problems. Variations of this met hodology have been proposed by (T horeson and
Liittschwager, 1907; Gfl~'trltart awl Liit t-chwager, L~)n9 ; Taylor, 197:3; Open~hHW, 19T7n,b j Rossiter and J ohns to n , 19S1 ) .2
Table 3: Construct ion phase
c
4.
5. Start new region
3 .2
6. Grow region
until feas ibility
7. T hree feasible regions (p =
3 and enclaves
Feasible re-
8. Assigna tion
of enclaves
Lo cal search p h ase
T he local search phase consists of an it erative process that seeks to improve
(minimize) the global het erogeneity value associated with a solut ion generated by the construction phase. At each iteration, the local search phase
evaluates a set of neighboring feasible solutions that can be obtained from
a current solution. Within t he context of spatially constraint clusters, a
2See Duque ct al. (2007) for a more detailed explanation of t he method s implemented
by these aut hors.
T he max-p-region problem
11
neighboring solution is comprised for those solutions that are obt ained by
moving one area from its current region (donor region) to another neighb oring region (recipient region). Such a movement has to satisfy the following
conditions: First , the donor region must have at least two areas to allow
one area to leave. Second, the removal of an area from the donor region
can not break the spatial contiguity of t hat donor region, nor can it lead t o
a infeasibility in the regional threshold value requirement. And third, th e
area to be moved must share a border with th e recipient region."
The local search phase implemented in our model uses a tabu search
algorit hm (Glover, 1977, 1989, 19(0 ), which has been used in a wide var iety
of combina torial problems. The tabu search algorit hm is a met aheur istic
provided with a good capacity of escap ing from local opt imal solut ion by
allowing a temporal worsening of t he object ive function value (in our case
the global heterogeneity) with t he hope of discovering a new solut ion bet ter
t hat the best solution obtained so far .
T he main ste ps in th e tabu-based local search phase are:
1. Start with a initial feasible soluti on from t he const ruct ion phase . T his
solution , wit h its respective global heterogeneity value, will have t he
status of current solution.
2. Generate a list of all potential feasible neighboring solutions from the
cur rent solut ion. T his list is known as th e candidate list of mov es.
3. Evaluate each candidate by calculat ing its impact in the global heterogeneity value. Since each neighboring solut ion implies the move of an
area i from its cur rent region d to t he recipient region r, its imp act
on the global heterogeneity measure can be calculated with equation
(13).
.6.Z i •r =
L
JEGr
di j
-
L
d;j
(13)
jEGd
where G; is t he set of areas in th e recipient region, and Gd is the set
of areas in t he donor region.
Although each candidate satisfies the feasibility condit ions (ste p 2.),
there is an additional condition to t ake into account in order to labe l
a candidate move as an admissible candidate move: If the can didate is
30 t her typ es of neighboring solutions are provided by \"ag:l.·! ( I ~Hj.), Sammons ( 1978)
and Hom ( l ~)!,l!\ ).
Th e tnex-p-region probl em
12
not a tabu move, then it automatically becomes an admissible candidate; ot herwise the move must lead to a new solution better than the
best solution obtained so far (the aspirational crit eria).
4. Choose the best admissible candidate and perform the move. This
move leads to a new current solution. This new cur rent solut ion is
designated as new aspirat ional criteria if it improves the cur rent one.
The reverse move (i.e., moving th e area back to the donor region) is
prohibited/tabu during t he next R iterations (R is a parameter known
as the length of the tabu list).
5. The algorithm stops if the aspirational criteria have not been improved
during a predefined number of iterations (conv) ; ot herwise go to step
2.
Not e that the main goal of the local search phase is to improve th e global
heterogeity value at a given scale p. It is during the const ruct ion phase
where the maximum number of regions is created.
Figure 1 contains a simple illustrat ion of th e main components in the
local search phase.
3.3
Assembling the max-p-region algorithm
Finally, in t his sect ion we insert the construct ion an d local sear ch phases
wit hin an algorit hmic stru ct ure t hat seeks to obtain a near optimal and consistent solut ion for the max-p-region model.
Th e mex-p-t egion problem
13
Figure 1: Illust ration of a local sea rch phase
Escape from IoQ/ opt""""
solutllm al'l:er tlIIe non-
J'nI)roW\g moves
""
•
uoc
1
2
3~
5
6
7a
5oIution .. ~ 'CDtlStl'uCliO "
PI'IaU
Pseudocode: M AX-P-REGION
input : (m ax itr, N i , Dij , L i , thr eshold, R , conv)
out put: (bestO F, bestSolution, maxp)
bestOF = a
bestSolution = 0
best Solution = 0
S=0
for c = 1, 2, · · ", maxitr
p, solution = CONSTRUCTION(Ni, Dij , L s, th reshold)
if p > m axp
{ m axp= p
S = solution
if p = maxp
then {S = S U solution
for sol in S
of , solution = LOCAL(sol, n; D ij , u , thr eshold, R , conv)
if of < bestO F
do
{ bestOF = of
th
e n bestSolution = solution
return bestOF, bestSolution , bestS olution
do
th
1
en
The max-p-region problem
14
The algorithm above takes into consideration the following statements:
• The solution obtained with the const ruction phase depends on t he way
that the initial seeds where selected during the iterat ions. Since th e
seed for each region is selected at rand om, it is possib le to obtain different feasible solutions if we run the construction phase several t imes.
Thus, repeat ing the construction phase multiple times forces th e algorithm to make a more intensive inspec tion of the feasible solution
set .
• Given t hat (a) one of the main goals in the max-p-region model is
to maximize the number of regions, (b) the numb er of regions can
vary if we run the construction phase mult iple times, and (c) t he local
search does not modify the number of regions ; th e local search can be
performed aft er generat ing a set of feasibl e solutions, by running the
construc tion phase multiple times, and for only those solutions with
the maximum number of regions.
4
Calibrat ion of parameters
5
Case of study
6
Conclusions
R efer ences
Cova, T. and Church, R. (2000) . Cont iguity const raints for single-region
sit e search problems. Geographical Analysis, 32(4):306-329.
Duque, J. , Church, R. , and Middl eton , R. (2006). Exact models for th e
regionalization problem . In Western Regional Science Association annu al
meetings.
Duque, J ., Ramos, R., an d Surinach, J . (2007). Sup ervised regionalization
methods: a sur vey. International Regional S cience Review, Forth coming.
Gearhart, B. C. and Liittschwager, J. M. (1969). Legislative dist rict ing by
compute r. B ehavioral Science, 14(5):404-417.
The mex-p-tegion problem
15
Glover, F. (1977). Heurist ic for int eger programming using surrogate const raints . Science, 8:156-166.
Glover , F. (1989). Tabu search - part I. ORSA Journ al on Computing,
1:190-206.
Glover , F. (1990). Tabu search - par t II. ORSA Journ al on Computing,
2:4- 32.
Gord on, A. (1996). A survey of const ra ined classificat ion. Computational
Statistics fj Data Analysis, 21:17-29.
Gordon, A. (1999). Classification . Chapman & Hall/ CRC , Lond on, 2nd
edit ion.
Horn, M. (1995). Solution techniques for large regional partitioning problems. Geographical Analysis, 27(3):230-248.
Murtagh, F. (1985). A survey of algorithms for cont iguity-const ra ined clustering and related problems. Computer Journal, 28(1):82-88.
Nagel, S. (1965). Simplified bipartisan compute r redistrictin g. Stanfo rd Law
Review, 17(5):863- 899.
Openshaw, S. (1977a). A geographical solut ion to scale an d aggregation
probl ems in region-bu ilding, partitioning and spatial modeling. Transactions of the In stitu te of British Geographers, 2(4) :459-472.
Openshaw, S. (1977b) . Optim al zoning syst ems for spatial interacti on models. Environment and Planning A, 9(2):169-1 84.
Rossiter , D. J . and Johns ton, R. J . (1981). Program GRO UP - th e identificat ion of all possible solutions to a constituency-delimitation problem .
Environment and Planning A, 13(2):231-238.
Sammons, R. (1978). Spatial Representation and Spatial Interaction, cha pter
A simplist ic approach to the redistricting problem, pages 71-94 . Leiden ;
Boston: M. Nijhoff Social Sciences Division.
Shirabe, T . (2005). A model of contiguity for spat ial unit allocation. Geographi cal A nalysis, 37(1):2- 16.
Taylor, P. J. (1973). Some implications of spatial organi zation of elect ions.
Transactions of the In stitut e of B rit ish Geographers, (60):121- 136.
The max-p-region problem
16
Thoreson, J . D. and Liit tschwager , J . M. (1967). Legislative dist ricting by
compute r simulation. Behavioml Science, 12(3):237-247.
Vickrey, W. (1961). On the prevention of gerrymandering. Political Science
Quarterly, 76(1):105- 110.
Zoltners, A. and Sinh a , P. (1983). Sales territory alignment - a review and
model. Management Science, 29(11):1237-1256.