Spatial`models`for`pandemic`influenza:`extending`human`movement

Spatial'models'for'pandemic'influenza:'extending'human'movement'models'across'
international'borders'
'
Thomas'Maples'
'
In'partial'fulfillment'of'the'requirements'for'graduation'with'the'Dean’s'Scholars'Honors'
Degree'in'Biology'at'The'University'of'Texas'at'Austin'
'
'
'
'
'
'
_____________________________________________________'
'
______________________'
Dr.'Rosalind'Eggo'
'
Date'
_____________________________________________________'
'
______________________'
Dr.'Lauren'Ancel'Meyers'
'
Date'
_____________________________________________________'
'
______________________'
Dr.'Ruth'Buskirk'
'
Date'
'
'
'
'
Supervising'Researcher'
'
'
'
'
'
'
Principal'Investigator'
'
'
'
'
Honor’s'Advisor'in'Biology'
'
'
'
1'Abstract'
'
Modeling'the'spread'of'pandemic'influenza'can'help'public'health'officials'decide'
when'and'where'to'concentrate'prevention,'detection,'and'intervention'efforts.'As'
influenza'is'transmitted'from'person'to'person,'models'for'the'spread'of'influenza'and'
influenzaMlike'viruses'require'an'understanding'of'human'movement'and'travel'patterns.'A'
recent'movement'model,'known'as'the'radiation'model,'has'been'shown'to'accurately'
predict'the'movement'of'people'within'the'contiguous'United'States.'Using'commuter'data'
from'the'2000'US'Census,'we'identify'geographic'regions'of'poor'model'fit,'and'
demonstrate'that'this'radiation'model'formulation'does'not'accurately'predict'crossM
border'movement'to'Canada'and'Mexico.'We'propose'a'modified'radiation'model'that'
takes'the'borders'into'account,'adjusting'the'probability'that'a'worker'commutes'to'a'
foreign'country.'Our'modifications'to'the'radiation'model'significantly'improve'its'ability'
to'predict'international'movement'from'the'United'States'to'Canada'and'Mexico.'The'
modifications'particularly'improve'the'fit'of'the'model'to'commuter'data'for'US'counties'
near'international'borders.'The'modified'radiation'model'could'be'applied'to'simulate'the'
spread'of'pandemic'influenza'in'North'America.''
'
2'Introduction'
Accurately'modeling'the'spread'of'pandemic'influenza'aids'in'the'evaluation'of'
public'health'interventions'and'the'geographic'concentration'of'detection'and'prevention'
efforts'(1).'The'spatial'spread'of'influenza'has'been'shown'to'correlate'closely'with'the'
movement'of'people'to'and'from'their'places'of'work'(2).'As'commuting'destination'
depends'on'the'availability'of'work,'there'is'greater'commuter'flow'between'highly'
populated'areas,'which'typically'have'more'job'opportunities'than'more'rural'areas'(2).'At'
the'same'time,'the'more'distant'a'worker'is'from'a'job'opportunity,'the'less'willing'he'or'
she'will'be'to'accept'that'job.'Therefore,'the'both'the'population'of'two'locations'and'
distance'between'those'locations'affect'the'amount'of'movement'between'them.'
The'dominant'model'that'captures'this'hierarchical'population'movement'is'the'
gravity'model,'first'proposed'in'1946'(3).'The'gravity'model,'as'shown'below,'states'that'
the'number'of'people'moving'from'location'i'to'location'j'(Tij)'is'proportional'to'the'
'
2'
population'of'location'i$(mi)'raised'to'a'power'α,'multiplied'by'the'population'of'location'j'
(mj)'raised'to'a'power ,'over'a'function'of'the'distance'between'the'locations'(rij)'(4).'
'
!
!!" =
!!∝ !!
! !!"
'
'
The'gravity'model'has'several'limitations.'First,'the'model'requires'previous'regionM
specific'movement'data'to'estimate'the'unknown'parameters'and'distance'function.'These'
estimated'parameters'limit'the'use'of'a'gravity'model'that'is'fitted'to'one'region'in'
describing'movements'in'a'different'region.'Second,'the'gravity'model'ignores'the'
populations'of'nearby'locations'when'calculating'movement.'For'instance,'according'to'the'
gravity'model,'the'movement'patterns'of'individuals'traveling'between'two'cities'depend'
only'on'the'populations'of'the'cities'and'the'distance'between'them.'However,'one'would'
expect'the'number'of'travelers'between'two'cities'in'a'region'of'high'population'density'to'
be'lower'than'the'number'of'travelers'between'two'equally'sized'and'distant'cities'in'a'
region'of'low'population'density.'This'effect'would'be'due'to'the'relatively'higher'number'
of'opportunities'available'in'a'dense'area.'
Some'existing'models,'such'as'the'intervening'opportunities'model,'resolve'this'
second'limitation,'but'still'have'parameters'that'must'be'calibrated'(5).'A'model'recently'
described'in'Simini'et$al.'(2012),'known'as'the'radiation'model'is'parameterMfree'and'takes'
nearby'population'distributions'into'account.'Instead'of'including'the'distance'value'rij$
when'calculating'Tij,'the'radiation'model'employs'a'related'value'sij,'defined'as'of'the'
population'within'the'circle'centered'at'location'i'with'radius'rij,$excluding'the'populations'
of'locations'i$and$j'(4).'The'radiation'model$is'illustrated'in'Figure'1'where'each'circle'
represents'the'population'size'of'a'county.'Circles'are'colored'light'blue'if'that'population'
is'included'in'sij.'Counties'whose'centroid'is'outside'of'the'circle'of'radius'rij'are'not'
included'in'sij'and,'thus,'are'not'shaded.'The'radiation'model'equation'is'shown'below.'Ti'
represents'the'total'number'of'workers'commuting'from'location'i$to'every'location'in'the'
model.'This'value'is'generally'taken'from'commuter'data.'
'
'
3'
!!" = !!
!! !!
'
(!! + !!" )(!! + !! + !!" )
!! =
!!" '
!!!
'
Figure 1. Schematic of the radiation model. Figure depicts movement from a county with population mi
to a county with population mj. Dark blue circles represent county populations mi and mj, and light blue
circles represent county populations included in sij.
'
Simini'et$al.'demonstrated'the'goodness'of'fit'of'the'radiation'model'for'modeling'
commuter'flow'within'the'contiguous'United'States.'We'aim'to'use'the'model'to'predict'the'
international'spread'of'pandemic'influenzaMlike'pathogens.'Therefore,'it'is'necessary'to'
extend'the'model'to'incorporate'human'movement'across'international'borders.'Current'
models'for'international'movement'of'commuters'and'international'spread'of'pathogens'
are'heavily'reliant'on'data'from'the'airline'network'(6).'These'data'are'used'to'understand'
the'movement'from'one'large'hub'to'another.'Early'cases'of'the'2009'H1N1'influenza'
pandemic'in'the'US'were'detected'in'San'Diego'County'and'Imperial'County'in'California'
'
4'
and'Guadalupe'County'in'Texas'(7),'as'well'as'in'New'York'City'(8).'This'is'a'good'
indication'that'although'air'travel'connecting'large'hubs'is'critical'to'the'spread'of'
pandemic'influenza,'crossMborder'movement'at'the'land'borders'may'also'play'a'role.'Here,'
we'aim'to'improve'the'formulation'of'the'radiation'model'so'it'can'incorporate'the'
movement'of'people'for'work,'movement'that'may'include'crossMborder'travel'by'land'or'
by'air.'
There'is'reason'to'suspect'that'crossMborder'commuting'patterns'are'inherently'
different'than'the'domestic'commuting'patterns'modeled'with'the'original'radiation'model.'
Previous'studies'have'shown'that'there'is'a'real'and'substantial'“border'effect,”'impeding'
the'flow'of'people'and'goods'across'international'borders'(9).'We'could'therefore'expect'
the'current'radiation'model,'which'lacks'a'concept'of'borders,'to'fail'to'accurately'predict'
international'movements'and'fail'to'accurately'model'the'spread'of'pandemic'influenza'on'
a'continental'level.'
In'this'study,'we'first'analyze'the'fit'of'the'original'radiation'model,'then'examine'
the'shortcomings'of'the'original'radiation'model'in'crossMborder'movement,'and'finally'
reformulate'the'radiation'model'to'better'account'for'border'effects,'in'order'to'more'
accurately'simulate'an'influenzaMlike'epidemic'in'the'future.'
'
3'Data'
The'population'and'commuter'data'used'in'this'report'came'from'three'main'
sources:'the'2000'Mexican'General'Census'of'Housing'and'Population'(10),'the'2000'
United'States'Census'(11),'and'the'2001'Canada'Census'(12).'We'use'data'from'the'2000'
Census'because'the'2010'US'census'did'not'record'the'same'countyMtoMcounty'commuter'
data.'US'Census'countyMtoMcounty'worker'flow'files'from'2000'are'widely'used'in'spatial'
models'of'pandemic'influenza'(2,'13).'
The'United'States'and'Mexican'censuses'organize'local'communities'into'counties'
and'municipios,'respectively.'The'Canadian'census,'however,'organizes'local'communities'
into'both'census'subdivisions'and'larger'census'divisions,'which'are'composed'of'several'
census'subdivisions.'To'select'the'level'of'organization'for'the'Canadian'data'to'use'in'this'
study,'we'compared'the'population'size'distributions'Canadian'divisions'and'subdivisions'
to'that'of'US'counties.'
'
5'
Figure'2'shows'the'population'size'distributions'of'US'counties'(a),'Canadian'
divisions'(b),'and'Canadian'subdivisions'(c).'The'figure'demonstrates'that'Canadian'
divisions'are'most'similar'to'US'counties'in'both'mean'population'size'and'distribution.'
Therefore,'Canadian'census'divisions'were'chosen'as'our'level'of'organization'for'Canada.'
'
Figure 2. Comparison of population distribution of US counties (a), Canadian divisions (b),
Canadian subdivisions (c). The means are 89353, 104191, and 5359 for US counties, Canadian
divisions, and Canadian subdivisions, respectively. The standard deviations are 292350, 251202, and
45910 for US counties, Canadian divisions, and Canadian subdivisions, respectively.
'
Our'source'of'commuter'data'was'the'2000'United'States'Census'countyMtoMcounty'
worker'flow'files'(14).'To'generate'this'data,'the'US'Census'asked'participants'to'identify'
the'place'they'worked'most'frequently'in'the'previous'week.'The'countyMtoMcounty'worker'
flow'data'enumerates'the'number'of'people'who'commuted'between'every'pair'of'counties'
in'the'50'US'states'and'the'District'of'Columbia'on'a'given'week.'It'also'specifies'the'
number'of'people'who'commuted'abroad,'reported'as'the'aggregate'number'of'commuters'
to'each'foreign'country.'
'
'
'
'
6'
4'Methods'
4.1$Characterization$of$the$radiation$model$
After'recoding'the'radiation'model'in'python,'we'verified'that'it'mirrored'the'model'
described'in'Simini'et$al.'As'in'Simini'et$al.,'we'evaluated'the'predictive'accuracy'of'the'
model'within'the'contiguous'United'States.'In'modeling'commuter'flows'within'the'
contiguous'US,'we'defined'Ti'to'be'the'number'of'people'commuting'from'county'i$to'
anywhere'in'the'contiguous'US'according'to'the'countyMtoMcounty'worker'flow'data.'We'
then'ran'the'radiation'model'to'predict'the'destination'county'for'each'of'these'workers'
and'compared'the'results'with'the'census'data.'
$
4.2$Error$in$the$radiation$model$across$borders'
As'our'ultimate'goal'is'to'model'the'international'spread'of'pandemic'influenza,'we'
studied'the'model’s'predictions'for'movement'across'international'boundaries.'Specifically,'
we'analyzed'the'success'of'the'model'in'predicting'movements'from'counties'in'the'US'to'
Mexico'and'Canada.'In'modeling'commuter'flows'within'the'US,'Mexico,'and'Canada,'we'
expanded'Ti'to'include'all'people'commuting'from'county'i$to'anywhere'in'the'continental'
US,'Mexico,'and'Canada'according'to'the'countyMtoMcounty'worker'flow'data.'We'then'ran'
the'radiation'model'to'assign'these'commuters'a'specific'destination'county,'census'
division,'or'municipio.'
For'each'commuter'flow,'we'used'the'following'formula'to'quantify'the'error'in'the'
predictions'of'the'radiation'model.'
'
!""#" = log !!""#$%&'!"#" + 1 − log !"##$%&'(!"#$% + 1 '
'
By'logging'both'terms'in'this'error'measurement,'we'penalize'small'errors'in'
relatively'small'commuter'flows'more'than'the'same'errors'in'larger'commuter'flows.'For'
example,'if'the'model'predicts'a'commuter'flow'of'5'while'the'data'shows'a'commuter'flow'
of'10,'the'error'for'this'discrepancy'is'0.26.'Conversely,'if'the'model'predicts'a'commuter'
flow'of'1005'while'the'data'shows'a'commuter'flow'of'1010'(the'same'difference),'the'
error'is'only'0.0022.'
'
7'
Using'this'error'measurement,'we'examined'how'well'the'radiation'model'predicted'
movement'from'each'US'county'to'Mexico,'Canada,'and'other'counties'in'the'US.'While'our'
radiation'model'predicts'movement'from'US'counties'to'individual'census'divisions'and'
municipios'in'Canada'and'Mexico,'the'countyMtoMcounty'worker'flow'data'only'reports'total'
movement'to'Canada'and'Mexico'for'each'US'county.'Therefore,'we'aggregated'the'total'
number'of'commuters'to'Canada'and'Mexico'predicted'by'the'model'in'order'to'compare'it'
to'the'data.''
'
4.3$Modified$radiation$model'
To'improve'the'international'movement'predictions'of'the'radiation'model,'we'
modified'the'original'model'to'include'a'concept'of'international'borders.'For'our'
purposes,'we'focused'on'the'borders'between'the'US'and'Mexico'and'the'US'and'Canada.'
However,'our'modified'radiation'model'is'applicable'to'any'movement'simulation'
involving'international'borders.$
In'the'original'model,'it'is'assumed'that'the'population'of'each'county'reflects'the'
relative'number'of'opportunities'for'employment'in'that'county.'It'is'also'assumed'that'
workers'from'every'other'county'in'the'model'have'the'same'number'of'opportunities'for'
employment'in'that'county.'These'two'assumptions'are'reasonable'when'modeling'
commuter'movement'within'a'single'country'like'the'US'because'there'are'probably'few'
barriers'that'favor'residents'of'one'county'over'residents'of'another'for'employment.'
When'modeling'commuter'movement'across'international'boundaries,'however,'
these'assumptions'begin'to'break'down.'Due'to'legal,'cultural,'and'language'barriers,'the'
employment'opportunities'for'a'foreign'worker'may'be'fewer'than'the'opportunities'
available'for'a'domestic'worker.'To'reflect'the'effect'of'residence'country'on'the'number'of'
employment'opportunities'available'to'a'worker,'our'modified'radiation'model'considers'
the'source'and'destination'country'when'predicting'the'number'of'commuters'between'
two'counties.$
For'our'model,'let'I'be'the'country'of'source'county'i'and'J'be'the'country'of'
destination'county'j.'As'in'the'original'model,'the'number'of'commuters'between'i'and'j'is'
dependent'on'the'population'of'the'intervening'locations'sij.'Our'model,'however,'partitions'
sij'into'component'terms'by'country,'and'weights'each'component'differently.'In'our'model'
'
8'
shown'below,'sAij'is'the'population'of'country'A'that'lies'within'the'circle'centered'at'i'with'
radius'of'length'rij'excluding'the'populations'of'i'and'j,'where'U,'C,'and'M'signify'the'US,'
Canada,'and'Mexico,'respectively.'After'partitioning'sij'by'country,'we'weight'each'
component'by'γIA,'which'represents'the'fraction'of'work'opportunities'in'country'A'that'are'
available'to'residents'of'country'I.'The'equation'below'shows'the'modified'radiation'model'
formulation,'which'reduces'to'the'original'model'when'all'γ'='1.''
'
!!" = !! !!
1
1
−!
'
!! + !!" !!"# + !!" !!"# + !!" !!"# !! + !!" !!"# + !!" !!"# + !!" !!"# + !!" !!
!! =
!!" '
!!!
'
A'schematic'of'the'effect'of'γ'across'the'USMCanada'border'is'illustrated'in'Figure'3.'
The'figure'depicts'movement'from'a'county'in'the'US'to'a'county'in'Canada.'All'populations'
of'all'counties'in'Canada'are'scaled'by'γUC'to'reflect'the'apparent'populations'of'those'
counties'from'the'perspective'of'commuters'from'the'US.'These'apparent'populations'
reflect'the'reduced'number'of'job'opportunities'available'in'Canada'to'residents'of'the'US.'
'
'
9'
Figure 3. Schematic of the modified radiation model. Figure depicts movement from a county with
population mi in the US to a county with population mj in Canada. The dark blue circle represents county
population mi, and the light blue circles represent county populations included in sUij. The dark red circle
represents population mj scaled by γUC, and the light red circles represent the populations included in sCij
scaled by γUC. Scaling all populations in Canada with γUC adjusts their actual populations to their apparent
populations (which are proportional to the number of job opportunities) from the perspective of workers in
the US.
$
4.3.1$Model$fitting'
To'fit'the'modified'radiation'model'to'the'commuter'data,'we'estimated'values'for'
γUU,'γUC,'and'γUM.'We'assumed'all'work'opportunities'in'the'US'were'fully'available'to'
residents'of'the'US,'so'γUU'equals'1.'We'estimated'γUC'and'γUM'by'fitting'the'modified'
radiation'model'using'a'multinomial'likelihood'function'described'in'(13)'and'shown'
below.'In'the'likelihood'function,'θ'represents'the'values'for'γUC'and'γUM,'and'!!|! ! '
represents'the'probability'of'a'commuter'travelling'to'county'j$given'that'the'commuter'
begins'in'county'i$according'to'the'modified'radiation'model.!!!"!#$ '(the'total'number'of'
commuters'in'the'model),'!! '(the'number'of'commuters'originating'in'county'i),'!!" '(the'
number'of'commuters'traveling'from'county'i'to'county'j),'and'!! '(the'fraction'of'
commuters'originating'in'county'i)'are'taken'from'commuter'data.'
'
10'
'
!=
!! ln!(!! ) +
!
!!" ln!(!!|! ! )'
!,!
!! =
!!" '
!!!
!! =
!!
!!"!#$
'
'
As'in'Truscott'and'Ferguson'(2011),'we'constrain'the'model'by'setting'the'number'
of'workers'in'each'county'(Ti)'to'be'the'number'of'workers'in'the'countyMtoMcounty'worker'
flow'data.'We'maximized'the'multinomial'likelihood'using'the'LowMMemory'BroydenM
FletcherMGoldfarbMShanno'(LMBFGS)'algorithm'with'a'modification'for'bound'constraints.'
Thus,'the'LMBFGSMBound'(LMBFGSMB)'algorithm'was'used,'which'is'a'common'and'popular'
algorithm'for'parameter'estimation'implemented'in'the'scipy'package'in'python'(15).'We'
bounded'each'parameter'to'be'nonMnegative'because'a'county'cannot'have'negative'
population'or'negative'job'opportunities.'Confidence'intervals'for'each'parameter'estimate'
were'determined'using'the'profile'likelihood'method'(16).'Using'the'parameter'estimates,'
we'examined'how'well'the'modified'radiation'model'predicted'movement'from'each'US'
county'to'Mexico,'Canada,'and'other'counties'in'the'US.'We'also'looked'at'the'ability'of'the'
modified'model'to'predict'aggregate'movement'from'each'US'border'state'to'Mexico'and'
Canada.'
'
5'Results'
5.1$Characterization$of$the$original$radiation$model$
As'shown'in'Simini'et$al.,'there'is'fairly'good'correlation'between'the'predictions'of'
the'radiation'model'and'the'countyMtoMcounty'worker'flow'data'within'the'contiguous'US'
(Figure'4).'Additionally,'the'model'accurately'predicts'the'probability'of'a'trip'between'two'
counties'given'the'distance'between'those'counties'(Figure'5).'
'
11'
Figure 4. Comparing the predicted number of travelers with observed number of travelers. Each
point represents an ordered pair of counties (i, j) in the contiguous US. A point’s x-value is the number of
commuters traveling from i to j in the county-to-county worker flow data. A point’s y-value is the number of
commuters traveling from i to j as predicted by the radiation model.
'
Figure 5. Probability of a trip between two counties in the contiguous US that are at distance r (in
km) from each other. The probability of a trip peaks at approximately 40 km, as shown in Simini et al.
'
A'key'feature'of'the'radiation'model'is'its'consideration'of'the'population'density'of'
surrounding'counties.'If'surrounding'counties'are'densely'populated,'workers'do'not'have'
'
12'
to'travel'as'far'to'find'work'and'therefore'commute'within'a'smaller'radius.'This'
phenomenon'is'apparent'in'both'the'US'Census'data,'and'is'demonstrated'in'Figure'6.'The'
figure'shows'the'output'of'the'radiation'model'for'Davis'County,'Utah'and'Northampton'
County,'Pennsylvania.'While'Davis'and'Northampton'Counties'have'similar'population'size'
and'number'of'commuters,'the'model'correctly'forecasts'very'different'commuting'
patterns'due'to'the'different'population'densities'of'surrounding'counties.'As'Northampton'
County'is'in'the'densely'populated'Northeast,'workers'to'not'have'to'commute'as'far'to'
find'jobs'as'workers'in'the'sparsely'populated'Mountain'West.'
'
Figure 6. US Census county-to-county worker flow data and radiation model predictions for similar
counties. The populations of Davis and Northampton Counties are 267066 and 238994, respectively.
$
5.2$Error$in$the$radiation$model$across$borders$
For'each'county'in'the'US,'we'compared'the'predictions'of'the'radiation'model'with'
the'number'of'commuters'to'Mexico,'Canada,'and'the'US'in'the'countyMtoMcounty'worker'
flow'data,'as'shown'in'Figure'7.'The'model'overMpredicts'movement'from'counties'colored'
orange,'and'underMpredicts'movement'from'countries'colored'green.'As'the'original'
'
13'
radiation'model'lacks'a'concept'of'borders,'there'is'understandably'a'large'difference'
between'the'census'data'and'the'results'of'a'continent'level'simulation'using'the'radiation'
model.'As'evident'from'Figure'7,'the'most'error'in'the'crossMborder'predictions'of'the'
radiation'model'occurs'along'borders.'In'general,'the'model'overMpredicts'movement'to'
Mexico'from'US'counties'near'Mexico'and'overMpredicts'movement'to'Canada'from'US'
counties'near'Canada.'This'overMprediction'is'indicated'by'the'orange'shading'of'counties'
near'the'Mexican'border'in'Figure'7a'and'the'orange'shading'of'counties'near'the'Canadian'
border'in'Figure'7b.'This'overMprediction'of'international'commutes'leads'to'an'underM
prediction'of'domestic'commutes'from'countries'close'to'an'international'border'as'
indicated'by'the'green'shading'of'counties'near'the'Mexican'and'Canadian'borders'in'
Figure'7c.'
'
Figure 7. Error in cross-border predictions of unmodified radiation model. Each map presents the
error of unmodified radiation model in predicting commuter flow from each county. Figure 7a shows the
error in predictions of commuter flow from each county in the US across the border to Mexico. Figure 7b
shows the error in predictions of commuter flow from each county in the US across the border to Canada.
Figure 7c shows the error in predictions of commuter flow from each county in the US to rest of the US.
Orange represents an over-prediction, green represents an under-prediction, and white represents an
accurate prediction.
'
'
14'
5.3$Modified$radiation$model'
5.3.1$Parameter$estimates'
Parameter'estimates'for'the'modified'radiation'model'are'shown'in'Table'1.'The'
parameter'estimate'for'Canada'(γUC)'is'0.0122,'and'the'parameter'estimate'for'Mexico'(γUM)'
is'0.0423.'These'values'mean'that'relative'to'the'US,'only'1.2%'and'4.2%'of'the'
employment'opportunities'in'Canada'and'Mexico'are'available'to'residents'of'the'US.'
Looked'at'another'way,'the'apparent'populations'of'counties'in'Canada'or'Mexico'are'1.2%'
and'4.2%'of'their'actual'populations,'when'considering'their'relative'attractiveness'to'
commuters'from'the'US.'Interestingly,'job'opportunities'in'Mexico'are'more'than'three'
times'as'attractive'as'opportunities'in'Canada.'
'
Parameter'
Estimate'
95%'CI'
γUC$
0.0122'
(0.0199,'0.0124)'
γUM$
0.0423'
(0.0417,'0.0429)'
Table 1. Parameter estimates with 95% confidence intervals for the modified radiation model.
'
5.3.2$Fit$of$the$modified$radiation$model'
Using'these'parameter'estimates,'we'examined'how'well'the'modified'radiation'
model'predicts'movement'from'each'US'county'to'Mexico,'Canada,'and'other'counties'in'
US.'Figure'8'depicts'the'error'in'the'predictions'of'the'modified'radiation'model.'The'model'
overMpredicts'movement'from'counties'colored'orange,'and'underMpredicts'movement'
from'countries'colored'green.'Most'of'the'error'of'the'original'radiation'model'that'is'
depicted'in'Figure'7'is'corrected'by'our'modifications'to'the'model.'While'the'original'
model'overMpredicts'movement'to'Mexico'and'Canada'from'most'areas'of'the'US,'by'
accounting'for'the'effects'of'international'borders,'our'model'predicts'movement'patterns'
that'are'much'more'comparable'with'the'commuter'data.''
The'modified'model'most'significantly'improves'the'overMprediction'of'international'
travel'from'US'counties'near'international'borders'and'population'centers'on'the'East'and'
West'Coasts.'Although'our'modifications'to'the'radiation'model'result'in'the'underM
prediction'of'movement'from'some'counties'(counties'colored'green),'this'error'is'minimal'
compared'to'the'correction'of'overMprediction.'
'
15'
'
Figure 8. Error in cross-border predictions of modified radiation model. Each map presents the error
of unmodified radiation model in predicting commuter flow from each county. Figure 8a shows the error in
predictions of commuter flow from each county in the US across the border to Mexico. Figure 8b shows
the error in predictions of commuter flow from each county in the US across the border to Canada. Figure
8c shows the error in predictions of commuter flow from each county in the US to rest of the US. Orange
represents an over-prediction, green represents an under-prediction, and white represents an accurate
prediction.
'
We'collate'the'countyMlevel'predictions'to'examine'the'fit'of'model'on'a'state'level.'
Figure'9'compares'the'number'of'commuters'from'US'border'states'to'Mexico'in'the'US'
census'data'(blue),'the'unmodified'radiation'model'(yellow),'and'the'modified'radiation'
model'(green).'The'modified'radiation'model'significantly'outperforms'the'original'
radiation'model'by'accounting'for'barriers'of'movement.'The'continued'overMprediction'of'
commuters'from'New'Mexico'could'stem'from'the'relatively'lower'number'of'border'
crossings'on'the'New'Mexico'border.'Figure'10,'similarly,'compares'the'performance'of'the'
original'and'modified'radiation'models'for'states'that'border'Canada.'Again,'the'modified'
radiation'model'outperforms'the'original'radiation'model,'often'by'several'orders'of'
magnitude.'
'
'
16'
Figure 9. Travelers to Mexico from US states that border Mexico. The modified radiation model
outperforms the unmodified model in predicting international movement from border states. The total
number of commuters from each state (i.e. sum of each county in the state) to Mexico is shown. Blue is
the US Census data, yellow is the unmodified radiation model, and green is the modified radiation model.
'
Figure 10. Travelers to Canada from US states that border Canada. The modified radiation model
outperforms the unmodified model in predicting international movement from border states. The total
number of commuters from each state (i.e. sum of each county in the state) to Canada is shown. Blue is
the US Census data, yellow is the unmodified radiation model, and green is the modified radiation model.
'
5.4$Remaining$error$in$model'
While'analyzing'the'original'radiation'model,'we'also'found'a'notable'error'in'its'
prediction'of'movement'within'the'US.'For'counties'with'large'populations,'the'model'
underMpredicts'movement'to'neighboring'counties'and'overMpredicts'movement'to'more'
distant'counties.'Figure'10'shows'the'error'of'the'unmodified'radiation'model'in'predicting'
commuter'flow'from'three'US'counties'to'neighboring'counties'(Figure'10aMc).'Next'to'each'
map,'for'each'highlighted'county,'the'distance'to'all'other'US'counties'is'plotted'against'the'
error'in'the'predicted'commuter'flow'to'those'counties'(Figure'10dMf).'The'map'figures'
'
17'
show'that'more'workers'commute'from'large'cities'like'Atlanta'and'Minneapolis'to'the'
surrounding'counties'than'the'original'radiation'model'predicts.'This'underMprediction'is'
shown'by'the'areas'of'green'close'to'the'highlighted'counties.'The'map'figures'also'show'
that'fewer'workers'commute'from'large'cities'to'more'distant'counties'than'the'model'
predicts.'This'overMprediction'is'shown'by'the'areas'of'orange'further'from'the'highlighted'
counties.'The'corresponding'graphs'demonstrate'this'relationship:'slight'underMprediction'
at'near'distances'and'overMprediction'at'increased'distance.'The'panel'on'the'right'(g)'plots'
population'size'against'the'correlation'between'error'and'distance'for'each'US'county.'This'
demonstrates'that'the'distanceMdependent'error'is'less'pronounced'for'counties'with'small'
populations.'This'error'is'retained'in'our'modified'radiation'model'and'will'be'subject'to'
future'analysis.'
'
Figure 11. Relationship between county population and errors for the unmodified radiation model
within the US. Each map presents the error of the unmodified radiation model in predicting commuter
flow from the highlighted county to neighboring counties (a-c). Orange represents an over-prediction,
green represents an under-prediction, and white represents an accurate prediction. The graph next to
each map plots the highlighted county’s distance to all other US counties against the error in the
predicted commuter flow to those counties (d-f). The graph on the right plots the population of a county
against that county’s distance vs. error correlation coefficient (g). The unmodified radiation model exhibits
a pattern of under-predicting movement from populous counties to neighboring counties and overpredicting movement from populous counties to more distant counties.
'
'
'
'
18'
6'Discussion'
This'thesis'presents'an'examination'of'the'radiation'model,'recently'proposed'to'
model'human'movements'between'counties'in'the'US.'We'explore'the'fit'of'the'model,'and'
extend'its'predictions'across'international'borders.'We'quantify'the'fit'of'the'model,'and'
develop'a'new'formulation'to'account'for'variation'in'job'opportunities'outside'of'the'US,'
available'to'residents'of'the'US.'Our'modified'radiation'model'preserves'the'simplicity'of'
the'original'radiation'model'while'improving'the'fit'to'international'commuter'data.'While'
our'modified'model'requires'the'estimation'of'additional'parameters'(i.e.'γUC'and'γUM),'we'
have'shown'that'these'parameters'can'be'simply'estimated'using'commuter'data.'Our'
estimated'values'for'γUC'and'γUM'were'between'1'and'5%,'indicating'that'similar'
percentages'of'the'job'opportunities'in'Canada'and'Mexico'are'available'to'residents'of'the'
US.'By'taking'this'effect'into'account,'the'modified'radiation'model'corrected'the'tendency'
of'the'original'model'to'overMpredict'crossMborder'movement.'The'model'fit'is'particularly'
improved'for'counties'near'international'borders.''
With'our'more'accurate'model'for'crossMborder'movement,'we'aim'to'better'
understand'commuting'patterns'within'North'America.'As'we'believe'commuting'patterns'
are'linked'to'the'spread'of'influenza,'a'more'complete'understanding'of'where'commuters'
travel'can'lead'to'an'improved'simulation'of'an'international'influenza'pandemic.'We'hope'
to'use'our'modified'radiation'model'formulation'to'simulate'the'countyMtoMcounty'spread'
of'the'2009'H1N1'pandemic'across'Mexico,'the'US,'and'Canada.'Such'a'simulation'could'
prove'useful'in'responding'to'future'influenzaMlike'pandemics.'
This'study'was'limited'by'the'availability'of'quality'commuter'flow'data.'While'
countyMtoMcounty'commuter'flow'data'was'available'for'the'US,'commuter'data'that'
includes'international'commutes'was'unavailable'for'Canada'and'Mexico.'This'lack'of'data'
prohibited'the'estimation'of'model'parameters'for'commuter'flows'originating'in'Mexico'
and'Canada.'Another'limitation'was'the'scope'of'the'movement'data.'The'data'reports'
movement'for'work,'leaving'out'other'types'of'movement,'such'as'movement'for'
recreation.'We'hope'to'infer'these'unavailable'data'in'further'work'on'this'system.'
In'the'future,'we'could'further'improve'the'modified'radiation'model'by'
adjustments'to'account'for'the'pattern'of'movement'seen'around'counties'with'large'
populations.'This'error'was'demonstrated'in'Figure'11.'Such'modifications'may'not'have'to'
'
19'
involve'the'addition'of'extra'parameters'because'the'pattern'is'strongly'tied'to'distance'
and'population,'parameters'already'present'in'the'model.'
'
7'Acknowledgements'
'
I'would'like'to'thank'and'acknowledge'Dr.'Rosalind'Eggo'of'the'Meyers'Lab'for'her'
hours'of'guidance'and'collaboration'on'this'project.'Without'her'direction'and'teaching,'
this'project'would'not'have'been'possible.'I'would'also'like'to'thank'Dr.'Ravi'Srinivasan'for'
his'contributions'to'the'modified'radiation'model'formulation.'
'
8'References'
1.
Coburn,'Brian'J.,'Bradley'G.'Wagner,'and'Sally'Blower.'"Modeling'influenza'
epidemics'and'pandemics:'insights'into'the'future'of'swine'flu'(H1N1)."'BMC$
Medicine$7,'no.'30'(2009).'http://www.biomedcentral.com/1741M7015/7/30'
(accessed'February'20,'2014).'
2. Viboud,'Cécile,'Ottar'N.'Bjørnstad,'David'L.'Smith,'Lone'Simonsen,'Mark'A.'Miller,'
and'Bryan'T.'Grenfell.'"Synchrony,'Waves,'and'Spatial'Hierarchies'in'the'Spread'of'
Influenza."'Science'312'(2006):'447M451.'
3. Zipf,'G.'K.'"The'P1P2/D'hypothesis:'on'the'intercity'movement'of'persons."'Am.$
Sociol.$Rev.'11'(1946):'677M686.'
4. Simini,'Filippo,'Marta'C.'González,'Amos'Maritan,'and'AlbertMLászló'Barabási.'"A'
universal'model'for'mobility'and'migration'patterns."'Nature$484'(2012):'96M100.'
5. Stouffer,'Samuel'A.'"Intervening'Opportunities:'A'Theory'Relating'Mobility'and'
Distance."'American$Sociological$Review'5,'no.'6'(1940):'845.'
6. Grais,'Rebecca'et$al.'"Assessing'the'impact'of'airline'travel'on'the'geographic'spread'
of'pandemic'influenza."'European$Journal$of$Epidemiology'19,'no.'4'(2003):'395.'
7. Ginsberg,'M.'et$al.'"Update:'swine'influenza'A'(H1N1)'infectionsMMCalifornia'and'
Texas,'April'2009."'Morbidity$and$Mortality$Weekly$Report'58,'no.'16'(2009):'435M
437.'
8. Lessler,'Justin,'Nicholas'G.'Reich,'and'Derek'A.t.'Cummings.'"Outbreak'of'2009'
Pandemic'Influenza'A'(H1N1)'at'a'New'York'City'School."'New$England$Journal$of$
Medicine'361,'no.'27'(2009):'2628M2636.'
9. Helliwell,'John'F.'"National'Borders,'Trade'and'Migration."'Pacific$Economic$Review'
2,'no.'3'(1997):'165M185.'
10. Instituto'Nacional'de'Estadística'y'Geografía.'"Censo'General'de'Población'y'
Vivienda'2000."'
http://www.inegi.org.mx/est/contenidos/proyectos/ccpv/cpv2000/'(accessed'
September'1,'2013).'
'
20'
11. United'States'Census'Bureau.'"Census'2000'Gateway."'
https://www.census.gov/main/www/cen2000.html'(accessed'September'1,'2013).'
12. Statistics'Canada.'"2001'Census'of'Canada."'
http://www12.statcan.ca/english/census01/'(accessed'September'1,'2013).'
13. Truscott,'James,'and'Neil'M.'Ferguson.'"Evaluating'the'Adequacy'of'Gravity'Models'
as'a'Description'of'Human'Mobility'for'Epidemic'Modeling."'PLoS$Computational$
Biology'8,'no.'10'(2012):'1M12.'
14. United'States'Census'Bureau.'"CountyMToMCounty'Worker'Flow'Files."'United'States'
Census'2000.'https://www.census.gov/population/www/cen2000/commuting/'
(accessed'April'13,'2014).'
15. SciPy'v0.13.0'Reference'Guide.'http://docs.scipy.org/doc/scipyM
0.13.0/reference/generated/scipy.optimize.minimize.html'(accessed'April'12,'
2014).'
16. Patterson,'David.'"Profile'Likelihood'Confidence'Intervals'for'GLM's."'University'of'
Montana'Department'of'Mathematical'Sciences.'
http://www.math.umt.edu/patterson/ProfileLikelihoodCI.pdf'(accessed'April'12,'
2014).'
'
'
21'