Prezentacja programu PowerPoint

EMPLOYING THE EU METHODOLOGY
TO DEFINE LABOUR MARKET AREAS
IN POLAND
Statistical Office in Bydgoszcz
Luxembourg, 2017.06.28
Schedule
o Data source for LMAs
o Defining LMAs in Poland
oChoosing the final set of parameters
o Fine tuning of LMAs
o Final LMA division for Poland
o Statistics by LMAs for years 2011-2014
o Problems
o Overall project experience – lessons learned
o Main challenges
o Changes to method or terminology used
oPresenting systematically data on LMAs
oFuture plans regarding the LMAs
Data source for LMAs
In Poland the source for developing Labour Market Areas is Population Census 2011. Data are
available at gminas (LAU-2) level.
National Census of Population and Housing 2011 in Poland:
o decision to use the “mixed” method – direct interview and 28 administrative sources
o use of public administration registers and information systems as the census-data source
o internet self-enumeration
o the first general statistical survey carried out only with the use of electronic questionnaires
o census carried out using hand-held terminals, on the basis of electronic forms - paper forms completely dropped
o use of GIS (Geographic Information Systems) tools
Defining LMAs in Poland
o data from Population Census 2011 based on registers
o persons aged 15 years and more
o insurance code as the criterion of being employed or not
o excluding persons not employed, working abroad or those for whom it is impossible to define
place of work from registers
o specifying for each employed person two LAU-2 codes: living_code and working_code
o for all farmers living_code=working_code
o for persons who did not declare travelling to work in tax registry living_code=working_code
o creating a matrix of commuting flows between living_code and working_code (around 278 000
links)
Choosing the final set of parameters
Selecting possible values of parameters on the basis of area, population, density of
population and number of LAU-2s in Great Britain, Italy and Poland.
First tests performed for the following values of parameters:
• minSZ = {1 000, 2 000, 3 000, 3 500, 4 000, 5 000}
• tarSZ = {7 500, 10 000, 15 000, 16 000, 17 000, 18 000, 19 000, 20 000, 35 000}
• minSC = {0.5, 0.55, 0.6, 0.667}
• tarSC = {0.667, 0.7, 0.75, 0.85, 0.9}
Choosing the final set of parameters – basic data on Italy, Great Britain and
Poland in 2011
Italy
Great Britain
Poland
Population (persons)
59,433,744
63,182,180
38,044,565
Population of 15 years & more (persons)
51,107,701
52,082,285
32,262,995
Economically active (persons)
25,985,295
32,442,335
17,576,246
Employed (persons)
23,017,840
30,008,635
15,443,421
2,967,455
2,433,705
2,132,825
302,073
248,528
312,679
197
254
122
Number of building blocks
8,092
10,399
3,081
Minimal size of LMA (working residents)
1,000
3,500
4,000
10,000
25,000
30,000
0.6
0.667
0.667
Target self-containment of LMA
0.75
0.75
0.8
Number of LMAs
611
228
339
Average population in a LMA (persons)
97,273
277,114
112,226
Average area of a LMA(thousands km2)
494
1,090
922
13
46
9.1
Unemployed (persons)
Area (thousands km2)
Density of population (persons/km2)
Target size of LMA (working residents)
Minimal self-containment of LMA
Average number of building blocks in a LMA
Source: own work on the basis of Eurostat, Europa.eu portal, Istat, INSEE, Office for National Statistics
Choosing the final set of parameters
First tests performed for the following values of parameters:
• minSZ = {1 000, 2 000, 3 000, 3 500, 4 000, 5 000}
• tarSZ = {7 500, 10 000, 15 000, 16 000, 17 000, 18 000, 19 000, 20 000, 35 000}
• minSC = {0.5, 0.55, 0.6, 0.667}
• tarSC = {0.667, 0.7, 0.75, 0.85, 0.9}
Choosing the final set of parameters
o single test made with EURO_script_Eurostat R-program lasted from one week
at the beginning to about 30 hours after introducing new version of R-program
o the results were not satisfying
o revision of the method of deciding whether persons travel to work (on the
basis of tax register) and thereby revision of flow matrix
o decision not to use minimal self-containment lower than 0.6 and to use target
self-containment equal or higher than 0.75
Choosing the final set of parameters
o improvement of efficiency after receiving LabourMarketAreas R-package single test lasted about 30 minutes
o decision to rerun tests for some sets of parameters
o testing R-package (version 2.0) with 144 sets of parameters:
• minSZ = {2000, 3000, 4000, 5000}
• tarSZ = {15000, 20000, 25000, 30000}
• minSC = {0.6, 0.667, 0.7}
• tarSC = {0.75, 0.8, 0.85}
Choosing the final set of parameters
o sensitivity analysis – checking and comparing sets of parameters
using functions StatClusterData, StatReserveList, CompareLMAsStat
o analysis of the results and maps focusing on LMAs covering area of
three voivodships and their capital cities: kujawsko-pomorskie
(Bydgoszcz), mazowieckie (Warsaw) and wielkopolskie (Poznan)
o analysis of non-contiguous LMAs
Choosing the final set of parameters – avoiding red and
choosing green
minSZ
minSC
tarSZ
tarSC
NbClusters
NbClusterUniqueCom
NbClustersValidLess1
2000
0,6
15000
0,85
458
15
5000
0,7
25000
0,85
297
1
2000
0,6
30000
0,75
578
87
3000
0,6
15000
0,75
541
28
4000
0,6
20000
0,8
411
6
4000
0,6
25000
0,8
395
6
4000
0,6
30000
0,8
379
6
5000
0,6
15000
0,8
416
5
5000
0,6
20000
0,8
395
5
5000
0,6
25000
0,8
378
5
5000
0,6
30000
0,8
362
4
2000
0,6
20000
0,85
421
13
2000
0,6
25000
0,85
392
12
2000
0,6
30000
0,85
367
9
3000
0,6
15000
0,85
432
6
3000
0,6
20000
0,85
395
6
3000
0,6
25000
0,85
377
5
3000
0,6
30000
0,85
355
4
4000
0,6
15000
0,85
414
5
4000
0,6
20000
0,85
378
4
4000
0,6
25000
0,85
367
4
4000
0,6
30000
0,85
343
4
5000
0,6
15000
0,85
402
4
5000
0,6
20000
0,85
372
4
4000
0,667
30000
0,8
339
2
5000
0,7
20000
0,85
306
1
5000
0,7
30000
0,85
283
1
2000
0,667
15000
0,85
389
10
2000
0,667
20000
0,85
359
7
3000
0,667
15000
0,85
370
2
3000
0,667
20000
0,85
347
1
4000
0,667
15000
0,85
358
1
4000
0,667
20000
0,85
334
1
5000
0,667
15000
0,85
352
1
5000
0,667
20000
0,85
329
1
NbClustersNoCentralCom
3
3
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
1
2
2
0
0
0
0
0
0
0
0
99
29
204
152
73
64
57
72
63
53
48
81
66
56
82
65
55
50
70
55
49
43
61
53
40
30
25
65
52
50
45
45
39
43
35
Mean.SC_demand_side
Mean.SC_supply_side
Q_modularity
0,884012651
0,799462149
0,812536787
0,905749151
0,829071858
0,834588096
0,88673945
0,795285419
0,811994642
0,877838798
0,787791873
0,807899561
0,887612354
0,802967279
0,815828962
0,888498994
0,805808398
0,817310075
0,891076593
0,809504207
0,818885466
0,884710529
0,799443875
0,814297011
0,887551542
0,803656985
0,816252121
0,888613396
0,807015501
0,81772254
0,891224305
0,810810902
0,819551946
0,888891394
0,805554466
0,815827063
0,890491566
0,809705013
0,817556207
0,893683147
0,814748842
0,820458267
0,885081731
0,799788029
0,813644712
0,888504763
0,806104544
0,816869353
0,88921568
0,809425422
0,817300291
0,893571671
0,814438906
0,820842944
0,884938021
0,800464766
0,814458133
0,888972821
0,806982745
0,817540083
0,889182826
0,809210347
0,817552741
0,893537957
0,814648492
0,821336205
0,885446286
0,801542683
0,814823836
0,889017021
0,807150615
0,817679336
0,900856575
0,822158784
0,82960862
0,904710737
0,827328628
0,834115522
0,907130651
0,832684872
0,836898716
0,897677683
0,815997386
0,827606724
0,899615088
0,820393371
0,828361146
0,896970723
0,816249045
0,827872361
0,90010782
0,820707797
0,828934171
0,898127559
0,816768822
0,828427285
0,900361058
0,821094934
0,829349143
0,898067766
0,81678393
0,828656275
0,899798503
0,820987274
0,828437912
Choosing the final set of parameters
Rank analysis (method introduced by Zdzisław Hellwig) of sets of parameters according to following
characteristics ([-] – negative correlation, [+] – positive correlation):
o number of communities in a reserve-list [-]
o percent of the number of clusters with an unique community [-]
o percent of the number of clusters with validity smaller than 1 [-]
o percent of the number of clusters with no communities having a centrality measure greater than 1 [-]
o mean of the demand side self-containment of the clusters in the partition [+]
o mean of the supply side self-containment of the clusters in the partition [+]
o median of the percentage of internal flows (excluding flows having the same node as origin and
destination) of the LMA between different communities with regard to the total internal flows [+]
o median of the ratio between number of links between communities inside LMA, excluding itself, and the
maximum number of possible links [+]
o Q_modularity index [+]
Choosing the final set of parameters
Four combinations of weights were created. In each of the rankings the following set of
parameters turned out to be in the first place:
minSZ = 4000
tarSZ = 30000
minSC = 0.667
tarSC = 0.8
Choosing the final set of parameters
Choosing proper set of parameters is one of the hardest part of
defining LMAs – there is no unequivocal method.
Analyses may not give the same results.
In Poland, the sensitivity analysis and rank analysis gave similar
results, whereas the analysis of voivodships gave slightly different set
of parameters for every voivodship.
Choosing the final set of parameters
minSZ = 4 000
tarSZ = 30 000
minSC = 0.667
tarSC = 0.8
Fine tuning of LMAs
Decision to perform fine tuning manually due to
necessity to correct more LMAs than indicated by
FineTuning function (e.g. LMAs with holes).
Fine tuning of LMAs – LMA 1613
o One special case – LMA 1613:
LMA 1613 consisted of two distant parts:
• part one – 9636 residents in 4 gminas
• part two – 7795 residents in 3 gminas.
The first idea was to split this LMA into two LMAs, but SC for
part two was lower than minSC, therefore part two could not
become a valid LMA.
For part one X-equation was not met.
Fine tuning of LMAs – LMA 1613
It was decided to check the number of attracting gminas in each part (using centrality index 𝐶𝑘 )
and dissolve the part with lower number.
𝐶𝑘 =
𝑓∗𝑘 − 𝑓𝑘𝑘
𝑓𝑘∗ − 𝑓𝑘𝑘
𝑓∗𝑘 - flows from all gminas to gmina k
If 𝐶𝑘 >1, then gmina is attracting.
Part one had 2 attracting gminas and part two had only 1.
Based on above procedure, it was decided that part one became autonomous LMA 1613. Part
two was divided into gminas. Each of these gminas was assigned to one of neighbouring LMAs
according to the cohesion measure.
Final LMA division for Poland
number of LMAs
mean self-containment
mean size
mean number of gminas forming the LMA
mean validity
number of LMAs with validity < 1
number od links between LMAs
number of LMAs with no gminas having a centrality measure > 1
339
0.816
41,818
9.1
1.12
1
46,167
41
mean SC_demand_side
0.902
std SC_demand_side
0.050
mean SC_supply_side
0.822
std SC_supply_side
0.055
Statistics by LMAs for years 2011-2014
LMA
Registered unemployed to working age population ratio by labour market areas
2011
POLAND
1 BOLESŁAWIEC
9 DZIERŻONIÓW
16 GŁOGÓW
28 JAWOR
44 KAMIENNA GÓRA
50 KŁODZKO
78 LUBAŃ
87 LUBIN
104 MILICZ
106 OLEŚNICA
118 OŁAWA
145 ŚWIDNICA
176 WAŁBRZYCH
180 WOŁÓW
199 ZĄBKOWICE ŚLĄSKIE
206 ZGORZELEC
216 ZŁOTORYJA
Source: own work
2012
8,01%
7,22%
9,05%
7,55%
11,99%
10,97%
12,58%
11,85%
6,14%
9,24%
7,99%
7,40%
6,76%
9,69%
10,97%
11,51%
6,51%
13,40%
2013
8,68%
8,18%
9,62%
8,23%
12,88%
11,79%
13,96%
12,20%
6,60%
9,86%
8,73%
8,57%
8,24%
10,71%
12,44%
12,21%
7,38%
13,89%
2014
8,84%
7,83%
9,42%
7,94%
12,36%
10,56%
14,28%
11,77%
6,69%
10,22%
8,79%
9,23%
7,61%
10,46%
12,07%
12,05%
7,66%
14,64%
7,53%
5,81%
7,31%
7,38%
9,40%
8,04%
12,03%
9,60%
5,54%
8,17%
6,53%
6,96%
6,01%
8,37%
9,83%
10,13%
6,23%
12,26%
Employed
to working age
population ratio in 2011
Registered unemployed
to working age
population ratio in 2011
Problems with data source
o links between gminas situated on two edges
of the country
o frequent cases of minor number of flows
between remote gminas
Problems with data source
Possible causes:
o in insurance registry people are considered to be working in the headquarters instead of working in
the actual place
o data errors (e.g. using names of localities instead of unique locality identifiers)
Possible solutions in future:
o analyzing a distance between gminas, and eliminating insignificant links between gminas situated
too far from each other
o introducing a threshold for number of flows between gmina_live and gmina_work dependent on
number of residents and eliminating links beneath the threshold
We intend to test these solutions after next Population Census.
Other problems
o too big building blocks (average surface of gmina
about 101,5 km2)
o diverse number of residents in
(minimum= 164 , maksimum= 726 245)
o administrative islands
The decision was to accept such situations.
gminas
Other problems
o 3 different territorial identifiers for urban-rural
set of gminas:
one for town
one for rural area
one for town and rural area together
Usually we treat urban-rural gminas as two
different gminas, but in some datasets they are
treated as one.
Overall project experience – lessons
learned
othe general rule: minimal self-containment equal or higher than 0.6
and target self-containment equal or higher than 0.75
oproposals of particular quality measures in choosing final set of
parameters
o the solution: treating urban-rural gminas separately in the
algorithm and providing the contiguity in the fine tuning process
o possibility to use centrality index in the fine tuning process
Main challenges
o choosing the optimal set of parameter values – no unequivocal
method
ourban-rural gminas in defining LMAs
oproblems with defining the actual place of work
osingle cases of non-contiguous LMAs and LMAs with holes
Changes to method or terminology used
o unintuitive name of size parameter
o introducing a threshold of number of flows between
gmina_live and gmina_work dependent on number of
residents and eliminating links beneath the threshold
Presenting systematically data on LMAs
o Survey-based data concerning employment in enterprises of 10 persons and more is available
annually at LAU-2 level. The division by NACE rev. 2 economic activity sections possible at the
LAU-2 level, but due to statistical confidentiality, it is recommended to group the NACE sections.
oThe number of the unemployed persons available at LAU-2 level twice a year from data of
Ministry of Family, Labour and Social Policy.
oSince 2016 data concerning employed and unemployed with unique identifiers from National
Insurance System are available four times a year at the LAU-2 level.
o Both, the employed and unemployed may be presented according to the gender, age,
residence. Additionally, the employed can be presented according to NACE rev. 2 economic
activity sections, wages, size of the enterprise.
All data available at LAU-2 level can be obtained at LMA level.
Future plans regarding the LMAs
o A publication concerning labour market areas in Poland.
o Defining labour market areas on the basis of the Population Census 2021 data.
o Comparison with Labour Market Areas defined on the basis of of the Population Census 2011
data.
o Considering elimination of insignificant links between distant gminas.
o Introduction and dissemination of the Labour Market Areas by occupational categories, gender,
age groups, earning groups, mode of travel to work, NACE rev. 2 economic activity sections and
others.
Thank you for your attention