Toward mapping land-use patterns from volunteered geographic

This article was downloaded by: [Universitaetsbibliothek Heidelberg]
On: 20 June 2013, At: 00:08
Publisher: Taylor & Francis
Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered
office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK
International Journal of Geographical
Information Science
Publication details, including instructions for authors and
subscription information:
http://www.tandfonline.com/loi/tgis20
Toward mapping land-use patterns from
volunteered geographic information
a
a
a
Jamal Jokar Arsanjani , Marco Helbich , Mohamed Bakillah ,
a
Julian Hagenauer & Alexander Zipf
a
a
Institute of Geography, GIScience Group, Heidelberg University ,
Heidelberg , Germany
Published online: 19 Jun 2013.
To cite this article: Jamal Jokar Arsanjani , Marco Helbich , Mohamed Bakillah , Julian
Hagenauer & Alexander Zipf (2013): Toward mapping land-use patterns from volunteered
geographic information, International Journal of Geographical Information Science,
DOI:10.1080/13658816.2013.800871
To link to this article: http://dx.doi.org/10.1080/13658816.2013.800871
PLEASE SCROLL DOWN FOR ARTICLE
Full terms and conditions of use: http://www.tandfonline.com/page/terms-andconditions
This article may be used for research, teaching, and private study purposes. Any
substantial or systematic reproduction, redistribution, reselling, loan, sub-licensing,
systematic supply, or distribution in any form to anyone is expressly forbidden.
The publisher does not give any warranty express or implied or make any representation
that the contents will be complete or accurate or up to date. The accuracy of any
instructions, formulae, and drug doses should be independently verified with primary
sources. The publisher shall not be liable for any loss, actions, claims, proceedings,
demand, or costs or damages whatsoever or howsoever caused arising directly or
indirectly in connection with or arising out of the use of this material.
International Journal of Geographical Information Science, 2013
http://dx.doi.org/10.1080/13658816.2013.800871
Toward mapping land-use patterns from volunteered geographic
information
Jamal Jokar Arsanjani*, Marco Helbich, Mohamed Bakillah, Julian Hagenauer and
Alexander Zipf
Downloaded by [Universitaetsbibliothek Heidelberg] at 00:08 20 June 2013
Institute of Geography, GIScience Group, Heidelberg University, Heidelberg, Germany
(Received 15 November 2012; final version received 21 April 2013)
A large number of applications have been launched to gather geo-located information
from the public. This article introduces an approach toward generating land-use patterns
from volunteered geographic information (VGI) without applying remote-sensing techniques and/or engaging official data. Hence, collaboratively collected OpenStreetMap
(OSM) data sets are employed to map land-use patterns in Vienna, Austria. Initially
the spatial pattern of the landscape was delineated and thereafter the most relevant land
type was assigned to each land parcel through a hierarchical GIS-based decision tree
approach. To evaluate the proposed approach, the results are compared with the Global
Monitoring for Environment and Security Urban Atlas (GMESUA) data. The results
are compared in two ways: first, the texture of the resulting land-use patterns is analyzed using texture-variability analysis. Second, the attributes assigned to each land
segment are evaluated. The achieved land-use map shows kappa indices of 91, 79, and
76% agreement for location in comparison with the GMESUA data set at three levels
of classification. Furthermore, the attributes of the two data sets match at 81, 67, and
65%. The results demonstrate that this approach opens a promising avenue to integrate
freely available VGI to map land-use patterns for environmental planning purposes.
Keywords: OpenStreetMap; land use; Global Monitoring for Environment and
Security Urban Atlas; hierarchical GIS-based decision tree approach; Vienna
1. Introduction
1.1. Land mapping
Land use (LU) and land cover (LC) maps are key terrestrial features (Ellis 2007), which are
the input of a variety of applications, namely, environmental studies (Evans et al. 2006),
spatial planning (Brown and Duh 2004), and urban management (Masoomi et al. 2013).
In fact, LU and LC are distinct in essence and represent dissimilar features; however,
they are not distinguished in some investigations (Ellis 2007, Wästfelt and Arnberg 2013).
As Ellis (2007) and de Sherbinin (2002) state, Land cover refers to the physical and biological cover over the surface, while LU illustrates human activities such as agriculture,
forestry and building construction that alter land surface.
Mostly, remote-sensing techniques on satellite images and field measurements have
been applied in mapping LU/LC patterns (e.g., Kandrika and Roy 2008; Pacifici et al.
2009, Saadat et al. 2011, Qi et al. 2012). Some efforts at producing LC at global,
*Corresponding author. Email: [email protected]
© 2013 Taylor & Francis
Downloaded by [Universitaetsbibliothek Heidelberg] at 00:08 20 June 2013
2
J. Jokar Arsanjani et al.
regional, and local scales have been made: examples of global scale and coarse resolution approaches include GLC-2000 (Fritz et al. 2003), MODIS (Friedl et al. 2002), and
GlobCover (Bicheron et al. 2008, Bontemps et al. 2011), among others. At a European
scale, the CORINE 2000 (Buettner et al. 2002) and Global Monitoring for Environment
and Security Urban Atlas (GMESUA; European Union 2011) provide LC and LU maps at
continental and municipal levels, respectively. High-resolution aerial and satellite images
(e.g., spot images) have been used to achieve high-accuracy maps (e.g., Kong et al. 2012).
The accuracy of LU and LC maps is of great importance due to several reasons: first,
how much land is covered by which land type. Second, in order to manage the natural
resources, the more accurate LU data can be attained, the better and more reliable the
world’s resources can be managed (Ellis 2007, Robinson and Brown 2009). Third, LUdependent models have been developed via inputting the current LU/LC maps under the
premise of having the most accurate LU/LC maps, while in comparative cases an approach
is verified and validated only because it represents a slightly higher (e.g., by 2%) kappa
index (Pontius and Petrova 2010, Jokar Arsanjani et al. 2013). This uncertainty is propagated through modeling and conversions, which can change the findings and conclusions
(Pontius et al. 2004). Fourth, proper environmental monitoring and ecological studies,
among others, require the most accurate LU maps (Folberth et al. 2012), so recommended
strategies and environmental decisions will have less bias caused by incorrect inputs.
In practice, LU maps are prepared by coupling LC maps with the actual usage of land
parcels (Cihlar and Jansen 2001, Jokar Arsanjani et al. 2011). Therefore, achieving a more
accurate LU data set requires in situ measurements obtained from local residents, land
managers, and evidence sources (GLP 2005). Hence, local residents can be considered as
a great source of information about their surrounding objects, which can be exploited for
determining the most proper LU types. However, gathering information from people is
time- and cost-consuming (Fritz et al. 2003). Thus, the people’s knowledge of the environment ought to be collected through an easier, more reasonable approach. In this research,
it is assumed that voluntary shared information in collaborative mapping projects could be
a potential data source for collecting individuals’ knowledge for determining LU classes.
1.2. Collaborative mapping projects as a potential data source
Due to the development of Web 2.0 technologies, a number of data repositories and web
mapping services (WMSs; e.g., Esri’s Basemaps, Google Earth, and Bing Maps) are accessible via application programming interfaces (APIs) to enable people to map geographical
objects by overlaying high-resolution image libraries. These libraries are released and
can be integrated into any online application. In general, collaborative mapping projects
(CMPs; Rouse et al. 2007) provide high-resolution base maps as WMSs to collect less
location-shifted objects (Ramm et al. 2011). Recently, a number of CMPs, namely, Geowiki (Fritz et al. 2012, Comber et al. 2013), Eye on Earth (European Union 2011), eBird
(Marris 2010), and Wikiloc (Castelein et al. 2010), have put this capability in the hands
of individuals who would like to share their knowledge of geographical objects voluntarily with the public (Turner 2006). This way of generating, sharing, and editing geodata is
known as volunteered geographic information (VGI; Goodchild 2007).
For instance, the Geo-wiki project recently launched by the International Institute for
Applied Systems Analysis (IIASA; Fritz et al. 2012) uses the Google Earth API to help
individuals to detect and correct drawbacks of croplands using three global LC maps (GLC2000, MODIS, and GlobCover) by volunteers. However, as projects like Geo-wiki are longterm projects and certainly last several years, it is questionable if and when the contributed
Downloaded by [Universitaetsbibliothek Heidelberg] at 00:08 20 June 2013
International Journal of Geographical Information Science
3
data will reach a sufficient quality standard. Furthermore, attracting the public’s attention
to such a project, which needs relevant expertise in distinguishing land types, seems to be
difficult and few contributions are expected. So, a faster and more effective approach is
urgently needed. A possible solution is to discover whether other active CMPs are able to
provide relevant information for this purpose.
Among CMPs providing VGI, OpenStreetMap (OSM) has been a pioneer project due
to attracting the most public attention and contribution, as indicated by Ramm et al. (2011),
who reported nearly 839,700 users in August 2012. An interesting point about OSM is the
high flexibility in importing data, as well as the availability and free access to the latest
contributions on a daily basis (Flanagin and Metzger 2008, Koukoletsos et al. 2012). As an
advantage of VGI, in general, and OSM, in particular, some data have never been collected before and are missing objects in the proprietary databases (Corcoran and Mooney
2012, Neis and Zipf 2012). For instance, the OSM’s total route length already exceeds
that of TomTom at 27% in eastern Germany (Neis et al. 2011). Based on Chilton’s (2009)
investigation studying the capital cities of the world by coverage, OSM is well ahead of
Google and TeleAtlas in Europe, and in Africa as well as in Asia where it is slightly ahead.
Moreover, a spatial statistical comparison by Helbich et al. (2012a) between official survey
data, TomTom, and OSM confirms the high positional accuracy of OSM, particularly in
urban areas. Based on these empirical results, VGI is a potential and valuable source for
generating LU maps.
To the best of our knowledge, no studies on deriving LU patterns from VGI have
been reported. Nevertheless, the first attempts at delineation of some specific land categories have been published. For instance, Hagenauer and Helbich (2012) used a genetic
algorithm and artificial neural networks to extract urban patterns from OSM data sets
within Europe, but this study is limited to urban patterns. In contrast to their investigation, all existing projected LU types in GMESUA are intended to be projected in this
study.
This research aims to investigate and develop a novel approach to delineate LU features from VGI, particularly the OSM project. The primary objective of this research is
to use individuals’ contributions to OSM in order to generate LU maps without using any
input from remote-sensing and commercial data. Hence, this research seeks to find out (a)
whether VGI is a potential data source for LU mapping and (b) how well LU patterns can
be mapped using VGI data compared to using GMESUA data.
This article is structured as follows: an overview of the utilized data sets and the chosen
study site is given in the next section. Section 3 presents the applied methods. Section 4
presents the achieved results, as well as a discussion on the attained outcomes. Section 5
addresses the implications and conclusions of this research.
2. Materials
2.1. OpenStreetMap data set
As shown in Figure 1, the OSM project organizes contributions according to several layers. In general, the amount of contributions relating to roads, points of interest (POIs)
and buildings is more than the other types of features. In addition to data completeness,
the contributed features suffer from a lack of logical consistency and thematic accuracy
(Hagenauer and Helbich 2012). Furthermore, contributions have dissimilar geometrical
accuracy and frequently overlap each other (i.e., some areas are given several attributes).
Therefore, using this layer for any applications generally causes substantial limitations.
4
J. Jokar Arsanjani et al.
OSM
OSM
Land use
Roads
OSM
OSM
Natural
OSM
Waterways
Downloaded by [Universitaetsbibliothek Heidelberg] at 00:08 20 June 2013
OSM
OSM
POIs
Buildings
Railways
Point features
Line features
Polygon features
Figure 1. Types of features collected by OpenStreetMap.
2.2. Global Monitoring for Environment and Security Urban Atlas
GMESUA provides pan-European comparable LU data for large urban zones with populations of more than 100,000. It is adapted to European needs and contains information that
can be derived chiefly from earth observation (EO) data supported by other reference data,
such as commercial-off-the-shelf (COTS) navigation data and topographic maps. It has a
minimum mapping unit of 0.25 ha and a minimum width of linear elements of 100 m with
±5 m positional accuracy (European Union 2011). Some supplementary data are used to
improve the accuracy of classification processes for all classes such as (a) COTS navigation
data like points of interest, LU, LC, and water areas; (b) Google Earth (only for interpretation and not for delineation); (c) local city maps for certain classes; (d) local zoning data
(e.g., cadastral data); (e) field checks (on-site visit); and (f) very high-resolution imagery
(better than 1 m ground resolution, e.g., aerial photographs) (Weber 2007). At the time
of writing this paper, this data set covers 305 urban regions within Europe. The thematic
accuracy for all classes is at least 80%. Twenty different LU categories are distinguished.
The LU segmentation is retrieved based on the generated LC maps from the EO data.
A diagram representing the classification scheme and hierarchical categories is shown in
Figure 2. For details, see the Urban Atlas mapping guide (European Union 2011).
2.3. Study area
An exemplary area within the central part of the city of Vienna, Austria, was selected.
The reason for selecting Vienna is that it has attracted a significant amount of contributions, which becomes evident from a query to OSMatrix (Roick et al. 2011). Also, the
selected area of interest (AOI) covers a diverse landscape so that several LU features can be
detected, including water bodies, agricultural areas, urban fabrics, and artificial surfaces.
International Journal of Geographical Information Science
Land
Water
Little / no human
influence,
agriculture, forestry
Human activity non-agricultural
2.
Agricultural
+
3. Forests
seminatural
areas +
wetlands
Downloaded by [Universitaetsbibliothek Heidelberg] at 00:08 20 June 2013
1. Articial surfaces
Urban areas with dominant
residential use or inner-city areas
with central business district and
residential use
Industrial, commercial, public, military and
private units or transport units are
predominant
Strong human inuence on soil
surface, buildings not dominant
Leisure and recreation
use dominates
1.1 Urban fabric
1.2 Industrial, commercial, public, military,
private and transport units
1.3 Mine, dump and construction
sites
1.4 Articial
nonagricultural
vegetated areas
Industrial,
Road and
commerci
rail
al, public,
Continuou Discontinu
network
Isolated
military
s urban ous urban
Port areas
and
structures
and
fabric
fabric
associated
private
land
units
Airports
Mineral
extraction Constructi
and dump on sites
sites
5
Land
without
current
use
Green
urban
areas
5. Water
Sports and
leisure
facilities
Figure 2. Classification scheme applied in the preparation of GMESUA.
Furthermore, the GMESUA data for the AOI has already been generated and released.
The AOI contains 12 major land types within a 32-km2 area. Thus, the chosen area is
representative and typical in order to test our assumptions and find proper answers for the
abovementioned question, which is shown in Figure 3.
Figure 3. Geographical extent of the study area in Vienna.
6
J. Jokar Arsanjani et al.
3. Methods
The steps outlined in Figure 4 are designated in order to achieve the objectives of
this research such as (a) data preprocessing, b) segmentation of land parcels, and (c)
determination of land segments’ functionality.
Downloaded by [Universitaetsbibliothek Heidelberg] at 00:08 20 June 2013
3.1. Data preprocessing
The OSM data of the AOI were downloaded from the official data provider Geofabrik on
17 August 2012. The GMESUA data of Vienna were also downloaded from the European
Environment Agency (EEA) website. To generate a LU map of the study area, two main
elements are required: (a) delineation of LU patterns spatially (i.e., land segmentation,
as discussed in Section 3.2) and (b) identifying attributes or functionality of each land
segment (Section 3.3). The former is extracted from the pattern of OSM features (Figure 5)
and the latter is retrieved from the descriptive information of OSM features, particularly
POIs, buildings, LU, and natural features, through an iterative process.
3.2. Segmentation of land
There are several ways to delineate LU features such as (a) in-field surveying, (b)
airborne/spaceborne imagery by applying image-processing techniques, and (c) cadastral data. The former is time-consuming, costly, and no longer common in large-scale
studies. The latter requires collecting and purchasing satellite images, as well as having
prior image-processing knowledge. It also asks for purchasing data. Furthermore, local
knowledge of objects and their metadata is required in order to assign a functionality
to land parcels. This solution is also time- and cost-consuming, thus an easier, cheaper,
quicker, and more accurate approach is necessary. In this research, the environmental
variables data sets contributed to the OSM project are overlaid and combined to partition the land into small segments and to delineate the boundary of each land feature.
In order to do so, polygon features (i.e., LU and natural features) and linear features (i.e.,
OSM
Data
pre_processing
Segmentation of
parcels
Decision tree
classification
Determination of
rules
Association of
attributes
GME SUA
Reclassification at
different hierarchies
NO
Decision
YES
Visualization of land use
map
Model evaluation
Figure 4. Workflow representing the implementation of the approach.
Reclassification
International Journal of Geographical Information Science
7
Downloaded by [Universitaetsbibliothek Heidelberg] at 00:08 20 June 2013
N
Urban atlas spatial pattern
Figure 5.
(right).
LU spatial pattern
Delineated spatial patterns of land-use features from GMESUA (left) and OpenStreetMap
roads, railways, and waterways) are intersected to draw the boundary of LU features, as
shown in Figure 5. A statistical evaluation of the patterns of these two maps is given
in Section 4.1.Nonetheless, a visual comparison of the two figures demonstrates a high
degree of similarity and even more detailed texture of the delineated segments from OSM.
Once the AOI is segmented, the attribute of each land segment needs to be determined, as
explained in the next section.
3.3. Determination of land attributes
In order to determine the functionality of LU features, a hierarchical GIS-based decision
tree approach is applied for assigning the most appropriate attribute to each parcel of land
(e.g., Lawrence et al. 2004). In this study, the applied categorization and decision rules
used by the GIS-based decision tree are based on the instructions recommended by the
Urban Atlas mapping guide (2011). A schematic view of the decision rules is shown in
Figure 5.
3.4. Hierarchical GIS-based decision tree approach
Several types of decision tree algorithm have been used for LU/LC classification and segmentation (e.g., Friedl et al. 2002; Brown de Colstoun et al. 2003; Lawrence et al. 2004).
The basic structure of the decision tree contains one root node, a number of internal nodes,
and finally a set of terminal nodes. The data are recursively divided down the decision
tree according to the predefined classification framework. At each node, a decision rule is
needed to classify a set of objects (Redo and Millington 2011). Of particular note is that
Downloaded by [Universitaetsbibliothek Heidelberg] at 00:08 20 June 2013
8
J. Jokar Arsanjani et al.
the present study does not utilize a statistical decision tree as proposed, for instance, by
Breiman et al. (1984). As different OSM data sets are shared by people, multidimensional
and multivariate logical rules (e.g., nearby objects and logical existence of a feature within
an area) concerning the location and attribute of objects are needed to associate them to
the most relevant LU types.
GMESUA follows a hierarchical framework to identifying LU categories, so a hierarchical GIS-based decision tree approach is applied to associate feature segments to the
most proper land type through a top-down scheme in an iterative process. As shown in
Figure 6, at the first decision node, two major classes (i.e., water and land, see Figure 1) are
discriminated based on the OSM–Waterways data set through employing a buffer function,
as well as integrating the contributed OSM–Natural features. Thereafter, the land features
are subsequently subdivided according into three major LU types (i.e., artificial surfaces
[100], agricultural + semi-natural areas + wetlands [200], and forests [300]) by means
of three main components: the location of POIs, metadata of POIs and building footprints,
and attributes.
Within the next decision node, the subdivisions of artificial surfaces are recognized.
Two main layers are combined to achieve this. The POIs, which are highly contributed
and can be classified according to their functionality as recommended by OSM (2011),
plus Polygon-in-Polygon analysis of building footprints per segment with respect to
their types facilitates labeling of land segments. POIs provide a wide variety of points,
namely, public-related, health-related, leisure-related, catering-related, accommodationrelated, shopping-related, money-related, tourism-related, and miscellaneous spots.
Point-in-Polygon analyses of the POIs, as well as their contributed attributes, are helpful in
identifying the functionality of land segments. In other words, the overlapped segments
with the POIs and buildings are given the attribute of these layers. This distinguishes
the urban fabric (110), industrial, commercial, public, military, private and transport
Land segments
Buffered waterways
+
natural features
Applying rules
Switched area
Feature
Land
Water
500
Switched segments
100
Natural features
200
300
POIs + buildings
142
140
130
Landuse
+
natural + POIs
POIs + buildings
141
134
133
120
Roads
+
railways
POIs + buildings
131
124
123
110
121
122
Buildings coverge
per segment
112
111
POIs + buildings
113
Figure 6. Schematic view of the hierarchical GIS-based decision tree applied through a top-down
procedure.
Downloaded by [Universitaetsbibliothek Heidelberg] at 00:08 20 June 2013
International Journal of Geographical Information Science
9
units (120), mine, dump and construction sites (130), and artificial nonvegetated areas
(140) classes.
For the last node of the decision tree, different rules are applied to distinguish the rest
of the subclasses according to the type of land. For instance, to identify the continuous
(111) and discontinuous urban fabrics (112), the area covered by buildings is calculated
to distinguish between the two types. If the building coverage per segment exceeds 80%,
it is labeled as continuous urban fabric while buildings with lower coverage values are
labeled as discontinuous urban fabric. As another example, buffer analysis of transport
networks based on road types classifies the road and rail network and associated land
(122) category. Furthermore, geographical proximity, association, dissolve, and elimination functions are applied to dissolve the sliver gaps and nearby segments to the most
relevant classes. For instance, small polygons were dissolved to the nearby polygon considering the longest common border of the two objects. This decision tree is employed
again in an iterative process to consider unidentified segments. Consequently, the final output map is post-processed and displayed in Figure 7. Nevertheless, a number of segments
are not labeled as any land type since no contributions were imported. They are labeled as
No class objects.
3.5. Suitability analysis
The evaluation of the proposed GIS-based decision tree approach is crucial and done by
means of a comparison against a reference data set with a high level of accuracy. For this
purpose, the GMESUA data set is the most optimal LU data available for comparison
in terms of scale. Two main steps are followed to achieve this comparison: (a) texturevariability analysis (Eastman 2009) of the generated LU map and (b) producing a confusion
matrix (Pontius and Petrova 2010) between the two data sets to realize the accuracy of the
generated LU map per category. For the texture-variability analysis, two indices are computed, namely, relative richness of variability and fractal dimension analysis. The relative
richness index depicts the diversity of LU classes by measuring the number of classes in
the defined neighborhood, while the fractal analysis calculates the fractal dimension of the
map features in a 3 × 3 neighborhood. The confusion matrix shows the number of correct
and incorrect classifications made by the approach compared with the actual classifications
in the reference data through computing kappa index of agreement (Pontius et al. 2004),
which statistically explains how well two thematic data sets match. The kappa value varies
between 0 and 100%, where 0% indicates that the level of agreement is equal to the agreement due to chance and 100% indicates perfect agreement. When comparing an alternative
reference map with a reference map, Kappano specifies the overall agreement. Klocation represents the extent to which the two maps match in terms of the location of each land class
(Pontius and Cheuk 2006).
4. Results and discussion
This section presents the achieved results and their evaluation. Figure 7 represents the
achieved LU maps at three different levels of classification.
4.1. Texture analysis
As mentioned earlier, the LU patterns are delineated by intersecting the OSM features
(Figure 5). A statistical comparison is carried out by texture analysis. Figure 8a
Downloaded by [Universitaetsbibliothek Heidelberg] at 00:08 20 June 2013
10
J. Jokar Arsanjani et al.
Figure 7. Extracted LU maps of GMESUA (top) and OpenStreetMap (down) at different levels
(1-2-3).
demonstrates that LU patterns are relatively richer than those of GMESUA. This analysis shows higher fractal dimensions for OSM-extracted features than for GMESUA. This
means that the contributed data represent higher fractal dimensions of land segments (see
Figure 8b and c).
Since the texture of the OSM-extracted LU map is validated, Section 4.2 compares the
attributes of the two LU data sets. This allows evaluating the results with a reference data
set via creating a confusion matrix.
4.2. Evaluation of results via confusion matrix
Next, a confusion matrix is generated to compare each LU class in the two data sets. Table 1
represents the confusion matrix and the coverage of each land type in the two data sets in
terms of hectares. These values are only for the third level of classification.
International Journal of Geographical Information Science
Fractal dimension of OSM-extracted map
Fractal dimension of GMESUA
2.00
2.06
2.12
2.18
2.24
2.30
2.36
2.42
2.48
2.54
2.60
2.66
2.72
2.78
2.84
2.90
2.96
Downloaded by [Universitaetsbibliothek Heidelberg] at 00:08 20 June 2013
11
2.00
2.06
2.12
2.18
2.24
2.30
2.36
2.42
2.47
2.53
2.59
2.65
2.71
2.77
2.83
2.89
2.95
Relative richness
Richer pattern
Figure 8. Texture analysis of the data sets: (a) relative richness, (b) fractal dimension of
OpenStreetMap, (c) fractal dimension of GMESUA.
As shown in Figure 6, LU types can be distinguished at different levels of classification. At level 1, the two main classes of land and water features are distinguished. The
second level differentiates between different artificial surfaces. The third level differentiates between the subcategories of artificial surfaces. As shown in Table 1, the least amount
of agreement between the two data sets is evident for port areas (123), construction sites
(133), and land without current use (134), where the overall coverage of them is 14, 54,
and 30 ha, respectively. Overall, this quantity is almost 3% of the whole study area.
Overall, 336 ha have not been classified and are labeled as No class features, representing 10% of the study area. The unclassified features are those which are mainly
labeled in the GMESUA as discontinuous urban fabrics (112) and green urban areas
(141). Table 2 shows the amount of agreement between the extracted LU map from
OSM and GMESUA at each level of classification of the GIS-based decision tree
approach.
According to Table 2, the approach outputs the LU classes at 90.6% rate of location of
classes and 81.5% matching the two maps. This rate is suitable for classifying artificial surfaces (100), agricultural + semi-natural areas + wetlands (200), forests (300), and water
(500). The second row indicates that the approach discriminates the artificial surfaces subcategories (see Figure 6) at 78.7 and 67.4% of location and overall accuracy, respectively.
However, as we enter the third level of classification, the accuracy of location and overall
accuracy decreases to 75.6 and 64.8%, respectively.
5. Conclusions and future work
The main aim of this research was to determine whether VGI could be exploited to map
LU features. Recent online collaborative projects such as OSM provide a new source
of geoinformation. The process of sharing information by individuals is a bottom-up
approach, which can motivate members of the public to share their information. The shared
information can be exploited for different purposes because the shared information is upto-date and newer than that held in official databases. The public information as expert
No class
19.7
127.46
0
21.35
15.5
0
6.09
8.62
94.76
38.83
4.49
0
336.8
0.00
Classes
111
112
113
121
122
123
133
134
141
142
200
500
Total
Commission
error %
35.11
3.83
0
0.44
2.92
0
0
0
0
0
0
0
42.3
83.00
111
8.09
256.57
0
29.94
18.82
0
9.5
2.92
34.77
16.64
0.59
0.78
378.62
67.76
112
0
0
0
0.58
0
0
0
0.02
0
0
0
0
0.6
0.00
113
22.95
22.51
0
499.39
23.3
11.3
14.58
12
46.35
15.48
8.6
1.09
677.55
73.71
121
7.58
37.01
0.17
60.05
244.1
2.6
7.95
3.52
47.05
10.95
1.13
2.03
424.14
57.55
122
0.22
0.64
0
10.97
0.05
0
9.66
1.74
0
0
3.9
0
27.18
35.54
133
Generated map
0
0
0
0
0.08
0
0.57
0
0
1.03
0
0
1.68
0.00
134
1.3
4.16
0.44
9.83
16.7
0.1
1.14
0.08
441.76
7.51
1.64
11.03
495.69
89.12
141
0
6.75
0
4.12
6.92
0
4.82
0.48
21.3
196.47
0
0.07
240.93
81.55
142
1.34
0.39
0
20.54
1.26
0
0.25
0.81
2.12
0.03
17.51
0
44.25
39.57
200
Downloaded by [Universitaetsbibliothek Heidelberg] at 00:08 20 June 2013
Table 1. Confusion matrix of land-use classes extracted from OpenStreetMap with GMESUA.
GMESUA map
0
0.33
0
2.11
3.2
0
0
0
25.37
3.93
0
384.9
419.84
91.68
500
96.29
459.65
0.61
659.32
332.85
14
54.56
30.19
713.48
290.87
37.86
399.9
3089.58
Total
36.46
55.82
0.00
75.74
73.34
0.00
1.04
0.00
61.92
67.55
46.25
96.25
1.37
Ommission
error %
12
J. Jokar Arsanjani et al.
International Journal of Geographical Information Science
Table 2.
The computed kappa indices for each level of classification.
Level
1
2
3
Downloaded by [Universitaetsbibliothek Heidelberg] at 00:08 20 June 2013
13
KappaLocation (%)
KappaNo (%)
90.64
78.69
75.58
81.47
67.38
64.79
knowledge can be imported for gathering a detailed database of our surrounding lands,
for example what they are being used for. In this study, a simple methodical approach is
tested and proposed for extracting LU features from individuals’ contributions to the OSM
project. Here, only individuals’ knowledge of their surrounding area is incorporated into
a hierarchical GIS-based decision tree approach to generate LU patterns without using
complex and costly techniques such as remote sensing, surveying, or other data collection
methods.
The proposed approach was carried out to produce LU maps by first delineating the
land segments and then assigning the land segments’ attributes through the collected information into the OSM. Based on the achieved results, the research question is now answered
and VGI can be a potential data source for mapping LU patterns. The evaluation process
of comparing the generated LU map with the GMESUA data at three levels verifies that
this approach can be applied to achieve LU maps at acceptable accuracies. It must be mentioned that as more contributions are gradually collected the accuracy of the LU maps
could increase with time.
Four main advantages of this approach must be highlighted: firstly, it imported no inputs
from remote-sensing or any administrative data and used only OSM data. Secondly, as
the OSM database is freely available to the public, the approach does not cost financially
nor require any time spent exploring land through fieldwork. Thirdly, a number of features were labeled incorrectly in the GMESUA, which can be determined by incorporating
the OSM data. Finally, the proposed approach is able to ease the process of updating LU
maps according to the latest information shared by people, while the GMESUA cannot
be updated frequently by authorities due to high financial costs and time requirements.
Nevertheless, the approach has some limitations as well, such as (a) it is based on information shared by individuals and the accuracy of the shared data influences the outcomes; (b)
some land segments remained unlabeled so that VGI could not help to determine their LU
type; and (c) the VGI domain has not been very well developed and therefore is a growing
field. However, gradually it will receive more attention from the public and new techniques
will be developed and adopted for VGI. Of course, the more people get involved, the more
accurate data can be shared with the public. For future work, integrating other sources of
VGI and CMPs such as Wikimapia and Flickr photos could be a practical task.
References
Bicheron, P., et al., 2008. GLOBCOVER: products description and validation report.
Bontemps, S., et al., 2011. GLOBCOVER 2009: products description and validation report.
Breiman, L., et al., 1984. Classification and regression trees. Boca Raton: CRS Press.
Brown, D.G. and Duh, J.-D., 2004. Spatial simulation for translating from land use to land cover.
International Journal of Geographical Information Science, 18 (1), 35–60.
Brown de Colstoun, E. C., et al., 2003. National Park vegetation mapping using multitemporal
Landsat 7 data and a decision tree classifier. Remote Sensing of Environment, 85 (3), 316–327.
Buettner, G., Feranec, J., and Jaffrain, G., 2002. Corine land cover update 2000. EEA. Technical
Report, vol. 89, 17 December 2002. Copenhagen.
Downloaded by [Universitaetsbibliothek Heidelberg] at 00:08 20 June 2013
14
J. Jokar Arsanjani et al.
Castelein, W., et al., 2010. A characterization of Volunteered Geographic Information. 13th AGILE
International Conference on Geographic Information Science 2010, 10–14 May, Guimarães,
Portugal.
Chilton, S., 2009. Crowdsourcing is radically changing the geodata landscape: case study of
OpenStreetMap. In 24th International cartographic conference – ICC 2009. Santiago, Chile
15–21 November 2009.
Cihlar, J. and Jansen, L.J.M., 2001. From land cover to land use: a methodology for efficient land use
mapping over large areas. The Professional Geographer, 53 (2), 275–289.
Comber, A., et al., 2013. Using control data to determine the reliability of volunteered geographic information about land cover. International Journal of Applied Earth Observation and
Geoinformation, 23, 37–48.
Corcoran, P. and Mooney, P., 2013. Characterising the metric and topological evolution of
OpenStreetMap network representations. European Physical Journal, Special Topics, 215 (1),
109–122.
de Sherbinin, A., 2002. A CIESIN Thematic Guide to Land Land-Use and Land Land-Cover Change
(LUCC). Center for International Earth Science Information Network (CIESIN) Columbia
University Palisades, NY, USA. Available from: http://sedac.ciesin.columbia.edu/binaries/web/
sedac/thematic-guides/ciesin_lucc_tg.pdf [Accessed 16 May 2013].
Eastman, J.R., 2009. IDRISI Taiga. Worcester, MA: Clark University.
Ellis, E., 2007. Land-use and land-cover change. In: R. Pontius and C.J. Cleveland, eds.
Encyclopedia of earth. Available from: http://www.eoearth.org/article/Land-use_and_landcover_change [Accessed 16 May 2013].
European Union, 2011. Mapping guide for a European Urban atlas. Available from: http://www.eea.
europa.eu/data-and-maps/data/urban-atlas [Accessed 16 May 2013].
Evans, T.P., Sun, W., and Kelley, H., 2006. Spatially explicit experiments for the exploration
of land-use decision-making dynamics. International Journal of Geographical Information
Science, 20 (9), 1013–1037.
Flanagin, A. and Metzger, M., 2008. The credibility of volunteered geographic information.
GeoJournal, 72, 137–148.
Folberth, C., et al., 2012. Impact of input data resolution and extent of harvested areas on crop
yield estimates in large-scale agricultural modeling for maize in the USA. Ecological Modelling,
235–236, 8–18.
Friedl, M.A., et al., 2002. Global land cover mapping from MODIS: algorithms and early results.
Remote Sensing of Environment, 83, 287–302.
Fritz, S., et al., 2003. Harmonisation, mosaicing and production of the global land
cover2000 database (beta version). Luxembourg: Office for Official Publications of the European
Communities. ISBN 92-894-6332-5.
Fritz, S., et al., 2012. Geo-Wiki: an online platform for improving global land cover. Environmental
Modelling & Software, 31, 110–123.
GLP, 2005. Global land project: science plan and implementation strategy. Stockholm, Sweden.
UNT Digital Library. Available from: http://digital.library.unt.edu/ark:/67531/metadc12009/
[Accessed 11 July 2012].
Goodchild, M.F., 2007. Citizens as sensors: the world of volunteered geography. GeoJournal, 69,
211–221.
Hagenauer, J. and Helbich, M., 2012. Mining urban land use patterns from volunteered geographic
information by means of genetic algorithms and artificial neural networks. International Journal
of Geographic Information Science, 26 (6), 963–982.
Jokar Arsanjani, J., et al., 2013. Integration of logistic regression, Markov chain and cellular automata
models to simulate urban expansion. International Journal of Applied Earth Observations and
Geoinformation, 21, 265–275.
Jokar Arsanjani, J., Kainz, W., and Mousivand, A.J., 2011. Tracking dynamic land-use change using
spatially explicit Markov Chain based on cellular automata: the case of Tehran. International
Journal of Image and Data Fusion, 2 (4), 329–345.
Kandrika, S. and Roy, P.S., 2008. Land use land cover classification of Orissa using multi-temporal
IRS-P6 awifs data: a decision tree approach. International Journal of Applied Earth Observation
and Geoinformation, 10 (2), 186–193.
Kong, F., et al., 2012. Simulating urban growth processes incorporating a potential model with spatial
metrics. Ecological Indicators, 20, 82–91.
Downloaded by [Universitaetsbibliothek Heidelberg] at 00:08 20 June 2013
International Journal of Geographical Information Science
15
Koukoletsos, T., Haklay, M., and Ellul, C., 2012. Assessing data completeness of VGI through an
automated matching procedure for linear data. Transactions in GIS, 16 (4), 477–498.
Lawrence, R., Bunn, A., Powell, S., & Zambon, M., 2004. Classification of remotely sensed imagery
using stochastic gradient boosting as a refinement of classification tree analysis. Remote sensing
of environment, 90 (3), 331–336.
Marris, E., 2010. Birds flock online. Nature, 10 August 2010. Available online at http://www.nature.
com/news/2010/100810/full/news.2010.395.html [Accessed 16 May 2013].
Masoomi, Z., Mesgari, M.S., and Hamrah, M., 2013. Allocation of urban land uses by MultiObjective Particle Swarm Optimization algorithm. International Journal of Geographical
Information Science, 27 (3), 542–566.
Neis, P., Zielstra, D., and Zipf, A., 2011. The street network evolution of crowdsourced maps:
OpenStreetMap in Germany 2007–2011. Future Internet, 4 (1), 1–21.
Neis, P. and Zipf, A., 2012. Analyzing the contributor activity of a volunteered geographic information project – the case of OpenStreetMap. ISPRS International Journal of Geo-Information, 1
(2), 146–165.
Pacifici, F., Chini, M., and Emery, W.J., 2009. A neural network approach using multi-scale textural metrics from very high-resolution panchromatic imagery for urban land-use classification.
Remote Sensing of Environment, 113 (6), 1276–1292.
Pontius, R.G. Jr., and Cheuk, M.L., 2006. A generalized cross-tabulation matrix to compare softclassified maps at multiple resolutions. International Journal of Geographical Information
Science, 20 (1), 1–30.
Pontius, R.G. Jr., Huffaker, D., and Denman, K., 2004. Useful techniques of validation for spatially
explicit land-change models. Ecological Modelling, 179 (4), 445–461.
Pontius, R.G. Jr., and Petrova, S.H., 2010. Assessing a predictive model of land change using
uncertain data. Environmental Modelling & Software, 25 (3), 299–309.
Qi, Z., et al., 2012. A novel algorithm for land use and land cover classification using RADARSAT2 polarimetric SAR data. Remote Sensing of Environment, 118, 21–39.
Ramm, F., Topf, J., and Chilton, S., 2011. OpenStreetMap: using and enhancing the free map of the
world. 3rd ed.. Cambridge: UIT Cambridge.
Redo, D.J. and Millington, A.C., 2011. A hybrid approach to mapping land-use modification and
land-cover transition from MODIS time-series data: a case study from the Bolivian seasonal
tropics. Remote Sensing of Environment, 115 (2), 353–372.
Robinson, D.T. and Brown, D.G., 2009. Evaluating the effects of land-use development policies
on ex-urban forest cover: an integrated agent-based GIS approach. International Journal of
Geographical Information Science, 23 (9), 1211–1232.
Roick, O., Hagenauer, J., and Zipf, A., 2011. OSMatrix – grid-based analysis and visualization of
OpenStreetMap. In proceedings of the State of the Map EU 2011, 15–17 July. Vienna, Austria.
Rouse, L.J., Bergeron, S.J., and Harris, T.M., 2007. Participating in the geospatial web: collaborative
mapping, social networks and participatory GIS. In: A. Scharl and K. Tochtermann, eds. The
geospatial web. London: Springer, 153–158.
Saadat, H., et al., 2011. Land use and land cover classification over a large area in Iran based on single
date analysis of satellite imagery. ISPRS Journal of Photogrammetry and Remote Sensing, 66 (5),
608–619.
Turner, A.J., 2006. Introduction to neography. Sebastopol, CA: O’Reilly Media.
Wästfelt, A. and Arnberg, W., 2013. Local spatial context measurements used to explore the relationship between land cover and land use functions. International Journal of Applied Earth
Observation and Geoinformation, 23, 234–244.
Weber, J.-L., 2007. Implementation of land and ecosystem accounts at the European Environment
Agency. Ecological Economics, 61 (4), 695–707.