A Campaign to Collect Volunteered Geographic Information on Land

83
A Campaign to Collect Volunteered Geographic
Information on Land Cover and Human Impact
Christoph PERGER, Steffen FRITZ, Linda SEE, Christian SCHILL,
Marijn VAN DER VELDE, Ian MCCALLUM and Michael OBERSTEINER
Abstract
This paper outlines experiences with a recent campaign to collect volunteered geographic
information (VGI) on land cover and human impact using the Geo-Wiki crowdsourcing
tool (humanimpact.geo-wiki.org). A targeted campaign approach was adopted with the aim
of gathering data for the validation of a map of land availability for biofuels. At the same
time, experimentation was undertaken to assess the quality of information from the crowd.
Some initial results of the campaign are presented along with lessons learned and future
developments.
1
Introduction
Volunteered Geographic Information (VGI), whereby citizens act as independent sensors
collecting spatial data of an environmental or social nature (GOODCHILD 2008), is
becoming increasingly visible in a Web2.0- and neography-enabled world. When citizens
work together towards a common goal in a collective bottom-up approach, then
crowdsourcing is generally used to describe this activity (HOWE 2008, SCHURMANN 2009).
To date, crowdsourcing has been used successfully in the identification of galaxies in the
Galaxy Zoo application (TIMMER 2010), in identifying protein structures in the FoldIt game
(KHATIB et al. 2011), and in the development of an open access street map (RAMM et al.
2010), with many other applications currently available.
One area in which VGI and crowdsourcing have great potential is in using Google Earth
and citizens to validate global land cover maps. A number of studies have shown that the
three most recent global products (GLC-2000, MODIS v.5 and GlobCover) disagree on
what the land cover types on the Earth’s surface are at a number of locations, in particular
in the forest and cropland domains (FRITZ & SEE 2005, SEE & FRITZ 2006, FRITZ et al.
2011a). Global land cover information represents crucial inputs to economic land use
modelling and resource assessments, but as the major global land cover maps differ by as
much as 20% in the cropland domain, for example, accurate estimates of how much land is
currently under cultivation or available for biofuels will be subject to high uncertainties.
The choice of which land cover map to use can therefore impact upon the outcome of
analyses in which these products are used.
To encourage the remote sensing community and the wider public to become involved in
land cover validation and land cover improvement, the Geo-Wiki crowdsourcing system
was established (PERGER 2009). The VGI data collected through such a system will
eventually be part of the creation process of improved, hybrid land cover maps, but also has
Jekel, T., Car, A., Strobl, J. & Griesebner, G. (Eds.) (2012): GI_Forum 2012: Geovizualisation, Society and
Learning. © Herbert Wichmann Verlag, VDE VERLAG GMBH, Berlin/Offenbach. ISBN 978-3-87907-521-8.
84
C. Perger, S. Fritz, L. See, C. Schill, M. van der Velde, I. McCallum and M. Obersteiner
the potential to be used in the training of new land cover products. Since the original
development in 2009, Geo-Wiki has expanded to become a modular system with a number
of different Geo-Wiki branches devoted to the validation of cropland, urban areas and
population, and biomass (FRITZ et al. 2009, 2011b, 2011c, 2012). The most recent addition
to the Geo-Wiki family of VGI collection tools is humanimpact.geo-wiki.org. This paper
describes how a targeted campaign was used to gather data on land cover and human
impact through this latest Geo-Wiki branch. The background to the campaign is described
and some initial results are presented. The lessons learned from the campaign are
summarised and future developments are discussed.
2
The Need for VGI on Land Cover and Human Impact
The idea for collecting information on the amount of human impact on the earth’s surface is
related to a recent debate on how much marginal agricultural land is available for biofuel
production globally. In a recent paper published by CAI et al. (2011), a map of land
availability for biofuels was derived (Figure 1), with estimates of 320 to 702 million ha of
land available if only marginal agricultural lands were considered, increasing to 1107 to
1141 million ha if grassland, savanna and marginal shrubland are included. A fuzzy logic
model was developed to create a land productivity layer based on a combination of the soil,
slope and climate at each 1 km pixel, which was then overlaid on a global land cover map
to determine the amount of marginal agricultural lands, grassland, savanna and marginal
shrubland that could be used for biofuel production.
Fig. 1:
A map of land availability for biofuels (in %). Source: CAI et al. (2011)
However, this land availability map has not been validated using in-situ or groundtruth
data, and visual inspection of the map indicates a likely overestimation of available land,
particularly in parts of India, USA, South America and western Africa. Moreover, given the
high uncertainties in global datasets such as global land cover, this estimate of available
land must be rigorously evaluated and validated, especially since the biofuel lobby is
utilizing these estimates as truth. Humanimpact.geo-wiki.org was specifically designed to
help validate the map produced by CAI et al. (2011). The amount of human impact visible
Volunteered Geographic Information on Land Cover and Human Impact
85
in combination with the land cover type will determine if a particular area is available for
biofuels. By using crowdsourcing to determine these variables from Google Earth on a
systematic sample of points taken from both inside and outside the biofuel land availability
map, both this map and the estimates of land available for biofuels can be evaluated.
3
The Campaign on Human Impact
VGI and crowdsourcing require citizen participation to be both successful and sustained,
which is one of the most difficult aspects of data collection via the crowd. In the past,
citizen participation in Geo-Wiki was encouraged through competitions, which were run as
part of an existing remote sensing conference (IGARSS 2010) and by tapping into a group
of IIASA’s young summer scientists in 2010. Both of these competitions resulted in a
reasonable number of land cover validations, which have subsequently been used in the
validation of an African hybrid cropland map (FRITZ et al. 2011a). A campaign as a vehicle
for VGI collection has many advantages, including incentives to participate and access to a
wide network of potential contributors through access to the remote sensing and GI
communities. The human impact competition was initiated in September 2011, where the
tasks are described in the next section.
3.1
The task
Users were randomly provided with MODIS-sized pixels of 500m resolution (at the
equator) and were then asked to answer 3 questions based on what they could see on
Google Earth. Thus 3 points were awarded for each question. Additional points were
awarded if friends were invited who then validated a minimum amount. The questions
were:
1.
What is the amount of human impact that can be seen in the pixel? A slider bar from 0
to 100% allowed users to indicate this value along with a confidence bar to indicate
certainty in the estimate. Four values of confidence were available: sure, quite sure,
less sure and unsure.
2.
What is the land cover type? Users were provided with a simple legend of 10 land
cover types selectable using a drop down list. A confidence bar was provided to
indicate certainty over the land cover chosen. If cultivated and managed or a mosaic of
cultivated and natural vegetation was selected, users were then asked to indicate the
field size, ranging from very small (and therefore less intensive cultivation) to large,
intensively managed fields.
3.
What is the amount of abandoned land visible in the pixel? A slider bar with values
between 0 to 100% and a confidence bar were provided.
In addition, users were asked to enter the date of the image, which is unfortunately not an
attribute that is automatically minable from Google Earth, and whether the image was high
resolution, which would further indicate potential difficulty in identifying human impact,
land cover and/or abandoned land, especially when presented with low resolution imagery.
An example is shown in Figure 2.
86
C. Perger, S. Fritz, L. See, C. Schill, M. van der Velde, I. McCallum and M. Obersteiner
Fig. 2:
3.2
Validation of a pixel on humanimpact.geo-wiki.org
The competition
The competition was advertised to all currently registered Geo-Wiki users and through
relevant mailing lists and contacts. The competition began on 7 Sep 2011 and ended two
months later on 7 Nov 2011. Users were provided with a training manual that explained the
task and gave examples of different land cover types, field sizes and abandoned land. A
Facebook group was set up to provide support during the competition. The enticement for
participation was Amazon vouchers for the top three validators (in terms of both quantity
and quality of responses) as well as co-authorship on a scientific paper to validate the map
of land availability for biofuels for the top 10 competitors.
Once the competition was finished, participants were scored based on two equally weighted
criteria:
1.
2.
Total number of points validated
Accuracy based on comparison of the answers of the participants to 300 known control
points in terms of human impact (within ±15%); land cover type (using a scoring
confusion matrix as explained below); field size (within 1 category of the control
points); and abandoned land (within 20%) to reflect more difficulty in the identification
of this type of land.
Land cover types can be difficult to identify definitively from Google Earth. Therefore,
some allowance is made for situations where users confuse two particular land cover types
that are not as easily distinguishable, e.g. cultivated and managed compared to a mosaic of
cultivation and natural vegetation, or tree and shrub cover, which also can be difficult to
separate. Table 1 provides the scores that were given when land cover types provided by
the participants were compared to the land cover types of the control points.
Volunteered Geographic Information on Land Cover and Human Impact
Table 1:
Scoring confusion matrix for assessing accuracy of land cover types with
values between 0 and 1 (with 1 being the score for the most accurate)
1
1
0.7
0.3
0.7
0
0
0
0
0
0
1
2
3
4
5
6
7
8
9
10
87
2
0.7
1
0.7
0.7
0
0.5
0
0
0
0
3
0.3
0.7
1
0.7
0
0.5
0
0.3
0
0
Land cover types from 1 to 10
4
5
6
7
0.7
0
0
0
0.7
0
0.5
0
0.7
0
0.5
0
1
0.8
0.7
0
0.8
1
1
0
0.7
1
1
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0.7
0
8
0
0
0.3
0
0
0
0
1
0
0
9
0
0
0
0
0
0
0
0
1
0
10
0
0
0
0
0
0.7
0
0
0
1
Note: 1=Tree cover; 2=Shrub cover; 3=Herbaceous vegetation; 4=Cultivated and managed; 5=Mosaic of type 4
and natural vegetation; 6=Regularly flooded; 7=Built up/urban; 8=Snow/ice; 9=Barren; 10=Water
Thus, a confusion between the land cover type ‘Cultivated and managed’ and ‘Mosaic of
cultivated/managed and natural vegetation’ would still score 0.8 rather than being
completely incorrect since these land cover types are more easily confused than two more
distinctive types such as ‘Tree cover’ and ‘Barren’.
4
Initial Results of the Campaign
More than 60 people took part in the competition with approximately 55,000 points
validated in total. Table 2 summarises the number of points validated by the top ten
competitors along with a score for quality based on their answers compared to the control
points, which were used to determine their final ranking. These calculations actually
resulted in changes to the ranking, with 2 competitors with higher validations dropping out
of the top ten and two entering due to improved accuracy. The users were informed of this
possibility on a number of occasions so that they understood that quality is of utmost
importance.
Table 2:
Competitor
1
2
3
4
5
6
7
8
9
10
Number of validations from the competition by top 10 competitors
Points validated
8150
7012
3780
4533
4788
2461
2588
2856
1712
1452
Score for Quality
76.82
68.07
84.29
73.27
64.34
81.69
78.72
75.67
83.54
81.99
Occupation
Postgraduate student
Remote sensing expert
Postgraduate student
Remote sensing expert
Academic
Engineer
University researcher
Postgraduate student
Postgraduate student
Postgraduate student
88
C. Perger, S. Fritz, L. See, C. Schill, M. van der Velde, I. McCallum and M. Obersteiner
Table 2 also contains the occupation of the competitors, where many are postgraduate
students undertaking a masters or PhD in IT, remote sensing, agriculture or spatial sciences.
Other participants had highly professional backgrounds. The prize of co-authorship on a
scientific paper is more likely to appeal to an audience within higher education or research
so the backgrounds of these individuals are not surprising. At present the system is
therefore more of an expert-sourcing system than a true crowdsourcing one. Increasing the
appeal of Geo-Wiki to a wider non-expert audience is one of the ultimate goals.
The next stage will involve analysis of the dataset for quality so that rules can be devised
which essentially filter out the usable validations for assessment of the map of land
availability for biofuels from those that are of insufficient quality. For example, an analysis
of human impact (expressed as a percentage of the pixel) for the different land cover types
from the competitors compared to the first 100 control points shows reasonable
correspondence between them, with expected patterns observed, i.e. types such as urban,
cultivated and mosaic have higher human impact while the remaining types are all lower. A
larger spread of answers can be seen for certain land cover types, which indicates a greater
uncertainty in characterising human impact in the pixels of these types. These uncertainties
will be reflected in the rules developed for quality control.
Fig. 3:
Average human impact (%) associated with different land cover types for the
first 100 control points compared to the responses of the competitors
Assessment of quality is ongoing and will be the subject of a research paper in the future.
Volunteered Geographic Information on Land Cover and Human Impact
5
89
Lessons Learned and Future Improvements
Although the campaign was a success with such a large number of points collected over a
two month period, a number of issues arose during the competition which will inform
future campaigns. These include:
1.
2.
3.
The land cover types were not comprehensive enough, in particular the mosaic type
accounted for only a mosaic between cultivated land and natural vegetation while users
encountered other types of mosaics which were difficult to identify if using a single,
pure land cover type. The ability to identify the percentage of pure land cover types in
a sub-pixel analysis would have provided much more information and avoided some
inevitable misclassification problems. This feature has subsequently been implemented
using an overlaid grid to provide a visual aid in sub-pixel percentage determination.
Some users mentioned that the addition of other information would be very useful, e.g.
maps on geology, where large areas of granite would help to differentiate barren areas
from herbaceous vegetation. However, plans are already in place to add additional
information from external sources such as Flickr, Picasa, YouTube and Twitter in the
future (SCHILL et al. 2012).
The Facebook site was an attempt to build a community during the competition but it
was not as heavily used as we would have liked. An alternative is to provide social
networking tools embedded directly within Geo-Wiki, e.g. facilities to view other
people’s validations and comment on their correctness, or which allows individuals to
work together in a type of collaborative learning environment (VOINOV & BOUSQUET
2010).
These various innovations are planned for the next Geo-Wiki campaign, which will focus
on validating land cover types in those areas where the main global land cover products
currently disagree, and be integrated into the development of an improved, hybrid global
land cover map.
6
Conclusions
The crowdsourcing competition that was recently undertaken using humanimpact.geowiki.org was a very successful example of a campaign that resulted in valuable VGI for
both validating a map of land availability for biofuels and for the future development of
improved global land cover maps. Campaigns have their advantages but building a longterm sustainable community is the ultimate goal of Geo-Wiki. Some of the planned, future
innovations will move Geo-Wiki in this direction, in particular the collaborative and social
elements. The quality of VGI is also a subject that requires much more research in order to
develop mechanisms and rules that retain only the robust and usable contributions of VGI
from the crowd. Building these mechanisms is possible in this campaign because of the 300
control points that were validated by the experts. Once an initial set of rules are developed,
the map of land availability for biofuels will be validated.
90
C. Perger, S. Fritz, L. See, C. Schill, M. van der Velde, I. McCallum and M. Obersteiner
Acknowledgements
This research was supported by the European Community’s Framework Programme via the
Project EuroGEOSS (No. 226487) and by the Austrian Research Funding Agency (FFG)
via the Project LandSpotting (No. 828332).
References
CAI, X., ZHANG, X. & WANG, D. (2011), Land availability for biofuel production.
Environmental Science & Technology, 45 (1), 334-339.
FRITZ, S. & SEE, L. (2005), Comparison of land cover maps using fuzzy agreement.
International Journal of GIS, 19(7), 787-807.
FRITZ, S., MCCALLUM, I., SCHILL, C., PERGER, C., GRILLMAYER, R., ACHARD, F., KRAXNER, F. & OBERSTEINER, M. (2009), Geo-Wiki.Org: The use of crowd-sourcing to
improve global land cover. Remote Sensing, 1 (3), 345-354.
FRITZ, S., SEE, L., MCCALLUM, I., SCHILL, C., OBERSTEINER, M., VAN DER VELDE, M.,
BOETTCHER, H., HAVLIK, P. & ACHARD, F. (2011a), Highlighting continued uncertainty
in global land cover maps to the user community. Environmental Research Letters, 6,
044005.
FRITZ, S., SEE, L., MCCALLUM, I., SCHILL, C., PERGER, C. & OBERSTEINER, M. (2011b),
Building a crowd-sourcing tool for the validation of urban extent and gridded
population. Lecture Notes in Computer Science, 6783, 39-50.
FRITZ, S., SCHEPASCHENKO, D., MCCALLUM, I., PERGER, C., SCHILL, C., OBERSTEINER, M.,
BACCINI, A., GALLAUN, H., KINDERMANN, G., KRAXNER, F., SAATCHI, S., SANTORO, M.,
SEE, L., SCHMULLIUS, C. & SHIVIDENKO, A. (2011c), Observing terrestrial biomass
globally: http://Biomass.Geo-wiki.org. AGU Fall Meeting 2011. San Francisco, 5-9 Dec
2011.
FRITZ, S., MCCALLUM, I., SCHILL, C., PERGER, C., SEE, L, SCHEPASCHENKO, D., VAN DER
VELDE, M., KRAXNER, F. & OBERSTEINER M. (2012), Geo-Wiki: An online platform for
improving global land cover. Environmental Modelling and Software, 31, 110-123.
GOODCHILD, M. F. (2008), Commentary: whither VGI? GeoJournal. 72, 239-244.
HOWE, J. (2008), Crowdsourcing: Why the power of the crowd is driving the future of
business. Crown Business, New York.
KHATIB, F., DIMAIO, F., FOLDIT CONTENDERS GROUP, FOLDIT VOID CRUSHERS GROUP,
COOPER, S., KAZMIERCZYK, M., GILSKI, M., KRZYWDA, S., ZABRANKSA, H. PICHOVA, I.,
THOMSPON, J., POPOVIC, Z., JASKOLSKI, M. & BAKER, D. (2011), Crystal structure of a
monomeric retroviral protease solved by protein folding game players. Nature Structural
& Molecular Biology, 18, 1175-1177.
PERGER, C. (2009), Crowdsourcing to improve the world’s land cover data.
Unpublished Masters Thesis. University of Applied Sciences, Wiener Neustadt.
http://www.geo-wiki.org/login.php?menu=media. Date accessed: 01/02/2012.
RAMM, F., TOPF, J. & CHILTON, S. (2010), OpenStreetMap: Using and enchancing the free
map of the world. UIT Cambridge, 386 pp.
SCHURMANN, N. (2009), The new Brave NewWorld: geography, GIS, and the emergence of
ubiquitous mapping and data. Environment and Planning D: Society and Space, 27,
571-580.
Volunteered Geographic Information on Land Cover and Human Impact
91
SCHILL, C., DIAZ, L., PERGER, C., FRITZ, S., MCCALLUM, I., NATIVI, S., MCINERNEY, D.,
SEE, L. & CRAGLIA, M. (2012), Web 2 tools to improve global land Cover: Linking the
EUROGEOSS Broker and Geo-wiki. EuroGEOSS 2012 Conference. Madrid, 23-27 Jan
2012.
SEE, L. & FRITZ, S. (2006), Towards a global hybrid land cover map for the year 2000.
IEEE Transactions on Geosciences and Remote Sensing, 44 (7), 1740-1746.
TIMMER, J. (2010), Galaxy Zoo shows how well crowdsourced citizen science works,
http://arstechnica.com/science/news/2010/10/galaxy-zoo-shows-how-wellcrowdsourced-citizen-science-works.ars. Date accessed: 31/01/2012.
VOINOV, A. & BOUSQUET, F. (2010), Modelling with stakeholders. Environmental
Modelling and Software, 25(11), 1268-1281.
.