METHODOLOGICAL GUIDELINE TO PRODUCE A FUTURE DEFORESTATION MODEL FOR PALM OIL EXPANSION IN PAPUA NEW GUINEA USING By: Giancarlo Raschio Freddie Alei November, 2016 Table of Contents Table of Contents 1 Data and Methods Used .......................................................................................... 3 2 Preparation of Hansen data for the deforestation model ......................................... 4 2.1 2.2 2.3 2.4 First stage .................................................................................................................... 4 Second stage ............................................................................................................... 9 Third stage ................................................................................................................ 12 Fourth stage .............................................................................................................. 14 3 Import raster files to Idrisi ..................................................................................... 16 4 Import of vector files to Idrisi ................................................................................ 18 5 Generation of Factor Maps .................................................................................... 22 6 Calibration ............................................................................................................ 28 6.1 6.2 6.3 Change Analysis ......................................................................................................... 29 Transition Potentials .................................................................................................. 32 Change Prediction...................................................................................................... 42 7 References ............................................................................................................ 45 8 Annexes ................................................................................................................ 46 8.1 Annex 1: Complete process flow for preparing Hansen Dataset to generate Forest/Nonforest maps for the years 2000 and 2014 ............................................................................. 46 8.2 Annex 2: Validation ................................................................................................... 47 2 PROJECTION OF THE QUANTITY AND LOCATION OF FUTURE DEFORESTATION This report presents the methodology and results to locate in space and time the baseline deforestation expected to occur within the STUDY AREA during the project crediting period. 1 Data and Methods Used To project deforestation into the future it is necessary at remote sensing data on land-cover for at least two points in time. In our case we used the publicly and freely available Hansen dataset (Hansen et al. 2013). Of course, the objective for PNGFA is to use its own classified remote sensing imagery for the analysis of future deforestation. The Hansen dataset provides the following data: i) ii) iii) iv) Forest cover in the year 2000 (tcover): This data refers to a land-cover classification of two classes: tree cover and tree-cover loss1 Annual tree-cover loss for each year between 2000 and 2014 (lossyear): this data refers to the tree-cover that has been lost in each of the 14 years between 2000 and 2014 A data mask layer (dmask): this data presents information about features that are neither tree cover nor tree-cover loss; thus, they are rivers. A data layer of tree-cover gain between 2000 and 2014 (gain): this data refers to tree-cover that has been regenerated between 2000 and 2014. However, it should be taken into account that in the case of PNG such gain or “regeneration” is the result of farming cycles; areas that appeared as regenerated tree-cover in 2014 were plantations already present in 2000 that grew their crops. Therefore, it is important to consider this layer to avoid overestimating baseline deforestation. The objective was to first generate forest and non-forest classified images for the years 2000 and 2014. Of course, the deforestation modeling process can be replicated and updated using a different set of images. Besides the data aforementioned, we also used a layer with official data on agricultural and forestry plantations in PNG (Qa,Qf), which was provided by PNGFA. This data layer is key because it allowed us to identify which areas were already plantations in the year 2000. The software we used was the module “Land Change Modeler” (LCM) from Idrisi Selva2. For modeling, the method of Similarity-Weighted Instance-based Machine Learning (SimWeight) was used. The transition that will be evaluated is Forest to Non-Forest. The software calculated the deforestation rate based on the classified images using a Markov Matrix. Once the deforestation rate was calculated, the model estimates the quantity and location of future deforestation. 1 Tree-cover loss is not the same as deforestation. What a forest is depends on the definition adopted by each country whereas the Hansen dataset identifies tree-cover loss as 2 http://clarklabs.org/applications/upload/IDRISI_Focus_Paper_REDD.pdf 3 First, we need to create a validation model that will test a set of driver variables that are assumed to describe the change from Forest to Non-Forest in the study area or selected provinces. Land cover maps from two points in time were created for this purpose: 2000 and 2014. The process will model assess forest loss between 2000 and 2014 and then use the calculated deforestation rate to predict future deforestation. Through this step, we get to test and identify the various driver variables to better match the predicted map to the map of reality. 2 Preparation of Hansen data for the deforestation model To prepare the data from Hansen dataset we used the Model Builder tool in ArcGIS 10.1 software. The objective is to have as results: i) Forest and non-forest cover map in 2000, and; Forest and non-forest cover map in 2014. The overall model to process Hansen datasets (Fig. 1) has been divided in four (4) stages for didactic purposes (see Annex 1 for a larger figure). Figure 1: processes flow in model builder to prepare Hansen datasets to produce one Forest/Non-forest map for the year 2000 and one for the year 2014. IMPORTANT: Before initiating any data analysis make sure that all raster and vector layers are in the same desired projection and coordinate system. For raster files, make sure all have the same desired number of columns and rows and the same cell size, otherwise the software won’t allow to start the modeling process. Number of columns and rows as well as cell size can be standardized by resampling the raster files either in a ArcGIS or Idrisi deforehand. 2.1 First stage First, we need to use three raster files: gain, tcover, and the layer with data on existing agricultural and forestry plantations from PNGFA (Qa,Qf). These three layers need to be reclassified to the adequate raster values to then go through a weighted sum (Fig. 2) 4 Figure 2: outlook of stage one “Forest and Non-forest in 2000 without gain cover” The raster “gain” should be reclassified by turning all values larger than zero equal to ten (Fig 3). Figure 3: reclassification of the “gain” raster. The raster “tcover” should be reclassified by selecting a threshold to define what percentage of tree cover should be considered as forest and which as non-forest. In our case, we tried with different thresholds, run the whole model, and then compared the area of forest in our resulting “Map 3” raster (map of forest/non-forest in 2014) to the actual area of forest in the study area or province according to the official data from Papua New Guinea’s Forest Authority (PNGFA 2013) (Fig 4). The idea is to input an initial threshold 5 percentage (see red square in Fig. 4) so anything above it should be forest and all below it should be non-forest. Then, we should run the model, calculate areas of forest, and compare with the official forest area for the study province according to PNGFA. We must be as close as possible to the official data and, to be conservative, it’d be better to have slightly less forest than the official source. This because the less initial forest we have (more nonforest) the less future non-forest we might have (thus lower projected deforestation, which is a conservative assumption). This step was necessary because of three reasons: i) the “tcover” raster from the Hansen dataset presents values of tree cover in percentage (from 0 to 100); ii) using the official forest definition of PNG will generate a forest area much larger than the official records, and; iii) there was no official classification of forest in 2000. Figure 4: reclassification of the “tcover” raster The raster “QaQf” only contained values for the plantation areas. For the model we cannot have No Data values within our study area or province. For this reason, it was necessary to allocate a value to the background of the study area or province. So, as a preliminary step, the QaQf raster had to be converted to a binary raster of 0 and 1 values. This was done via a reclassification (tool Reclassify of ArcGIS) with the following setting: Original Value New Value 6 NoData 1 - 151 0 1 Also, a working environment was set up using the “Environments” option of the “Reclassify” tool. We selected “Processing Extend” and set up the extend to be the same as the “fcover” raster. Then, we selected “Raster Analysis” and set up the extend to be the same as the “fcover” raster. Then click OK and then OK again to run the reclassification. So, once we have the binary QaQf we use it in the Model Builder and indicated that all values equal to zero should remain zero and that all values higher than zero should be converted to three (Fig 5). Figure 5: reclassification of the plantation binary raster (QaQf_bi) What the Weigthed Sum tool will do is to sum the values of the overlaying three rasters. We assigned the an equal weight of “1” to all rasters. So, we are adding the following rasters: GAIN 0 10 Description Background Gain 7 TCOVER Description 1 Forest in 2000 2 Non-forest in 2000 QaQf 0 3 Description Background Existing Plantations in 2000 As a result, the “Weighted_t1” raster will have the following values which should be reclassified (Fig. 6) to new values (Table. 1): Table 1: Meaning of the values generated via the Weighted Sum tool Values 1 2 4 5 11 12 14 15 Meaning Forest on background Non-forest on background Forest that falls over existing plantation Non-forest on existing plantation Forest on Gain Non-forest on Gain Forest on Gain on existing plantation Non-forest on Gain on existing plantation New Class Forest Non-forest Non-forest Non-forest Non-forest Non-forest Non-forest Non-forest Figure 6: reclassification of the weighted sum results 8 Resulting from this first stage we have the “Forest2000” raster, which has two classes, (1) forest and (2) non-forest, and accounts for the presence of “gain” and “plantations”. 2.2 Second stage In this stage we’ll generate a raster with three classes: (1) forest, (2) non-forest, and (3) rivers/water that we’ll call “Map1”. This Map 1 represents the Forest in 2000, which is our initial year of the analysis period (Fig. 7). Figure 7: outlook of stage two For this stage we used two inputs: i) the “dmask” raster (refer to Section 1),and; ii) the “Forest2000” raster that we generated as a result of Stage 1. The dmask raster was used to identify rivers/water and to overlay these on the Forest2000 raster, which only contains data on forest/non-forest. We used the dmask raster and reclassified its values so initial class “2” would be final class “3” (Fig. 8). 9 Figure 8: reclassification of dmask What the Weigthed Sum tool will do is to sum the values of the overlaying two rasters. We assigned an equal weight of “1” to all rasters. So, we are adding the following rasters: RIVERS 0 3 Description Background Rivers/Water FOREST2000 Description 1 Forest in 2000 2 Non-forest in 2000 As a result, we had an intermediate raster called “t3” with the following values, which should be then reclassified (Fig. 9) to new values (Table. 2): 10 Figure 9: reclassification of “t3” intermediate raster Table 2: Meaning of the values generated via the Weighted Sum tool Values 1 2 4 5 Meaning Forest on background Non-forest on background Non-forest on rivers/water Non-forest on rivers/water New Class Forest Non-forest Rivers/water Non-forest Finally, resulting from this stage we will have Map1 that is the Forest/Non-forest Map in 2000 with 3 classes. Then we can decide to convert the raster to polygon to calculate the areas. 11 2.3 Third stage In this stage we generated a map of forest cover loss between the years 2000 and 2014 (Fig. 10). For this stage we used two inputs: i) Map1 (Forest/Non-forest Map in 2000 with 3 classes) from Stage 2, and; ii) the lossyear raster (see Section 1). First we reclassified the lossyear raster so tree cover loss for the year 2000 (class 1) was converted to background data (class 0). All the other classes representing tree cover loss between 2001 and 2013 (classes 1 to 13) were reclassified to class 3 (Fig. 11). Figure 10: sub-process flow of Stage 3 Figure 11: reclassification of lossyear raster 12 What the Weigthed Sum tool will do is to sum the values of the overlaying two rasters. We assigned an equal weight of “1” to all rasters. So, we are adding the following rasters: LOSSYEAR Description 0 Background 3 New Non-forest 2001-2013 FOREST2000 Description (MAP1) 1 Forest in 2000 2 Non-forest in 2000 3 Rivers/Water As a result, we had an intermediate raster called “t5” with the following values, which should be then reclassified (Fig. 12) to new values (Table. 3): Table 3: Meaning of the values generated via the Weighted Sum tool Values 1 2 3 4 5 Meaning Forest on background Non-forest on background Rivers/water on background Forest2000 on new non-forest Non-forest2000 on new non-forest New Class Forest Non-forest Rivers/water New Non-forest New Non-forest 13 Figure 12: reclassification of “t5” intermediate raster Finally, resulting from this stage we will have Map2 that is the Map of forest cover loss between 2000 and 2014 Forest/Non-forest Map in 2000 with 4 classes. Then we can decide to convert the raster to polygon to calculate the areas. 2.4 Fourth stage In this final stage we generated the Forest 2014 map, which we called Map3 (Fig. 13). For this stage we used two inputs: i) Intermediate “t5” raster from Stage 3 (see Section 2.3), and; ii) the dmask raster (see Section 1). 14 Figure 13: Overlook of Stage 4 In this case we used the “dmask” raster to generate a mask within which the reclassification of the intermediate raster t5 will take place. We reclassified classes 1 and 2 to a new class 1, and values “0” to NoData (Fig. 14) Figure 14: reclassification of dmask raster to generate a mask of the study area The, we reclassified the intermediate raster “t5” so all non-forest classes (2 and 4) should become a single non-forest class (2). Forest (1) and Rivers/Water (3) remain with the same class values (Fig. 15). 15 Figure 15: reclassification of the intermediate raster “t5” to have only one non-forest class Finally, resulting from this stage we will have Map3 that is the Forest/Non-forest Map in 2014 with 3 classes. Then we can decide to convert the raster to polygon to calculate the areas. 3 Import raster files to Idrisi The raster files generated in Section 2 must be converted to a format that can be imported to Idrisi to become inputs in the Land Change Modeler (LCM). Rasters in ArcGIS must be exported from GRID format to TIFF format. Then, in Idrisi, TIFF files can be imported to Idrisi format (Fig. 16). 16 Figure 16: Importing TIFF files into Idrisi IMPORTANT: Once imported, you need to set the categories for the Hansen classified images that you’ll be working with. In our case, Forest Cover 2000 and Forest Cover 2014. To do this, select the raster file, go to metadata, and open the menu of “Categories” Here you’ll set a name for each of the three categories in the forest cover raster file: 1) Forest; 2) Non-forest; 3) Rivers (Fig. 17). Do the same for both forest cover raster images. This way, the software will be able to identify by name the changes in forest-cover later in the process. 17 Figure 17: Setting category names for the forest cover raster images 4 Import of vector files to Idrisi Vector files must be imported from shapefile format to Idrisi format before they can be used in the LCM. In the main menu go to File >Import > Software Specific Formats > ESRI formats > SHAPEIDR (Fig. 18). 18 Figure 18: Finding the tool to import shapefiles to Idrisi A new menu will open and here select the vector file you want to import, the name for the imported Idrisi vector file, and the Reference System for the Idrisi vector file, and click OK (Fig. 19). 19 Figure 19: Importing shapefiles to Idrisi Keep in mind that you will need at least a vector file of roads (primary and/or secondary). If you have separate vector files for primary and secondary roads you’ll need to also generate a separate vector file that combines both primary and secondary roads (you’ll need this combined file as noted in section 6). Finally, you’ll have to convert all your vector files into raster files using the tool “RasterVector”. In the tool select a conversion option (depending on the type of vector file you’re converting), select the vector file to be converted and the name for the raster file to be generated. In “Operation type” you just keep the default option, and click OK. (Fig. 20). Figure 20: RasterVector menu 20 Once you click OK, a warning message will appear and click Yes. Then, the “Image Initialization” menu will open. This is used to define the spatial parameters of the raster file you are creating from a vector file. If you already have a raster file with the cell size you want (in our case the forest cover 2000 and 2014 generated from Hansen dataset) keep the default option “Copy spatial parameters from another image”. Keep the “Output image” option as default, and in the “Image to copy parameters from” select the raster file you’ll use to extract the cell size, and leave the rest as default. Once you click OK the process of converting a vector to raster is completed (Fig. 21). Figure 21: Image initialization menu 21 5 Generation of Factor Maps IMPORTANT: Before initiating any data analysis make sure that all raster and vector layers are in the same desired projection and coordinate system. For raster files, make sure all have the same desired number of columns and rows and the same cell size, otherwise the software won’t allow to start the modeling process. Number of columns and rows as well as cell size can be standardized by resampling the raster files either in a ArcGIS or Idrisi deforehand. Once all raster and vector files were imported to Idrisi, we proceeded to generate Factor Maps. Factor Maps (FM) represented the variables used to explain deforestation in the analysis period and that will be used to project future deforestation. In our case we selected eight (8) variables thus we had eight FM: 1. 2. 3. 4. 5. 6. 7. 8. Distance to non-forest in 2000 (non-forest in the initial year) Distance to rivers Digital Elevation Model of 30 meters (DEM30) Distance to Census 2011 points Distance to major towns Distance to primary roads Distance to secondary roads Distance to Special Agricultural Businesses Leases (SABL) areas To generate the distance for the seven variables (we do not generate a distance map for the DEM, we just use it as it is) we used the tool “Distance” in Idrisi. The feature image is the variable we are working with and the output image is the name with which the “distance to variable” new file will be saved as (Fig. 22) Figure 22: Distance tool in Idrisi Finally, we must multiply each distance raster by a mask of the study area. This mask is a raster file with values 0 and 1. Value 1 is assigned to the study area and the value 0 to the NoData areas (this raster mask can be created using the “Reclass” tool in Idrisi or ArcGIS). 22 To multiply each distance raster by the raster mask we used the “Overlay” tool in Idrisi making sure to select “First*Second” in the Overlay options (Fig. 23): Figure 23: Overlay tool to create mask raster from distance raster files Data for FM 1 and 2 came from our preparation of Hansen data explained in Section 1. Data for FM 3 through 8 came from PNGFA. Spatial variables and Distance Maps (Factor Maps) for the study area are presented below: 23 Figure 24: Non-forest in 2000 and distance to non-forest Figure 25: Rivers and distance to rivers 24 Figure 26: Digital Elevation Model (DEM) Figure 27: 2011 census points and distance to census points 25 Figure 28: Major towns and distance to major towns Figure 29: Primary roads and distance to primary roads 26 Figure 30: Secondary roads and distance to secondary roads 27 Figure 31: SABL areas and distance to SABL areas 6 Calibration With the raster distance files created we are ready to start with the modeling process. Go to the main menu and select Modeling > Environmental/Simulation Models > Land Change Modeler:ES (Fig. 32). 28 Figure 32: Route to open the Land Change Modeler tool The Land Change Modeler tool consists on many tabs with sub-sections each. We’ll through one of the applicable tabs and sub-sections. 6.1 Change Analysis The first tab is “Change Analysis” and it is here that you need to select a name for your modeling project (Fig. 33). Then, you will select the initial and final land cover images for your analysis. On our case, these were forest cover 2000 and forest cover 2014. The tool identifies automatically the years of each of the raster files (Fig. 33). Check the box “REDD project”, then select the end date for the modeling period and the interval for the model projections. In our case we selected 40 years into the future (20142044) on 5-year intervals (Fig. 33). Then you need to select a the vector file wit all the roads (primary and secondary), and the DEM raster file. The selection of a palette is optional for the colors of the future projected deforestation (Fig. 33). Finally, click “Continue” (Fig. 34) and the software will calibrate the gain and losses in the analysis period (2000 – 2014) to which point it will show a gain and losses bar chart (Fig. 35). 29 Figure 33: First tab of LCM Sub-section “Transition Sub-Model Structure” 30 Figure 34: Once all files are input into the LCM project parameters click Continue Finally, click “Continue” and the software will calibrate the gain and losses in the analysis period (2000 – 2014) to which point it will show a gain and losses bar chart (Fig. 35). Figure 35: Gain and losses between 2000 and 2014 Note: if the raster files do not have the same cell size or if category names have not been assigned to the forest cover raster files, a warning message will appear and you won’t be able to continue with the process until such issues are solved. 31 6.2 Transition Potentials Next you’ll go to the “Transition Potential” tab (Fig. 36). The first sub-section indicates which transitions will be evaluated. In our case, we only have one transition which is forest to non-forest, thus that is the only one that appears in this sub-section. In this case we don’t make any changes in this sub-section and proceed to the next one. Figure 36: Tab “Transition Potentials” We go directly to the sub-section “Test and Selection of site and Driver Variables” (Fig. 37). In this sub-section we can test the Cramer’s V index. This index represents the correlation between variables and non-forest expansion in the assessment period and presents values from 0 to 1 (the closer to 1, the more correlation there is). This step is optional, but it is helpful to discriminate among several variables when we are deciding which ones to include in our model. I we decide to use it, we proceed to select the variable to evaluate and then click on “Text Explanatory Power”. If we decide that a variable should be used in our model, we can add it directly by clicking on “Add to Model”. If you decide not to test Cramer’s V coefficient, you can add variables manually on the next sub-section. 32 Results will show in a table and, in this case, we should look at the Cramer’s V coefficient for non-forest (Fig. 38), which is the transition we are interested on. Figure 37: Sub-section “Test and Selection of Site and Driver Variables” 33 Figure 38: Results of evaluating the variable “distance to non-forest in 2000” for the Cramer’s V coefficient In the next sub-section “Transition Sub-Model Structure you’ll select the variables for the deforestation mode (the variables might already appear in this sub-section if you added them during the assessment of Cramer’s V coefficient) (Fig. 39). 34 Figure 39: Sub-section “Transition Sub-Model Structure” For each variable you must indicate its role, if it is “Static” or “Dynamic”. In our case we selected as “Dynamic” variables: non-forest in 2000, distance to primary roads, and distance to secondary roads. All the other variables remained as “Static”. Then, we select the “Basis layer type” for the non-forest 2000, distance to primary roads, and distance to secondary roads variables. We click in this field for the non-forest 2000 35 variable and a pop-up menu appears in which we select “non-forest” and then click “insert”, and then OK (Fig. 40). In the case of distance to roads (primary and secondary) we select either it is a variable representing primary or secondary roads in each case and then click “insert”, and then OK (Fig. 41). Figure 40: Pop-up menu for the basis layer type of the non-forest 2000 variable 36 Figure 41: Pop-up menu for the basis layer type of the distance to roads (primary and then secondary) variables 37 Once all variables have been input in the “Transition Sub-Model Structure” sub-section we are ready to mode on to the next sub-section called “Run Transition Sub-Model” (Fig. 42). It is in this sub-section that we will first calculate the relevance weights of each variable and then generate a sub-model for the transition under assessment, in this case, forest to non-forest. There are three options for a modeling approach. We selected SimWeight which is the most appropriate when assessing only one transition (forest to non-forest). The selected method was the Similarity-Weighted Instance-based Machine Learning (SimWeight) as the model method to run the transition sub-model. SimWeight is a Similarity-Weighted Instance-based Machine Learning algorithm. It uses a slightly modified variant of the algorithm described by Sangermano et al., (2010) – a similarity-weighted K-nearest neighbor procedure (Idrisi help). 38 Figure 42: Sub-section “Run Transition Sub-Model” We leave all values by default and click on “Calculate Relevance Weights”. This process takes a few minutes and as a result we will have a chart with the relevance weight of each selected variable to predict the change in forest to non-forest in the assessment period (Fig. 43). The relevance weight chart is an indication of each variable’s importance at 39 discriminating change. For each variable, it compares the standard deviation of the variable inside areas that have changed (Forest to Non-Forest) to the standard deviation across the entire study area. For a variable to be important it would have a smaller standard deviation in the change area than for the entire study area. The graph can be used as a guide to inform the utility of variables as well to indicate that more variables may need to be identifies to include in the model. Figure 43: Relevance weight of each of the selected variables for the sub-model Once the relevance weight of the variables has been calculated, we proceed to run the submodel. You’ll notice that the button we previously used to “Calculate Relevance Weights” has now change into “Run Sub-Model”. Click on “Run Sub-Model” and leave the software to calculate the sub-model for the assessment period (Fig. 44). This operation will take several minutes depending on the size of the study area (in our case it took about 5 hours). The result from the sub-model calculation is a soft-prediction or deforestation risk map for the assessment period (Fig. 45). 40 Figure 44: Option to run the sub-model or soft-prediction 41 Figure 45: Soft-prediction map or deforestation risk map for the assessment period Once the sub-model has been calculated we can then proceed to run the actual future deforestation model as explained in the next section. 6.3 Change Prediction We can now go to the next tab “Change Prediction” (Fig. 46). In our case we didn’t account for the expansion of roads so we didn’t use the sub-section “Dynamic Road Development”. We go straight to the sub-section “Change Allocation”. In our case we didn’t include any dynamic road development, changes in infrastructure or zones of constraint/incentives, thus we left unchecked all the options in the “Optional Components” box. We leave all other options by default and proceed to click on “Run Model” (Fig. 46). At this point the software will start developing the future deforestation model based on the variables we have chosen and for the period and intervals selected. This process will take significant time because the software will calculate a soft-prediction for each transition period and, based on this, will generate a hard-prediction or spatial distribution of deforestation for the selected year. 42 Figure 46: Sub-section “Change Allocation” 5.4 Results Two basic models of change are provided: a hard prediction model and a soft prediction model (Fig. 47). The hard prediction model is based on a competitive land allocation model similar to a multi-objective decision process. The soft prediction yields a map of vulnerability to change for the selected set of transitions. Hard and Soft prediction maps are generated for each year. The output is a series of predicted land cover maps for the study area. 43 Figure 47: Soft-prediction (above) and hard-prediction (below) models The resulting hard-prediction maps are raster files from which we can calculate areas of non-forest increase (deforestation). This raster files can be worked in Idrisi or can be exported to ArcGIS. 44 7 References Eastman, J.R., 2012. “Land Change Modeler” Idrisi Selva Tutorial, Manual version 17. 264-280 Hansen, Matthew C., et al. "High-resolution global maps of 21st-century forest cover change." Science 342.6160 (2013): 850-853. PNGFA. 2013. “Forest and Land Use in Papua New Guinea 2013.” Port Moresby. Puyravaud, J.P., 2003. "Standardizing the calculation of the annual rate of deforestation". Forest Ecology and Management, 177: 593-596 Sangermano, F., Eastman, J.R., Zhu, H., 2010. Similarity weighted instance based learning for the generation of transition potentials in land change modeling. Transactions in GIS, 14(5), 569-580. Takada, T., Miyamoto, A., and Hasegawa, S.F., 2010. "Derivation of a yearly transition probability matrix for land-use dynamics and its applications", Landscape Ecology, 25, 561–572. 45 8 Annexes 8.1 Annex 1: Complete process flow for preparing Hansen Dataset to generate Forest/Non-forest maps for the years 2000 and 2014 8.2 Annex 2: Validation A commonly used step in deforestation modeling is model validation. This allows the analyst to have an idea on the accuracy of the model. In our case, the scope and time of the assignment as well as the availability of data did not allow for a validation of the deforestation models. However, it is suggested that PNGFA should validate all the deforestation models it generates as new data becomes available. Model validation is needed to determine which of the deforestation risks maps is the most accurate in order to confirm the quality of the model output. To confirm a model output, it is required both a “Calibration” and a “Validation” stage. For example, imagine we have data for three points in time: year 1996, 2004, and 2008. In this case, there are two historical periods (1996-2004 and 2004-2008) that have shown a similar deforestation trends. Data from the most recent period (2004-2008) can be used as the “validation” data set and those from the previous period (1996-2004) as the “calibration” data set. With data from the calibration period (1996-2004), we prepare a Risk Map and a Prediction Map of the deforestation for the validation period (2004-2008). Then, predicted deforestation for 2008 will be overlaid with locations that were actually deforested in 2008 (land cover map for 2008). It is necessary to select the Prediction Map which best fits with the real map and that best reproduce actual deforestation in the validation period. In this step, the hard prediction 2008 will be used to validate the model, given that the actual land cover map for 2008 is already known. The output map is a 3-way cross-tabulation between the projected or “predicted” 2008 map and the actual 2008 map (the “reality” map). Area calculation for both predicted and actual map of 2008 should be similar because we used the actual rate of change in the period 2004-2008. What we want to do in this case is to validate whether the projected locations of change are similar to those of actual changes in this period. One of the assessments techniques that can be used is the “Figure of Merit” (FOM) that confirms the model prediction in statistical manner. This FOM is a ratio of the intersection of the observed change and the predicted change to the union of the observed change and the predicted change and ranges from 0 (where there is no overlap between observed and predicted change) to 1.0 (where there is perfect overlap between observed and predicted change) Results show three data for analysis: - False Alarms (C), which are areas where we predicted a change from forest to nonforest but there was no change; - Misses (A), which are areas where a change was not predicted but pixels actually changed from forest to non-forest; and - Hits (B), which are pixels predicted to change to non-forest areas did change. The Figure of Merit (FOM) is calculated from these values, as is presented below: FOM = B / (A+B+C) 48
© Copyright 2026 Paperzz