Rowe Ground Classification and Below Ground Response Assessment of Forested Regions using Full-Waveform LiDAR Jonathan Rowe [email protected] Advisor: Dr. Jan van Aardt Committee: Paul Romanczyk, Dave Kelbe, Bob Kremens Senior Project, Chester F. Carlson Center for Imaging Science College of Science Rochester Institute of Technology Fall 2013 ABSTRACT To gain a better understanding of how forested ecosystems function, research has been performed to characterize and model specific features within the forest using imaging techniques. Since the 1960s, light detection and ranging (LiDAR) has been used to generate three dimensional models of scenes by emitting a laser pulse and recording the time it takes for the pulse to interact with a scene and return back to the instrument. These times are converted to discrete ranges, often able to record multiple returned ranges per laser pulse. More recently, a new type of LiDAR instrument has been used in research that has the ability to digitize the reflected energy at a much finer scale, giving it the ability to record the entire waveform of reflected laser energy from the scene. Since the technology is so new, its potential is still not fully understood in the scientific community. In this paper, we focus on this new technology as a means to locate and model the ground of a forested region based on a 100m by 100m study area within the Harvard Forest in Massachusetts. The ground layer, called a digital elevation model (DEM), is important in LiDAR forest research because many other models, such as understory and canopy height models, require accurate DEM estimation. It became apparent that this task is not as easy with waveform LiDAR as with discrete LiDAR because in some cases, the waveform energy distribution extends below the ground level. Although discrete LiDAR can sometimes contain below ground hits, intensity and range thresholds tend to eliminate the issue from becoming a serious difficulty in estimating the DEM. In addition to generating a DEM derived from waveform products provided by the National Ecological Observatory Network (NEON), we focus on how to characterize these below ground responses and understand where they are coming from. Our reference data include a discrete derived DEM from NEON’s same sensor, and a discrete derived DEM generated by NASA’s G-LiHT system. Results show that our proposed method produced a DEM with root mean square deviation metrics of 293.7m and 7.41m when compared with the NEON and G-LiHT DEM, respectively. Assessment of the below ground responses reveal that the strongest relationship of below and above ground hits happened within the first meter of elevation above ground, yielding an R2 value of 0.2557. Overall, however, our method did not produce an acceptable accurate model to determine the cause of the below ground echoes. 1 Rowe 1. INTRODUCTION With the dramatic changes in the Earth’s climate and the increased demand for timber and other wood supplies, the need for characterization and modeling of our planet’s forests and vegetation is at an all-time high. To safely and efficiently assess our human impact in these areas, we must first be able to model the quality of vegetation of the forest of interest. Many parameters can be used to describe the health and biomass of the trees, such as spectral characteristics and physical attributes, but these parameters are not easily modeled. Foresters gain knowledge of a forest biomass by taking field measurements, which include ground-based measurements of trunks and leaves through use of hand held instruments, ground-based remote sensing, and airborne remote sensing. Two-dimensional images of the Earth’s surface collected by airborne and spaceborne sensors are valuable sources of data for understanding these vegetation parameters (Singh, 1989). The most common form of remote sensing used for this purpose is multispectral imaging, which is acquired by a sensor with multiple bands that detects energy reflected from a scene in the visible, near infrared, and short wave infrared regions of the electromagnetic spectrum. The problem with these systems is that even though high spatial resolutions can be interpreted, the resulting images are two-dimensional and thus cannot account for elevation characteristics of the vegetation. These elevation characteristics can consist of canopy height measurements, differentiation between canopy and understory, and specific elements such as the general location of leaves, branches, etc. Many of these limitations of two-dimensional imaging techniques may be compensated for by using three-dimensional measurement techniques, which allow for separation of objects found at different heights above the Earth’s surface. This height information is crucial to generating separate models for ground and canopy surfaces, as well as other height-dependent classification products (Wulder, 1998). One of the most popular remote sensing techniques used to capture three-dimensional measurement data is light detection and ranging (LiDAR). LiDAR is an active form of remote sensing, meaning that it provides its own source of energy, rather than using separate sources of illumination, such as the sun, for two-dimensional imaging methods. The energy provided by LiDAR instruments is in the form of a laser pulse. This pulse is emitted towards a target, and with each interaction a portion of the initial pulse energy is reflected back to the sensor, generating a waveform of energy characteristic to the specific scene being scanned. Early LiDAR systems were only capable of digitizing 1-4 responses, which resulted in discrete samples. For every one of these responses, the time between sending and receiving the laser pulse is recorded. Using the speed of light, range between the sensor and point of reflection is calculated. Included with the time recording for each pulse, the scan angle of the sensor, platform attitude of the airplane (pitch, roll, yaw, and heading) for airborne scanning, and global positioning of the instrument itself are stored. Combining all of these factors, the locations of every discrete return can be precisely mapped on the Earth’s surface (Wehr and Lohr, 1999). By the 1990s, LiDAR instruments had developed such that the entire waveform of back-scattered energy could be recorded onto the sensor. These waveforms have the potential to provide much more information about the scene content than discrete sampling, especially with vegetative scenes. Most discrete responses in vegetation correspond to either canopy or ground, 2 Rowe while every once and a while gaining an extra response in between, but the full waveform of a response accounts for all the interactions in the emitted laser pulse’s path (Mallet and Bretar, 2009). Waveform LiDAR processing is similar to discrete processing in that many models and products are derived in reference to the ground. Therefore, classifying the ground proves to be one of the most important tasks. The most common LiDAR derived product containing ground classification is called a digital elevation model (DEM), which can be represented as either a raster or a vector based triangular irregular network (Wagner, et al., 2004). DEMs are used in many fields of study for purposes such as surveying, flood mapping, line-of-sight analysis, urban planning, cartography, land use modeling, and many others. In vegetation analysis, the DEM provides the tool used in deriving canopy height models and understory segmentation. With discrete LiDAR, ground classification is usually performed by storing the final returns of all the laser scans and removing outliers, or the returns where the laser pulse never made it to the ground (Lewis and Hancock, 2007). With full-waveform LiDAR this process proves to be more difficult. This is due to observed phenomena where the waveform consists of responses below the ground level. This occurs because of multiple scattering of the energy from the initial laser pulse and the extra time it takes for the scattered energy to return to the sensor. Therefore, the challenge arises of how to account for the belowground multiple-scattering echoes (Pirotti, 2011). The purpose of this research is to study these belowground echoes to try and gain an understanding of why they exist and how to remove them in order to perform a more accurate ground assessment of the waveform. In addition, I will address the relationship between the multiple-scattering responses and the content just above the ground surface, such as brush, as well as the canopy. 2. LITERATURE REVIEW Many studies have been performed on the assessment of full-waveform LiDAR and its applications to measuring vegetation. One of the most important steps in the processing pipeline comes first, namely signal deconvolution. When a raw waveform is recovered at the sensor, it appears stretched and does not provide an accurate indication of features in the scene. This is due to a combination of the receiver impulse response, the sensor’s variable outgoing pulse signal, a fixed time gate for detection, and system noise. In theory, the resolution of the waveform signal can be recovered by deconvolving the system response from the measured signal (Wagner 2006). Many approaches have been taken to perform this deconvolution. L.B. Lucy developed one of the most popular, known as the Richardson-Lucy algorithm, in 1974. This is an iterative procedure based on Bayes’ statistical theorem (Richardson, 1972; Lucy, 1974). Harsdorf and Reuter (2000) stated that this approach resulted in the most dependable results when compared to Fourier transform and non-negative least squares methods (Harsdorf and Reuter, 2000). Wu et al. tested the Richardson-Lucy algorithm against the Wiener (Wiener, 1949) and non-negative least squares and arrived at the same conclusion (Wu et al., 2009; Wu 2011; Wu et al., 2012). Wagner et al. also advocates for the RichardsonLucy algorithm for all of their waveform pre-processing (Wagner, Ullrich, Melzer, et al., 2004; Wagner et al., 2006; Wagner et a l., 2007; Wagner et al., 2008). 3 Rowe A majority of digital elevation model extraction methods from full waveform LiDAR involve digitizing the waveforms to discrete samples (Sithole and Vosselman, 2004; Hollaus et al., 2006; Wagner et al., 2008). One common fact that all discrete DEM generation algorithms share is that they only rely upon geometric criteria, such as elevation relationships between neighboring points, to eliminate non-ground hits (Pfeifer 2004). With waveform data, characteristics of individual Gaussian-like echoes within the waveform can be evaluated to assess whether or not it refers to a ground location. Doneus and Briese (2006) proposed a two-step approach to evaluating these individual echoes for ground assessment. They first eliminate all last echo points with a significantly greater echo width than the echo width of the system waveform. Next, standard filtering procedures are performed, outlined by Kraus and Pfeifer (1998) and Briese et al. (2002) to remove non-ground points (Doneus et al., 2008). These filters divide the study area into grid-like regions and extract the lowest elevation response for each region. Afterwards, standard geometric criteria are used to remove points with large elevation differences from neighboring points (Kraus and Pfeifer, 1998; Briese 2002). Many studies do not account for the underground responses in their extraction of the DEM. There have been no studies completely dedicated to the understanding of the below ground echo phenomena. Researchers consider these phenomena to be unimportant in the extraction of waveform-derived products. The most common approach taken to deal with the belowground echoes is to assume that it is due to system noise. A noise threshold is selected based on a fraction of largest magnitude of all the peaks in the waveform. Every response peak below this threshold is assumed noise and negated from the analysis (Asner et al., 2007; Wagner et al., 2008; Kronseder et al., 2012). The estimate of 1/10th was suggested by Chhatkuli et al. (2012), whose study shows that ground penetration increased by 50% in autumn and 20% in winter (Chhatkuli et al., 2012). As mentioned, another approach taken is to remove all last echoes with a greater echo width and standard deviation than the echo width of the entire system waveform (Doneus and Briese 2006). Although these methods do a decent job at removing below ground responses from the waveform, they do not account for why the echoes exist, or how we can go about understanding them. 3. MATERIALS 3.1 Study Area The study area is centered on the Boston University/University of Massachusetts Boston Harvard forest hardwood site. The Harvard Forest Hardwood site is 100m by 100m with a center of (42.53100° N, 72.18210° W). The site, in association with Harvard University and the Long Term Ecological Research Network, is considered a wildland “core” site, which statistically represents unmanaged wildlife conditions across NEON’s 30-year history. It consists of dense hardwood trees with thick canopy coverage. Both ground truth and airborne measurements were taken of the area in the summer of 2012. 4 Rowe 3.2 Field Measurements Field data for this research were collected from five reference points within the 100m by 100m plot by a team comprised of researchers from the University of Massachusetts Boston, University of Massachusetts Lowell, and Boston University. For each of the sites, data for the surrounding trees were collected, including tree identification number, range and bearing from a reference location, species, whether or not an occlusion of the trunk is present, and classification of the crown prominence. Diameter at breast height (DBH) and height were also measured for select trees. DBH and tree height are valuable measurements for inputs to models that predict parameters such as biomass. 3.3 NEON LiDAR Data NEON’s Airborne Observation Platform (AOP) provided an airborne full-waveform LiDAR survey of the study area. An Optech Gemini instrument was used to capture the LiDAR waveforms at a wavelength of 1064nm. An approximately 200m × 200m subset of the NEON prototype waveform data was extracted about the site center; 110,367 waveforms were processed for the scene. Each waveform consists of 500 bands with 1ns (0.15m) vertical spacing. The time gate of the 500 bands for each waveform was variable, i.e., where the bands began recording at the first response of the waveform. Since a variable time gate was used, a reference file was supplied for each waveform that consisted of important location information such as northing, easting, and height of first return, outgoing pulse reference bin location, and first returns bin reference location in the return waveform. All northing and easting information were stored in Universal Transverse Mercator (UTM) coordinates, the common coordinate system used with LiDAR data. All data were converted to MATLAB-accepted formats for processing. NEON also generated a discrete LiDAR point cloud of the study area. These data were collected using the Optech Gemini instrument without the waveform digitizer in use. ENVI LiDAR software was used to extract the DEM from the point cloud. This product will be used as our primary reference data as it was sollected using the same sensor as the waveform data. 3.4 G-LiHT LiDAR Data Our second set of reference data for validating our implementations was generated by the National Aeronautics and Space Administration’s Goddard LiDAR, Hyperspectral, and Thermal Imager (G-LiHT). Although G-LiHT does provide additional imaging information such as imaging spectroscopy and thermal imaging, we will only use the digital elevation model derived from its discrete LiDAR component. The G-LiHT lidar data were collected with a Riegl LD321-A40 system with a pulse frequency of 10kHz and can process up to five returns per laser shot at 905nm (Cook et al., 2013). The DEM is a rasterized surface interpolated to one meter. 4. WAVEFORM PRE-PROCESSING Before specific analysis can be performed on the individual waveforms, some preprocessing needs to occur. Raw waveforms are not smooth functions in that they contain both noise and intensity thresholding. The LiDAR sensor applies a detection threshold for the reflected waveform that was used to filter out solar background light and detector 5 Rowe noise (M. Hofton 2000). As a result, only portions of the waveform whose corresponding intensity was above the detection threshold are stored in the raw waveform. This results in functions such as the one shown in blue in Figure 1. 5.1 Waveform Deconvolution The deconvolution approach is important because it can remove the system effect from the signal. The raw waveform was deconvolved with the outgoing pulse shape and estimated system impulse response to estimate fine detail at the 30cm scale, which is the Nyquist frequency for one nanosecond sampling. A one-dimensional implementation of the Richardson-Lucy algorithm was used as the main deconvolution method (Richardson, 1972; Fish et al., 1995; Cawse-‐Nicholson, 2013). The Weiner deconvolution algorithm was also tested by Wu et al. (2011), but the specific application to one-dimensional data did not produce as accurate input responses at the Richardson-Lucy approach (Wu et al., 2011). Figure 1 displays the result of this deconvolution. Figure 1: Normalized representation of a raw (blue) and deconvolved (green) waveform. The horizontal line represents the detection threshold that the system implemented to remove noise and solar background light from the raw signal. All returned intensity values that fall below this threshold are set to zero. The deconvolved waveform was generated using the Richardson-Lucy deconvolution approach to estimate an impulse response and retrieve the original input signal from the laser response. As we can see from the deconvolved waveform, it was much easier to estimate time locations where the laser interacted with the scene and reflected some of its energy. At this point in the processing chain, we have enough information to begin to estimate specific features in the scene. For example, the separations between the peaks at time bins 37 and 143 could be the difference between the canopy height and ground, with some small responses in between for below-canopy tree features, and some responses below ground for multiple-scattered responses. However, these estimations are still ambiguous due to multiple geometric and radiometric issues, such as leaf and tree structure and reflectance and transmittance, and more processing must be performed to gain a better understanding of the scene content. 5.2 Waveform Decomposition To get a better understanding of what these specific peaks correspond to, we must isolate them from each other within the waveform itself. Doing so will provide vital 6 Rowe information about the specific waveform responses, such as peak amplitude, standard deviation, and location (M. Hofton 2000). These statistics are extracted through a process called waveform decomposition. Waveform decomposition assumes that the energy distribution at every peak location can be modeled by Gaussian functions. The iterative process selects a Gaussian distribution of arbitrary statistics and loops through different inputs until the best match is defined through the root mean square error (Wagner et al., 2007). As a result, a new array of stored points was generated, each point containing peak amplitude, standard deviation, and time bin location properties. This process essentially discretizes all of the waveforms, producing data similar to discrete LiDAR. The difference was that the discretized waveforms can contain more points per laser pulse and each point contains statistics about the laser energy interaction at the scene. 5. DIGITAL ELEVATION MODEL GENERATION 5.1 Methods The discretized waveforms can be plotted as a geographically referenced point cloud using the time bin value for each peak and the geographic reference data provided for each waveform by NEON. This referenced point cloud was used for DEM generation. Figure 3 displays this point cloud. Figure 3: 40m by 40m subset of the geographically referenced point cloud of the discretized waveforms, focused at the center of the study area at (42.53100° N, 72.18210° W). Point sizes are displayed according to their corresponding normalized intensity. Figure 3 provides an excellent visual tool showing a region of the discretized point cloud. As the plot shows, most of the emitted laser energy was reflected off the tree canopy, leaving few ground points. Furthermore, by displaying the point sizes proportional to their respective intensities, it can be seen that below ground multiple scattering echoes have very small amplitudes (Persson et al., 2005). Using this plot, we can make assumptions for two processing steps that can be used to remove data irrelevant to our DEM extraction. This irrelevant data include the below ground and canopy responses. 7 Rowe First, an intensity threshold can be applied to remove the below ground responses. For the purpose of this implementation, a value corresponding to 1/10th the maximum intensity in the waveform was used as the threshold. Second, an elevation threshold can be applied to remove the majority of above ground points. This threshold was selected by taking the global threshold of the histogram of elevation values (Otsu, 1975). The result of these two thresholding operations was a point cloud containing only points from the ground and those just above the ground. A gridded sampling was used to divide the study area into smaller regions in order to get a representative estimate of ground points. For each grid cell, the minimum point coordinate of the lowest point was stored. This was most representative of ground because the below ground responses have already been removed through thresholding (Doneus et al., 2008). The size of the grid spacing was important to the ground extraction because the grid size must be small enough to acquire a significant amount of resulting ground points, but large enough to account for non-ground points (de Berg et al., 2008). For example, if a grid size of 2m by 2m was used and no ground points lie within that grid, a point above ground may be assigned to ground for that location. After evaluating many grid sizes, a final grid spacing of 10m by 10m was selected. This grid size did produce cells that did not contain ground points, as we can see from Figures 4a and 4b, but it contained the fewest non-ground hits of all grid sizes ranging from 1m to 20m. Using a grid size of 10m reduced the number of points in our cloud to 441, shown in Figure 4a. There was still a chance, however, that the minimum point of a grid was not ground. This potential issue was addressed using a triangulation method. The built-in MATLAB function “DelaunayTri” was used to perform the Delaunay triangulation. The algorithm takes three neighboring points and forms a circumcircle around them. If no other points lie within that circle, the points are connected to form a triangular facet. If a point does lie within the circumcircle, the specific three points are not triangulated (Lee, 1991; Kelbe, 2013). The result was a surface of triangular facets that connect the points in the subsampled ground point cloud, shown in Figure 4b. Figure 4a. Subsampled point cloud containing one point in every 10m by 10m spacing. Since the below ground responses were removed and the minimum point in each grid cell was selected, it was assumed that these points approximate the ground. 8 Rowe Figure 4b. Delaunay triangulated grid of the ground points selected in Figure 4a. It was assumed that the outlying points above the average ground surface do not correspond to ground. As shown in Figure 4b, some of the minimum points taken from the grid cells lie above the average surface height. The most likely cause of this was that the 10m by 10m region did not have a point corresponding to ground. Now that these outliers have been determined, a simple angular threshold can be generated to remove them. Using simple trigonometry, the angle (θ) between each point along the triangulation lines was calculated using equations 1 and 2. 𝜃 = cos !! !! !! (1) Δd = Δx ! + Δy ! (2) where Δd is the distance in northing and easting between two points and Δx, Δy, Δz are the difference in easting, northing, and elevation, respectively. An angular threshold was determined by referring to the reference data. Since the G-LiHT DEM is already rasterized to one-meter resolution, these points were used to determine a maximum slope. The maximum slope was calculated using equations 1 and 2. In this case, the Δx and Δy are each one meter. Results of these computations imply that a maximum angle of 10° should be used for the angle threshold in removing outlier points. Figure 5 displays the resulting triangular network after the outlying points are removed. 9 Rowe Figure 5. Delaunay triangulated network of ground points after outlying points with an angle greater than 10 degrees of their neighboring hits were removed. After the outlying points were removed, the points were translated onto a raster grid and interpolated. MATLAB’s built-in “TriScatteredInterp” function was used to interpolate the rasterized points to a density of 0.5m. 5.2 Results The digital elevation model generated using the above method did not yield similar results to our G-LiHT reference data, but was reasonably close to our NEON discrete reference data. Since our ground was sampled at wide 10m by 10m intervals, there was much more variability between ground points, resulting in a less smooth surface. Figure 6a displays the digital elevation model generated from the waveform LiDAR. Figures 6b and 6c display the DEMs generated from G-LiHT and NEON data sets, respectively. The differences between the waveform DEM and the two reference DEMs are shown in Figures 6d and 6e, respectively. Figure 6a: Digital elevation model generated through the waveform-based method described in section 5.2. The variability was attributed to the irregular 10m by 10m sample spacing. 10 Rowe Figure 6b: The digital elevation model generated by G-LiHT’s Riegl LD321-A40 discrete LiDAR component. G-LiHT’s DEM was given to us “as is.” Figure 6c: The digital elevation model generated by NEON’s Optech Gemini discrete LiDAR component. This DEM, given a discrete point cloud provided by NEON, was derived using ENVI LiDAR’s ground extraction algorithms. Figure 6d. Difference of the waveform derived DEM and the reference G-LiHT DEM, where the waveform DEM was subtracted from the G-LiHT. Figure 6e. Difference of the waveform derived DEM and the reference NEON discrete DEM, where the waveform DEM was subtracted from the discrete DEM. Calibration inconsistencies between NEON’s and G-LiHT’s processing pipeline account for offsets in elevation between the two models; however, this does not account for differences in the ranges of elevation between the three models. Most likely, these range shifts are due to time shifts produced during the deconvolution step. Elevation ranges and standard deviations are calculated for each DEM, and the root mean square deviation (RMSD) and mean elevation difference between the reference DEMs and waveform derived DEM are highlighted in Table 1. 11 Rowe Table 1: Quantitative comparison of the waveform-derived DEM, NEON’s discrete DEM derived using ENVI LiDAR and the discrete-derived DEM using G-LiHT’s data. Elevation Range (m) Standard Deviation (m) RMSD (m) Mean Elevation Difference(m) Waveform-derived 15.48 3.19 --- G-LiHT Discrete 10.68 2.703 293.70 24.41 NEON Discrete 18.01 3.20 7.31 5.02 Statistically speaking, the waveform generated DEM was very different than the GLiHT DEM. This was due to two main causes. First, it was unknown where in the reflected energy response G-LiHT’s system quantifies the discrete responses. For example, when the waveforms were discretized, the location at the amplitude’s peak was used for the location of the discrete point. Some LiDAR processing takes the full-width half-max point of the leading edge to discretize a response. This can result in varying differences between the two point locations because reflected energy distributions have very different shapes due to different interactions with scene content. Secondly, in the 10m by 10m grid cells generated for identifying ground points, the point actually representative of ground contained a relative amplitude less than 1/10th of the largest amplitude in the waveform. In this case, the algorithm will select a point that was actually above the ground. One way that we can justify ground point locations was by understanding how below ground responses are formed. 6. BELOW GROUND ECHO ASSESSMENT 6.1 Methods Now that a digital elevation model has been generated, we can use its surface to assess the laser energy below ground. Our assumption was that these energy returns are a result of content in the scene above ground. A voxel approach was taken to compare the below ground energy distribution to the above ground energy distribution. The implementation generates a voxels in three-dimensional space directly below and above the DEM. The voxel is 10m by 10m in surface area, which is wide enough to locate enough below ground points to accurately represent their characteristics. Within each voxel, the means, the standard deviations, and the sums of the point amplitudes of the existing discretized points’ intensities were calculated. We seek to prove that these statistics below the DEM will correlate with those above the DEM, therefore accounting for the multiple-scattering phenomena. The below ground voxels will be compared to different elevation levels above the ground. Above ground voxels will range in height from one meter to 30 meters, in one meter increments. In other words, 30 voxel comparisons will be made for each 10m by 10m spacing in the study area. Number of points, the mean amplitudes, and standard deviations of the points within the voxels will be plotted alongside the corresponding statistics of the below ground voxel. Relationships between the below and above ground voxels will be assessed through linear R2 regression. 12 Rowe 6.2 Results Assessment of below ground responses yielded interesting results. It was assumed that this phenomena was caused by multiple scattering by the laser vegetation structure above ground. Comparisons of below ground data were performed on data above the ground from one meter to 30 meter increments. Figure 7 provides an example of the process using two above ground voxels. The first, labeled understory, accounts for all points up to 5m above ground level. The canopy region refers to a voxel that accounts for all points 5m above ground level. Figure 7a. Relationship of the number of below ground responses to the number of understory responses (blue) and canopy responses (green). Linear fitting lines are overlaid over the data that are used to calculate the R2 regression. Figure 7b. Relationship of the mean amplitude of below ground responses to the mean amplitude of understory responses (blue) and canopy responses (green). Linear fitting lines are overlaid over the data that are used to calculate the R2 regression. 13 Rowe Figure 7c. Relationship of the standard deviation of below ground responses to the standard deviation of understory responses (blue) and canopy responses (green). Linear fitting lines are overlaid over the data that are used to calculate the R2 regression. We can qualitatively assume from Figure 7 that the below ground points have very little correlation to the above ground points. If they did, the corresponding statistics would form a linear relationship. Instead the points are scattered, implying very little relationship. This relationship is measured quantitatively by calculating the R2 regression component. As mentioned, the R2 value was then measured for a variety of above ground voxel sizes. The results of these statistics are shown in Figure 8. Figure 8: Comparison of R2 values for the number of points, mean amplitude, and standard deviation of the amplitudes for varying voxel elevations above ground level. 14 Rowe Figure 8 provides valuable information about the occurrences of below ground echoes. Calculating the R2 values with voxels of varying height above ground provided information as to where the multiple scattering is occurring. R2 values calculated above the voxel location make it easier to determine how valid the correlations between the above ground points and below ground points are. For elevation levels where the R2 value is much greater below the threshold than above it, it is safe to assume that it is the due to the interactions below the thresholds that result in the below ground echoes. This is the case with the statistic relating number of points above and below the low elevation thresholds. For the regression between the number of points in the first meter above the ground and the below ground points, there was an R2 value of 0.26, which is the largest R2 value of all relationships tested. All of the points lying above this one meter threshold, when compared to the below ground points, resulted in an R2 of 0.01. Although the number of points R2 decreases as the voxel height increases over the next 10 voxel sizes, it still remains much larger than its respective above-threshold R2 value. This means that the strongest estimation we can make is that the below ground echoes are related far more to the number of points just above the ground level than any other elevation, as well as any other statistic describing the points. Standard deviations of points residing in the voxels below 5m had the second strongest relationship to the below ground points. This reflects the results of the study by Doneus and Briese (2006), where below ground hits were filtered out based on the widths of the individual waveform echoes. The mean of the amplitude R2 values remained fairly consistent throughout the voxel heights lower than 19m, implying that the below ground points can not be accurately modeled by the mean and standard deviation of the points up to 19m above ground. For the above ground voxels with heights above 19m, there is a greater relationship between the residing points and their below ground counterparts. The number of points above the 26m to 28m voxels actually resulted in greater R2 values than the ones within the voxel. This mean that at these thresholds the number of points above influence the number of below ground points more than the number of points above ground and below the threshold. Standard deviations and means of point amplitudes in voxels covering the canopy regions increase as well. At voxel thresholds of 20m to 24m there is a larger relationship of mean amplitude above the threshold to the means below ground than anywhere else in the elevation region, but with a maximum R2 value of 0.05, it is still not a strong enough of a relationship to accurately model the below ground returns to multiple scattering in this elevation region. Based on the results shown in Figure 8, the number of points and the mean of point amplitudes within the voxel at lower voxel elevation thresholds provide the best estimation as to the location of the multiple scattering phenomena. 7. CONCLUSIONS The proposed method does not serve as a complete algorithm for ground extraction using waveform LiDAR data, because it does not contain the quantitative precision that the NEON and G-LiHT discrete DEMs do. The lack of precision was best reflected in the 15 Rowe quantitative comparison shown in Table 1. Inability in this approach to separate below ground returns and above ground returns from actual ground hits resulted in the algorithm selecting points above the ground to be ground points, which extended the range of elevation in the model. In addition, there were many instances where the signal simply did not reach the ground, resulting in the selection of above ground points independent of a below ground counterpart. Other factors that may have influenced these results are differences in the processing chains between the data provided by NEON and G-LiHT. Since G-LiHT’s processing pipeline is unknown it is not possible to compare the time shifts resulting from potentially different deconvolution methods. Since our ground truth data were a discrete DEM generated by a separate organization, it was difficult to be certain of our accuracy in comparison. This was due to the unknown variables in the ground truth DEM’s processing chain. Our assessment of underground responses shows that these occurrences are likely due to multiple scattering of photons emitted by the laser in the scene, which delays their return to the sensor and results in a larger range. Calculating R2 regression metrics for below ground points compared to their above ground counterparts provided valuable information in understanding where these scatterings occur. Figure 8 shows that the strongest relationships are between number of points and standard deviation of amplitudes of the points below ground and within 4-5m above the ground’s surface, having maximum R2 values of 0.26 and 0.18 respectively. Since the relationship between the below ground points and those points above these voxels are drastically smaller, we can assume that the points within the 1-5m voxels themselves are the driving forces behind these below ground echoes. There is little relationship between the points below ground and the points between 5m and 19m above ground. This is expected because this elevation range contains less content than both the ground level brush and the thick canopy. In the canopy region of 19m to 28m, stronger relationships occurred for number of points and standard deviation of those points, as well as mean amplitude values of the points within that voxel. However, with the largest R2 value is this region being 0.11 (number of points within the voxel at 22m), it makes for a very inaccurate model. In conclusion, our data show that the below ground responses have the strongest relationship with the number of points and their standard deviations less than 4m above ground, but with an R2 so low it is still not strong enough to generate an accurate model of the below ground echoes. REFERENCES Briese, C., Pfeifer, N., and Dorninger, P. (2002). “Applications of the robust interpolation for DTM determination”. In: International Archives of Photogrammetry Remote Sensing and Spatial Information Sciences 34.3/A, pp. 55–61. Cawse-‐Nicholson, K. (2013). Deconvolution and Decomposition MATLAB code. Rochester Institute of Technology. 16 Rowe Chhatkuli, S., Mano, K., Kogure, T., Tachibana, K., and Shimamura, H. (2012). “Full waveform lidar exploitation technique and its evaluation in the mixed forest hilly region”. In: ISPRS -‐International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences XXXIX-‐B7, pp. 505–509. Doi: 10.5194/isprsarchives-‐XXXIX-‐B7-‐505-‐2012. Cook, B. D., Nelson, R. F., Middleton, E. M., Morton, D. C., McCorkel, J. T., Masek, J. G., Ranson, K. J., Ly, V., and Montesano, P. M. (2013). “NASA Goddard’s LiDAR, Hyperspectral and Thermal (G-‐LiHT) Airborne Imager.” In: Remote Sensing 5.8. Doi: 10.3390/rs5084045. De Berg, M., Cheong, O., van Kreveld, M., and Overmars, M. (2008). Computational geometry. 3rd. Springer. isBN: 3540779736. Doneus, M. and Briese, C. (2006). “Digital Terrain Modelling for Archaeological Interpretation within Forested Areas using Full-‐Waveform Laserscanning”. In: VAST: International Symposium on Virtual Reality, Archaeology and Intelligent Cultural Heritage. The Eurographics Association, pp. 155–162. Doneus, M., Briese, C., Fera, M., and Janner, M. (2008). “Archaeological prospection of forested areas using full-‐waveform airborne laser scanning”. In: Journal of Archaeological Science 35.4, pp. 882–893. Doi: 10.1016/j.jas.2007.06.013. Fish, D. A., Brinicombe, A. M., Pike, E. R., and Walker, J. G. (1995). “Blind deconvolution by means of the Richardson–Lucy algorithm”. In: Journal of the Optical Society of America 12.1, pp. 58–65. Doi: 10.1364/JOSAA.12.000058. Harsdorf, S. and Reuter, R. (2000). “Stable deconvolution of noisy lidar signals”. In: tc 10, p. 1. Hofton, M., Minster, J., and Blair, J. (2000). “Decomposition of laser altimeter waveforms”. In: Geoscience and Remote Sensing, IEEE Transactions on 38.4, pp. 1989–1996. issN: 0196-‐2892. Doi: 10.1109/36.851780. Hofton, M. A., Minster, J. B., and Blair, J. B. (2000). “Decomposition of laser altimeter waveforms”. In: Geoscience and Remote Sensing, IEEE Transactions on 38.4, pp. 1989– 1996. Doi: 10.1109/36.851780. Hollaus, M., Wagner, W., Eberh¨ofer, C., and Karel, W. (2006). “Accuracy of large-‐ scale canopy heights derived from LiDAR data under operational constraints in a complex alpine environment”. In: ISPRS Journal of Photogrammetry and Remote Sensing 60.5, pp. 323–338. Doi: 10.1016/j.isprsjprs.2006.05.002. Kelbe, D. (2013). Digital Elevation Model MATLAB code. Ed. by D. Kelbe and J. Rowe. Rochester Institute of Technology. Kraus, K. and Pfeifer, N. (1998). “Determination of terrain models in wooded areas with airborne laser scanner data”. In: ISPRS Journal of Photogrammetry and remote Sensing 53.4, pp. 193–203. Doi: 10.1016/S0924-‐2716(98)00009-‐4. Lee, J. (1991). “Comparison of existing methods for building triangular irregular network, models of terrain from grid digital elevation models”. In: International Journal of Geographical Information System 5.3, pp. 267–285. doi: 10.1080/02693799108927855. Lewis, P. and Hancock, S. (2007). “LiDAR for vegetation applications”. In: UCL, Gower St, London, UK. 17 Rowe Lucy, L. (1974). “An iterative technique for the rectification of observed distributions”. In: The astronomical journal 79, p. 745. Mallet, C. and Bretar, F. (2009). “Full-‐waveform topographic lidar: State-‐of-‐the-‐art”. In: ISPRS Journal of Photogrammetry and Remote Sensing 64.1, pp. 1–16. doi: 10.1016/ j.isprsjprs.2008.09.007. Otsu, N. (1975). “A threshold selection method from gray-‐level histograms”. In: Automatica 11.285-‐296, pp. 23–27. Persson, ˚A., S¨oderman, U., T¨opel, J., and Ahlberg, S. (2005). “Visualization and analysis of full-‐waveform airborne laser scanner data”. In: International Archives of Photogram-‐metry, Remote Sensing and Spatial Information Sciences 36.3/W19, pp. 103–108. Pfeifer, N., Gorte, B., and Elberink, S. O. (2004). “Influences of vegetation on laser altimetry–analysis and correction approaches”. In: International Archives of Photogram-‐metry, Remote Sensing and Spatial Information Sciences 36.part 8, W2. Pirotti, F. (2011). “Analysis of full-‐waveform LiDAR data for forestry applications: a review of investigations and methods”. In: iForest-‐Biogeosciences and Forestry 4.1, p. 100. doi: 10.3832/ifor0562-‐004. Richardson, William Hadley. (1972). "Bayesian-Based Iterative Method of Image Restoration." In: Journal of the Optical Society of America 62.1, p 55. Singh, A. (1989). “Review Article Digital change detection techniques using remotely-‐sensed data”. In: International journal of remote sensing 10.6, pp. 989– 1003. doi: 10. 1080/01431168908903939. Sithole, G. and Vosselman, G. (2004). “Experimental comparison of filter algorithms for bare-‐Earth extraction from airborne laser scanning point clouds”. In: ISPRS Journal of Photogrammetry and Remote Sensing 59.1, pp. 85–101. doi: 10.1016/j.isprsjprs. 2004.05.004. Wagner, W., Ullrich, A., Melzer, T., Briese, C., and Kraus, K. (2004). “From single-‐ pulse to full-‐waveform airborne laser scanners: potential and practical challenges”. In: International Archives of Photogrammetry and Remote Sensing 35.B3, pp. 201–206. Wagner, W., Hollaus, M., Briese, C., and Ducic, V. (2008). “3D vegetation mapping us-‐ ing small-‐footprint full-‐waveform airborne laser scanners”. In: International Journal of Remote Sensing 29.5, pp. 1433–1452. doi: 10.1080/01431160701736398. Wagner, W., Roncat, A., Melzer, T., and Ullrich, A. (2007). “Waveform analysis techniques in airborne laser scanning”. In: International Archives of Photogrammetry and Remote Sensing 36.3, pp. 413–418. Wagner, W., Ullrich, A., Ducic, V., Melzer, T., and Studnicka, N. (2006). “Gaussian decomposition and calibration of a novel small-‐footprint full-‐waveform digitising airborne laser scanner”. In: ISPRS Journal of Photogrammetry and Remote Sensing 60.2, pp. 100– 112. doi: 10.1016/j.isprsjprs.2005.12.001. 18 Rowe Wehr, A. and Lohr, U. (1999). “Airborne laser scanning–an introduction and overview”. In: ISPRS Journal of Photogrammetry and Remote Sensing 54.2, pp. 68–82. Doi: 10. 1016/S0924-‐2716(99)00011-‐8. Wiener, N. (1949). Extrapolation Interpolation and Smoothing of Stationary Time Series. MIT Press, Cambridge, MA. Wu, J., Van Aardt, J., Asner, G., Kennedy-‐Bowdoin, T., Knapp, D., Erasmus, B., Mathieu, R., Wessels, K., and Smit, I. (2009). “Lidar waveform-‐based woody and foliar biomass estimation in savanna environments”. In: Silvilaser 2009–9th International Conference on Lidar Applications for Assessing Forest Ecosystems. Wu, J., van Aardt, J., and Asner, G. P. (2011). “A comparison of signal deconvolution algorithms based on small-‐footprint lidar waveform simulation”. In: Geoscience and Remote Sensing, IEEE Transactions on 49.6, pp. 2402–2414. Doi: 10 . 1109 /TGRS . 2010.2103080. Wu, J., van Aardt, J., McGlinchy, J., and Asner, G. P. (2012). “A Robust Signal Preprocessing Chain for Small-‐Footprint Waveform LiDAR”. In: Geoscience and Remote Sensing, IEEE Transactions on 50.8, pp. 3242–3255. Doi: 10.1109/TGRS.2011.2178420. Wulder, M. (1998). “Optical remote-‐sensing techniques for the assessment of forest inventory and biophysical parameters”. In: Progress in physical Geography 22.4, pp. 449–476. Doi: 10.1177/030913339802200402. 19
© Copyright 2026 Paperzz