A robust statistically based approach to estimating the probability of contamination occurring between sampling locations Peter Beck | Principal Environmental Scientist Image placeholder Image placeholder Image placeholder Current Site Assessment Approach Site History Select Target Size and Design Pattern Statistical Evaluation of Concentration Data Judgment Based Decision Collect data on site history to identify sources of impact Collect target samples at locations of concern Select target shape and size of concern and design a sampling pattern to establish absence at 95% confidence using an unbiased sampling pattern Assess unbiased concentration data using uni-variant statistical tools Interpret the results from the two separate approaches to assess contaminant distribution and site condition A robust statistically based approach to estimating the probability of contamination occurring between sampling title locations Presentation The Trouble with the Hot Spot • Group trial in data interpretation • Participants selected sample locations • Doted line represents actual hot spot • Solid line represents linear based interpolation • Dashed line represents nearest clean sample interpretation • Note the high degree of variability and uncertainty A robust statistically based approach to estimating the probability of contamination occurring between sampling title locations Presentation What Does the Hot Spot Mean A robust statistically based approach to estimating the probability of contamination occurring between sampling title locations Presentation Lead Histogram Consistent Scale 160 120.00% Frequency Cumulative % 140 100.00% 120 Interpreting Concentration Data Lead Consistent Bin Range 80.00% 80 60.00% 60 40.00% 40 20.00% 20 7930 7750 7570 7390 7210 7030 6850 6670 6490 6310 6130 5950 5770 5590 5410 5230 5050 4870 4690 4510 4330 4150 3970 3790 3610 3430 3250 3070 2890 2710 2530 2350 2170 1990 1810 1630 1450 1270 910 1090 730 550 370 .00% 10 0 190 Frequency 100 Bin A robust statistically based approach to estimating the probability of contamination occurring between sampling title locations Presentation Lead Histogram Variable Scale 70 120.00% Frequency Cumulative % 60 100.00% 50 Interpreting Concentration Data Lead Variable Bin Range Frequency 40 80.00% 60.00% 30 40.00% 20 20.00% 10 48 20 0 50 0 80 0 11 00 14 00 17 00 20 00 23 00 26 00 29 00 32 00 35 00 38 00 41 00 44 00 47 00 50 00 53 00 56 00 59 00 62 00 65 00 68 00 71 00 74 00 77 00 80 00 42 36 30 24 18 6 .00% 12 0 0 Bin A robust statistically based approach to estimating the probability of contamination occurring between sampling title locations Presentation Limitations of Uni-variate statistics • Assumes samples were collected in an un-biased manner • Dissociates location and concentration (ie. No relationship between the two) • Ignores sample location as a factor • Normal and Log-Normal distribution not applicable in many situations • Log-Normal distribution can be unstable • Non-parametric methods overcome distribution issues but still do not consider location • Can not provide confidence in spatial data interpretation A robust statistically based approach to estimating the probability of contamination occurring between sampling title locations Presentation Why is Spatial Relationship Important A robust statistically based approach to estimating the probability of contamination occurring between sampling title locations Presentation Spatial Geostatistics • Based on approach used in the mining industry • Allows for spatial relationship between the samples • Unaffected by sample biased • The VARIOGRAM or SEMIVARIOGRAM is the fundamental Assessment Tool. • The variogram present random variance, spatial variance as well as the range of influence of samples • Data evaluation is done by Kriging • Probability plots developed by Indicator Kriging A robust statistically based approach to estimating the probability of contamination occurring between sampling title locations Presentation The Variogram h 1 n 2 [ X X ] i ih 2n i 1 A robust statistically based approach to estimating the probability of contamination occurring between sampling title locations Presentation One Dimensional Example Distance Spatial 0 1 2 3 4 5 6 7 8 9 h 1 n 2 [ X X ] i ih 2n i 1 h=2 h=1 1 2 3 4 5 6 7 8 9 10 (Xi-Xi+h)2=(1-2)2=1 (Xi-Xi+h)2=(2-3)2=1 “ “ “ “ “ “2 (Xi-Xi+h) =(9-10)2=1 n=9 1 2 3 4 5 6 7 8 9 10 (Xi-Xi+h)2=(1-3)2=4 (Xi-Xi+h)2=(2-4)2=4 “ “ “ “ (Xi-Xi+h)2=(8-10)2=4 n=8 A robust statistically based approach to estimating the probability of contamination occurring between sampling title locations Presentation Variogram Types and Examples A robust statistically based approach to estimating the probability of contamination occurring between sampling title locations Presentation Kriging and Indicator Kriging Kriging • Mathematical technique for assigning best linear moving average concentration over a defined area. • Considered the best method of estimating concentration distribution because: • Avoids systematic bias • Minimises the error of estimation (kriging error) • Requires development and data input from variogram Indicator Kriging • Assign a value of 1 to “clean” samples and a value of 0 to “dirty” samples • Results in a Probability Plot of presence and absence of contamintion A robust statistically based approach to estimating the probability of contamination occurring between sampling title locations Presentation So Why Consider Spatial Geostatistics • Provides a linkage between concentration, variance (micro + macro) and location • Separates random and spatial components of variance • Micro-scale (sample scale) variance always in Nugget Effect (random variance) • Macro-scale (spatial variance) is either spatially related, random (Nugget Effect) or a combination of the two • Assist in establishing when sufficient samples have been collected to characterise a site • Provides a robust method for predicting uncertainty in impact distribution, remediation volumes and cost. A robust statistically based approach to estimating the probability of contamination occurring between sampling title locations Presentation Case Study 1: Application to 19ha Parkland • Parkland and Sporting Ovals in Armidale NSW. Impact by fill from gasworks site was suspected. • Investigations commenced in 2000, with a limited sampling program and initial results were used to inform a staged geostatistical assessment process A robust statistically based approach to estimating the probability of contamination occurring between sampling title locations Presentation Initial Assessment Stage Results Variogram for PAH concentration. The results were used to develop confidence regions for the initial assessment area using indicator Kriging and then selecting additional sampling locations A robust statistically based approach to estimating the probability of contamination occurring between sampling title locations Presentation Second Assessment Stage Variogram for PAH concentration. The results were used to revise confidence regions for the second stage assessment area using indicator Kriging and then selecting additional sampling locations A robust statistically based approach to estimating the probability of contamination occurring between sampling title locations Presentation Third Assessment Stage Note there was little change in variogram between stage 2 and 3 sampling. Thus further sampling would offer limited benefit A robust statistically based approach to estimating the probability of contamination occurring between sampling title locations Presentation Final Results A robust statistically based approach to estimating the probability of contamination occurring between sampling title locations Presentation More Recent Advances • Effects of different approaches and skill in the development of the variogram on the reliability of spatial interpretation is an important consideration • Using different variograms and assessing the effects on interpretation can assist in clarifying the reliability of the spatial interpretation • Utilising only primary sample results can lead to overestimation of statistical confidence and in the case of spatial geostatistics, over estimation of the confidence in the spatial interpretation • The QA/QC data collected can be utilised to factor in sample scale variance often caused by heterogeneity • Inclusion of the blind duplicate and split samples allows incorporation of the sample scale variance into the variogram development and accounts for it in the spatial interpretation A robust statistically based approach to estimating the probability of contamination occurring between sampling title locations Presentation Case Study 2: Effects of Variogram and Variance • A 2ha site was intensively assessed to facilitate in-situ waste classification • A total of 303 primary samples were analysed, with 108 (~36%) samples analysed exceeded adopted criteria for one or more contaminants 5815540 5815520 5815500 5815480 5815460 5815440 • The effects of different variogram interpretation was assessed by development of variogram by different assessors with different time budgets 5815420 5815400 5815380 5815360 5815340 303780 303800 303820 303840 303860 303880 303900 303920 303940 303960 303980 304000 • Lead QA/QC data was used to assess the effect of random variance on the variogram and confidence mapping A robust statistically based approach to estimating the probability of contamination occurring between sampling title locations Presentation Column D: Cadmium Direction: 0.0 Tolerance: 40.0 1000 5815540 900 5815520 800 5815500 700 5815480 600 5815460 500 5815440 Variogram Variograms – Cd Data 400 200 100 0 0 10 20 30 40 50 60 70 80 90 100 110 5815400 5815380 5815360 5815340 Lag Distance 303780 303800 303820 303840 303860 303880 303900 303920 303940 303960 303980 304000 Variogram Column D: Cadmium Direction: 0.0 Tolerance: 40.0 1000 5815540 900 5815520 800 5815500 700 5815480 600 5815460 500 5815440 400 5815420 Key Variogram Data Transformed: None Sill: 800 Nugget: 1 Range: 35 Model: Spherical 300 200 100 5815400 5815380 5815360 0 0 10 20 30 40 50 60 70 80 90 100 110 5815340 Lag Distance 303780 303800 303820 303840 303860 303880 303900 303920 303940 303960 303980 304000 Column D: Cadmium Direction: 0.0 Tolerance: 40.0 5815540 800 Variogram Cadmium data was assessed for spatial relationship and variance Effects of variations in various aspects of the variogram on the data interpretation were examined, including: • Increasing variance • Reducing random variance • Decreasing the lag distance and range of influence • Using a different model 5815420 Key Variogram Data Transformed: None Sill: 700 Nugget: 50 Range: 35 Model: Spherical 300 5815520 700 5815500 600 5815480 5815460 500 5815440 400 5815420 300 Key Variogram Data Transformed None Sill: 770 Nugget: 1 Range: 15 Model: Gaussian 200 100 5815400 5815380 5815360 0 0 10 20 30 40 50 60 70 80 90 100 110 5815340 Lag Distance 303780 303800 303820 303840 303860 303880 303900 303920 303940 303960 303980 304000 A robust statistically based approach to estimating the probability of contamination occurring between sampling title locations Presentation Column G: Lead Direction: 0.0 Tolerance: 40.0 5815540 Variograms – Pb Data 1000000 5815520 900000 5815500 800000 5815480 700000 Variogram 5815460 600000 5815440 500000 5815420 400000 300000 200000 100000 5815400 5815380 5815360 5815340 0 0 10 20 30 40 50 60 70 80 90 100 110 Lag Distance 303780 303800 303820 303840 303860 303880 303900 303920 303940 303960 303980 304000 Column D: Lead Log Direction: -15.0 Tolerance: 20.0 5815540 3 5815520 5815500 2.5 5815480 2 Variogram Lead data was assessed for spatial relationship and variance Effects of variations in various aspects of the variogram on the data interpretation were examined, including: • Reducing random variance • Decreasing the lag distance and range of influence • Log Transforming the data before variogram development Key Variogram Data Transformed None Sill: 1100000 Nugget: 1 Range: 17 Model: Rational Quadratic 5815460 5815440 1.5 5815420 Key Variogram Data Transformed Log Sill: 2.5 Nugget: 0.25 Range: 25 Model: Rational Quadratic 1 0.5 5815400 5815380 5815360 5815340 0 0 10 20 30 40 50 60 70 80 90 100 110 Lag Distance 303780 303800 303820 303840 303860 303880 303900 303920 303940 303960 303980 304000 Column D: Lead Log Direction: -15.0 Tolerance: 20.0 5815540 5815520 2.5 5815500 2 5815480 Variogram 5815460 1.5 5815440 5815420 1 Key Variogram Data Transformed Log Sill: 2.5 Nugget: 0.25 Range: 20 Model: Rational Quadratic 0.5 5815400 5815380 5815360 5815340 0 0 10 20 30 40 50 60 70 80 90 100 110 Lag Distance 303780 303800 303820 303840 303860 303880 303900 303920 303940 303960 303980 304000 A robust statistically based approach to estimating the probability of contamination occurring between sampling title locations Presentation Comparison of Results Cd Pb A robust statistically based approach to estimating the probability of contamination occurring between sampling title locations Presentation Inclusion of QA/QC Samples – Lead Data The variogram shows that the overall random variance component increased from <1% to about 10% Column G: Lead Direction: 0.0 Tolerance: 40.0 Column D: Lead Direction: 0.0 Tolerance: 40.0 1000000 1000000 900000 900000 800000 800000 700000 700000 Variogram Variogram 600000 600000 500000 500000 400000 400000 300000 300000 200000 200000 Primary Data Only 100000 0 Primary Blind Duplicate and Split Data Only 100000 0 0 10 20 30 40 50 60 Lag Distance 70 80 90 100 110 0 10 20 30 40 50 60 70 80 90 100 110 Lag Distance A robust statistically based approach to estimating the probability of contamination occurring between sampling title locations Presentation Confidence Mapping A robust statistically based approach to estimating the probability of contamination occurring between sampling title locations Presentation Adjusting Indicator Krige • Assigning 0 or 1 to samples is deterministic and does not take into account measurement uncertainty • Using a probabilistic approach to assigning the indicator krige value can take this measurement uncertainty into account. 0 0.75 C+U C 0.5 0.25 Decision Criteria 1 C-U Uncontaminated Uncontaminated Contaminated Contaminated Contaminated Deterministic Approach Uncontaminated Probabilistic Approach Possibly Contaminated Maybe Contaminated Probably Contaminated Contaminated A robust statistically based approach to estimating the probability of contamination occurring between sampling title locations Presentation Results of Robust Statistical Assessment Legend Variogram Concentration 6274800 Column C: Zinc (mg/kg) Direction: 5.0 Tolerance: 50.0 5 to 50 50 to 100 100 to 200 200 to 2000 2000 to 7000 7000 to 17500 17500 to 500000 6274700 6274650 Indicator Krig Value 6274600 8000 7000 6000 Variogram 6274750 9000 5000 4000 3000 0 to 0.25 0.25 to 0.5 0.5 to 0.75 0.75 to 1.001 6274550 6274500 2000 1000 0 0 Probability Distribution 6274450 20 40 60 80 100 120 140 Lag Distance 6274400 351300 351350 351400 351450 351500 351550 A robust statistically based approach to estimating the probability of contamination occurring between sampling title locations Presentation Conclusions • Current site characterisation practice collects spatially distributed data, interpretation of the extent of impact and uncertainty is generally based on judgment • Uni and Bi-variant statistical tools are available to assess uncertainty • Uni-variant approaches generally rely on the assumption of random distribution and unbiased sample collection, which is rarely met • Bi-variant variant approaches link concentration, variance and location thus allowing estimation of the random and spatial component of variance allowing development of probability distributions on the presence or absence of contamination • Results are generally robust even when variograms utilised are diverse • Inclusion of QA/QC samples in the variogram development results in more realistic probability distributions A robust statistically based approach to estimating the probability of contamination occurring between sampling title locations Presentation www.ghd.com Presentation title
© Copyright 2026 Paperzz