Full text

ESTIMATION AND PREDICTION OF THE OUTDOOR 222Rn AND 220Rn
PROGENY CONCENTRATIONS USING METEOROLOGICAL VARIABLES*
E. SIMION1,2, I. MIHALCEA2, V. CUCULEANU3,4, F. SIMION1,3
1
National Environmental Protection Agency, Radioactivity Laboratory, Splaiul Independentei 294,
RO-060031, Bucuresti, Romania
2
Faculty of Chemistry, University of Bucuresti, 4-12 Regina Elisabeta Av., 030017-Bucuresti,
Romania
3
Faculty of Physics, University of Bucuresti, Magurele, RO-077125, Bucuresti, Romania
4
Academy of Romanian Scientists, 54 Splaiul Independentei, RO-050094, Bucharest, Romania
Received November 15, 2012
The present paper presents the study of the multiple linear regression model for the
estimation and prediction of the time series of radon and thoron progeny concentrations
in atmosphere. Radon and thoron progeny data measured in Environmental
Radioactivity Monitoring Station Botoşani, part of the National Environmental
Radioactivity Survey Network are modeled, at different time scales, by making use of
the multiple linear regression with meteorological parameters as independent variables.
The collinearity and multicollinearity of independent variables have been analyzed. The
estimations were checked by using the regression statistics: multiple correlation
coefficient, coefficient of determination, F-test values, level of significance (p-value)
and residuals. The predictions performances have been analyzed by means of the
residuals, coefficient of determination and the relative error specific to each time
interval from prediction period.
Key words: radon, thoron, progeny concentration, multiple linear regression, statistics.
1. INTRODUCTION
Inert radioactive gases radon (222Rn) and thoron (220Rn) are produced as
decay products of 226Ra and 224Ra from the decay series of 238U and 234Th that are
present in all terrestrial materials.
Due to exhalation process the atmospheric concentrations of 222Rn, 220Rn and
their daughters enter the atmosphere where they are affected by the meteorological
parameters specific to the planetary boundary layer: air temperature (Ta),
atmospheric pressure (Pa), wind speed (V), precipitations (Pr) and relative
humidity (Urel) [1, 2, 3].
*
Paper presented at the First East European Radon Symposium – FERAS 2012, September 2–5,
2012, Cluj-Napoca, Romania.
Rom. Journ. Phys., Vol. 58, Supplement, P. S262–S272, Bucharest, 2013
2
Estimation and prediction of outdoor 222Rn and 220Rn progeny concentrations
S263
In order to perform estimations and predictions for 222Rn and 222Rn progeny
concentrations, the meteorological parameters have been used as independent
variables in the multiple linear regression model [4].
The following meteorological variables were used as independent variables
of the regression model: air temperature, relative humidity, atmospheric pressure,
precipitation and wind speed. In order to obtain reliable results, the collinearity and
multicollinearity of predictors have to be analyzed. Study of the collinearity is
based on the analysis of correlation matrix of the predictor variables. In case of
multicollinearity, two or more predictor variables are highly linearly correlated.
The regression model performances for the estimation of the time series
values are quantified by the following regression statistics: multiple correlation
coefficient (R), coefficient of determination (R2), p-value (significance level), F
(test), sum squared residuals (SSR). In case of prediction, the model performance
was quantified by means of correlation coefficient R, p-value and SSR [7, 8].
The processed data regarding the radon and thoron progeny concentrations
were measured in the year 1996 by the Environmental Radioactivity Monitoring
Station Botoșani (ERMS), Romania. The meteorological data were measured in the
same area. The choice of this year is justified by the fact that does not exist missing
data for both radon and thoron progeny concentrations and corresponding
meteorological data.
2. DATA AND METHOD
2.1. MEASURING METHOD
The measuring method of radon and thoron progeny activity concentrations is
based on the aerosol aspiration on glass fiber filters with high filtration efficiency
(96-99%), using high volume aerosol samplers with aspiration head located at 2
meters above the ground. The filter activity has been measured using a low
background total beta counter and a 90(Sr/Y) reference standard for determination
of equipment detection efficiency [5, 6].
The counter background was between 2.5 and 6 counts per minute. The
sampling time interval was 5 hours. The filters have been measured immediately
after sampling (3 min. after), for 1000 seconds and after 20 hours, for 3000 seconds [14].
The first two measurements provide filter activities necessary for the
determination of the radon and thoron progeny concentrations and the last
measurement indicates the presence of artificial radioactivity in the atmosphere.
In 1996, ERMS Botoșani had a 4 aspiration program: 02:00–07:00
(aspiration A1), 08:00–13:00 (aspiration A2), 14:00–19:00 (aspiration A3) and
20:00–01:00 (aspiration A4). The measured data were monthly reported to the
National Reference Radioactivity Laboratory from National Environmental
Protection Agency-Bucharest.
S264
E. Simion et al.
3
2.2. RADON AND THORON PROGENY CONCENTRATIONS
The data used for this study are averaged daily values obtained from 4
aspirations, for both 222Rn and 220Rn progeny. The time series of daily data were
used to develop the multiple linear regression model for both estimation and
prediction of the progeny concentrations.
1.4
Precipitations (L)
100
Relative humidity (%)
1.2
90
1.0
0.8
80
0.6
0.4
0.2
(months)
70
1 2 3 4 5 6 7 8 9 10 11 12
1010
Atmospheric pressure (hPa)
(months)
1 2 3 4 5 6 7 8 9 10 11 12
20
Air Temperature (0C)
15
1005
10
1000
5
0
995
(months)
990
4.0
1 2 3 4 5 6 7 8 9 10 11 12
Wind speed (m/ s)
3.5
3.0
2.5
2.0
1.5
(months)
1 2 3 4 5 6 7 8 9 10 11 12
-5
-10
10
9
8
7
6
5
4
0.25
0.20
0.15
0.10
0.05
0.00
(months)
1 2 3 4 5 6 7 8 9 10 11 12
3
Progeny concentrations (Bq/ m )
222
220
Rn
Rn
(months)
1 2 3 4 5 6 7 8 9 10 11 12
Fig. 1 – 222Rn and 220Rn progeny concentrations and meteorological variables in 1996.
In the Figure 1, the monthly average values of 222Rn and 220Rn progeny
concentrations and meteorological variables for the year 1996 are presented.
Estimation and prediction of outdoor 222Rn and 220Rn progeny concentrations
4
S265
As can be seen in Figure 1, both 222Rn and 220Rn progeny are related with
meteorological data. In case of 222Rn progeny maximum concentrations were in
winter time - January due to less prounounced quantities of precipitations. Also,
when were large amounts of precipitations in September, concentration of 222Rn
and 220Rn progeny has decreased [12, 4, 15].
The air temperature is the parameter with the greatest influence in the
determination of progeny concentrations.
In case of thoron progeny, due to the small half-live (56s), there were no
significant changes in concentrations in winter, the rest of concentrations being
similar to that of radon.
2.3. ESTIMATION AND PREDICTION MODEL
In order to find the linear relationship between a dependent variable and
several independent (or predictor) variables by using the multiple linear regression
model it was studied the performances of estimations and predictions of radon and
thoron progeny concentrations considering meteorological data [10].
Multiple linear regression model can be used for estimations and predictions,
with statistical significance, only if the independent variables are not in a relation
of collinearity (correlation coefficients close to 0.99 and larger) and if two or more
predictor variables are not in a relation of multicollinearity - Variance Inflation
Factor (VIF) must be <10 [9, 13, 11].
Collinearity and multicollinearity analysis was studied on meteorological
data for 4 month of the year, each month beeing representative for each season,
data that was used to estimate and predict radon and thoron progeny concentrations.
Table 1
Correlation matrices for January, April, July and October
Month
January
Variables
Pr
Pr
1
Urel
Pa
Ta
V
Urel
Pa
July
Ta
V
Pr
Urel
Pa
Ta
V
1
0.343
0.172
1
1
p≤0.058
p≤0.353
0.575
0.409
0.297
0.122
1
1
p≤0.001 p≤0.022
p≤0.103 p≤0.51
0.205
0.484
0.46
0.068
0.312
0.395
1
1
p≤0.266 p≤0.005 p≤0.009
p≤0.713 p≤0.086 p≤0.027
0.096
0.178
0.08
0.527
0.024
0.095
0.053
0.297
1
p≤0.606 p≤0.337 p≤0.666 p≤0.002
p≤0.897 p≤0.607 p≤0.774 p≤0.104
1
S266
E. Simion et al.
5
Table 1 (continued)
Month
April
Variables
Pr
Pr
1
Urel
Pa
Ta
V
Urel
Pa
October
Ta
V
v
Pr
Urel
Pa
Ta
V
1
0.462
0.392
1
1
p≤0.009
p≤0.028
0.265
0.094
0.23
0.085
1
1
p≤0.156 p≤0.617
p≤0.213 p≤0.647
0.392
0.492
0.134
0.235
0.393
0.365
1
1
p≤0.032 p≤0.008 p≤0.479
p≤0.201 p≤0.028 p≤0.042
0.038
0.654
0.075
0.104
0.546
0.393
0.038
0.043
1
p≤0.839 p≤0.001 p≤0.669 p≤0.583
p≤0.001 p≤0.028 p≤0.836 p≤0.815
1
The correlation coefficients (R) and corresponding level of significance for the
independent variables from Table 1, reveal a high correlation between relative
hummidity and air temperature (0.312 ÷ 0.492) with high level of significance
(0.005 ÷ 0.086) and between precipitations and atmospheric pressure (0.230 ÷
0.575) with p-value between (0.001 ÷ 0.213). From the results exposed in Table 1
can be seen that the independent variables can not generate a relation of
collinearity.
The multicollinearity was determined by calculating the VIF for the data sets
used in the regression analysis for estimations and predictions of 222Rn and 220Rn
progeny concentrations. According to the values of VIF presented in Table 2, for
all months that was investigated showed that multicollinearity is not present for
none of the considered predictors.
Table 2
Variance Inflation Factors for estimation and prediction variables
Variables
222
Rn progeny multicollinearity data
January
April
July
October
220
Rn progeny multicollinearity data
January
April
July
October
Variance inflation factors
Pr
1.74
1.41
1.13
1.91
1.63
1.39
1.16
1.89
Urel
1.45
2.43
1.60
1.43
2.03
2.08
1.47
1.37
Pa
2.20
1.13
1.50
1.03
1.97
1.09
1.48
1.38
Ta
4.05
2.26
3.31
1.49
2.16
2.89
3.02
2.31
V
2.31
1.27
1.15
1.52
2.29
1.20
1.16
1.51
6
Estimation and prediction of outdoor 222Rn and 220Rn progeny concentrations
S267
The estimation of progeny concentrations is made by using the analytical
expression of the multiple linear regression with the corresponding meteorological
variables from the time period on which the regression is defined.
Multiple linear regression model assumes a linear relationship between the
progeny concentration as dependent variable and the meteorological data as
predictor variables [4]:
y(t)= β0 + β1 x1(t) + β2 x2 (t)......+ βn xn (t)
(1)
where: y(t) is the dependent variable associated to predictor variables {xi (t)}i=1,2,..n
β0 is the regression constant and β1….. βn are the regression coefficients.
The performances of regression model are quantified by the following
statistics:
• R – Multiple correlation coefficient. R is measuring the degree of
correlation between 222Rn or 222Rn progeny concentrations and the ensemble of
meteorological data.
• R2 – Squared multiple correlation coefficient. R2 is quantifying the
proportion of the variation of dependent variable that is explained by the
independent variables.
• F-Test – This is a global test of significance for the ensemble of
coefficients. Large values of the F-test provide evidence against the null hypothesis.
• p-value – The p-value characterizes the significance level. A p-value <0.05
indicates that the null hypothesis may be rejected.
• SSR – Sum Squared Residual quantifies the deviation between the progeny
concentrations estimated or predicted by the multiple regression and the measured
values of the respective concentrations.
The prediction is performed using analytical expression of regression
obtained for a previous time period by keeping the regression coefficients constant
and by introducing the meteorological data from the prediction interval.
3. ESTIMATION AND PREDICTION OF 222Rn AND 220Rn PROGENY
The estimation of progeny concentrations is made by using the analytical
expression of the multiple linear regression with the corresponding meteorological
variables from the time period on which the regression is defined.
Multiple linear regression model assumes a linear relationship between the
progeny concentration as dependent variable and the meteorological data as
predictor variables.
E. Simion et al.
3
16
8
0.16
0
222
220
0.08
4
5
6
7
8
9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
April
10
0.00
0.9
8
6
0.6
4
220
0.3
3
3
222
3
2
Rn progeny [Bq/m ]
1
Rn progeny [Bq/m ]
3
0.24
January
24
Rn progeny [Bq/m ]
7
Rn progeny [Bq/m ]
S268
0.0
3
Rn progeny [Bq/m ]
0.8
July
0.6
0.4
0.2
220
15
12
9
6
3
222
3
Rn progeny [Bq/m ]
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
0.0
3
Rn progeny [Bq/m ]
0.9
October
12
8
0.6
4
220
0.3
222
3
Rn progeny [Bq/m ]
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
0.0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
222
Rn measured
220
222
Rn estimated
Fig. 2 – Estimation of
222
Rn and
Rn measured
220
Rn estimated
220
Rn progeny concentrations.
From the Figure 2 containing representative months for each season, it can be
clearly seen, that the estimation values generated by the multiple regression model,
are very close to the measured values. This means that the considered predictor
variables may describe well the radon and thoron concentration dynamics in the
respective months.
Making use of the dedicated software [12] and Excel, the regression
coefficients are determined by the least square method. In the Table 3 the
Estimation and prediction of outdoor 222Rn and 220Rn progeny concentrations
8
S269
regression statistics for the estimation of the 222Rn and 220Rn progeny
concentrations for a representative month of each season, are presented.
Table 3
Month
Regression statistics for 222Rn and 220Rn progeny estimation
Jan
‘96
Apr
‘96
Jul
‘96
Oct
‘96
Regression statistics for 222Rn progeny
Regression statistics for 220Rn progeny
Multiple
R
R2
pvalue
F-test
SSR
Multiple
R
R2
pvalue
F-test
SSR
0.730
0.534
1.2e-3
5.730
298.63
0.731
0.534
1.1e-3
5.753
0.004
0.559
0.313
8.9e-2
2.188
54.139
0.690
0.476
5.6e-3
4.375
0.175
0.809
0.655
3.6e-5
9.495
89.407
0.781
0.611
1.4e-4
7.867
0.107
0.434
0.188
3.6e-1
1.162
121.93
0.772
0.521
1.5e-3
5.453
0.134
One may notice that the multiple correlation coefficients has the greatest
values in July with R values equal to 0.809 with very high level of significance
(p<0.01) for both radon and thoron progeny concentrations. This proves that the
corresponding predictor variables are describing quite well the variability of radon
progeny in this month.
The prediction is achieved by using the analytical expression of regression
obtained for a previous time period, with predictor variables from the time interval
on which the prediction is done [4]. The predictions of the progeny concentrations
[y] N+1 on the time interval N+1, were performed by using the following analytical
expression:
[y] N+1 = β0 + β1[x1]N+1 + β2 [x2]N+1…. + βn [xn ]N+1
(2)
where: [x1]N+1, [x2]N+1 ...[xn]N+1, are the values of independent variables in the time
interval N+1.
From the Figure 3 where the prediction values generated by the multiple
regression model are compared with measured values showed better results
obtained for the first part of the representative months it can be clearly seen
because of the changing of meteorological conditions from the prediction period
considering the fact that the equation constants are calculated considering the
meteorological data from the previous month. This means that the considered
model for estimation and prediction may describe well the radon and thoron
progeny concentrations dynamics in the respective months.
0.24
February
24
16
0.16
8
0
220
0.08
222
0.00
4 5
6 7
4
6
8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
8
10
12
14
16
18
20
22
24
26
28
30
May
6
0.9
4
0.6
2
220
0.3
222
3
Rn progeny [Bq/m ]
3
2
3
1 2
8
Rn progeny [Bq/m ]
3
Rn progeny [Bq/m ]
32
9
3
E. Simion et al.
Rn progeny [Bq/m ]
S270
0.0
3
Rn progeny [Bq/m ]
0.8
August
0.6
0.4
0.2
220
15
12
9
6
3
222
3
Rn progeny [Bq/m ]
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
0.0
0.9
8
4
0.6
0.3
220
3
Rn progeny [Bq/m ]
1.2
November
12
222
3
Rn progeny [Bq/m ]
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
0.0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
222
222
220
Rn measured
Rn measured
Rn predicted
Fig. 3 – Prediction of 222Rn and 220Rn progeny concentrations.
220
Rn predicted
As for criteria for prediction performances, we have used the SSR and
correlation coefficient (R) between predicted and observed progeny concentrations,
the corresponding level of significance (p-value) and the relative error specific to
each predicted concentration value.
10
Estimation and prediction of outdoor 222Rn and 220Rn progeny concentrations
S271
Table 4
Regression statistics for
222
Rn and 220Rn progeny prediction
222
220
Rn progeny
Month
Prediction characteristics
R
p-value
-4
February
0.562
9.2e
May
0.527
2.1e-3
0.443
1.2e
-2
3.4e
-1
August
November
Rn progeny
0.176
SSR
Prediction characteristics
R
p-value
SSR
-6
0.013
205.189
0.731
5.0e
44.473
0.357
4.8e-2
0.673
0.788
9.7e
-8
0.250
4.5e
-6
0.574
104.39
175.54
0.717
The performances of preditions presented in Table 4 indicates that both radon
and thoron progeny concentrations are being well simulated by the multiple linear
regression model except the case of radon in November (R = 0,176), the
explanation consisting in a high variation of meteorological conditions for the
month of the prediction period – November and from the time period used to
determine the equation for prediction – October.
4. CONCLUSIONS
There have been studied the statistical and physical aspects of the modeling
of the 222Rn and 220Rn progeny concentrations using multiple linear regression with
meteorological variables as predictors.
Because the multiple regressions was used for estimations of January, April,
July and October in Figure 2, and the same months were used to predict the
progeny concentrations in the next months, in Figure 3 are given the time
distributions of the predicted air concentrations for February, May, August and
November.
The regression statistics from Table 3 shows that for the radon and thoron
progeny concentration estimation, the greatest values of multiple R with high level
of significance, have been obtained in July due to the fact that radon having the
lifetime comparable with the ventilation time of entire planetary boundary layer is
very sensitive to the air temperature which induce the vertical transport of the air
masses.
The analysis of the progeny concentration prediction has shown that
limitations of the prediction procedure, is more pronounced in case of the 220Rn
progeny concentrations prediction, because thoron progeny concentrations are
much lower than those of radon progeny. Thoron progeny concentrations are more
S272
E. Simion et al.
11
sensitive to the variations of the dependent and independent variables used in the
estimation and prediction procedures.
The prediction perfomances described in Table 4 showed that 220Rn progeny
concentrations are much more sensitive to the variability of the independent
variables in comparison with 222Rn progeny concentrations due to a much lower air
concentrations.
This paper sustain the results and conclusions presented in a previous one [4],
"Modeling the 222Rn and 220Rn progeny concentrations in atmosphere using
multiple linear regression with meteorological variables as predictors", which used
data from another location of România, Bacău.
Acknowledgments. Our colleagues from Environmental Radioactivity Monitoring Station
Botoșani are acknowledged for their effort in sample collection.
REFERENCES
1. C.R. Hosler, Meteorlogical effects on atmospheric concentrations of radon (Rn222), RaB(Pb214),
and RaC(Bi 214) near the ground, Monthly Weather Review, 94(2), 89–99 (1966).
2. M. Wilkening, Radon in atmospheric studies: a review, Second Special Symposium, Bhabha
Atomic Research Center Bombay, CONF-810153-3 DE93 003345, India, 1981
3. J. Porstendorfer, G.Butterweck, A.Reineking, Diurnal variation of the concentrations of Radon and
its short-lived daughters in the atmosphere near the ground, Atmos. Environ. 25A (3/4), 709–713
(1991).
4. Simion F., Cuculeanu V., Simion E., Geicu A., Modeling the 222Rn and 220Rn progeny
concetrations in atmosphere using multiple linear regression with meteorological variables as
predictors, Romanian Reports in Physics, nr. 2 (2013).
5. C.A. Baciu, Radon and thoron progeny concentration variability in relation to meteorological
conditions at Bucharest (Romania), J. Environ. Radioact. 83(2), 171–189 (2005)
6. V. Cuculeanu, F. Simion, E. Simion, A. Geicu, Dynamics, deterministic nature and correlations of
outdoor 222Rn and 220Rn progeny concentrations measured at Bacău, Romania. J. Environ.
Radioact., ELSEVIER, nr. 102, 703–712 (2011)
7. J.B. Cromwell, M.J. Hannan, W.C. Labys, M. Terreza, Multivariate tests for time series models.
SAGE Publications, Inc (1994).
8. M.C. Hubbard, W.G.Cobourn, Development of a regression model to forecast ground-level ozone
concentration in Louisville, KY. Atmos.Environ. 32, 14/15, 2637–2647, Elsevier. (1998).
9. D.S. Wilks, Statistical methods in atmospheric sciences: an introduction. Academic Press (1995).
10. E. Kovac-Andric, J. Brana, V. Gvozdic, Impact of meteorological factor on ozone concentrations
modeled by time series analysis and multivariate statistical methods. Ecological Informatics,
4, 117–122 (2009).
11. P.A. Rogerson, A Statistical Method for the Detection of Geographic Clustering, 2001.
12. P. Wessa. Free Statistics Software, Office for Research Development and Education, version
1.1.23-r7, URL http://www.wessa.net/ (2012).
13. J. M. Chambers, Linear models, eds J. M. Chambers and T. J. Hastie, Wadsworth & Brooks/Cole
(1992).
14. Sima, O., 1978, „Radioactivitatea naturala a aerosolilor” Studii si cercetari de meteorologie,
Nr. 1: 361–378.
15. Simion E., Mihalcea I., Simion F., Păcurar C., Evaluation model of atmospheric natural
radioactivity considering meteorological variables, Revista de chimie, vol. 63, 2012.