Evolutionary data division methods for developing artificial neural

Evolutionary data division methods for developing artificial neural network
models in geotechnical engineering
by
M A Shahin
H R Maier
M B Jaksa
Department of Civil & Environmental Engineering
The University of Adelaide
Research Report No. R 171
November, 2000
i
ABSTRACT
In recent years, artificial neural networks (ANNs) have been applied to many
geotechnical engineering problems and have demonstrated some degree of success.
In the majority of these applications, data division is carried out on an arbitrary
basis. However, the way the data are divided can have a significant effect on model
performance. In this report, the relationship between the statistical properties of
training, testing and validation sets and model performance and the effect of the
proportion of data used for training, testing and validation on model performance are
investigated for the case study of predicting the settlement of shallow foundations
on cohesionless soils. In addition, a novel approach for data division, which is
based on a self-organising map, is introduced and evaluated for the above case
study. The results obtained indicate that the statistical properties of the data in the
training, testing and validation sets need to be taken into account to ensure that
optimal model performance is achieved. The data division method introduced in
this paper is found to negate the need to choose which proportion of the data to use
for training, testing and validation and to ensure that each of the subsets are
representative of the available data.
ii
TABLE OF CONTENTS
ABSTRACT..................................................................................................... i
TABLE OF CONTENTS ................................................................................. ii
LIST OF FIGURES.......................................................................................... iii
LIST OF TABLES ........................................................................................... iii
1.
INTRODUCTION.................................................................................. 1
2.
DATA DIVISION ................................................................................. 2
3.
CASE STUDY ....................................................................................... 4
4.
RESULTS AND DISCUSSION............................................................. 6
5.
CONCLUSIONS.................................................................................... 14
6.
REFERENCES....................................................................................... 15
APPENDIX A – NOTATION .......................................................................... 18
APPENDIX B – INPUT AND OUTPUT STATISTICS FOR DIFFERENT
DATA PROPORTIONS USING DATA DIVISION METHOD
TO ENSURE STATISTICAL CONSISTENCY .................... 19
APPENDIX C – NULL HYPOTHESIS TESTS FOR DIFFERENT DATA
PROPORTIONS USING DATA DIVISION METHOD TO
ENSURE STATISTICAL CONSISTENCY .......................... 27
iii
LIST OF FIGURES
Figure 1: Typical structure of self-organising map .............................................. 3
LIST OF TABLES
Table 1: Proportions of data used for training, testing and validation .................. 3
Table 2: Database used for ANN model development ......................................... 5
Table 3: Input and output statistics obtained using random data division ............ 7
Table 4: Input and output statistics obtained using data division method to
ensure statistical consistency ................................................................. 7
Table 5: Null hypothesis tests for random data division ..................................... 8
Table 6: Null hypothesis tests for data division to ensure statistical consistency . 9
Table 7: Performance of ANN models using data subsets obtained using
different approaches to data division...................................................... 10
Table 8: Performance of ANN models for different data approaches using
statistical data division approach ........................................................... 11
Table 9: Input and output statistics obtained using SOM..................................... 13
Table 10: Null hypothesis tests for SOM data division ....................................... 13
1
1.
INTRODUCTION
Artificial neural networks (ANNs) (Fausett 1994; Flood and Kartam 1994; HechtNilsen 1990; Maren et al. 1990; Zurada 1992) are a form of artificial intelligence,
which, in their architecture, attempt to simulate the biological structure of the human
brain and nervous system. ANNs have been applied extensively to many prediction
tasks, as they have the ability to model the nonlinear relationship between a set of
input variables and the corresponding outputs.
ANNs are similar to conventional statistical models in the sense that model
parameters (e.g. connection weights) are adjusted in a model calibration phase called
“training” so as to minimise the error between model outputs and the corresponding
measured values for a particular data set (the training set). As the number of
connection weights in ANN models is generally large, ANNs can overfit the training
data, especially if the training data are noisy. In order to avoid overfitting, various
rules of thumb have been proposed to restrict the ratio of the number of connection
weights to the number of data samples in the training set (see Maier and Dandy
2000b). Alternatively, the cross-validation technique (Stone 1974) can be used, in
which the training data are divided into training and testing subsets. The training
subset is used to adjust the connection weights, whereas the testing subset is used to
check the performance of the network at various stages of learning. Training is
stopped once the error in the testing set increases. Generally, cross-validation is
considered to be the most effective method to ensure overfitting does not occur
(Smith 1993). Regardless of which of the above methods is used, the performance
of the trained model is generally evaluated on an independent validation set.
Consequently, if cross-validation is used, the available data need to be divided into
training, testing and validation subsets.
In recent times, ANNs have been applied to many geotechnical engineering
problems and have demonstrated some degree of success. For example, ANNs have
been used in pile bearing capacity prediction (Lee and Lee 1996), stress-strain
modelling of sands (Ellis et al. 1995), interpretation of site investigation (Zhou and
Wu 1994) and seismic liquefaction assessment (Goh 1994). A comprehensive list of
the applications of ANNs in geotechnical engineering is given by Shahin et al.
(2000). In the majority of these applications, the data are divided into the subsets
needed to develop ANN models (e.g. training, testing and validation) on an arbitrary
basis. However, recent studies have shown that the way the data are divided can
have a significant impact on the results obtained (Tokar and Johnson 1999). For
example, if the validation data fall outside the range of the data used for training, the
results obtained using the validation data can be worse than those obtained using the
training data. In addition, if cross-validation is used as the stopping criterion, the
results obtained using the testing set have to representative of those obtained using
the training set, as the testing set is used to decide when to stop training.
Consequently, it has been suggested that the statistical properties (e.g. mean,
2
standard deviation) of the various data subsets (e.g. training, testing and validation)
need to be considered to ensure that each subset represents the same population
(Masters 1993). If this is not the case, it may be difficult to judge the validity of
ANN models (Maier and Dandy 2000b).
In this report, the effect of various data division methods on the performance of
ANN models is investigated. The objectives of the report are:
1. To investigate the relationship between the statistical properties of the data
subsets used to develop ANN models and model performance;
2. To investigate the relationship between the proportion of the data used in each of
the subsets used to develop ANN models and model performance; and
3. To introduce and evaluate a novel method for dividing data into the subsets
needed to develop ANN models.
2.
DATA DIVISION
In order to meet the objectives set out above, three approaches to data division are
investigated. Firstly, the available data are randomly divided into their respective
subsets. Ten percent of the available data are used for validation and ninety percent
of the remaining data are used for training and ten percent for testing. Secondly, a
trial and error process is used to divide the data in such a way that the statistical
properties of the data in each of the subsets are as close to each other as possible,
and thus represent the same population. The statistical parameters used include the
mean, standard deviation, maximum, minimum and range. In order to investigate
the relationship between the proportion of the data used for training, testing and
validation and model performance, the proportion of the available data used in each
of the subsets is varied, as shown in Table 1. Thirdly, a novel approach proposed in
this work, which uses a self-organising map (SOM) (Kohonen 1982) to cluster
similar data records together. One sample is then chosen from each cluster for each
of the required subsets.
Self-organising maps belong to the genre of unsupervised neural networks. The
typical structure of SOMs consists of two layers: an input layer and a Kohonen layer
(Figure 1). The processing elements in the Kohonen layer are arranged in a one- or
two-dimensional array. The input from each node in the input layer (xi) is fully
connected to the Kohonen layer through connection weights (wij). At the beginning
of the self-organising process, these weights are initialised randomly. At each node
in the Kohonen layer, the input pattern is presented without providing the desired
output, and a matching value is calculated. This value is typically the Euclidean
distance (Dj) between the weights of each node and the corresponding values of the
input pattern, as shown in Equation 1.
3
Table 1. Proportions of data used for training, testing and validation
Validation set (%)
10
20
30
Remaining data
Training set (%)
Testing set (%)
90
70
50
90
70
50
90
70
50
10
30
50
10
30
50
10
30
50
Figure 1. Typical structure of self-organising map
4
Dj =
∑
( wij − xi )2
(1)
The node that has the minimum Euclidean value is declared the winner. The
weights of the winning node and its neighbouring nodes, in terms of topology, are
then updated to match the input pattern more closely. The process is repeated by
presenting a new input pattern to the model. The connection weights are adjusted
until they remain unchanged. The result is a topological map in which similar
patterns are clustered together. A full description of the self-organising map process
is given in Kohonen (1997).
The proposed method has a number of advantages, including:
1. There is no need to decide which proportion of the available data to use for
training, testing and validation.
2. The number of training samples is reduced, thus increasing computational
efficiency during training.
3. The statistical properties of the resulting training, testing and validation data are
similar, provided that the clusters are small enough.
4. Information is provided about whether “outliers” (not necessarily in the statistical
sense) exist in the data set. For example, if a cluster contains only one data
sample, this sample should be included in the training set. If it were to be
included in the validation set, the trained ANN model could not be expected to
perform well, as the validation data fall outside the range of the training data.
A potential disadvantage of this method is that the parameters that control the
learning process in the SOM have to be selected, potentially affecting the results
obtained.
3.
CASE STUDY
In this research, the three approaches for data division discussed above are
applied to the case study of predicting the settlement of shallow foundations on
cohesionless soils. The data are obtained from the literature and include field
measurements of settlement of shallow foundations, as well as the corresponding
information regarding footings and soils. The data cover a wide range of footing
dimensions and cohesionless soil types and properties. The database comprises a
total of 260 individual cases. Information regarding the database is shown in
Table 2.
The steps for developing ANN models outlined by Maier and Dandy (2000a; b) are
used as a guide in this research. These include the division and pre-processing of
the available data, the determination of appropriate model inputs and network
architecture, optimization of the connection weights (training) and model validation.
5
Table 2. Database used for ANN model development
Reference
Vargas, 1961
Levy and Morton, 1974
Burland and Burbidge, 1985
Jeyapalan and Boehm, 1986
Papadopoulos, 1992
Wahls, 1997
Maugeri et al., 1998
Briaud and Gibbens, 1999
Total
No. of cases
2
37
99
26
55
31
6
4
260
The PC-based software package NeuralWorks Predict Release 2.1 (NeuralWare, Inc.
1997) is used to simulate artificial neural network operation. The default values
suggested in the software package that used are adopted unless stated otherwise.
Use of the default parameters is considered reasonable as this best reflects an actual
modelling situation (Maier and Dandy 1999). Details of the default parameters are
given elsewhere (Shahin et al. 2000) and discussed in NeuralWare, Inc. (1997).
The model inputs used include footing width (B), footing length (L), applied
pressure (q) and the average blow count per 300 mm obtained using a standard
penetration test over the depth of influence of the foundation (N), as these variables
are considered to have the greatest effect on the settlement of shallow foundations
on cohesionless soils (Burland and Burbidge 1985). The model output is foundation
settlement (Sm).
The available data are divided using the three methods discussed previously. The
PC-based software package NeuFrame Version 1.1 (Neural Computer Sciences
1995) is used to cluster the data using a SOM (Figure 1). A two-dimensional array
of nodes (10× 10) is used in the Kohonen layer and the default parameters suggested
in the software package are adopted (Neural Computer Sciences 1995). Training is
continued for 10,000 iterations, as the connection weights remain stable after this
point. From each cluster, three samples are selected; one for training, one for testing
and one for validation. In the instance when a cluster contains two records, one
record is then chosen for training set and the other one is chosen for testing set. If a
cluster contains only one record, this record is chosen in the training set. As a result,
a total of 98 records are used for training, 46 for testing and 26 for validation,
reducing the number of data records used from 260 to 170.
6
To examine how representative the training, testing and validation sets are of each
other, t- and F-tests are carried out. The t-test examine the null hypothesis of no
difference in the means of two data sets and F-test examine the null hypothesis of no
difference in the standard variation of the two sets. For a given level of significance,
test statistics can be calculated to test the null hypotheses for t- and F-tests,
respectively. Traditionally, a level of significance equal to 0.05 is selected (Levine
et al. 1999). Consequently, this level of significance is used in this work. This
means that there is a confidence level of 95% that the training, testing and validation
sets are statistically consistent. A detailed description of these tests can be found by
Levine et al. (1999).
Back-propagation neural networks are used to predict settlement, as they have a high
capability of data mapping (Hecht-Nilsen 1990) and are currently the most widely
used neural network type. A discussion of the topology and algorithm details of
back-propagation neural networks is beyond the scope of this paper and can be
found in many publications (e.g. Fausett 1994; Hertz et al. 1991; Picton 1994;
Ripley 1996; Zurada 1992). In the software package used, the optimal network
geometry is found with the aid of Cascade Learning (Fahlman and Lebiere 1990).
The Cascade Learning is a constructive algorithm in which hidden layer nodes are
added as training progresses until there is no further improvement in model
performance.
The coefficient of correlation, r, the root mean square error, RMSE, and the mean
absolute error, MAE, are the criteria used to evaluate the performance of the trained
model in this work. The correlation coefficient is used as it is the most popular error
measure in geotechnical engineering (e.g. Abu Kiefa 1998; Goh 1995; Shi 2000).
The RMSE and MAE are used as they are desirable error measures when the data
are smooth or continuous (Twomey and Smith 1997), which is the case in this
research. In addition, the RMSE has the advantage that large errors receive much
greater attention than small ones (Hecht-Nielsen 1990).
4.
RESULTS AND DISCUSSION
The statistics of the training, testing and validation sets obtained when the data are
divided in a purely random fashion and where the statistics of the subsets are taken
into account, are shown in Tables 3 and 4. It should be noted that the statistics in
Tables 3 and 4 are obtained using 26 data samples for validation, 210 data samples
for training, and 24 data samples for testing. It can be seen that when the data are
divided in a purely random manner (Table 3), there are some inconsistencies in the
statistics between the various data subsets. This is confirmed by the results of t- and
F-tests (Table 5), which show that the data in the three subsets generally do not
belong to the same population. However, it should be noted that this is not
necessarily the case when the data are divided in a random manner, as there are
many different possible ways in which the data can be divided into training, testing
7
and validation subsets. The results in Table 4 show that when the data are divided in
a way that takes into account the statistical properties of the various subsets, the
statistics are in much better agreement. This is confirmed by the outcomes of the tand F-tests (Table 6), which indicate that the training, testing and validation sets are
generally representative of each other.
Table 3. Input and output statistics obtained using random data division
Model
variables and
data sets
Statistical parameters
Mean
Std. Dev.
Footing width, B (m)
Training set
8.5
9.7
Testing set
4.0
5.2
Validation set
1.8
1.0
Footing length, L (m)
Training set
18.0
24.4
Testing set
7.6
9.6
Validation set
3.0
3.5
Load applied pressure, q (kPa)
Training set
223.4
146.6
Testing set
236.3
163.3
Validation set
251.4
163.7
Average blow count, N
Training set
24.6
13.4
Testing set
26.9
11.8
Validation set
31.8
15.7
Measured settlement, Sm
Training set
18.7
23.2
Testing set
2.5
2.0
Validation set
5.7
17.2
Maximum Minimum
Range
55
14
4.5
0.3
0.3
0.3
54.7
13.7
4.2
200
23.9
14.2
0.3
0.3
0.3
199.7
23.6
13.9
850
500
697
6.3
16.2
25
843.7
483.7
672
60
50
60
4
8
6
56
42
54
143
11
90
0.2
0.6
0.5
142.8
10.4
89.5
Table 4. Input and output statistics obtained using data division method to
ensure statistical consistency
Model
variables and
data sets
Mean
Footing width, B (m)
Training set
7.4
Testing set
8.6
Validation set
6.2
Statistical parameters
Std. Dev. Maximum Minimum
9.1
8.8
10.2
55
30.2
53
0.3
0.88
0.3
Range
54.7
29.3
52.7
8
Table 4. Input and output statistics obtained using data division method to
ensure statistical consistency (continued)
Model
variables and
data sets
Statistical parameters
Mean
Std. Dev.
Footing length, L (m)
Training set
15.9
24.0
Testing set
13.7
15.2
Validation set
14.1
17.0
Load applied pressure, q (kPa)
Training set
226.3
151.4
Testing set
224.3
126.1
Validation set
239.9
159.7
Average blow count, N
Training set
25.3
13.6
Testing set
27.7
14.4
Validation set
25.7
14.0
Measured settlement, Sm
Training set
15.8
22.4
Testing set
19.4
24.3
Validation set
13.5
19.5
Maximum Minimum
Range
200
72.9
72.9
0.3
0.88
0.3
199.7
72.0
72.6
850
500
507.5
6.3
71.8
25
843.7
428.2
482.5
60
58
50
4
8
4
56
50
46
143
100
97
0.2
1.8
0.85
142.8
98.2
96.15
Table 5. Null hypothesis tests for random data division
Variable
and
data sets
B
Testing
Validation
L
Testing
Validation
q
Testing
Validation
N
Testing
Validation
t
value
Lower
critical
value
Upper
critical
value
t-test
Ho:
µ1=µ2
Fvalue
Lower
critical
value
Upper
critical
value
F-test
Ho:
s 1=s 2
2.23
3.51
-1.97
-1.97
1.97
1.97
Reject
Reject
3.47
94.09
0.58
0.59
2.01
1.94
Reject
Reject
2.06
3.12
-1.97
-1.97
1.97
1.97
Reject
Reject
5.44
46.07
0.58
0.59
2.01
1.94
Reject
Reject
-0.40
-0.90
-1.97
-1.97
1.97
1.97
Accept
Accept
0.80
0.80
0.58
0.59
2.01
1.94
Accept
Accept
-0.80
-2.53
-1.97
-1.97
1.97
1.97
Accept
Reject
1.28
0.72
0.58
0.59
2.01
1.94
Accept
Accept
9
Table 5. Null hypothesis tests for random data division (continued)
Variable
and
data sets
Sm
Testing
Validation
t
value
Lower
critical
value
Upper
critical
value
t-test
Ho:
µ1=µ2
Fvalue
Lower
critical
value
Upper
critical
value
F-test
Ho:
s 1=s 2
3.41
2.76
-1.97
-1.97
1.97
1.97
Reject
Reject
134.5
1.81
0.58
0.59
2.01
1.94
Reject
Accept
Table 6. Null hypothesis tests for data division to ensure statistical consistency
Variable
and
Data sets
B
Testing
Validation
L
Testing
Validation
q
Testing
Validation
N
Testing
Validation
Sm
Testing
Validation
tvalue
Lower
critical
value
Upper
critical
value
t-test
Ho:
µ1=µ2
Fvalue
Lower
critical
value
Upper
critical
value
F-test
Ho:
s 1=s 2
-0.62
0.63
-1.97
-1.97
1.97
1.97
Accept
Accept
1.06
0.79
0.58
0.59
2.01
1.94
Accept
Accept
0.43
0.37
-1.97
-1.97
1.97
1.97
Accept
Accept
2.49
1.99
0.58
0.59
2.01
1.94
Reject
Reject
0.06
-0.42
-1.97
-1.97
1.97
1.97
Accept
Accept
1.44
0.89
0.58
0.59
2.01
1.94
Accept
Accept
-0.81
-0.14
-1.97
-1.97
1.97
1.97
Accept
Accept
0.89
0.94
0.58
0.59
2.01
1.94
Accept
Accept
-0.73
0.50
-1.97
-1.97
1.97
1.97
Accept
Accept
0.84
1.31
0.58
0.59
2.01
1.94
Accept
Accept
The performance of the models developed using the data sets whose statistics are
shown in Tables 3 and 4 are shown in Table 7 (column 2 and 3). It can be seen that
there is a direct relationship between the consistency in the statistics between
training, testing and validation sets and consistency in model performance. When
the training, testing and validation data are not representative of each other, there
can be large discrepancies in the model performance obtained using the training,
testing and validation sets. Consequently, the results obtained using the validation
10
set may not be truly representative of the performance of the trained model.
Similarly, if the results obtained using the testing set are not representative of those
obtained using the training set, training may be ceased at a sub-optimal time.
The model performances obtained when different of the available data are used for
training, testing and validation, in conjunction with the data division method which
takes into account the statistical properties of the data, are shown in Table 8. The
statistics and null hypothesis tests for each data proportion used in this method are
shown in Appendices B and C respectively. A code is used to distinguish between
the various proportions of the available data used for training, testing and validation.
The code consists of three numbers. The first number represents the percentage of
the data used in the validation set, whereas the second two numbers, separated by a
hyphen, are the percentages that divide the remaining data into training and testing
sets respectively. It can be seen from Table 8 that there is no clear relationship
between the proportion of data used for training, testing and validation and model
Table 7. Performance of ANN models using data subsets obtained using
different approaches to data division
Performance measures
And data sets
Correlation coefficient, r
RMSE (mm)
MAE (mm)
Correlation coefficient, r
RMSE (mm)
MAE (mm)
Correlation coefficient, r
RMSE (mm)
MAE (mm)
Random division
Training
0.906
9.82
5.91
Testing
0.766
5.74
5.43
Validation
0.366
16.42
7.39
Statistical division
SOM
0.964
6.07
4.28
0.945
7.52
4.22
0.980
5.43
4.67
0.934
5.75
4.05
0.934
6.77
5.10
0.956
5.14
3.20
11
Table 8. Performance of ANN models for different data proportions using
statistical data division approach
Data proportions
and sets
Performance measures
Correlation
RMSE
coefficient
(mm)
MAE
(mm)
10/90-10
Training set
Testing set
Validation set
0.964
0.980
0.934
6.07
5.43
6.77
4.28
4.67
5.10
9.26
14.22
13.84
5.84
8.79
6.66
6.83
11.96
13.41
4.56
6.96
7.37
9.31
12.29
13.83
5.28
8.05
7.34
6.75
12.15
10.18
4.21
6.28
6.80
11.62
12.64
13.27
7.11
7.09
8.35
11.27
10.39
13.07
6.47
7.02
8.14
6.11
15.56
15.43
4.39
8.59
8.76
10/70-30
Training set
Testing set
Validation set
0.902
0.832
0.864
10/50-50
Training set
Testing set
Validation set
0.956
0.861
0.713
20/90-10
Training set
Testing set
Validation set
0.903
0.931
0.804
20/70-30
Training set
Testing set
Validation set
0.952
0.878
0.901
20/50-50
Training set
Testing set
Validation set
0.872
0.792
0.824
30/90-10
Training set
Testing set
Validation set
0.871
0.930
0.838
30/70-30
Training set
Testing set
Validation set
0.957
0.834
0.811
12
Table 8. Performance of ANN models for different data proportions using
statistical data division approach (continued)
Data proportions
and sets
Performance measures
Correlation
RMSE
coefficient
(mm)
MAE
(mm)
30/50-50
Training set
Testing set
Validation set
0.929
0.854
0.855
7.16
13.59
12.79
4.99
7.60
7.63
performance. The best result is obtained when 10% of the data are used for
validation and the remaining data are divided into 90% for training and 10% for
testing. The results in Table 9 also indicate that there can be significant variation in
the results obtained, depending which proportion of the data is used for training,
testing and validation, even if the statistical properties of the data subsets are taken
into account. This may be due to the difficulties in obtaining representative data sets
for some of the proportions for training, testing and validation investigated for the
particular data set used.
The difficulties associated with deciding which proportion of the available data to
use for training, testing and validation can be overcome by using a SOM to obtain
appropriate data subsets. By using this method, there is no need to choose which
proportion of the data to use for training, testing and validation, as one sample from
each cluster is used for each subset. In addition, the statistics of the data in each of
the subsets are very close to each other (Table 9), and may be considered to be
representative of each other (Table 10). The success of this approach is illustrated in
Table 7, which compares the results obtained using the three different approaches to
data division investigated. It can be seen that the results obtained when the SOM
data division method is used are very close to the results obtained when the
statistically consistent data division method is used and significantly better than the
results obtained when the purely random data division method is used. It should be
noted that the results presented for the method using data division that taking into
account statistical properties are for the proportion of training, testing and validation
data that give the best results. Consequently, it appears as though the SOM method
is a suitable approach for dividing data into training, testing and validation subsets.
13
Table 9. Input and output statistics obtained using SOM
Model
variables and
data sets
Mean
Statistical parameters
Std. Dev. Maximum Minimum
Footing width, B (m)
Training set
9.8
Testing set
6.4
Validation set
5.8
Footing length, L (m)
Training set
23.6
Testing set
16.6
Validation set
13.5
Load applied pressure, q (kPa)
Training set
196.0
Testing set
220.4
Validation set
251.6
Average blow count, N
Training set
26.8
Testing set
25.8
Validation set
21.7
Measured settlement, Sm
Training set
18.7
Testing set
13.8
Validation set
16.1
Range
10.7
6.8
7.6
55
24.3
30.2
0.3
0.5
0.5
54.7
23.8
29.7
29.8
25.5
24.0
200
134.5
121.9
0.3
0.6
0.9
199.7
133.9
121
134.0
122.4
127.3
850
584
666
6.3
35
92
843.7
549
574
14.0
13.3
10.6
60
60
42
4
5
6
56
55
36
20.1
15.6
17.1
143
100
91.6
0.2
2.4
2.8
142.8
97.6
88.8
Table 10. Null hypothesis tests for SOM data division
Variable
and
Data sets
B
Testing
Validation
L
Testing
Validation
tvalue
Lower
critical
value
Upper
critical
value
t-test
Ho:
µ1=µ2
Fvalue
Lower
critical
value
Upper
critical
value
F-test
Ho:
s 1=s 2
1.97
1.78
-1.97
-1.97
1.97
1.97
Accept
Accept
2.47
1.98
0.61
0.56
1.69
1.99
Reject
Accept
1.37
1.59
-1.97
-1.97
1.97
1.97
Accept
Accept
1.37
1.54
0.61
0.56
1.69
1.99
Accept
Accept
14
Table 10. Null hypothesis tests for SOM data division (continued)
Variable
and
Data sets
q
Testing
Validation
N
Testing
Validation
Sm
Testing
Validation
tvalue
Lower
critical
value
Upper
critical
value
t-test
Ho:
µ1=µ2
Fvalue
Lower
critical
value
Upper
critical
value
F-test
Ho:
s 1=s 2
-1.04
-1.89
-1.97
-1.97
1.97
1.97
Accept
Accept
1.19
1.10
0.61
0.56
1.69
1.99
Accept
Accept
0.40
1.72
-1.97
-1.97
1.97
1.97
Accept
Accept
1.10
1.74
0.61
0.56
1.69
1.99
Accept
Accept
1.45
0.60
-1.97
-1.97
1.97
1.97
Accept
Accept
1.68
1.38
0.61
0.56
1.69
1.99
Accept
Accept
5. CONCLUSIONS
The results obtained indicate that the way the available data are divided into
training, testing, and validation sets can have a significant effect on model
performance. There is a direct relationship between the consistency of the statistics
between training, testing, and validation sets and the consistency in model
performance. Consequently, the statistical properties of the various data subsets
should be taken into account as part of any data division procedure to ensure that
optimal model performance is achieved. The proportion of the data used for
training, testing, and validation also appears to have an effect on model
performance. However, there appears to be no clear relationship between the
proportion of the data used in each of the subsets and model performance, although
in the trials conducted, the optimal model performance is obtained when 10% of the
data are used for validation and 90% of the remaining data are used for training and
10% for testing.
In this paper a novel method for dividing the available data into training, testing and
validation sets is proposed. The method involves clustering the data using a SOM
and then choosing one sample from each cluster for each of the required data
subsets. The method negates the need to choose which proportion of the data to use
for training, testing, and validation and ensures that each of the subsets are
representative of the available data. The method works well for the case study
considered, producing results that are comparable with the best results obtained
when the data are divided taking into account the statistical properties of the subsets.
15
6.
REFERENCES
Abu Kiefa, M. A. (1998). “General regression neural networks for driven piles in
cohesionless soils.” J. Geotech. & Geoenv. Engrg., ASCE, 124(12), 11771185.
Briaud, J. L., and Gibbens, R. (1999). “Behaviour of five large spread footings in
sand.” J. Geotech. & Geoenv. Engrg., 125(9), 787-796.
Burland, J. B., and Burbidge, M. C. (1985). “Settlement of foundations on sand and
gravel.” Proc., Institution of Civil Engineers, Part I, 78(6), 1325-1381.
Ellis, G. W., Yao, C., Zhao, R., and Penumadu, D. (1995). “Stress-strain modelling
of sands using artificial neural networks.” J. Geotech. Engrg., ASCE, 121(5),
429-435.
Fahlman, S. E., and Lebiere, C. (1990). “The cascade-correlation learning
architecture.” Advances in Neural Information Processing Systems 2, D. S. E.
Touretzky, ed., Morgan Kaufmann, San Mateo, California.
Fausett, L. V. (1994). Fundamentals neural networks: Architecture, algorithms, and
applications, Prentice-Hall, Inc., Englewood Cliffs, New Jersey.
Flood, I., and Kartam, N. (1994). “Neural networks in civil engineering. I: Principles
and understanding.” J. Computing in Civil Engineering, ASCE, 8(2), 131-148.
Goh, A. T. C. (1994). “Seismic liquefaction potential assessed by neural network.”
J. Geotech. & Geoenv. Engrg., ASCE, 120(9), 1467-1480.
Goh, A. T. C. (1995). “Back-propagation neural networks for modeling complex
systems.” Artificial Intelligence in Engineering, 9, 143-151.
Hecht-Nilsen, R. (1990). Neurocomputing, Addison-Wesely Publication Company.
Hertz, J. A., Krogh, A., and Palmer, R. G. (1991). Introduction to the theory of
neural computation, Addison-Wesely Publishing Company, Red Wood City,
California.
Jeyapalan, J. K., and Boehm, R. (Year). “Procedures for predicting settlements in
sands.” Settlement of Shallow Foundations on Cohesionless Soils: Design and
Performance, Seattle, Washington, 1-22.
Kohonen, T. (1982). “Self-organized formation of topologically correct feature
maps.” Biological Cybernetics, 43, 59-69.
Kohonen, T. (1997). Self-organizing maps, Springer-Verlag, Berlin Heildelberg.
Lee, I., and Lee, J. (1996). “Prediction of pile bearing capacity using artificial neural
networks.” Computers and Geotechnics, 18(3), 189-200.
Levine, D. M., Berenson, M. L., and Stephan, D. (1999). Statistics for managers
using microsoft excel, Prentice-Hall, Inc., New Jersey.
Levy, J. F., and Morton, K. (Year). “Loading tests and settlement observations on
granular soils.” Settlement of Structures, Cambridge, 43-52.
Maier, H. R., and Dandy, G. C. (1999). “Empirical comparison of various methods
for training feedforward neural networks for salinity forecasting.” Water
Resources Research, 35(8), 2591-2596.
16
Maier, H. R., and Dandy, G. C. (2000a). “Application of artificial neural networks to
forecasting of surface water quality variables: issues, applications and
challenges.” Artificial Neural Networks in Hydrology, R. S. G. a. A. R. Rao,
ed., Kluwer, Dordrecht, The Netherlands.
Maier, H. R., and Dandy, G. C. (2000b). “Neural networks for the prediction and
forecasting of water resources variables: a review of modelling issues and
applications.” Environmental Modelling & Software, 15(2000), 101-124.
Maren, A., Harston, C., and Pap, R. (1990). Handbook of neural computing
applications, Academic Press, Inc., San Diego, California.
Masters, T. (1993). Practical neural network recipes in C++, Academic Press, Inc.,
San Diego, California.
Maugeri, M., Castelli, F., Massimino, M. R., and Verona, G. (1998). “Observed and
computed settlements of two shallow foundations on sand.” J. Geotech. &
Geoenv. Engrg., ASCE, 124(7), 595-605.
Neural Computer Sciences. (1995). NeuFrame Version 4.0.
NeuralWare, Inc. (1997). NeuralWorks Predict Release 2.1, Pittsburgh.
Papadopoulos, B. P. (1992). “Settlements of shallow foundations on cohesionless
soils.” J. Geotech. Engrg., ASCE, 118(3), 377-393.
Picton, P. D. (1994). Introduction to neural networks, The MacMillan Press Ltd.
Ripley, B. D. (1996). Pattern recognition and neural networks, Cambridge
University Press.
Shahin, M. A., Maier, H. R., and Jaksa, M. B. (2000). “Predicting settlements of
shallow foundations using neural networks.” J. Geotech. & Geoenviron.
Engrg., ASCE, Submitted for review.
Shi, J. J. (2000). “Reducing prediction error by transforming input data for neural
networks.” J. Computing in Civil Engrg., ASCE, 14(2), 109-116.
Smith, M. (1993). Neural networks for statistical modelling, Van Nostrand
Reinhold, New York.
Stone, M. (1974). “Cross-validatory choice and assessment of statistical
predictions.” J. Royal Statistical Society, B 36, 111-147.
Tokar, S. A., and Johnson, P. A. (1999). “Rainfall-runoff modeling using artificial
neural networks.” J. Hydrologic Engrg., 4(3), 232-239.
Twomey, J. M., and and Smith, A. E. (1997). “Validation and verification.”
Artificial neural networks for civil engineers: Fundamentals and applications,
N. Kartam, Flood, I., and Garrett, J. H., ed., ASCE, New York, 44-64.
Vargas, M. (1961). "Foundations of tall buildings on sand in Sao Paulo." Proc., 5th
Int. Conf. on Soil Mech. & Found. Engrg., Paris, 1, 841-843.
Wahls, H. E. (1997). "Settlement analysis for shallow foundations on sand." Proc.,
Third Int. Geotech. Engrg. Conf., Cairo, 7-28.
Zhou, Y., and Wu, X. (1994). “Use of neural networks in the analysis and
interpretation of site investigation data.” Computer and Geotechnics, 16, 105122.
17
Zurada, J. M. (1992). Introduction to artificial neural systems, West Publishing
Company, St. Paul.
18
APPENDIX A – NOTATION
The following symbols are used in this report:
ANNs
B
Dj
L
MAE
N
q
r
RMSE
Sm
SOM
wij
xi
µ1
µ2
s1
s2
= artificial neural networks;
= footing width;
= Euclidean distance;
= footing length;
= mean absolute error;
= average blow count/300 mm from standard penetration test;
= applied pressure;
= correlation coefficient;
= root mean square error;
= measured settlement;
= self-organising map;
= the connection weight between nodes i and j;
= the input from node i, i = 0, 1,… , n;
= mean of the first data set;
= mean of the second data set;
= standard deviation of the first data set; and
= standard deviation of the second data set.
19
APPENDIX B – INPUT AND OUTPUT STATISTICS FOR DIFFERENT
DATA PROPORTIONS USING DATA DIVISION METHOD TO ENSURE
STATISTICAL CONSISTENCY
10/70-30
Model
variables and
data sets
Mean
Statistical parameters
Std. Dev. Maximum Minimum
Footing width, B (m)
Training set
7.1
9.0
Testing set
8.5
9.0
Validation set
6.5
10.1
Footing length, L (m)
Training set
15.1
23.3
Testing set
16.2
22.5
Validation set
16.4
20.2
Load applied pressure, q (kPa)
Training set
234.8
158.4
Testing set
208.2
137.0
Validation set
233.8
159.4
Average blow count, N
Training set
25.8
14.0
Testing set
24.8
13.0
Validation set
25.9
13.7
Measured settlement, Sm
Training set
14.9
21.1
Testing set
19.3
25.6
Validation set
13.2
19.1
Range
55
41.2
53
0.3
0.3
0.3
54.7
40.9
52.7
200
134.5
72.9
0.3
0.3
0.3
199.7
134.2
72.6
850
697
507.5
6.3
18.32
25
843.7
678.68
482.5
60
58
50
4
5
4
56
53
46
143
131
97
0.2
0.6
0.85
142.8
130.4
96.15
20
10/50-50
Model
variables and
data sets
Mean
Statistical parameters
Std. Dev. Maximum Minimum
Footing width, B (m)
Training set
7.4
9.4
Testing set
7.6
8.6
Validation set
6.4
10.1
Footing length, L (m)
Training set
15.5
25.3
Testing set
15.3
20.6
Validation set
16.4
20.2
Load applied pressure, q (kPa)
Training set
238.8
164.8
Testing set
214.6
130.5
Validation set
233.8
159.4
Average blow count, N
Training set
26.7
14.5
Testing set
24.3
12.8
Validation set
25.9
13.7
Measured settlement, Sm
Training set
15.2
21.8
Testing set
17.3
23.4
Validation set
13.2
19.1
Range
55
41.2
53
0.3
0.3
0.3
54.7
40.9
52.7
200
134.5
72.9
0.3
0.3
0.3
199.7
134.2
72.6
850
697
507.5
6.3
18.3
25
843.7
678.6
482.5
60
58
50
4
5
4
56
53
46
143
131
97
0.2
0.6
0.8
142.8
130.4
96.1
21
20/90-10
Model
variables and
data sets
Mean
Statistical parameters
Std. Dev. Maximum Minimum
Footing width, B (m)
Training set
7.3
9.3
Testing set
6.1
6.7
Validation set
8.3
9.4
Footing length, L (m)
Training set
15.6
23.9
Testing set
10.9
11.1
Validation set
17.0
22.0
Load applied pressure, q (kPa)
Training set
231.4
149.0
Testing set
236.4
176.0
Validation set
209.4
141.9
Average blow count, N
Training set
25.7
13.9
Testing set
24.2
13.8
Validation set
25.5
12.9
Measured settlement, Sm
Training set
16.3
21.5
Testing set
16.6
26.4
Validation set
14.4
23.4
Range
55
29.7
41.2
0.3
0.3
0.3
54.7
29.4
40.9
200
43.0
84.4
0.3
0.3
0.3
199.7
42.7
84.1
850
584
584
6.3
16.2
25
843.7
567.7
559
60
50
51
4
4
5
56
46
46
143
97
120
0.2
1
0.6
142.8
96
119.4
22
20/70-30
Model
variables and
data sets
Mean
Statistical parameters
Std. Dev. Maximum Minimum
Footing width, B (m)
Training set
7.2
9.3
Testing set
7.1
8.4
Validation set
8.3
9.4
Footing length, L (m)
Training set
15.6
24.3
Testing set
14.1
19.6
Validation set
17.0
22.0
Load applied pressure, q (kPa)
Training set
222.8
142.4
Testing set
253.0
169.8
Validation set
209.4
141.9
Average blow count, N
Training set
25.1
13.7
Testing set
26.6
14.3
Validation set
25.5
12.9
Measured settlement, Sm
Training set
16.2
20.8
Testing set
16.4
24.8
Validation set
14.4
23.4
Range
55
36
41.2
0.3
0.3
0.3
54.7
35.7
40.9
200
121.9
84.4
0.3
0.3
0.3
199.7
121.6
84.1
850
666
584
6.3
16.2
25
843.7
649.7
559
60
60
51
4
4
5
56
56
46
143
131
120
0.2
0.5
0.6
142.8
130.5
119.4
23
20/50-50
Model
variables and
data sets
Mean
Statistical parameters
Std. Dev. Maximum Minimum
Footing width, B (m)
Training set
7.3
10.2
Testing set
7.0
7.8
Validation set
8.3
9.4
Footing length, L (m)
Training set
15.5
26.2
Testing set
14.8
19.2
Validation set
17.0
22.0
Load applied pressure, q (kPa)
Training set
230.7
147.7
Testing set
233.1
155.8
Validation set
209.4
141.9
Average blow count, N
Training set
25.8
14.3
Testing set
25.4
13.5
Validation set
25.5
12.9
Measured settlement, Sm
Training set
17.7
23.2
Testing set
14.9
20.8
Validation set
14.4
23.4
Range
55
36
41.2
0.3
0.3
0.3
54.7
35.7
40.9
200
121.9
84.4
0.3
0.3
0.3
199.7
121.6
84.1
850
666
584
6.3
16.2
25
843.7
649.7
559
60
60
51
4
4
5
56
56
46
143
131
120
0.2
0.5
0.6
142.8
130.5
119.4
24
30/90-10
Model
variables and
data sets
Mean
Statistical parameters
Std. Dev. Maximum Minimum
Footing width, B (m)
Training set
6.7
8.7
Testing set
10.0
11.3
Validation set
8.2
9.4
Footing length, L (m)
Training set
14.8
24.3
Testing set
20.3
21.8
Validation set
16.0
19.4
Load applied pressure, q (kPa)
Training set
230.5
150.0
Testing set
233.2
158.8
Validation set
219.6
148.2
Average blow count, N
Training set
25.2
13.5
Testing set
26.1
11.5
Validation set
26.1
14.5
Measured settlement, Sm
Training set
16.2
23.0
Testing set
18.0
28.3
Validation set
14.9
19.2
Range
55
36
53
0.3
0.3
0.3
54.7
35.7
52.7
200
77.4
84.4
0.3
0.3
0.3
199.7
77.1
84.1
850
500
697
6.3
16.2
25
843.7
483.7
672
60
50
60
4
10
4
56
40
56
143
131
97
0.2
1
1.3
142.8
130
95.7
25
30/70-30
Model
variables and
data sets
Mean
Statistical parameters
Std. Dev. Maximum Minimum
Footing width, B (m)
Training set
6.5
8.6
Testing set
7.9
9.6
Validation set
8.4
9.5
Footing length, L (m)
Training set
15.4
26.7
Testing set
15.7
17.0
Validation set
15.6
19.2
Load applied pressure, q (kPa)
Training set
239.2
154.4
Testing set
213.5
140.4
Validation set
218.2
148.3
Average blow count, N
Training set
25.1
13.6
Testing set
26.0
12.6
Validation set
26.0
14.6
Measured settlement, Sm
Training set
15.6
20.8
Testing set
16.6
26.6
Validation set
16.0
21.5
Range
55
37.8
53
0.3
0.3
0.3
54.7
37.5
52.7
200
77.4
84.4
0.3
0.3
0.3
199.7
77.1
84.1
850
584
697
6.3
16.2
25
843.7
567.7
672
60
51
60
4
5
4
56
46
56
143
131
100
0.5
0.2
1.3
142.5
130.8
98.7
26
30/50-50
Model
variables and
data sets
Mean
Statistical parameters
Std. Dev. Maximum Minimum
Footing width, B (m)
Training set
6.9
9.2
Testing set
7.1
8.6
Validation set
8.4
9.5
Footing length, L (m)
Training set
16.3
27.6
Testing set
14.7
20.1
Validation set
15.6
19.2
Load applied pressure, q (kPa)
Training set
231.0
146.5
Testing set
231.8
154.9
Validation set
218.2
148.3
Average blow count, N
Training set
26.8
14.0
Testing set
23.9
12.3
Validation set
26.0
14.6
Measured settlement, Sm
Training set
14.3
19.0
Testing set
17.5
25.8
Validation set
16.0
21.5
Range
55
37.8
53
0.3
0.3
0.3
54.7
37.5
52.7
200
134.5
84.4
0.3
0.3
0.3
199.7
134.2
84.1
850
666
697
6.3
16.2
25
843.7
649.7
672
60
51
60
4
5
4
56
46
56
143
131
100
0.2
0.5
1.3
142.8
130.5
98.7
27
APPENDIX C – NULL HYPOTHESIS TESTS FOR DIFFERENT DATA
PROPORTIONS USING DATA DIVISION METHOD TO ENSURE
STATISTICAL CONSISTENCY
10/70-30
Variable and
data sets
B
Testing
Validation
L
Testing
Validation
q
Testing
Validation
N
Testing
Validation
Sm
Testing
Validation
tvalue
Lower
critical
value
Upper
critical
value
t-test
Ho:
µ1=µ2
Fvalue
Lower
critical
value
Upper
critical
value
F-test
Ho:
s 1=s 2
-0.72
0.31
-1.97
-1.97
1.97
1.97
Accept
Accept
1.00
0.79
0.58
0.59
2.01
1.94
Accept
Accept
-0.21
-0.27
-1.97
-1.97
1.97
1.97
Accept
Accept
1.07
1.33
0.58
0.59
2.01
1.94
Accept
Accept
0.81
0.03
-1.97
-1.97
1.97
1.97
Accept
Accept
1.25
0.92
0.58
0.59
2.01
1.94
Accept
Accept
0.33
-0.03
-1.97
-1.97
1.97
1.97
Accept
Accept
1.15
1.04
0.58
0.59
2.01
1.94
Accept
Accept
-0.94
0.39
-1.97
-1.97
1.97
1.97
Accept
Accept
0.67
1.22
0.58
0.59
2.01
1.94
Accept
Accept
28
10/50-50
Variable
and
data sets
B
Testing
Validation
L
Testing
Validation
q
Testing
Validation
N
Testing
Validation
Sm
Testing
Validation
Fvalue
Lower
critical
value
Upper
critical
value
F-test
Ho:
s 1=s 2
Accept
Accept
1.19
0.86
0.58
0.59
2.01
1.94
Accept
Accept
1.97
1.97
Accept
Accept
1.50
1.56
0.58
0.59
2.01
1.94
Accept
Accept
-1.97
-1.97
1.97
1.97
Accept
Accept
1.56
1.06
0.58
0.59
2.01
1.94
Accept
Accept
0.77
0.26
-1.97
-1.97
1.97
1.97
Accept
Accept
1.28
1.12
0.58
0.59
2.01
1.94
Accept
Accept
-0.44
0.44
-1.97
-1.97
1.97
1.97
accept
Accept
0.86
1.30
0.58
0.59
2.01
1.94
Accept
Accept
tvalue
Lower
critical
value
Upper
critical
value
t-test
Ho:
µ1=µ2
-0.09
0.50
-1.97
-1.97
1.97
1.97
0.03
-0.17
-1.97
-1.97
0.69
0.14
29
20/90-10
Variable
and
data sets
B
Testing
Validation
L
Testing
Validation
q
Testing
Validation
N
Testing
Validation
Sm
Testing
Validation
Fvalue
Lower
critical
value
Upper
critical
value
F-test
Ho:
s 1=s 2
Accept
Accept
1.92
0.97
0.58
0.59
2.01
1.94
Accept
Accept
1.97
1.97
Accept
Accept
4.63
1.18
0.58
0.59
2.01
1.94
Reject
Accept
-1.97
-1.97
1.97
1.97
Accept
Accept
0.71
1.10
0.58
0.59
2.01
1.94
Accept
Accept
0.50
0.06
-1.97
-1.97
1.97
1.97
Accept
Accept
1.01
1.16
0.58
0.59
2.01
1.94
Accept
Accept
-0.06
0.42
-1.97
-1.97
1.97
1.97
Accept
Accept
0.66
0.84
0.58
0.59
2.01
1.94
Accept
Accept
tvalue
Lower
critical
value
Upper
critical
value
t-test
Ho:
µ1=µ2
0.61
-0.51
-1.97
-1.97
1.97
1.97
0.95
-0.28
-1.97
-1.97
-0.15
0.71
30
20/70-30
Variable
and
data sets
B
Testing
Validation
L
Testing
Validation
q
Testing
Validation
N
Testing
Validation
Sm
Testing
Validation
Fvalue
Lower
critical
value
Upper
critical
value
F-test
Ho:
s 1=s 2
Accept
Accept
1.22
0.97
0.58
0.59
2.01
1.94
Accept
Accept
1.97
1.97
Accept
Accept
1.53
1.22
0.58
0.59
2.01
1.94
Accept
Accept
-1.97
-1.97
1.97
1.97
Accept
Accept
0.70
1.00
0.58
0.59
2.01
1.94
Accept
Accept
-0.50
-0.14
-1.97
-1.97
1.97
1.97
Accept
Accept
0.91
1.12
0.58
0.59
2.01
1.94
Accept
Accept
-0.04
-0.41
-1.97
-1.97
1.97
1.97
Accept
Accept
0.70
0.79
0.58
0.59
2.01
1.94
Accept
Accept
tvalue
Lower
critical
value
Upper
critical
value
t-test
Ho:
µ1=µ2
0.05
-0.56
-1.97
-1.97
1.97
1.97
0.48
-0.07
-1.97
-1.97
-0.96
-0.45
31
20/50-50
Variable
and
data sets
B
Testing
Validation
L
Testing
Validation
q
Testing
Validation
N
Testing
Validation
Sm
Testing
Validation
Fvalue
Lower
critical
value
Upper
critical
value
F-test
Ho:
s 1=s 2
Accept
Accept
1.71
1.17
0.58
0.59
2.01
1.94
Accept
Accept
1.97
1.97
Accept
Accept
1.86
1.41
0.58
0.59
2.01
1.94
Accept
Accept
-1.97
-1.97
1.97
1.97
Accept
Accept
0.89
1.08
0.58
0.59
2.01
1.94
Accept
Accept
0.11
0.10
-1.97
-1.97
1.97
1.97
Accept
Accept
1.12
1.22
0.58
0.59
2.01
1.94
Accept
Accept
0.56
0.68
-1.97
-1.97
1.97
1.97
Accept
Accept
1.24
0.98
0.58
0.59
2.01
1.94
Accept
Accept
tvalue
Lower
critical
value
Upper
critical
value
t-test
Ho:
µ1=µ2
0.13
-0.47
-1.97
-1.97
1.97
1.97
0.12
-0.27
-1.97
-1.97
-0.07
0.69
32
30/90-10
Variable
and
Data sets
B
Testing
Validation
L
Testing
Validation
q
Testing
Validation
N
Testing
Validation
Sm
Testing
Validation
Fvalue
Lower
critical
value
Upper
critical
value
F-test
Ho:
s 1=s 2
Accept
Accept
0.59
0.85
0.58
0.59
2.01
1.94
Accept
Accept
1.97
1.97
Accept
Accept
1.24
1.56
0.58
0.59
2.01
1.94
Accept
Accept
-1.97
-1.97
1.97
1.97
Accept
Accept
0.89
1.02
0.58
0.59
2.01
1.94
Accept
Accept
-0.31
-0.31
-1.97
-1.97
1.97
1.97
Accept
Accept
1.37
0.86
0.58
0.59
2.01
1.94
Accept
Accept
-0.35
0.27
-1.97
-1.97
1.97
1.97
Accept
Accept
0.66
1.43
0.58
0.59
2.01
1.94
Accept
Accept
tvalue
Lower
critical
value
Upper
critical
value
t-test
Ho:
µ1=µ2
-1.71
-0.83
-1.97
-1.97
1.97
1.97
-1.06
-0.24
-1.97
-1.97
-0.08
0.34
33
30/70-30
Variable
and
Data sets
B
Testing
Validation
L
Testing
Validation
q
Testing
Validation
N
Testing
Validation
Sm
Testing
Validation
Fvalue
Lower
critical
value
Upper
critical
value
F-test
Ho:
s 1=s 2
Accept
Accept
0.80
0.81
0.58
0.59
2.01
1.94
Accept
Accept
1.97
1.97
Accept
Accept
2.46
1.93
0.58
0.59
2.01
1.94
Accept
Accept
-1.97
-1.97
1.97
1.97
Accept
Accept
1.20
1.08
0.58
0.59
2.01
1.94
Accept
Accept
-0.30
-0.31
-1.97
-1.97
1.97
1.97
Accept
Accept
1.16
0.86
0.58
0.59
2.01
1.94
Accept
Accept
-0.21
-0.09
-1.97
-1.97
1.97
1.97
Accept
Accept
0.61
0.93
0.58
0.59
2.01
1.94
Accept
Accept
tvalue
Lower
critical
value
Upper
critical
value
t-test
Ho:
µ1=µ2
-0.74
-1.05
-1.97
-1.97
1.97
1.97
-0.05
-0.03
-1.97
-1.97
0.77
0.65
34
30/50-50
Variable
and
data sets
B
Testing
Validation
L
Testing
Validation
q
Testing
Validation
N
Testing
Validation
Sm
Testing
Validation
Fvalue
Lower
critical
value
Upper
critical
value
F-test
Ho:
s 1=s 2
Accept
Accept
1.14
0.93
0.58
0.59
2.01
1.94
Accept
Accept
1.97
1.97
Accept
Accept
1.88
2.06
0.58
0.59
2.01
1.94
Accept
Reject
-1.97
-1.97
1.97
1.97
Accept
Accept
0.89
0.97
0.58
0.59
2.01
1.94
Accept
Accept
0.97
0.27
-1.97
-1.97
1.97
1.97
Accept
Accept
1.29
0.91
0.58
0.59
2.01
1.94
Accept
Accept
-0.75
-0.42
-1.97
-1.97
1.97
1.97
Accept
Accept
0.54
0.78
0.58
0.59
2.01
1.94
Reject
Accept
tvalue
Lower
critical
value
Upper
critical
value
t-test
Ho:
µ1=µ2
-0.10
-0.78
-1.97
-1.97
1.97
1.97
0.27
0.12
-1.97
-1.97
-0.02
0.41