Indian Journal of Engineering & Materials Sciences Vol. 17, June 2010, pp. 179-185 Modeling slump of concrete using the group method data handling algorithm Li Chen* & Tai-Sheng Wang Department of Civil Engineering and Engineering Informatics, Chung Hua University, Hsinchu, Taiwan 30067, R.O.C. Received 1 December 2008; accepted 15 April 2010 This paper proposes the group method data handling (GMDH) algorithm and applies it to estimate the slump of highperformance concrete (HPC). It is known that HPC is a highly complex material whose behaviour is difficult to model, especially for slump. To estimate the slump, it is a nonlinear function of the content of all concrete ingredients, including cement, fly ash, blast furnace slag, water, superplasticizer, and coarse and fine aggregate. Therefore, slump estimation is set as a function of the content of these seven concrete ingredients and additional four important ratios. The GMDH algorithm presented in this paper has the advantage of a heuristic self-organized and gradually complicated model for the complicated multi-variable HPC slump estimation. The model establishes the input-output relationship of a complex system using a multilayered perception-type structure that is similar to a feed-forward multilayer artificial neural network (ANN), but it expresses relationships using more explicit functions than ANN. Moreover, the GMDH has the ability to select significant variables and combine them properly and automatically. The results show that GMDH obtains a more accurate mathematical equation through learning procedures which outperforms the traditional multiple linear regression analysis (RA) and ANN, with lower estimating errors for predicting the HPC slump. Keywords: Group method data handling, Slump, High-performance concrete, Regression analysis Workability in concrete technology is one of the key properties that must be satisfied1 which consists of at least two main components consistency and cohesiveness. To measure the consistency or flow characteristic of a concrete mixture, the slump test is a fairly good method. The slump can be deduced by measuring the drop from the top of the slumped fresh concrete. However, these tests, if carried out on site by site workers, may give inadequate results due to lack of professional knowledge and proper training2. The essence of high-performance concrete (HPC) is emphasized on such characteristics as high strength, high workability with good consistency, dimensional stability and durability3. Nowadays, HPC can be made with about four to ten different components as a highly complex material that modeling its behaviour is a difficult task, especially for the slump. In addition to the three basic ingredients in conventional concrete, i.e., Portland cement, fine and coarse aggregates, and water, the making of HPC needs to incorporate supplementary cementitious materials, such as fly ash and blast furnace slag, and chemical admixture, such as superplasticizer4. In other words, the number of properties to be adjusted has also —————— *Corresponding author (E-mail: [email protected]). increased results in the waste of materials, laborers and time. Furthermore, in laboratory, to obtain desired concrete strength with suitable workability, technical personnel must try several mix proportions5. Modeling slumps from laboratory are not adequate to include many factors that need to be considered when designing HPC mixes. Therefore, it becomes more difficult to estimate the slump of concrete with these complex materials described above. The traditional approach used in modeling the effects of these parameters on the slump of concrete starts with an assumed form of analytical equation and is followed by a regression analysis using experimental data to determine unknown coefficients in the equation6. Unfortunately, rational and easy-to-use equations are not yet available in design codes to accurately predict the slump. In recent years, artificial neural networks (ANNs) have shown exceptional performance as regression tools7. They are highly nonlinear, and can capture complex interactions among input/output variables in a system without any prior knowledge about the nature of these interactions. The main advantage of ANNs is that one does not have to explicitly assume a model form, which is a prerequisite in the parametric approach8. There are a lot of recent applications of neural networks in civil engineering materials3,5,7-16. 180 INDIAN J. ENG. MATER. SCI., JUNE 2010 Group method of data handing (GMDH) algorithm is another useful data process for identifying complex systems, which was presented by Ivakhnenko17-19. The main idea is that the gradually complicated models are generated based on the evaluation of their performances on a set of multi-input-single-output data pairs20. It has an advantage over traditional statistical methods because it is distribution free, i.e., no prior knowledge is needed about the statistical distribution of the data like the ANN. In other words, we need not to know the properties of system; GMDH can generate and compare all possible combinations between input and output variables automatically. However, this method has received very little attention in the concrete mixture literature despite successful use in broad areas such as education, economic systems, weather modeling, manufacturing, pattern recognition, physiological experiments19,21-26. In this study, the GMDH algorithm therefore was represented and used to estimate the slump of concrete. The results were compared with those obtained from the regression analysis and ANN. By means of GMDH algorithm a model can be represented as set of neurons in which different pairs of them in each layer are connected through a quadratic polynomial and thus produce new neurons in the next layer. Such representation can be used in to map inputs to outputs. The formal definition of the ^ identification problem is to find a function f so that can be approximately used instead of actual one, f in ^ order to predict output Y for a given input vector x = ( x 1 , x 2 , . . . . . , x n ) as close as possible to its actual output Y . Therefore, assume the output variable Y is a function of the input variables ( x 1 , x 2 , . . . . . , x n ) , as in the following equation (Y 1 , x 2 , ....., x n ) … (1) The Kolmogorov-Gabor polynomial28-33: Yˆ = a 0 + n n ∑ax i i i =1 n GMDH Algorithms GMDH is a heuristic self-organization method that models the input-output relationship of a complex system using a multilayered Rosenblatt's perceptiontype network structure, which is similar to a feedforward multilayer neural network. Each element in the network implements a non-linear equation of two inputs and its coefficients are determined by a regression analysis. Self-selection thresholds are given at each layer in the network to delete those useless elements which cannot estimate the correct output. Only those elements whose performance indices exceed the threshold are allowed to pass to succeeding layers, where more complex combinations are formed. These steps are repeated until the convergence criterion is satisfied or a predetermined number of layers are reached. In general, the advantageous characteristics of the GMDH algorithm for modeling or problem solving can be summarized as follows: (i) a small training set of data is required; (ii) the computational burden is reduced; (iii) the procedure automatically filters out input properties that provide little information about the location and shape of the decision hypersurface and (iv) a multilayers structure is a computationally feasible way to implement multinomials of very high degree27. f ( x = +∑ i =1 n + n ∑∑a i =1 ij xi x j j =1 n ∑∑ a ijk … (2) xi x j xk + .... j =1 k =1 can simulate the input-output relationship perfectly and has been widely used as a complete description of the system model. By combining the so-called partial polynomial of two variables in multilayers, GMDH algorithm can easily solve these problems. The main process is summarized in the following sequence. Step 1. Select input variables N useful input variables are chosen. In the case of the estimation of slump, the components of concrete may be chosen. The model is set as Eq. (1): Y = f ( x 1 , x 2 ,....., x n ) , where Y and xi represent vectors of the output and ith input, respectively. Step 2. Divide the original data into a training and a testing set. Step 3. Construct new intermediate variables In this step, all of the independent variables are taken two at a time to construct the partial polynomial equation Yˆi = f ( x j , x k ) = a 0 i + a1i x j + a 2 i x k + a3i x 2j + a4i xk2 + a5i x j xk i = 1,..., q; j = 1,..., n; k = 1,..., n − 1; q = n(n − 1) 2 … (3) CHEN & WANG : GROUP METHOD DATA HANDLING ALGORITHM The method of least squares is used to estimate the coefficients so that the equation will best fit the observed slump of concrete, Y. The coefficients matrix A = [a0i,…a5i] can be calculated as: XA = Y→(XT X ) A =XT Y A = (XT X )-1 XT Y, where XT is the transpose of X and (XTX)-1XT is the pseudo-inverse of X if XTX is non-singular. Step 4. Select the new variables. Evaluate the total RMSE in the preceding step. We keep only n best nodes in each layer which are allowed to pass to the succeeding layer. These new variables can be interpreted as new improved variables that have better predictability powers than the previous generation. Step 5. Truncate the multilayered iterative computation. Compare the best result of the present layer with the preceding layer; if the improvement does not exceed the defined threshold or reach the maximum layer, the stop criterion is satisfied; otherwise go to step 3. Step 6. Compute the predicted value. The prediction model can be obtained as the intermediate variables remaining in the final layer. Modeling the Slump of Concrete System models The properties of concrete are mainly influenced by the mix proportion. This system identification problem may be viewed as a search for a proper model, which maps input values of ingredients onto an output value of slump of HPC by using GMDH described in this paper. There are seven ingredients used to produce the HPC: (i) cement (C, kg/m3); (ii) fly ash (FL, kg/m3); (iii) blast furnace slag (SL, kg/m3); (iv) water (W, kg/m3); superplasticizer (SP, kg/m3); coarse aggregate (CA, kg/m3); and fine aggregate (FA, kg/m3). Table 1 presents the general properties of the concrete evaluated in this study. In addition to the seven components, four ratios were included as input features defined as follows. Water-to-cement ratio: W/C = (W+SP) / (C) Water-to-binder ratio: W/B = (W+SP)/ (C+FL+SL) Water-to-solid ratio: W/S = (W+SP) /(C+FL+SL+CA+FA) Total aggregate-to-binder ratio: TA/B = (CA+FA) / (C+FL+SL) 181 Therefore, in this approach, slump of concrete is a nonlinear function of these eleven input variables described above. Data set Experimental data from Chen34,35 and Lien36 was used to construct of the slump model. The fresh concrete was assessed by the slump test. To collect training and testing data systematically, mix proportions were performed using the design of mixture experiment. In this study, the experiments were designed according to a simplex-centroid design (SCD)3. In all 100 concrete samples from the above investigations were evaluated, each containing seven components and four ratios, total eleven of the input vector and one output value, slump (from 0 to 30 cm). Modeling procedures All data were grouped in two sets, called the training (calibration) set and the testing (validation) set. When the training process had been completed, the constructed model was used to predict the output values for the data in the testing set (which the process had never seen during the training stage). Therefore, using these HPC data for learning by GMDH depends on randomly splitting 100 records into two groups: (i) The first group is used for training the model called the training set, which consists of seventy five records and (ii) The second group is used to measure the performance of the model called the testing set, which consists of twenty five records. Results and Discussion First, all the eleven input variables are standardized from 0.1 to 0.9, then the GMDH algorithms are applied to the slump estimation. The number of input variables of the first layer is set to be eleven, while Table 1—Statistical properties of components Components Minimum Maximum 3 Cement (kg/m ) Fly ash (kg/m3) Blast furnace slag (kg/m3) Water (kg/m3) Superplasticizer (kg/m3) Coarse aggregate (kg/m3) Fine aggregate (kg/m3) Water-to-cement ratio (W/C) Water-to-binder ratio (W/B) Water-to-solid ratio (W/S) Total aggregate-to-binder ratio (TA/B) Slump (cm) 137.0 0.0 0.0 160.0 4.4 708.0 640.6 0.5 0.3 0.075 2.363 0.0 374.0 193.0 260.0 240.0 19.0 1049.9 902.0 1.736 0.678 0.125 5.562 29.0 INDIAN J. ENG. MATER. SCI., JUNE 2010 182 the number of input variables of the rest of the layers is also limited to eleven. During the training stage, the GMDH model builds up through twenty layers with eleven nodes in each layer. The convergence diagram is shown in Fig. 1. At the final (20th) layer, the root mean square error (RMSE) equals 3.07 (cm). To realize the mechanism in detail, the fittest function between eleven input features and the slump of concrete generated from GMDH with only two layers was shown as Fig. 2 and Eq. (4). Y = −0.153 + 0.504 f ( xFL , xW ) + 1.657 f ( xW , xW 2 −2.82 f ( xFL , xW ) − 4.312 f ( xW , xW +5.884 f ( xFL , xW ) f ( xW , xW B B ) B ) 2 ) f ( xFL , xW ) = 0.411 + 0.013xFL + 1.199 xW 2 −0.909 xFL − 1.009 xW2 + 0.857 xFL xW f ( xW , xW B ) = 0.019 + 0.896 x + 1.723 xW W 2 W 2 W B −0.726 x − 1.919 x B + 0.424 xW xW B … (4) where Y is the slump of HPC. The nodes in grey represent the optimal solution of this problem, which consists of three nodes in the input, two nodes in the first layer, and only one node in the second layer. This shows that only three input variables were available, including fly ash (FL), water (W) and water-to-binder ratio (W/B). The RMSE at this stage (two layers) equals 4.55 (cm) also shown in Fig. 1. These two components and one ratio are very significant variables to model the slump of HPC. Comparison with multiple linear regressive analysis (RA) In the conventional material modeling process, regression analysis (RA) is an important tool for building a model. Because we don’t know the proper form of these functions, only the simplest linear type was considered. The established form including eleven variables was given by: Fig. 1—Convergence diagram of slump estimation by GMDH Y = 466.41- 0.4536C - 0.459 FL - 0.4663SL +3.4929W + 3.1385SP - 0.1761CA -0.1791FA + 2.6744(W / C ) -199.8158(W / B ) -5625.2125(W / S ) - 4.6645(TA / B) … (5) Comparison with artificial neural network (ANN) Fig. 2—Structure of GMDH with two layers The artificial neural network with backpropagation algorithm, called back-propagation network (BPN), might be one of the most widely used models for estimation. The same data were selected for use in the training and testing stages to compare the performance of GMDH with that of BPN. In the BPN with the gradient descent algorithms, there are some combinations of neural parameters that are set by trials. It uses two hidden layers with eight nodes at each layer and is terminated after 1000 iterations for training procedure. The criteria of root mean square error (RMSE) and coefficient of determination (R2) were used for evaluating the performance of these three models, which are summarized in Table 2. Obviously, the results of GMDH (RMSE = 3.07 cm for the training set; 4.54 cm for the testing set) are better than those of RA (RMSE = 4.96 cm for the training set; 8.82 cm for CHEN & WANG : GROUP METHOD DATA HANDLING ALGORITHM the testing set). The RMSE of BPN equals 3.20 cm for the training set, which is slightly worse than that of GMDH, but the RMSE of BPN equals 7.46 cm for the testing set, which is much worse than that of the GMDH. Table 2—The results of three models at two stages Criteria R2 for the training set RMSE for the training set (cm) R2 for the testing set RMSE for the testing set (cm) GMDH 0.8582* 3.0687* 0.7396* 4.5357* RA 0.6214 4.9619 0.2002 8.8242 *represents the best result of these three models. BPN 0.8380 3.2018 0.5366 7.4636 183 According to R2, it indicates a significant enough correlation by using GMDH (R2 = 0.93 for the training set; 0.86 for the testing set). On the contrary, the coefficient of determination R2 is 0.2002 by RA and 0.5366 by BPN for the testing set, both indicate low correlations. Figures 3- 5 show the scatter diagrams of predicted slump values versus values observed in the laboratory for these three models at the training stage. Figures 6-8 show the scatter diagrams of predicted slump values versus values observed in the laboratory for these three models at the testing stage. One can tell that the predicted values of GMDH are much [ Fig. 3—Scatter plots of GMDH for the training set Fig. 5—Scatter plots of BPN for the training set Fig. 4—Scatter plots of RA for the training set Fig. 6—Scatter plots of GMDH for the testing set 184 INDIAN J. ENG. MATER. SCI., JUNE 2010 Fig. 7—Scatter plots of RA for the testing set Fig. 8—Scatter plots of BPN for the testing set closer to the ideal line than the other two methods. It is also indicated that the model obtained by GMDH more accurately predicts the experimental results for both the training and testing data in the range of concrete slump in this study. In contrast with GMDH, it verifies that when the testing set is used instead of the training set as the basis for evaluating the slump model derived with RA or BPN, the predictions become much more inaccurate for the model used in this study. traditional multiple regression analysis (RA) and back-propagation network (BPN), the performances of GMDH with twenty layers are much better than those of the RA and slightly better than those of BPN. Conclusions The main contribution of this paper is to provide a self-organization method called group method data handling (GMDH) algorithm, which creates potentials to predict the slump of concrete. This model can deal easily with nonlinear problems through multilayer network among seven components including (i) cement (C, kg/m3); (ii) fly ash (FL, kg/m3); (iii) blast furnace slag (SL, kg/m3); (iv) water (W, kg/m3); (v) superplasticizer (SP, kg/m3); (vi) coarse aggregate (CA, kg/m3); and (vii) fine aggregate (FA, kg/m3) and four ratios, versus the slump of high-performance concrete (HPC). The highly nonlinear equation obtained using GMDH with two layers helps us realize the mixture mechanisms in a transparent way, containing only three significant input variables including fly ash (FL), water (W), and water-to-binder ratio (W/B). The results also show that the GMDH presented in this paper is a very efficient and robust system identified model. Compared with the References 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 Mehta P K & Monteiro P J M, Concrete: structure, properties, and materials (Englewood Cliffs: Prentice Hall Inc.), 1993. Bai J, Wild S, Ware J A & Sabir B B, Adv Eng Software, 34(11-12) (2003) 663. Yeh I C, Cem Concr Compos, 29 (2007) 474. Yeh I C, J Comput Civil Eng, 20(3) (2006) 217. Öztaşa A, Pala M, Özbayb E, Kanca E, Çagˇlarc N & Bhatti M A, Constr Build Mater, 20 (2006) 769. Mansour M Y, Eng Struct, 26 (2004) 781-799. Yeh I C, J Mater Civil Eng, 18(4) (2006b) 597. Ji T, Lin T & Lin X, Cem Concr Res, 36 (2006) 1399-1408. Yeh I C, Chen I C, Ko T Z, Peng C C, Gan C C & Chen J W, J Technol, 17(4) (2002) 583. Haj-Ali R M, Kurtis K E & Akshay R, ACI Mater J, 98(1) (2001) 36. 11. Nehdi M, Djebbar Y & Khan A, ACI Mater J, 98(5) (2001) 402. Nehdi M, El-Chabib, H & El-Naggar M H, ACI Mater J, 98(5) (2001b) 394. Peng J, Li Z & Ma B, J Mater Civil Eng, 14(4) (2002) 327. El-Chabib H, Nehdi M & Sonebi M, ACI Mater J, 100(2) (2003) 165. Kim S, Kim J & Lee C B, Fuzzy decision support system to the prediction of ozone concentrations, paper presented at IEEE Conf, Pusan, Korea. 2001. Stegemann J A & Buenfeld N R, J Environ Eng-ASCE, 130(5) (2004) 508. Ivakhnenko A G, Sov J Autom Inf Sci, 6 (1970) 207. Ivakhnenko A G, Fateyeva Ye N & Ivakhnenko N A, Sov J Autom Inf Sci, 22(1) (1989) 1. CHEN & WANG : GROUP METHOD DATA HANDLING ALGORITHM 19 Ivakhnenko A G, Sov J Autom Inf Sci, 22(2) (1989) 1. 20 Ivakhnenko G A, Syst Anal Model Simul, 20 (1995) 107. 21 Kalantary F, Ardalan H & Nariman-Zadeh N, Eng Geol, (2008) (accepted). 22 Baker B D & Richards C E, Econom Edu Rev, 18(4) (1999) 405. 23 Pavel N & Miroslav S, Syst Anal Model Simul, 43(10) (2003) 1415. 24 Sarycheva L, Syst Anal Model Simul, 43(10) (2003) 1409. 25 Kim J I, Kim D K, Feng M Q & Yazdani F, J Mater Civil Eng, 16(3) (2004) 257. 26 Ivakhnenko A G, Ivakhnenko G A & Mulle J A, Pattern Recognition Image Anal, 3(4) (1993) 415. 27 Hwang S L, Yau Y J, Lin Y T, Chen J H, Huang T H, Yenn T C & Hsu C C, Safety Sci, 46(7) (2008) 1115. 28 Chang F J & Hwang Y Y, Hydrol Process, 13 (1999) 123. 29 Ivakhnenko A G, SMC-1, (1971) 364. 30 Farlow S J, Self-organizing methods in modelling: GMDH type algorithms, (Marcel Dekker Inc., New York), 1984. 185 31 Madala H R & Ivakhnenko A G, Learning Algorithms for Complex Systems Modeling, (CRC Press Inc, Boca Raton), 1994. 32 Iba H, deGaris H & Sato T, Evol Comput, 3(4) (1996) 417. 33 Sanchez E, Shibata T & Zadeh L A, Genetic algorithms and fuzzy logic systems: soft computing perspectives, (World Scientific, Riveredge, NJ), 1997. 34 Nariman-Zadeh N, Darvizeh A & Ahmad-Zadeh G R, J Eng Manuf, 217 (2003) 779. 35 Chen I C, Optimization the Mixture Design of Highperformance Concrete by Neural Networks, Master’s thesis, Dept. of Civil Eng and Eng Informatics, Chung Hua Unniversity, Hsin Chu, Taiwan (in Chinese), 2001. 36 Chen J W, Modeling and Comparison the Workability of High-performance Concrete by Regression Analysis and Neural Networks, Master’s thesis, Dept. of Civi Eng. and Eng Informatics, Chung Hua Unniversity, Hsin Chu, Taiwan (in Chinese), 2002. 37 Lien L C, Applications of Genetic Algorithms in Reinforced Learning, Master’s thesis, Dept. of Civil Eng. and Eng. Informatics, Chung Hua Unniversity, Hsin Chu, Taiwan (in Chinese), 2005.
© Copyright 2026 Paperzz