DATA BASED MODELLING OF THE MEAN WAVE PERIOD IN THE ADRIATIC SEA Marko Katalinić, Faculty of maritime Studies, Ruđera Boškovića 37, 21 000 Split Luka Mudronja, Faculty of maritime Studies, Ruđera Boškovića 37, 21 000 Split Petar Matić, Faculty of maritime Studies, Ruđera Boškovića 37, 21 000 Split ([email protected]) Abstract: This paper investigates an ability to apply data based modelling methods for mean wave period modelling in a single point in the Adriatic Sea, while examining the influence of different input variables and their respective time series. For that purpose, regression analysis and artificial neural network were used. Total of 22-year data with 6h step size, i.e. 33604 data samples acquired from a satellite calibrated numeric model, were used to form the models. Available data set was divided in two subsets, 20-year data, i.e. 30684 data samples, which were used to calibrate the models, and 2-year data, i.e. 2920 data samples, that were used to test the model performance. Simulations were performed in Matlab, with the results proving the efficiency of modelling approaches, where artificial neural network provided more accurate results than traditional statistical models. Furthermore, the advantage of the neural network was more prominent for the case of multiple input variables KEY WORDS sea wave modelling, Adriatic Sea, data based modelling, mean wave period, Matlab. 1. Introduction approaches were investigated for the mean wave period modelling in the Adriatic Sea, artificial neural network (ANN) and regression analysis, while examining the influence of a different input variables and its time series. The mean wave period (MWP) is one of the parameters that describe sea state and represents the mean of all wave periods in a time-series. In general terms, a model can be formed based on the mathematical formulation of the physical processes that occur in the system being modelled, and that model is called physical model. The model can also be formed based on the measured values of the input and output variables of the system. That model is called the experimental or statistical model, and although it does not give insight to the physical properties of the system, it better describes the input-output behavior of the system (Matić et al., 2015). Two modelling approaches are often used for that matter, a regression analysis and artificial neural networks (ANNs). While regression based models require a strict mathematical form of a model to be defined, neural network uses more flexible structure to adjust to the data. ANNs can be regarded as an alternative to traditional, regression based statistical models that are expected to In this paper, two data based provide better modelling results especially in complex cases where multiple variables influence the output. So far, ANNs were successfully applied to a large number of different computational problems such as pattern recognition, classification, function approximation, modelling and prediction. Sea wave modelling represents a challenge due to its complexity, where different parameters influence a mathematical description of the sea surface. In the field of waves modelling for the Adriatic Sea, Tonko Tabain made great effort by definition of Tabain spectrum of the waves (Tabain, 1997). For the sea wave modelling, several methods can be applied, ranging from empirical to the most sophisticated numerical third generation models. Both, ANNs and regression analysis were already used in sea wave modeling, as described in (Zamani et al., 2009), (Peres et al., 2015), (Haddadpour, Etemad-Shahidi & Kamranzad, 2014). 2. MODELLING METHODS In data based modeling, a modelling method should provide a way to determine values of the model parameters, so the models’ response should fit the available data (Rawlings et al., 2001). In this paper two modelling methods were compared, an artificial neural networks and regression analysis. Regression analysis requires an assumption about the functional relationship between input and output variables, where models can use one or more (k) input, i.e. independent variables (xi) to explain the behavior of the output, i.e. dependent variable (yi). If the functional relationship is assumed to be linear, the model can be described with the expression (1). In that case, the model uses k+1 parameters () that need to be estimated in order to fit the function to the available data, and least squares estimate (LSE) method is often used (Rawlings et al., 2001). yi 0 1 x1i 2 x2i k xki (1) However, physical processes are inherently non-linear and time invariant. Therefore, only simple cases could be modeled with linear functions. More realistic models use nonlinear relationships between dependent and independent variables, which could generally be expressed with (2), where fN stands for a nonlinear function. yi f N k 1 , xki (2) Higher-degree polynomials, exponential, logaritmic and trigonometric functions can all be used as fN. Due to the complexity of other solutions, a quadratic function is often used, and even then a problem becomes too complex for more than two input variables. Decision on models’ functional relationship between input and output variables and model complexity problem could be avoided by using artificial neural networks (ANNs). As briefly described in (Matić et al., 2015), ANN is an artificial structure that consists of a number of interconnected artificial neurons. Based on a type of a neuron and the way they are connected different ANNs have been developed over the last 50 years. However, a static feed-forward ANN called Multi-Layer Perceptron (MLP) represents probably the most commonly used network architecture. MLP has neurons organized in layers, with i inputs, h hidden neurons and o output neurons, and uses nonlinear sigmoidal activation functions in a hidden layer, which enables it to approximate nonlinear functions, i.e. model non-linear processes. Although optimal number of layers was often subject of research, it has been proven that two-layer structure is sufficient to approximate any practical function, given enough neurons in hidden layer (Cybenko, 1989). Therefore, a two-layer MLP known as universal approximator shown in Figure 1, was used in this research to form a neural model of mean wave period (MWP). Figure 1. MLP network, a universal approximator To form a neural model of the system, ANN needs to be trained, which is done based on the examples presented to the network during the training (or learning) process. During that process an algorithm is used to modify the network parameters (w) in order to minimize the error between network output and desired value for a given input values. ANNs model the process in a similar way human brain does when it “gets to know the system”, it learns from experience. However, this is also the way statistical models work; they also “learn” from the examples of input-output data pairs. The knowledge they both have about the system is stored in the adjustable parameters of the model. The difference is that ANNs uses a flexible neural structure to adjust to the data, which enables it to approximate any simple or composite function, unlike statistical models where exact function needs to be defined in advance. In this paper, Bayesian regularization (BR) algorithm from Matlabs’ neural network toolbox was used to train the neural models. BR is a useful tool in determining sufficient number of hidden neurons, making an improvement to the Levenberg-Marquardt (LM) algorithm, already proven to be the fastest and most appropriate algorithm for training networks containing up to few hundreds of adjustable parameters (Beale et al., 2010). Number of hidden neurons determines the quality of the neural model, and therefore represents inevitable subject of research in model development process. 3. CASE STUDY – MEAN WAVE PERIOD MODELLING OF THE ADRIATIC SEA WAVE Available data were obtained from the company for environmental monitoring Fugro OCEANOR with the aim of use in DATAS (Damaged Tanker in The Adriatic Sea) project funded by the Croatian Science Foundation. Collected data are based on a numerical hindcast which was calibrated with satellite altimetry mapping in period between January 1992 and January 2016, with time step of 6 hours in 40 points in the Adriatic Sea. Calculations in this paper deal with data in a single point in the Adriatic Sea (42.00 N, 17.00 E). The chosen point is on a frequent merchant ship route from Otrant (SE entrance in the Adriatic Sea) towards ports in the NE part (Rijeka, Venezia, Koper). The total available data included 12 variables, while in this study only 4 variable data were used, as presented in Table 1. Table 1. Data used for modelling purposes Variable name Abbreviation Unit Significant wave height m (meters) SWH Mean period wave MWP s (seconds) Wind speed at 10 m height WSP m/s (meters/second) Wind direction at 10 m height WDIR ° (degrees) 3.1. Data analysis The correlation analysis was performed to examine the influence of the potential input variables, with the results presented in Table 2, providing an insight to the input and output variable dependence. Table 2. Correlation analysis for potential input variables for the chosen point in the Adriatic Sea MWPt WSPt WDIRt SWHt MWPt-1 0.421069 -0.09684 0.691128 0.839201 Based on the results shown in Table 2, it can be assumed that MWPt is strongly influenced by the SWHt and its own value from previous calculation step, i.e. MWPt-1, while WSPt has some influence to the MWPt and WDIRt has none. Therefore, time series of the variables: SWH, WSP and MWP are investigated as the potential inputs to the MWPt model through a set of model based experiments. 3.2. Model formulation From the set of 22-year data available, 20-year data were used for calibrating the models and 2-year data for testing the models, which resulted in 30684 samples for calibrating and 2920 samples for testing the models performances, due to the 6h sample rate. In order to define optimal input variables to model current value of the mean wave period (MWPt), a set of potential input variables was evaluated through series of experiments as described in Table 3. Table 3. Experiment setup for optimal input variables detection i Formulation 1 MWPt = f (MWPt-k1) 2 MWPt = f (MWPt-k1, SWHt-k2) 3 MWPt = f (MWPt-k1, SWHt-k2, WSPt-k3) 4 MWPt = f (SWHt-k2, WSPt-k3) 5 MWPt = f (SWHt-k2) First, the influence of the variable with the highest score in correlation test was examined as a single input variable, and then other variables were included. Therefore, experiment 1 was used to determine the optimal number of the MWP time series members as inputs (k1). In experiment 2, k1 number of inputs was expended with SWH time series, and the optimal number of the time series members k2 was determined. In experiment 3, the input set of variables was furtherly expanded by the WSP time series in order to determine optimal number k3. Therefore, maximal number of inputs in all experiments was Ni_max = k1+k2+k3, which was reduced in experiments 4 and 5, as defined in table 3 to examine the impact of the variables added later to the set of inputs. For each experiment i [1, 5] two models were formed, a neural network model (NNMi), and a regression model (RMi). 3.3. Model evaluation To evaluate models performances graphical and numerical methods can be used. Graphical methods enable visual comparison of the model response to the actual values, while numerical methods measure the exact quantity of models response deviation from the actual value. The authors of (LeGates and McCabe, 1999) proposed a set of different statistical measures to evaluate the quality of the model, and recommended that the set should include at least one relative and one absolute measure. In paper (Moriasi et al., 2007), model classification is made based on the values of CE, RSR and PBIAS quality measures. In the paper (Gupta et al, 1999) the application of Persistence Index (PI) is also recommended. Therefore, to evaluate the models quality in this paper, both graphical and following numerical measures are used, absolute: root mean squared error (RMSE), mean absolute error (MAE); and relative: Nash-Sutchllife coefficient of efficiency (NSC, or CE), percent bias (PBIAS), RMSE to standard deviation ratio (RSR) and persistency index (PI). Furthermore, in order to compare modelling approaches, i.e. neural networks and regression analysis, a model comparison coefficient (MCCPI) was created in this paper, as described with the expression (3). MCCPI PI max PI RM 1 PI max PI NM (3) Since PI has been identified as the most sensitive criterion among the numerical measures used in this paper, MCCPI uses PI values of the RM and NM to calculate the value which ranges from – to +. Positive values indicate better performances of the NM, negative values indicate better performances of the RM expressed in percentages, and value of the MCCPI close to zero indicates that there is no advantage in either models performance. 4. RESULTS AND DISCUSSION As defined in Table 3, experiments 1 to 5 were performed to determine optimal set of input variables for modelling MWP, and the results are presented in Table 4. The best performances of the models from experiment 1 can be obtained when two time series members are used as inputs, i.e. MWPt-1 and MWPt-2. Therefore, neural model NM1 has two input neurons and performs best with 40 hidden neurons to calculate MWPt. Optimal RM11 model is described with the expression (4), where y stands for MWP. y t 0.0477 y t 12 0.5613 y t 1 0.0312 y t 2 2 0.1473 y t 2 0.8462 (4) The models from experiment 2 perform best when three time series members of SWH are used as input variables, i.e. SWHt, SWHt-1, SWHt-2, in addition to MWPt-1 and MWPt-2, determined in experiment 1. Optimal set of input variables was determined through a model based experiment using artificial neural network, and the same variables were used to form a regression model, described with (5), where x(1) stands for SWH. Neural network used in the experiment has 40 neurons of hidden layer, also determined as a subject of the experiments performed. (1) y t 0.0271 y t 12 0.5189 y t 1 0.013 y t 2 2 0.1495 y t 1 0.032 x (1) t 0.4524 x t 2 0.224 x (1) 2 t 1 1.0952 x (1) t 1 0.1311 x (1) 2 t 2 1.0294 x (1) t 2 (5) 0.6407 In order to improve model performances another input variable was included in the experiment 3, namely a WSP, and the experiments performed on a neural model suggested 4 time series members to be included in the set of input variables (WSPt, WSPt-1, WSPt-2, WSPt-3) in addition to already established members of the input set. All neural models were formed using 40 neurons of the hidden layer. The same variables were used to form a regression model, described with (6), where x(2) stands for WSP. yt 0.0316 yt 1 0.5118 yt 1 0.0119 yt 2 0.1371 yt 2 2 2 0.0928 xt(1) 1.9301 xt(1) 0.1719 xt(1)1 0.0035 xt(1)1 0.1403 xt(1) 2 1.0878 xt(1) 2 2 2 2 (2) (2) (2) 0.0028 xt(2) 0.1286 xt(2) 0.0041 xt(2) 1 0.011 xt 1 0.0002 xt 2 0.024 xt 2 2 2 2 (6) (2) 0.0002 xt(2) 3 0.0173 xt 3 0.7487 2 For the purpose of the experiment 4, MWP time series was excluded from the input set of variables. The neural network was trained using 40 neurons of the hidden layer, and the regression model was formed based on the expression (7). yt 0.0717 xt(1) 1.7557 xt(1) 0.216 xt(1)1 1.4855 xt(1)1 0.0275 xt(1) 2 0.1383 xt(1) 2 2 2 2 (2) (2) (2) 0.0012 xt(2) 0.1443 xt(2) 0.0011 xt(2) 1 0.078 xt 1 0. 0015 xt 2 0.0565 xt 2 2 2 2 (7) (2) 0.0025 xt(2) 3 0.041 xt 3 2.6518 2 In the experiment 5, an input set of variables was furtherly reduced by the variable WSP. The neural network was trained using 40 neurons of the hidden layer, and the regression model was formed based on the expression (8). yt 0.1295 xt(1) 0.0192 xt(1) 0.2088 xt(1)1 1.4863 xt(1)1 0.0122 xt(1)2 0.13 xt(1)2 2.4289 2 2 2 (8) Table 4. Numerical evaluation of neural and regression MWP models Model NM1 RM1 NM2 RM2 NM3 RM3 NM4 RM4 NM5 RM5 Absolute RMSE 0.648 0.684 0.483 0.582 0.394 0.544 0.676 0.776 0.808 0.835 MAE 0.419 0.438 0.292 0.325 0.208 0.27 0.341 0.418 0.498 0.532 Relative PI 0.232 0.144 0.573 0.378 0.716 0.459 0.162 -0.105 -0.198 -0.277 CE 0.766 0.739 0.87 0.811 0.914 0.835 0.745 0.664 0.636 0.611 PBIAS -0.562 -0.95 -1.535 -2.241 -1.604 -1.963 -6.4 -8.202 -7.442 -8.482 RSR 0.571 0.624 0.388 0.497 0.309 0.456 0.658 0.854 0.893 0.966 Based on the numerical evaluation presented in Table 4, it can be noted that NM1 and RM1 are fair representatives of the system, where NM1 is slightly better evaluated by all of the numerical measures used. However, Figure 2 shows that both models suffer from a lag effect, as a serious deficiency of the models. Figure 2. Models responses comparison to actual values of MWP from experiment 1 NM2 and RM2 are better evaluated than the models from experiments 1, where NM2 is better evaluated than RM2 by all numerical measures. Based on the graphical evaluation shown in Figure 3, it can be noted that NM2 and RM2 fit the actual MWP data quite accurately and mostly without the lag effect. However, in some cases lag effect can still be noted. Figure 3. Models responses comparison to actual values of MWP from experiment 2 Numerical and graphical evaluation indicate that both NM3 and RM3 benefit from the inclusion of the WSP time series into the set of model inputs, while NM3 produces better results. As Figure 4 shows, the remaining prediction lag from the experiment 2 is almost completely removed in experiment 3. Figure 4. Models responses comparison to actual values of MWP from experiment 3 To examine the impact of the variables added later to the set of inputs, MWP, as the most influential variable, was excluded from the set of input variables in experiment 4. It can also be assumed that MWP time series at the input are responsible for the lag effect. Therefore, in experiment 4 that assumption was investigated. Figure 5. Models responses comparison to actual values of MWP from experiment 4 Although graphical evaluation presented in Figure 5 shows no signs of lag effect, based on the numerical evaluation presented in Table 4, it can be noted that NM4 and RM4 are less accurate models of the system. Therefore, it can be concluded that MWP modelling of sufficient accuracy and without lag effect is possible using SWH and WSP time series at the input, excluding the MWP time series would downgrade models overall quality. In similar way, excluding WSP variable from set of input variables also decreases models performances, as can be seen from the numerical evaluation presented in table 4. Furthermore, it can be noted that NM5 and RM5 are evaluated as unsatisfactory accurate models of the system by PI and RSR measures of quality. On the other hand, graphical evaluation from the figure 6 shows no sign of a lag effect, and a solid fit of the models responses to the actual data. Figure 6. Models responses comparison to actual values of MWP from experiment 5 Based on the simulation results it can be concluded that MWP can be modelled with data based modelling methods, i.e. regression analysis and artificial neural network, where best results can be obtained when ANN with 9 input variables is used as inputs, namely: MWPt-1, MWPt-2, SWHt, SWHt-1, SWHt-2, WSPt, WSPt-1, WSPt-2, WSPt-3 for the case study presented in this paper. The models comparison based on MCCPI criterion is presented in Table 5 and in Figure 7. Table 5 shows MCCPI values for the experiments 1 to 5 and the number of inputs those models use. Table 5. NM and RM model performance comparison based on the MCCPI EXPERIMENT MCCPI Ni 1 0.12 2 2 0.46 5 3 0.91 9 4 0.32 7 5 0.10 4 MCCPI values presented in Table 5 suggest that NM1 is 12% more accurate than the RM1, where both models use 2 input variables. NM2 is 46% more accurate than the RM2, where both models use 5 input variables, etc. Based on the MCCPI it can be concluded that the NM is better than the RM in general, and that the advantage of NM over RM increases as the number of input variables of the model increases. Figure 7. MCCPI values for the experiments 1 to 5 5. CONCLUSION The purpose of this paper was to investigate the ability of Adriatic Sea waves mean wave period (MWP) modelling using data based modelling methods, i.e. artificial neural networks (ANN) regression analysis. Based on the results presented in section 4 it can be concluded that data based modelling can be successfully applied to mean wave period modelling of the Adriatic Sea waves, where artificial neural networks produce better results than regression models. Furthermore, neural model was more dominant over the regression model for more input variables used. The advantage of neural network is also expressed in simplicity of model formulation, which allows easier experiment performing. Therefore, neural model served as a base model for investigating optimal number of time series members for each input variable, i.e. MWP, SWH and WSP, and regression model was only used to validate neural model performances on the same set of inputs. For the case study, the best results can be obtained when 9 input variables are used as inputs, namely: MWPt-1, MWPt-2, SWHt, SWHt-1, SWHt2, WSPt, WSPt-1, WSPt-2, WSPt-3; i.e.: two preceding mean wave periods, current and two preceding significant wave heights, current and three preceding wind speeds. Regarding time series investigation of the input variables, interesting conclusion can be made. By showing the significance of including longer time series of the WSP, experiments revealed the importance of wind duration information to the MWP modelling which is indirectly included through time series of WSP. Challenges of the further research include prediction ability investigation, as well as prediction horizon determination. Furthermore, modelling and prediction of other sea state variables are of special interest, as well as extension of the experiments to all of the 40 available points in the Adriatic Sea. The resulting model from the following research should provide reliable information for the ship response modelling purposes with final goal of the optimizing route in the heavy seas. Generally, wave period information, i.e. the corresponding wave frequency and length in deep water, is important when designing ship and/or offshore structures in order to avoid resonant rolling or pitching motions by choosing the appropriate overall dimension which will not coincide with dominant wave lengths from various directions that will be encountered in service. In such a way the natural response frequency of a ship or an offshore structure can be “moved away” from dominant wave excitation frequencies thus minimizing undesirable response. The model proposed in this paper serves as a starting point in developing an efficient, simple, real-time decision making tool that could be used for navigation during bad weather in the Adriatic Sea, based on easily measurable data, i.e. wind speed and direction. Acknowledgments This work has been supported in part by Croatian Science Foundation under the project 8658-DATAS on Faculty of Mechanical Engineering and Naval Architecture, Zagreb University. Company Fugro OCEANOR provided academic license for using sea states data of the Adriatic Sea under the project DATAS. Authors Katalinić Marko and Mudronja Luka are PhD students on Faculty of Mechanical Engineering and Naval Architecture and participate in DATAS project. REFERENCES 1. Beale, Mark Hudson, Martin T. Hagan, and Howard B. Demuth. "Neural network toolbox 7." User’s Guide, MathWorks (2010).2. Hall, C., "Fleet Management", ToMS, Vol. 2, No. 2 (2011), pp. 77-81. [A reference to a journal article ...] 2 Cybenko, G., 1989. Approximation by Superpositions of a Sigmoidal Function, Math. Control Signals Systems 2, pp. 303-314. 3. Gupta, H.V., Sorooshian, S. and Yapo, P.O., 1999. Status of automatic calibration for hydrologic models: Comparison with multilevel expert calibration. Journal of Hydrologic Engineering, 4(2), pp. 135-143.Biographie 4. Haddadpour, S., Etemad-Shahidi, A., Kamranzad, B., 2014. Wave energy forecasting using artificial neural networks in the Caspian Sea, ICE-Maritime Engineering, DOI: 10.1680/maen.13.00004. 5. Katalinić, M., Ćorak, M., Parunov, J. Analysis of wave heights and wind speeds in the Adriatic Sea, Maritime Technology and Engineering, Gudes Soares, C., Santos, T. (ed.), London, Taylor & Francis Group, 2015. 1389-1394. 6. Matić, P., Golub Medvešek, I. and Perić, T., 2015. System Identification in Difficult Operating Conditions Using Artificial Neural Networks. Transactions on Maritime Science, 4(02), pp. 105112. 7. Moriasi, D.N., Arnold, J.G., Van Liew, M.W., Bingner, R.L., Harmel, R.D. and Veith, T.L., 2007. Model evaluation guidelines for systematic quantification of accuracy in watershed simulations. Trans. Asabe, 50(3), pp. 885-900. 8. Peres, D.J., Iuppa, C., Cavallaro, L., Cancelliere, A., Foti, E., 2015. Significant wave height record extension by neural networks and reanalysis wind data, Ocean Modelling 94, 128–140. 9. Rawlings, John O., Sastry G. Pantula, and David A. Dickey, 2001. Applied regression analysis: a research tool. Springer Science & Business Media, 2001. 10. Tabain T. Standard wind wave spectrum for the Adriatic Sea 1997;(45), pp. 303-313. revisited. Brodogradnja. 11. Zamani, A., Azimian, A., Heemink, A., Solomatine, D., 2009. Wave height prediction at the Caspian Sea using a datadriven model and ensemble-based data assimilation methods, Journal of Hydroinformatics 11.2, pp. 154 – 164 (doi: 10.2166/hydro.2009.043)
© Copyright 2026 Paperzz