Constructing Multilayer Feedforward Neural Networks to Approximate Nonlinear Functions - Examples and Justifications Jin-Song Pei School of Civil Engineering and Environmental Science University of Oklahoma Norman, OK 73019 Eric C. Mai School of Civil Engineering and Environmental Science Honors College University of Oklahoma Norman, OK 73019 ABSTRACT The paper reports on the continuous development of a heuristic methodology for designing multilayer feedforward network networks in modeling nonlinear functions in engineering mechanics applications. In this and the previous studies [10, 15, 16, 13, 16, 12] that this work is built upon, the authors do not presume to provide a universal method to approximate any arbitrary function, rather the focus is given to the development of a procedure that benefits the applications in the specific domain of engineering mechanics. This goal is fulfilled by utilizing the governing physics and mathematics of nonlinear functions and the strength of the sigmoidal basis function. A clear procedure for initializing neural networks to model various nonlinear functions commonly seen in engineering mechanics is provided to answer questions regarding neural network architecture and the initial values of weights and biases. Training examples and mathematical insights are presented to demonstrate the rationality and efficiency of the proposed methodology. Future work is also identified. 1 OVERVIEW The motivations and technical challenges of this study were presented by the authors at IMAC XXIV [12]. The ultimate goal of the authors is to develop a set of detailed guidelines with theoretical justifications for applying data-driven techniques such as neural networks to engineering applications based on (1) the mathematical and physical insights of the problem to be modeled and (2) the capabilities of neural networks in terms of a clear formulation of a linear sum of sigmoidal functions. The benefits of such an effort are many and include a more constructive approach for neural network initialization, more reliable training performance, and training results with more validity than could be obtained otherwise. To validate and fully develop the proposed neural network initialization methodology, a collection of ten types of nonlinear functions appearing in [1, 18] and presented in Fig. 1 are selected as target functions, and an initialization procedure is to be developed in this study. These nonlinearities represent typical functions encountered in the applications of aerospace, mechanical and structural engineering. I. Linear II. Cubic stiffness and more Prototype 1a, 1b, 1c III. Bilinear stiffness and more IV. Multislope Prototype 2 V. Fractional power VI. Softening cubic and more Prototype 3 VII. ClearVIII. Hard ance (dead saturation space Prototype 1b+1c IX. Saturation X. Stiction Prototype 1b+ -2 Figure 1: Ten nonlinear functions commonly seen in engineering mechanics applications and the recommended multilayer feedforward neural network architectures (i.e., prototypes) used to train them. Note that the indicated relationships are not exhaustive. Although not entirely arbitrary, the functions to be approximated in this study are not limited to nonlinear restoring forces as previously studied [10, 15, 16, 13, 16]. Here, the focus is given to approximating basic nonlinear functions that are widely encountered in engineering mechanics applications such as those seen in the stress-strain, moment-curvature, and load-displacement relationships, as well as time histories. This study is focused on memoryless and monotonic functions. Nonlinearities with memory are not treated in this study since they require different types of neural networks (e.g., recurrent neural networks, or multilayer feedforward neural networks with high dimensional inputs, e.g., [7, 17]). This study will lay a solid foundation for future studies on these other types of neural networks to build upon. Monotonic nonlinearities are also the focus of this study. Several existing studies, e.g., [3, 9], have been carried out to analyze strategies for time history-like nonlinearities with obvious peaks and valleys, however there is a gap in the literature on how to approximate ubiquitous monotonic nonlinearities using multilayer feedforward neural networks. 2 PROPOSED INITIALIZATION PROCEDURE, PROTOTYPES, AND VARIANTS The proposed initialization methodology was briefly introduced in [12]. A self-contained procedure has been developed by the authors in [11], of which this paper offers a condensed introduction. The central drive of this domain-specific neural network initialization methodology is to transform an otherwise ambiguous trial-and-error-based procedure into a clearly defined near-deterministic procedure that can be easily understood and executed. For a functional approximation problem as depicted in Fig. 2(a), this study proposes that three cohesive initialization stages including Stage I: Selecting prototypes; Stage II: Selecting variants, and Stage III: Deciding transformation as outlined in Fig. 2(b) be implemented. This is recommended for a typical initialization procedure using a feedforward neural network with one hidden layer to approximate a nonlinear function. In detail, the number of hidden nodes can be found in the outcome of Stage I, while the values of the weights and biases (i.e., the values of IW , b and LW as shown in Fig. 2(a)) can be found through a progressive and iterative procedure consisting of Stages I to III. (a) (b) Input Weights (IW) Start Layer Weights (LW) Σ Examine data to determine dominating features Stage I Bias (b1) Select prototype that best corrosponds to these features Prototype 1 Prototype 2 Σ Prototype 1b+1c Prototype 1+ -2 Decide number of hidden nodes Bias (b2) x Prototype 3 Σ Stage II y(x) Stage III Select a variant of the prototype Step 1: Decide proportioning and translation if necessary Σ Step 2: Decide scaling if necessary Batch mode training Bias (bnh) Yes Input Layer Hidden Layer Legend Σ Output Layer No Yes Adjust variant? No summation Yes activation function Adjust transformtion? Adjust prototype? No End Figure 2: (a) Universal approximator and notation used in this study. Note that the terms IW , b and LW follow the notation convention used in the Matlab Neural Network Toolbox [2]. (b) Flow chart to illustrate the proposed prototype-based initialization procedure. For the ten types of nonlinear functions specified in Fig. 1, it is recommended that only three fundamental prototypes be utilized either individually or combinatorially for neural network initialization. This finding reveals the versatility and efficiency of the proposed initialization. The key elements in this proposed methodology, prototypes and their variants, are predetermined neural networks that are not obtained from an inverse formulation of training any data sets. Instead, they are constructed in advance from a forward formulation (based on either the algebraic or geometric capabilities of linear sums of sigmoidal functions) to capture some dominating features of the nonlinear function to be approximated in the specified applications. The construction of Prototype 2, for example, was illustrated graphically previously in Fig. 1 of [12]. In the previous work [10, 13, 14, 16] that this study is built upon, some prototypes were obtained using various linear sums of a few terms of sigmoidal functions through either algebraic derivations or geometric visualizations. [11] further depicts how all three prototypes can be obtained, explains why there are numerous variants for each prototype and gives the values of IW , b and LW for the selected variants. Fig. 3 illustrates three possible variants for each proposed prototype for a normalized input. With the prototypes and their variants prepared in advance, training a neural network to approximate a specific function within the scope for which these forward exercises are formulated can Prototype 1 Prototype 2 Prototype 3 1 1 1 0.5 0.5 0.5 0 0 0 -0.5 -0.5 Variant a -0.5 Variant b Variant c -1 -1 -0.5 0 0.5 1 -1 -1 -0.5 0 0.5 1 -1 -1 -0.5 0 0.5 1 Figure 3: Three variant examples within the proposed three main prototypes. begin with a matching, selecting and tuning procedure of initialization, which is reflected respectively in the proposed Stages I, II and III as shown in Fig. 2(b). The concept of prototypes and their variants is generic and thus should not be restricted to normalized input and output ranges. In principle, one could determine the values of IW , b and LW based on arbitrary input and output ranges. This flexibility, however, could cause confusion and inconsistency and needs to be handled with care for the sake of clarity in implementing the proposed methodology. Having said this, it is adopted in this study to (1) define prototypes and their variants entirely based on normalized input (x) and output (y(x)) ranges as shown in Fig. 3, and (2) utilize a separate stage, Stage III, to further transform a selected prototype or its variant for a non-normalized input-output situation. In detail, one has to adjust the values of weights and biases obtained from Stages I and II according to the input and output ranges of the training data set as detailed in [11]. This procedure largely reduces subjective judgements, an approach which is in sharp contrast with commonly seen random initialization schemes. 3 TRAINING EXAMPLES In this section, several training examples given previously in [12] are revisited. No errata or retraining is needed. The only purpose is to better illustrate how to precisely follow the procedure defined in Fig. 2(b) and utilize the prototypes and their variants defined over normalized input and output as shown in Fig. 3. Also note that the same presentation format of training examples as is seen in in [12] is adopted in this section. Since the Nguyen-Widrow initialization algorithm does not specify the required number of hidden nodes, this critical piece of information is borrowed from the proposed initialization methodology whenever the Nguyen-Widrow initialization is used. 3.1 Direct Adoption of Prototypes The proposed three variants of Prototype 1 are used directly in Fig. 4. No transformations of these variants are needed since the input is already normalized, and the output has the same order of magnitude as that of the variants presented in Fig. 3. Trained Neural Networks Initial Neural Networks 2 NguyenWidrow 1.5 1 0.5 0.5 0 0 -0.5 -0.5 -1 -1 -1.5 -1.5 -0.5 0 0.5 1 2 1.5 -2 -1 -1 10 MSE 1 -2 -1 10 -2 -3 -4 -0.5 0 0.5 10 1 1 0.5 0.5 0 0 -0.5 40 60 Epoch 80 100 20 40 60 Epoch 80 100 0 10 10 -1 1a 1b -1.5 10 0.5 -2 -3 -1.5 1c 0 -1 -0.5 Target function -1 -0.5 20 1.5 1 -2 -1 0 10 2 Proposed Prototype 1 10 10 MSE 1.5 Training Performance 0 2 1 -2 -1 10 -0.5 0 0.5 1 -4 0 Figure 4: An example of using three variants of Prototype 1 (with three hidden nodes) to train a fractional power function, y = x1/3 . The target function is in magenta, while those curves in blue with different line thicknesses show four random options using the Nguyen-Widrow initialization [8]. Note that some of the training stopped prematurely. In Fig. 3 of [12], which was presented at IMAC XXIV, however, a direct adoption of the three variants of Prototype 2 (as defined in this study) is prevented due to the increased input range of [−10, 10]. To carry out Step 1 under Stage III, the values of the weights IW can be scaled down by a factor of 10 (Except for those two terms corresponding to the constant term. See [11] for more details). For non-normalized inputs in general, x̄ = Cx x, and one can proportion the derived prototypes and their variants by “stretching” or “squeezing” the function approximated by the initial neural network along the x-direction in inverse relation to the non-normalized input. Quantitatively, the transformed value of IW , w̄, is based on wx − b = w̄x̄ − b, where w̄ = C1x w. In addition to these training exercises, target functions such as linear (Nonlinearity Type I), sine wave [− π2 , π2 ] (Type VI), and hard saturation (Type VIII), shown in Fig. 1, have been trained successfully using the proposed initialization methodology. In all these exercises, the proposed prototypes and their variants are either adopted directly or after proportioning in Step 1 of Stage III. 3.2 Combining Prototypes The first examples of the usefulness of Prototype #3 as shown previously in Figs. 4 and 5 in [12] exhibit a decomposition idea that is used to handle more complex functions. In Fig. 5(a), the same swept sine wave form from Figs. 4 and 5 [12] is spatially partitioned into three individual components/cycles, each of which can be approximated independently using Prototype 3 after some detailed treatment under Stage III transformation. In particular, the center of each cycle needs to be captured in the initialization through translation (i.e., adjusting the value of the bias, b), while the non-normalized input range needs to be taken into account through proportioning (i.e., scaling the value of the weights, IW ). Both translation and proportioning take place during Step 1 of Stage III. An illustration of a neural network with six hidden nodes has been presented in Fig. 5(a); the training results were previously presented in Fig. 5 of [12]. 1 (a) Sum 1 0.5 1 0.5 = 0 -0.5 5 10 y -1 0 x Component 2 1 + + 0 -0.5 5 10 y -1 0 x (b) Component 3 0.5 0.5 0 -0.5 -1 0 x Component 1 5 10 y 10 10 10 -0.5 -10 x 10 y Component 2 20 0 5 Component 1 20 0 -1 0 Sum 20 -20 -10 = + 0 -10 0 x 10 y -20 -10 x 0 -10 0 10 y -20 -10 x 0 10 y Figure 5: Decomposing (a) Swept-sine, and (b) Multi-slope function into a summation of some components that can be approximated directly with the proposed prototypes. A multi-slope nonlinearity is approximated to further illustrate the use of a combinatorial prototype. The decomposition idea is illustrated in Fig. 5(b). A step-by-step evolution of several initialization options obtained at Steps 1 and 2 and their training results are presented in Fig. 6(b). Note that multiple options exist for the training of this nonlinearity. The presented training results are not exhaustive; one can further utilize the proposed Stage III to further generate and refine other options. Also note that the legend utilizes the nomenclature defined in [11]. The idea of decomposition is very useful in (1) handling numerous types of nonlinearities that are more complex than those which can be approximated directly by individual prototypes, and (2) generalizing the solution from one-variable to two-variable functions, especially when dealing with two uncoupled variables. For example, a softening Duffing oscillator from [4] is selected where the force-state mapping can be applied in the formulation and displacement and velocity are uncoupled. Fig. 7 presents the training results using both the Nguyen-Widrow and the proposed initialization, respectively, each with two options. It can be seen that the proposed initialization is more successful than the Nguyen-Widrow algorithm in approximating this function even when only five nodes are used. 3.3 Approximating Piece-Wise Unsymmetrical Functions Although the proposed prototypes are derived to approximate symmetrical and smooth nonlinearities, e.g., Fig. 1 in [12], these prototypes have shown the ability to be trained and converged well to piece-wise unsymmetrical nonlinearities over the specified input range, as revealed in Fig. 8 in [12]. Approximating these nonlinearities is of great practical significance. First, they represent experimental phenomena that can often be encountered in the practice of engineering mechanics such as concrete in compression, and clearance (or dead space) joint behavior. Second, these situations Initial Neural Networks Trained Neural Networks 2 Training Performance 20 NguyenWidrow 10 0.5 5 0 0 -0.5 -5 -1 -10 -1.5 -15 -2 -10 -5 0 5 10 -5 0 5 10 10 20 Proposed Combination (Step 1) 10 10 -20 -10 10 1 0.5 2 15 1 MSE 1.5 10 1 0 -1 -2 0 10 20 40 60 Epoch 80 100 0 20 40 60 Epoch 80 100 0 20 40 60 Epoch 80 100 2 15 10 10 1 MSE 5 0 0 Target function [10] [10] 1b +2a -0.5 [10] 1b +2b [10] 1b -1 -10 -5 0 +2c -1 10 [10] -15 [10] 5 -2 -20 -10 10 -5 0 5 10 10 20 Proposed Combination (Step 2) 10 5 5 0 0 -5 1b [10] 2x1b -15 [10] [10]+ 0 [10] +2a 10 -10 +2a 40x1b -5 2 10 1 0 -5 Target function -10 -20 -10 10 15 MSE 10 0 -5 -10 20 15 10 10 -1 [10] 20x2a 5 -15 [10] -20 -10 10 -5 0 5 10 10 -2 0 10 -5 -50 0 50 velocity 0 10 -5 -10 -10 displacement 0 0 10 target trained restoring force 5 5 0 0 -5 -50 0 50 velocity 10 -10 -5 0 -10 displacement 0 target trained 10 5 5 0 0 -5 -50 0 velocity50 10 -10 -5 0 -10 displacement 0 10 restoring force 10 -5 -10 -10 displacement 0 5 5 0 -5 -50 0 0 50 velocity 10 -5 -10 -10 displacement 0 0 10 5 5 0 -5 -50 0 50 velocity 0 10 -5 -10 -10 displacement 0 0 10 0 10 target trained 5 restoring force 0 0 0 restoring force -5 -50 5 5 restoring force 0 Proposed Combination (Stage III) restoring force 5 5 50 velocity restoring force Nguyen-Widrow restoring force Figure 6: An example of combining Prototypes 1b and 2a to train a multi-slope function. The target function is in magenta, while those curves in blue with different line thicknesses show four random options using the NguyenWidrow initialization [8]. Note that both Steps 1 and 2 were used to generate possible options for the initialization. 5 0 -5 -50 0 0 50 velocity 10 -5 0 -10 10 displacement 0 10 5 5 0 -5 -50 0 50 velocity 0 10 -5 0 -10 -10 displacement target trained Figure 7: Training results of a softening Duffing nonlinearity in [4] based on two options using the Nguyen-Widrow algorithm and two other options using the proposed initialization methodology. All four trainings use neural networks with five hidden nodes. The target function is in black. involve the C 1 discontinuity where (1) polynomial fitting normally cannot perform as efficiently and (2) the Fourier series causes nonuniform convergence (the so-called Gibbs phenomenon). An idealized function typical for concrete in compression, a parabola joined by a horizontal line at its vertex, is also approximated. Fig. 8 shows the training results using both the Nguyen-Widrow algorithm and the proposed initialization methodology. It can be seen that the joint is offset both horizontally and vertically. The values of the weights and biases, derived from the proposed Stage III transformation, are detailed in [11]. As in Fig. 6, multiple options for the initialization exist following the proposed methodology; those presented are merely some possibilities. Initial Neural Networks Trained Neural Networks Nguyen-Widrow 1 Training Performance 5 200 10 150 10 100 10 3 1 MSE 2 0 50 10 -1 0 0 5 10 15 20 200 150 100 Proposed Prototype 2a Transformed at Steps 2 and 1 50 -50 0 5 10 15 20 10 -3 -5 0 200 10 150 10 100 10 MSE -2 10 -1 Target function 50 10 20 40 60 Epoch 80 100 20 40 60 Epoch 80 100 5 3 1 -1 [10] 100 x 2a 0 [10] 100 x 2a<10> 0 100 x -50 0 5 10 15 20 -50 0 5 10 10 -3 [10] 2a<10> +100 15 20 10 -5 0 Figure 8: An example of using Prototype 2, Variant a (with four hidden nodes) to approximate an idealized piecewise unsymmetrical nonlinearity with an offset that is typical for concrete in compression. The target function is in magenta, while those curves in blue with different line thicknesses show four random options using the NguyenWidrow initialization [8]. Note that both Steps 2 and 1 were gone through individually to generate three possible options for the initialization. 4 JUSTIFICATIONS AND MATHEMATICAL INSIGHTS A qualitative justification for applying a prototype-based approach for greater success in neural network training can be found in the balance between global and local search involved in training. Ideally, training neural networks in function approximation should belong to the “global search” category by finding global minimums of error functions. However, currently employed training techniques are normally only “local search” tools [7]. Thus selecting a good initial point for neural network training is critical since the training process will normally result in trained values that are still in the neighborhood of their initial values. If domain knowledge or any other insight of the function to be approximated could be used to influence neural network initialization, then the training would more likely converge to the global minimum (instead of just a local minimum), making the trained neural network more accurate and meaningful. This is the guiding philosophy of the neural network initialization methodology proposed in this paper and its associated previous work [10, 13, 14, 16]. In addition to the graphical illustration in Fig. 3, a quantitative exercise is further presented in this study to offer some insights into the construction of Prototypes 2 and 3. For the convenience of 1 , discussion, the sigmoidal function S (p) = 1+e1−p is denoted equivalently as σ(w, x, b) = 1+e−(wx−b) where p = wx − b is for one-variable function approximation. The notation h represents the output of a hidden node. The superscripts in <> denote the prototype ID, and the subscripts refer to the serial number of hidden nodes. Prototype 2: Summing two defined sigmoidal terms To understand Prototype 2, an explanation can be provided as follows, which is similar to the derivation of the approximation of a cubic power as in [10, 16]: Two sigmoidal functions are chosen, σ1<2> (w, x, b) and σ2<2> (w, x, −b), where b 6= 0. The Taylor series expansion of both functions at the origin x = 0 to the third power can be written as: web 1 w2 eb −1 + eb 2 1 w3 eb 1 − 4eb + e2b 3 1 <2> + x+ x + x +··· σ1 = 1 + eb (1 + eb )2 2! 3! (1 + eb )3 (1 + eb )4 we−b 1 w2 e−b −1 + e−b 2 1 w3 e−b 1 − 4e−b + e−2b 3 1 <2> + x+ x + x +··· σ2 = 1 + e−b (1 + e−b )2 2! 3! (1 + e−b )3 (1 + e−b )4 The sum of the above two functions leaves one with the following (referring to [10, 16] for those vanishing terms): 1 w3 eb 1 − 4eb + e2b 3 2web <2> <2> x+ x +··· (1) σ1 + σ2 = 1 + 3 (1 + eb )2 (1 + eb )4 = k <2> × σ1<2> + k <2> × σ2<2> − + h<2> Prototype 2 and its variants can be considered as h<2> 2 1 <2> <2> k , where k is equal to LW . The constant term of −k <2> can be approximated with no errors using two defined sigmoidal terms [10, 16]. Based on Eq. (1), it can be seen that Prototype 2 and its variants mimic sums of some odd-power terms of x, i.e., hardening types of nonlinearities. Prototype 3: Subtracting one defined sigmoidal term out of the other Consider two sigmoidal functions that share the same center. Here the center refers to the scaled bias, b0 in a sigmoidal variable p = wx − b = w(x − b0 ) = wx̄. The rest of the argument can either follow the idea of using the Taylor series expansion of a sigmoidal function or it can make use of the central difference method to approximate derivatives of the sigmoidal function as proposed in [6] and adopted in [5]. The latter is utilized in this study. Since the following discussion can be conveniently generalized to the case when b0 6= 0 by replacing x with x̄, consider one sigmoidal term with b0 = 0, i.e., σ(w, x, 0) = 1 1 + e−wx (2) It can be further derived that: ∂σ xe−wx (w, x, 0) = ∂w (1 + e−wx )2 (3) 0.25 0.2 0.15 xe -wx/(1+e -wx)2 0.1 0.05 0 -0.05 w=0.1 w=1 -0.1 w=2 w=3 -0.15 w=4 w=5 -0.2 w=10 -0.25 -1 -0.8 -0.6 -0.4 -0.2 0 x 0.2 0.4 0.6 0.8 1 Figure 9: Understanding Prototype 3 ∂σ (w, x, 0) can be plotted versus x as shown in Fig 9, and can demonstrate This function ∂w various functional shapes, including antisymmetrical wavy forms, based on the value of w. Note that the first derivative shown in Eq. (4) can be approximated using the central difference method as follows: σ(w + ∆w, x, 0) − σ(w − ∆w, x, 0) ∂σ (w, x, 0) ≈ ∂w 2∆w (4) where ∆w is a user-defined value and controls the approximation accuracy. If one assigns σ1<3> = σ(w + ∆w, x, 0) σ2<3> = σ(w − ∆w, x, 0) <3> <3> then h<3> + h<3> = k2∆w σ1<3> − k2∆w σ2<3> can be used to represent Prototype 3 and its 1 2 variants. Based on Eq. (4) and Fig. 9, it can be seen that Prototype 3 and its variants mimic antisymmetrical wavy forms. By adjusting the weight of w, this linear sum can be dilated into a straight line as a special case of Prototype 3, which has been proven using the Taylor series expansion in [10, 16]. 5 CONCLUSION Neural networks can be highly versatile and efficient in adapting to data when approximating nonlinear functions, however, these qualities can be achieved only if neural networks are initialized properly, as constructively verified in this study. A structured and detailed initialization procedure has been presented as the continuous development of a heuristic prototype-based initialization approach for multilayer feedforward neural networks proposed in previous studies [10, 15, 16, 13, 16, 12]. A range of typical nonlinear functions used in engineering mechanics applications has been targeted, and training performances have been presented and compared with those of neural networks trained using the Nguyen-Widrow initialization algorithm. The proposed initialization methodology has shown satisfactory versatility in addition to being a constructive method. Technical challenges have been identified, and solution strategies have been provided as in [11]. In particular, adding more nodes in a transparent and rational manner is being pursued by the authors and their co-author. 6 ACKNOWLEDGEMENT The Junior Faculty Research Program awarded to the first author by Dr. T.H. Lee Williams, the Vice President for Research at the University of Oklahoma is greatly appreciated. Funding from the Undergraduate Research Opportunities Program (UROP) awarded to the second author is also greatly appreciated. References [1] D.E. Adams and R.J. Allemang. Non-linear vibrations classnotes, course 20-263-781. 2000. [2] M.T. Hagan, H.B. Demuth, and M. Beale. Neural Network Design. PWS Publishing Company, 1995. [3] A. Lapedes and R. Farber. Neural information processing systems, d. anderson (ed.), american institute of physics, new york. pages 442–456, 1988. [4] S.F. Masri, J.P. Caffrey, T.K. Caughey, A.W. Smyth, and A.G. Chassiakos. Identification of the state equation in complex non-linear systems. International Journal of Non-Linear Mechanics, 39:1111–1127, 2004. [5] A.J.Jr. Meade. Regularization of a programmed recurrent artificial neural network. Journal of Guidance, Control, and Dynamics, 2003. [6] H.N. Mhaskar. Neural networks for optimal approximation of smooth and analytic functions. Neural Computation, 8:164–177, 1995. [7] O. Nelles. Nonlinear System Identification: From Classical Approaches to Neural Networks and Fuzzy Models, pp. 785. Springer Verlag, 2000. [8] D. Nguyen and B. Widrow. Improving the learning speed of 2-layer neural networks by choosing initial values of the adaptive weights. In Proceedings of the IJCNN, volume III, pages 21–26, July 1990. [9] S. Osowski. New approach to selection of initial values of weights in neural function approximation. Electronic Letters, 29(3):313–315, 1993. [10] J.S. Pei. Parametric and Nonparametric Identification of Nonlinear Systems. Ph.d. dissertation, Columbia University, 2001. [11] J.S. Pei and E.C. Mai. Constructing multilayer feedforward neural networks to approximate nonlinear functions in engineering mechanics applications. ASME Journal of Applied Mechanics, 2006. under review. [12] J.S. Pei and E.C. Mai. Neural network initialization for modeling nonlinear functions in engineering mechanics. In Proceedings of the 24rd International Modal Analysis Conference (IMAC XXIV), 2006. [13] J.S. Pei and A.W. Smyth. A new approach to design multilayer feedforward neural network architecture in modeling nonlinear restoring forces: Part i - formulation. ASCE Journal of Engineering Mechanics, December 2006. to appear. [14] J.S. Pei and A.W. Smyth. A new approach to design multilayer feedforward neural network architecture in modeling nonlinear restoring forces: Part ii - applications. ASCE Journal of Engineering Mechanics, December 2006. to appear. [15] J.S. Pei, A.W. Smyth, and E.B. Kosmatopoulos. Analysis and modification of volterra/wiener neural networks for identification of nonlinear hysteretic dynamic systems. Journal of Sound and Vibration, accepted for publication, 275(3-5):693–718, 2004. [16] J.S. Pei, J.P. Wright, and A.W. Smyth. Mapping polynomial fitting into feedforward neural networks for modeling nonlinear dynamic systems and beyond. Computer Methods in Applied Mechanics and Engineering, 194(42-44):4481–4505, 2005. [17] I.W. Sandberg, J.T. Lo, C.L. Fancourt, J.C. Principe, S. Katagiri, and S. Haykin. Nonlinear Dynamical Systems: Feedforward Neural Network Perspectives, pp. 256. Wiley-Interscience, 2001. [18] K. Worden and G.R. Tomlinson. Nonlinearity in Structural Dynamics: Detection, Identification and Modelling, pp. 680. Institute of Physics Pub, 2001.
© Copyright 2024 Paperzz