Identification of complex nonlinear processes based on fuzzy decomposition of the steady state space Aswin N. Venkat, P. Vijaysai, Ravindra D. Gudi* Department of Chemical Engineering, Indian Institute of Technology-Bombay, Powai, Mumbai-400076, India Abstract A methodology for identification and control of complex nonlinear plants using multi-model approach is presented in this paper. The proposed methodology is based on fuzzy decomposition of the steady state map. It is shown that such a decomposition strategy facilitates the design of input perturbation signals and helps in identifying linear or simple nonlinear models for each local region. A composition strategy to aggregate the local model predictions is proposed and shown to give excellent cross validation as well as to facilitate smooth switching between the local models. A novel control scheme that is based on the multi model strategy is proposed. The practicality of the identification and control scheme presented here is demonstrated by application to the continuous fermenter of Henson and Seborg (M.A. Henson, D.E. Seborg, Nonlinear control strategies for continuous fermenter, in: Proceedings of 1990 American Control Conference, San Diego, 1990), which exhibits severe nonlinearities and gain directionality changes. Keywords: Multi-model Identification; Steady state decomposition; Fuzzy segregation 1. Introduction Nonlinear processes are invariably difficult to model and hence have been traditionally represented employing a linearized model around some operational steady state. A large number of chemical and biological processes are characterized by a wide and varied set of input dependent dynamics that cannot be handled within the linear model identification framework. Changes in the directionality of the gain and in the nature of the dynamics itself, for different ranges of inputs, pose a real challenge to identification and control of such nonlinear systems. Different methodologies have been employed by researchers to mimic nonlinear behavior. An approach that has been extensively used is the development of an overall nonlinear model that performs satisfactorily through the entire operating range. Neural Networks (NN) and other nonlinear structures have been extensively used for the purpose. One of the disadvantages of such an approach is the difficulty in arriving at a suitable model/network structure and the invariably complex structure thereof. Further, a majority of these approaches are essentially black box models, which to a good extent do not permit transparency between the plant dynamics and the modeling procedures. An alternate methodology is the multiple-model based framework that allows for partial transparency of process knowledge to the modeling procedure. In the multiple modeling frameworks, the main challenges in the development of dynamic models are: (i) Division of the operating space into operating regimes. Johansen and Foss [1] have proposed a strategy for decomposition of the nonlinear space into multiple linear structures utilizing a priori process information. An operating regime decomposition strategy based on heuristics and local model validity functions has been proposed by Johansen and Foss [2]. Ogunnaike and Ray [3] have considered a steady state map and classified operating regimes based on a priori knowledge of the input and output ranges on the steady state map. They postulate appropriate controllers based on an analysis of the RGA within each operating regime. While analysis of the steady state map, as has been done by Ogunnaike and Ray [3] does give an indication of the 474 steady state gain variation, inclusion of dynamics in the decomposition step may characterize the variation in dynamics more accurately. Towards this, Venkat and Gudi [4] have proposed a strategy for decomposition of the operating space based on an analysis of the dynamic input–output data obtained by plant perturbation. However, for complex plants having varied regions of operation (including gain sign changes and variations in time constants), perturbation design is not an easy task and needs further analysis. (ii) Construction of local dynamic models. Segregation of the operating regions as in (i), divides the input-output space into a collection of local regions. The next step would be the development of dynamic models within each region. Kuipers and Astrom [5] have proposed a multiple-model strategy wherein interpolation between local model parameters is employed. In Venkat and Gudi [6], the problem of local model structure selection was solved as a combinatorial exercise with the goal being that of minimum local cross validation error. (iii) Aggregation of local models through a suitable switching strategy. For the deployment of these local models in closed loop operation, the local model predictions need to be aggregated to yield a representative prediction in the region of operation. A number of switching and aggregation strategies have been proposed. Hard switching strategies work by invoking only the most representative model/controller at any given time. Such a crisp strategy leads to deterioration in performance during transitions between local regions. Several model/controller scheduling techniques have been proposed in recent times for use within the multiple-model based framework. Schott and Bequette [7] have used an estimator based scheduling algorithm to weigh local models towards control of chemical reactors. Local model performance indices to select local controllers have been outlined by Narendra et al. [8]. A supervisor based hierarchical technique for local controller selection based on virtual control loop feedback error has been proposed by Kordon et al. [9]. The objective of this paper is to analyze the above issues related to multiple model control for nonlinear systems by using fuzzy decomposition and aggregation techniques. Unlike in our earlier approach [4], the focus here is on decomposition of the steady state space. In a typical operating plant having varying regions of operation (such as polymerization for the production of different polymer grades), steady state information corresponding to each region is typically available from the plant. Thus, based on the proposition that the steady state map characterizes the variation in the operating space at least in terms of the gain changes (Ogunnaike and Ray [3]), we propose a decomposition of the steady state map using fuzzy segregation. Towards the latter decomposition, we propose a modi- fication of the existing segregation methodologies and develop a new iterative segregation scheme that includes conditions on the linearity and parsimony of the resulting local models. For the development of dynamic models in each local region that results from the above segregation, we analyze issues related to perturbation within each operating region for the development of simpler linear/nonlinear models. Towards obtaining accurate predictions over the entire operating range, we propose a method of generating composite predictions based on fuzzy aggregation, which ensures smoother transitions between operating regions. For control, we propose a multiple model control scheme that employs the local models through fuzzy switching. The superiority of fuzzy switching over crisp controller switching in terms of control performance is also highlighted. Thus, the merits of the proposed approach over those proposed in literature are (i) fuzzy classification of the steady state information to decompose the relationships with a view to better input signal design, (ii) suitable division of the relationships based on a linearity index to generate local linear models and (iii) fuzzy composition based control law in an model predictive control framework. We demonstrate the validity of the identification and control methodologies proposed here by application to the nonlinear fermenter problem of Henson and Seborg [10] which is characterized by severe non-linearities and gain direction changes. The organization of the paper is as follows. Section 2, provides an overview of the fuzzy clustering techniques employed in this work. The proposed cluster validity measure based on minimum linear model reconstruction error is also detailed. In Section 3, an analysis of a key variable selection technique that has been used for model structure selection is presented. Two indices to ascertain linearity within local regions are also discussed. A detailed discussion on the proposed methodology is provided in Section 4. Section 5 describes the relevant issues in the implementation of the multiple model based MPC framework towards achieving good control of complex nonlinear systems. The proposed identification and control strategies are tested on the nonlinear fermenter of Henson and Seborg [10]. The case study and the results obtained employing the proposed schemes are detailed in Section 6. A discussion on the proposed technique and scope for future work is outlined in Section 7. 2. Fuzzy clustering Clustering enables division of the complex nonlinear regions into simpler subspaces. Incorporation of fuzziness into the clustering methodology allows for overlap of subspaces and thereby smoother transitions between operating regions. 475 Fig. 1(a) shows the steady state map for a certain process. From an examination of the figure it is evident that there are three characteristic and distinct divisions that exist based on steady state gains. A point in such a map belongs to one and only one uniform-gain operating region. Fig. 1(b) shows an alternate steady state map with two operating regions based on uniform gain. For this case it would be incorrect to classify intermediate points crisply to either region. A transitional point in such a map would belong to both regions albeit to varying extents. By allowing for a point in the steady state map to belong to more than one region, we introduce the concept of fuzziness within the clustering scheme. This enables smoother transitions between operating regions. Consider the set X (matrix) of feature points (in our case, the input–output steady state data points) for a process, where X 2 Rn . A typical structure for X can be written as, " # U~ X¼ ð1Þ y U~ nN where U~ represents the set of steady state inputs and * yðU~ Þ is the steady state output corresponding to U. Let c be the number of clusters and m be the fuzzy exponent (defined later), where m > 1. Partitioning of the data, in the n-dimensional space, into each of these c clusters would result in feature points that have similar gains to lie closer to each other in a cluster. Often times this transition may not be crisp. Let ij therefore denote the membership of the jth feature point in X to the ith cluster and let Vk denote the prototype point for the kth cluster. A fuzzy partition is characterized by the following equations: ij 2 ½0; 1 ð2Þ c X ij ¼ 1 8j ¼ 1; 2 N ð3Þ i¼1 0< N X ij < N 8i ¼ 1; 2 c ð4Þ j¼1 The partitioning involves the minimization of an objective function J in the space of Mi and ij J¼ c X N X 2 m ij Xj Vi ð5Þ i¼1 j¼1 where Xj Vi 2 ¼ Xj Vi Mi Xj Vi T Xj represents the jth feature vector and Mi, the norm matrix that characterizes the orientation of the clusters. The minimization of J yields a partition of the data with Vi , i ¼ 1 c; as the cluster centers. It also produces a matrix of memberships of each feature vector to each cluster. A number of fuzzy clustering algorithms are available in literature [11,12]. Two parameters critically affect the nature of clustering. The first, is the norm employed for partitioning. We have employed the Gustafson–Kessel algorithm (G–K) [13] for adaptive norm based segregation. The alternate fixed norm based clustering [14] forces the clusters into characteristic norm based topographies, even if such a structure is not present in the data. For example, a Euclidean norm based clustering induces spherical cluster topographies irrespective of the actual distribution of the data. Employing an adaptive norm based clustering methodology enables the cluster topology to vary depending on the data spread. The expression in such a case for Mi can be analytically derived [14,15] as Mi ¼ ði detðFiÞÞ1=p Fi 1 ð7Þ N P where Fi ¼ Fig. 1. Division of the steady state map into local regions: (a) crisp division; (b) fuzzy division. ij m ðZj ViÞðZj ViÞT j¼1 N m P ij j¼1 ð8Þ 476 and i is the volume constraint in the ith cluster and Z is the data vector. The other parameter, the fuzziness exponent ‘m’ also significantly affects the nature of the partition. As m approaches 1 from above, the partition becomes crisp. As m ! 1 the partition becomes totally fuzzy. In this limit, the magnitudes of the membership are all equal with a value of 1=c, i.e. all points belong to all the clusters with an equal membership. An overview of the G–K algorithm is provided in Table 1(a). 2.1. Minimum linear model reconstruction error Determination of the optimal number of divisions of the operating space marks an important step in the segregation methodology. Dividing the operating space into a large number of local regions may represent the behavior more accurately but is a non-parsimonious representation; however, a very small number of local regions may perhaps be inadequate to represent the overall process. To aid the determination of the number of divisions necessary, two cluster validity measure namely, Fuzzy Hypervolume [16] and Average within cluster distance [17] have been proposed. The primary thrust of the modeling philosophy discussed here is to construct locally linear models whenever possible. The clustering techniques as discussed here, are data driven techniques and as such do not guarantee linearity within each subspace. In this section, we propose a new cluster validity measure that provides an indication of the number of clusters into which the operating space has to be divided into, such that the error due to linear modeling is minimum. Thus at the end of the clustering/decomposition step, it is expected that the relationships between points within each cluster could be modeled by linear or simpler nonlinear structures Let n be the dimension of the input(s)–output space. A typical feature vector (say the jth set, j=1, 2 . . .N) is represented as 2 j 3 2 1 3 Zj U1 6: 7 6: 7 6 7 6 7 * 7 ¼ 6: 7 Zj ¼ 6 ð9Þ : 6 j 7 6 n 1 7 4U 5 4 Zj 5 n 1 Zjn Yj nx1 nx1 where is the parameter vector for the relationship. Consider the objective function J¼ i¼1 j¼1 þ Zjn ¼ Zj1:ðn 1Þ ð10Þ c X N 2 X i Zjn Zj1 ðn 1Þ i ð11Þ i¼1 j¼1 where i is the parameter vector for the ith cluster. The first term in the objective function is identical to the distance minimization term in the adaptive clustering methodology discussed earlier in Section 2. The second term represents a weighted sum of the errors resulting due to the linear model assumption in each local region. For linear modeling at this step, overlap between local regions is not allowed and only those points that have the highest membership to a particular cluster are considered for construction of the local linear models. The expression for i can thus be written as i ¼ 1; if ij > kj 8k ¼ 1; 2; ::::; i; :::c ð12Þ ¼ 0; otherwise Further, noting that the linear model has to satisfy the conditions at the center of each cluster, the following equation becomes mandatory T 1 i ¼ Vi1:ðn 1Þ Vi1:ðn 1Þ Vi1:ðn 1Þ Vin N m P ij Zj where Vi ¼ j¼1 N m P ij ; ¼ 1; 2:::N ð13Þ j¼1 The constraint set for the objective function in (11) comprises of the following equations [13] c X ij ¼ 18j ¼ 1; 2; ::::::::::N ð14Þ i¼1 detðMi Þ ¼ 8i ¼ 1; 2; ::::::::::c 0 < ij < 18i ¼ 1; 2; :::::c ð15Þ j ¼ 1; 2; :::::N ð16Þ 0< Let Z~nj denote the output belonging to the jth set namely Y j and Zj1:ðn 1Þ denotes the set of inputs resultj T ing in the output Zjn i.e. the vector U1j U2j Un 1 . In other words, the vector Zj is the representation of the steady state output Y with respect each of the inputs U and the steady state relationship between the inputs and the output is given by c X N X m T ij Zj Vi Mi Zj Vi N X ij m < N8i ¼ 1; 2; :::::::c ð17Þ j¼1 Numerical solution of the optimization problem in the space of ij , Vi and Mi is complex as the number of constraints depend directly on the number of data points. For a data set of size N and number of clusters c, there are c þ N equality constraints and 2c þ 2cN inequality constraints. Below, we propose a scheme for the minimization of the objective function (11). The step-by-step procedure is provided in Table 1(b) 477 Table 1 (a) The Gustafson Kessel algorithm Given the data set Z, choose the number of clusters c; the weighting exponent m (m >1) and the termination tolerance, tol Initialize the fuzzy partition U(0) and the cluster centers V(0) and define ij to be the (i, j)th element of U Repeat for ctr=1, 2, . . .. . .. Update the cluster prototypes V(ctr) as PN ViðctrÞ ðijctr 1Þm Zj m ¼ P ðctr 1Þ N j¼1 ij j¼1 Re-compute the cluster covariance matrices Fi PN j¼1 F ðctrÞ i ¼ m T ðctr 1Þ Zj ViðctrÞ Zj ViðctrÞ ij PN ðctr 1Þ Þm j¼1 ðij Update the partitions 1 ðijctrÞ ¼ Pc k¼1 ðDij=DkjÞ 2=ðm 1Þ D2ij ¼ ðZj ViðctrÞÞT Mi Zj ViðctrÞ Mi ¼ ði detðFi ÞÞ1=p Fi 1 where Until || U(ctr) U(ctr 1) || < tol (b) Clustering for minimum linear modeling errors Given the data set Z, choose the number of clusters c; the weighting exponent m (m >1) and the termination tolerance, tol Initialize the fuzzy partition U(0) and the cluster centers V(0) and define ij to be the (i, j)th element of U Repeat for ctr=1, 2, . . .. . .. Update the cluster prototypes V(ctr) as PN ðctrÞ V m ðijctr 1Þ Zj m i¼ P ðctr 1Þ N j¼1 ij j¼1 Compute i as 1 ctr i ¼ 1; if ctr > kj ij ¼ 0; otherwise 1 8k ¼ 1; 2; ::::; i; :::c Calculate i for each cluster from the equation T 1 Vi1:ðn 1Þ Vin using values of V at ðctrÞ i ¼ Vi1:ðn 1Þ Vi1:ðn 1Þ Update the partition matrix U through the solution of the objective function J¼ Pc PN i¼1 j¼1 ij m Zj Vi T 2 P P 1:ðn 1Þ n Mi Zj Vi þ ci¼1 N i j¼1 i Zj Zj in the space of Mi and ij , subject to the constraint set Pc m ¼ 18j ¼ 1; 2; ::::::::::N i¼1 ij detðMi Þ ¼ 8i ¼ 1; 2; ::::::::::c 0 < ij < 18i ¼ 1; 2; :::::c j ¼ 1; 2; :::::N 0< PN j¼1 ij m < N8i ¼ 1; 2; :::::::c Until || U(ctr) U(ctr 1) || <tol 478 While the above optimization can be carried out using constrained minimization techniques as in Table 1(b), an alternative technique could be to view the two terms appearing in the objective function J separately. As mentioned in the earlier section, the minimization of the first term along with the constraints can be carried out analytically. The second term could be used independently as a validity measure that gives the requisite number of clusters resulting in minimum modeling error. Towards this, we define the linear model based reconstruction error (LMRE) as LMRE ¼ c X N 2 X i Zjn Zj1 ðn 1Þ i ð18Þ i¼1 j¼1 The average LMRE is defined by the equations, Av:LMRE ¼ LMRE c ð19Þ The behavior of LMRE with the number of clusters is monotonic and the knee of the graph corresponds to the optimal selection of the number of clusters. An alternative measure could be the average reconstruction error per cluster defined as 2N 2 3 P n 1 ðn 1Þ Z Z i 7 j i j c 6 1X 6j¼1 7 Av:LMRE=cluster ¼ 6 7 5 c i¼1 4 N ð20Þ Measures similar to the final prediction error index employed in linear identification schemes may also be used 1 þ c=N FPEðLMREÞ ¼ Av:LMRE ð21Þ 1 c=N An examination of each of the above cluster validity indices provides an insight into the number of clusters that may be required. In this work we have used this latter approach towards determining the number of divisions required. 3.1. Linearity ascertaining indices Two techniques were employed as initial tests to assess the extent of linearity of the obtained local regions. 3.1.1. Mean of the output Each cluster center corresponds to the mean of the local operating region. Perturbation is carried out within each such local operating region utilizing a predetermined input sequence. As long as the process is linear and the input sequence is mean centered, the output is also expected to have a mean around its local steady state value. This assumption holds good only for the linear processes since positive and negative inputs of equal amplitudes perturb the process uniformly in either direction. Significant deviation of the mean of the locaI process output from the mean of the local region could be assumed as an indication of nonlinearity. 3.1.2. Condition number If the dynamics in each cluster are linear and can be represented by ARX structure, the following equation holds good for the assumed model order na and nb which are greater than the true model order. For a general SISO dynamic model under open-loop conditions yk ¼ a1 yk 1 þ a2 yk 2 þ þ ana yk na þ b1 uk 1 þ b2 uk 2 þ bnb uk nb þ ek ð22Þ Then Y ¼ XK þ E ð23Þ where Y ¼ ½yk ; ykþ1 ykþm T is a vector of output recorded from k to k þ m instants, such that m > na þ nb þ 1. The matrix X can be written as 2 yk 1 yk 2 yk na uk 1 uk 2 6y yk 1 ykþ1 na uk uk 1 6 k X ¼6 .. .. .. .. .. .. 6 .. 4. . . . . . . ykþm 1 3. Local dynamic model structure selection 3 The segregation of the data into different local regions is based on the steady state information. The dynamic model identification is done within each of the local regions by using the suitable input signals, so that the composite dynamics can be reproduced by a number of local dynamic models of fairly constant gains. However, as the segregation step itself does not guarantee (dynamic) linearity of the sub-regions, it is also necessary to ascertain whether the local regions obtained are adequately linear. 7 7 7 5 ykþm 2 ykþm na ukþm 1 ukþm 2 ð24Þ The constant K ¼ ½a1 a2 ana b1 b2 bnb T and the error E ¼ ½ek ekþ1 ekþm T . Let the new matrix obtained by augmenting vector Y with the matrix X be represented by uk nb ukþ1 .. . ukþm 479 h i ¼ YX ð25Þ An examination of the regressor matrix would indicate that the matrix is ill conditioned if the process is linear if na and nb are sufficiently larger than the true model order. It must be noted that the presence of higher signal to noise ratios or colored noise could somewhat affect this interpretation regarding the linearity of the process in the local region. If the local dynamics are non-linear, NARX models could be used to effectively capture the nonlinearity or each local region may further be dynamically clustered in to an array of simpler structures as shown in [4] . 3.2. Parsimonious model estimation Ordinary least square techniques and prediction error methods (PEM) have been extensively quoted in the literature for parameter estimation [18]. But these technique have been proven to perform well only when the model order is known. Huang [19] proposed a generalized last principal component (GLPC) analysis for estimating the model orders for linear systems. A two step method of estimating non linear low order models by augmenting an average linear low order model with selected nonlinear terms has been proposed by Docter and Georgakis [20] . The orthogonal least square techniques based on Classical Gram Schmidt (CGS), Modified Gram Schmidt (MGS) and householder transformation have been shown to be effective by Chen et al. [21] in parameter estimation and structure selection of NARX models. So based on the linearity test, a suitable identification methodology to determine the model order for linear systems or identify the structure (if the system is nonlinear) and estimate the parameters has to be used. In another ongoing work, yet unpublished [22], an alternative method of estimating appropriate model order/structure for ARX /NARX type models has been investigated and has been found to be promising. This proposed method is a two step technique based on principal component regression. In the outer loop, the correct model orders are identified by evaluating the key variables that describe the dynamic variation in the input output data blocks. The parameter estimation, for the assumed set of key variables (model orders) is carried out in the inner loop. This strategy has been used in this paper to estimate the model orders/ structures and estimate parameters for the local regions. 4. Fuzzy modeling methodology The proposed fuzzy modeling procedure utilizing the concepts discussed in Sections 2 and 3 can be described as follows. 4.1. Division of the operating space The steady state map is segregated into a number of clusters using the adaptive norm based segregation technique described in Section 2. The number of divisions necessary is determined through the application of the cluster validity measures proposed. From the nature of the membership functions obtained, the input space is divided into a number of local grids. Suitable signal design ensuring adequate excitation is carried out within each local grid for the generation of dynamic data. 4.2. Construction of local models The dynamic data generated through excitation of the plant within each local grid are utilized for local dynamic model construction. To ascertain the most influential lagged variables, the proposed PCR based key variable identification technique described in Section 3 is employed. To ascertain suitable basic model structure within each local region i.e. linear or nonlinear, the linearity of each sub-region is ascertained using the linearity determination indices discussed in Section 3.1. Based on the results of this test, a suitable representative structure linear (ARX or FIR) or nonlinear (NARX) is assumed for key variable selection. The proposed key variable selection technique then enables identification of the most significant lagged variables. The coefficients for the ARX and FIR models (if linear) and NARX model (if nonlinear) are identified through PLS based regression. The suitability of the identified model is ascertained through cross-validation and residual correlation tests. 4.2.1. Aggregation of local models through a smooth switching strategy Given a new data point Xj, the strategy for aggregation involves the computation of cluster-wise memberships for the data point and weighing each local model output by this cluster wise membership. A projection based technique for fuzzy composition has been employed by Gomez-Skarmeta et al. [23] and Venkat and Gudi.[6]. An alternate ‘Method of Quasi Invariance’ for application to chemical systems has been discussed by Venkat and Gudi [6]. In this work, we have employed the projection technique for fuzzy aggregation. Direct application of this technique for prediction is not possible as at any instant ‘k’, out of the n-dimensional Table 2 Nominal model parameters for the continuous fermenter Yx/s=0.4 =2.2 =0.2 h 1 m=0.48 h 1 Pm=50 g/l Km=1.2 g/l Ki=22 g/l 480 MISO mapping only a maximum of ‘n 1’ dimensions are known. This scheme thus involves the projection of the ‘n’ dimensional cluster onto the known ‘p’ dimensional space. The membership of the point Xj to cluster ‘i’ is obtained using a probabilistic measure as ij ¼ 1 " #2 Xj ViP m 1 c P P k¼1 Xj Vk Fig. 2. Cluster validity measures based on minimum linear model error. Fig. 3. Segregation of the input–output steady state map: (a) location of regime centers; (b) regime wise membership functions. ð26Þ 481 Table 3 Segregation results for steady state data Input (U2) range Output (Y1) range Regime center 0 < u < 12:8 0 < y < 4:6 u ¼ 6:6 y ¼ 2:3 12:8 < u < 20:2 4:6 < y < 6:9 u ¼ 16:5 y ¼ 5:8 where the superscript P on the norm denotes a projection onto the lower P dimensional space Remark 1. In an earlier work [4], the decomposition to build local parsimonious models was based on an analysis in the dynamic operating space after suitably perturbing the plant with appropriate inputs. For a complex nonlinear plant with varying dynamics and gain sign changes (as seen in the case study presented here), the design of inputs is by itself a difficult task. In the approach proposed here, the decomposition based on steady state information helps simplify the task of input design and local model building. 20:2 < u < 25:2 6:9 < y < 7:15 u ¼ 22:7 y ¼ 7:2 25:2 < u < 31 5:2 < y < 7:15 u ¼ 28:1 y ¼ 6:3 31 < u < 40 0 < y < 5:2 u ¼ 35:7 y ¼ 3:0 [25,26]. In this work we employ a multiple model based MPC scheme wherein the composite model predictions are obtained through fuzzy composition of individual local model outputs. The advantage of fuzzy aggregation over crisp composition techniques is illustrated through control performance. The output of the composite model when employing fuzzy composition can be expressed as c P O¼ ði Þm Oi i¼1 c P ð27Þ ð i Þ m i¼1 5. Control: multiple model MPC The MPC technique has been extensively employed for control of chemical processes. In Garcia et al. [24] an introduction to DMC based control is provided. A number of researchers have looked into the utilization of nonlinear process models within the MPC framework where Oi represents the local model output in cluster i, i denotes the weight associated with the local model and O the composite output. This scheme is compared with two crisp aggregation strategies, one wherein the output is computed through a crisp weighted composition and the other involving a non-weighted output averaging of relevant local model outputs. Fig. 4. Model coefficients for local region 1: (a) FIR coefficients derived from ARX; (b) direct FIR estimation. 482 Fig. 5. Model coefficients for local region 3: (a) FIR coefficients derived from ARX; (b) direct FIR estimation. Fig. 6. Model coefficients for local region 5: (a) FIR coefficients derived from ARX; (b) direct FIR estimation. 483 Fig. 7. (a) Cross validation results for cluster 3; (b) comparison between direct FIR estimated model and ARX derived FIR model for cluster 4. In a crisp composition strategy, only the subset of models having the largest associated membership are employed. Let p, q and r denote the local models associated with the largest membership values (p, q, r < =c). The weighted crisp composition yields an output O which is expressed as P O¼ ði Þm Oi i¼p;q;r P ð28Þ ði Þm i¼p;q;r where denotes the number of local models associated with the largest membership value. Both the above crisp schemes [Eqs. (28) and (29)] reduce to the same expression when only one local model is fired. To obtain good performance for the MPC scheme, the local model parameters are optimized for long-range prediction capability. Remark 2. As can be seen in the preceding section, the algorithm requires information about the regions of the operation, which is typically known through a priori knowledge or through the above analysis. Perturbation in each individual region an online composition of the models as proposed above is required to implement this scheme. In a non-weighted crisp composition scheme the output O is obtained as P O¼ 6. Illustrative examples and a case study Oi i¼p;q;r ð29Þ The system under consideration is the well-mixed continuous fermenter of Henson and Seborg [10]. The Table 4 Local model structures, coefficients and steady state gains Local region Settling time Model structure 1 2 3 4 5 20 30 40 100 300 ykþ1 ykþ1 ykþ1 ykþ1 ykþ1 ¼ 0:0045uk þ 0:0037uk 1 þ 1:6338yk 0:6554yk 1 ¼ 0:3503uk þ 0:003uk 1 þ 1:7yk 0:7189yk 1 ¼ 0:0003uk þ 2:2196yk 1:5927yk 1 þ 0:3678yk 2 ¼ 0:0006uk 0:0003uk 1 þ 1:9086yk 0:9113yk 1 ¼ 0:0003uk þ 1:9561yk 0:957yk 1 Steady state gain 0.3806 0.3188 0.0682 0.3352 0.4678 484 Fig. 8. Composite model predictions employing fuzzy aggregation: (a) disturbance in input U1 (10% noise); (b) variation in input U2; (c) comparison of plant and composite model predictions; (d) magnified view of the comparison between sampling instants 800 and 1600. RMSE of prediction errors=0.0176. 485 dilution rate (U1) and the feed substrate concentration (U2) are the manipulated variables while the cell mass concentration (X1), substrate concentration (X2) and product concentration (X3) are the state variables. The model equation set for the continuous fermenter is as follows dX1 ¼ U1 X1 þ X1 dt ð30Þ dX2 X1 ¼ U1 ðU2 X2 Þ dt Yx=s ð31Þ dX3 ¼ U1 X3 þ ð þ ÞX1 dt Y1 ¼ X1 Y2 ¼ X2 ð33Þ ð34Þ ð35Þ is the specific growth, þ is the specific product formation rate and Yx/s is the yield of the cell mass with respect to the limiting substrate. The specific growth rate is given through the substrate and product inhibition model as X3 m 1 X2 Pm ¼ ð36Þ X22 Km þ X2 þ K1 The nominal model parameters are listed in Table 2. This application exhibits problems of gain direction changes as well as non-linearities in different regions of operation as seen in Fig. 3(a). In this work, we restrict the analysis to the SISO case. The aim is to obtain a dynamic model that describes the effect of feed substrate concentration (U2) on cell mass concentration (Y1). The value of the dilution rate (U1) is maintained at 0.1636, Fig. 9. (a) Proposed multiple model based MPC scheme; (b) internal structure of the multiple model MPC. 486 which corresponds to the optimal operational value for operation of the fermenter [10]. The steady state map of the input(s)–output pair is generated through a simulation model. The allowable range of U2 is fixed between 0 and 40. An analysis of the data based on the LMRE based validity measures (Fig. 2) indicates the optimal number of clusters to be 5. The segregation of the steady state map into 5 clusters is carried out employing the adaptive norm based technique discussed in Section 2. The location of the center and nature of the membership function obtained for each cluster is shown in Fig. 3. Based on the membership plots, the input space is divided into 5 grids. Membership cuts for each of the membership functions are selected such that the input range for each cluster spans midway through the region of overlap between adjacent membership functions. The location and characteristics of the grids are depicted in Table 3. Within each local grid, perturbation data is generated through appropriate signal design. The linearity indices are evaluated to ascertain linearity with each local region. In this case study, all five regions were found to be approximately linear. The key lagged variables are estimated through the PCR based technique discussed earlier. Since the local regions are linear, ARX and FIR forms for the models are selected. The tolerance limits on the percentage residual in y is set based on minimum deviation between the actual FIR and the ARX derived FIR coefficients. The coefficients of the ARX (reduced to FIR form) and direct FIR models constructed for the local regions 1, 3 and 5 are depicted in Figs. 4–6. Fig. 10. Comparative control performance of the crisp and fuzzy model switching strategies in the multiple model MPC framework (fuzzy aggregation is employed to get composite model predictions in either case). 487 Fig. 11. Disturbance rejection performance of the propsed MPC scheme subjected to a 10% change in U1. Fig. 7(a) depicts the cross-validation performance of model 3. This model corresponds to the peak of the steady state map. This region is characterized by low gains and is typically difficult to model. However the model constructed through the proposed scheme for local region 3 shows excellent cross validation as ascertained from Fig. 7(a). Fig. 7(b) shows a comparison of the ARX derived FIR and directly estimated FIR coefficients for local region 4. The local model structures, coefficients and steady state gains are tabulated in Table 4. It is noted from Table 4 that the dynamics in each of the local regions are significantly disparate, with the local time constants varying from 5 s in region 1 to 75 s in region 5. It is further observed that trend of the steady state gains of the local models constructed follows the behavior expected from the steady state map of the process as depicted in Fig. 2(a). The adequate performance of each local model is verified through cross validation within each local region. Each local model is constructed based on a sampling rate that is characteristic of the local time constant. Fuzzy aggregation is used for the composition of the local models and the performance of the aggregated model through the range of operation is shown in Fig. 8. It is observed that the composite model cross validates excellently through the entire range of operation. and a comparative performance of either MPC scheme is shown in Fig. 10. The dashed lines show the control response when a crisp weighted average switching strategy is used and the dotted lines show the control response when fuzzy switching is employed. From the control results obtained, it is clearly observed that the fuzzy switching strategy performs better than the crisp switching technique. The MPC scheme is observed to fail when crisp non-weighted output averaging technique is employed. A disturbance in the form of variation in the value of U1 is introduced for a fixed time and the ability of the fuzzy switching based MPC scheme to reject the disturbance is studied. The disturbance rejection performance of the proposed scheme is observed to 6.1. Control Control of the fermenter through the range of operation is attempted through the incorporation of the proposed multiple model based MPC scheme (Fig. 9). Both fuzzy and crisp switching strategies are employed Fig. 12. Performance of the proposed MPC scheme in presence of both process and measurement noise (up to 10% deviations considered). 488 be excellent as is shown Fig. 11. The performance of the MPC scheme in presence of process and measurement noise is shown in Fig. 12. 7. Conclusion In this work, we propose a new fuzzy modeling based methodology for decomposition of the overall nonlinear plant behavior to obtain simpler dynamic models. New validity measures that aid in the determination of the optimal number of divisions of the operating space based on linear model reconstruction errors have been proposed. This decomposition, which is based on a steady state map of the input–output space, is shown to be feasible for dynamic model development in local regions. Linearity indices are suggested to ascertain linearity (or non-linearity) of the local regions. Towards obtaining accurate predictions over the entire operating range, we have proposed and evaluated a method for generating composite predictions based on fuzzy aggregation. A MPC scheme is also proposed within the multiple model framework. The validity of the identification and control strategies proposed are demonstrated through application to the nonlinear fermenter problem of Henson and Seborg [10]. The superiority of the fuzzy switching strategy over crisp switching methodology is highlighted through control performance. The multiple model based MPC scheme shows good set point tracking and disturbance rejection through the entire range of operation. It can be seen that, since the local models have stable realizable inverses, the local controllers can also therefore be assumed stable. Also, since the fuzzy composition is done by weighing each controller output by appropriate membership, which in themselves sum to unity, it may be concluded that the composite control action will also be stable in a bounded input bounded output sense. However, a rigorous analysis of the stability aspects is further ndatory and will be a focus of future research. References [1] T.A. Johansen, B.A. Foss, Constructing NARMAX models using ARMAX models, Int. J. Control. 58 (1993) 1125–1153. [2] T.A. Johansen, B.A. Foss, Identification of nonlinear system structure and parameters using regime decomposition, Automatica 31 (2) (1995) 321–326. [3] B.A Ogunnaike, W.H. Ray, Process Dynamics, Modeling and Control, Oxford University Press, 1994. [4] A.N. Venkat and R.D.Gudi, Fuzzy Segregation Based Identification of Nonlinear Dynamic System, presented at ACC 2001, Washington DC. [5] B. Kuipers, K. Astrom, The composition and validation of heterogenous control laws, Automatica 20 (1994) 233–249. [6] A.N. Venkat, R.D. Gudi, Fuzzy Segregation based identification and control of nonlinear dynamic systems. Journal of Industrial and Engineering Chemistry Research (submitted). [7] K. Schott, W. Bequette, Control of chemical reactors using multiple model adaptive control, IFAC DYCORD +95, HelsingerDenmark (1995) 345–350. [8] K.S. Narendra, J. Balakrishnan, M.K. Ciliz, Adaptation and learning using multiple models, switching and tuning, IEEE Control Systems Magazine 15 (3) (1995) 37–51. [9] A. Kordon, Y.O. Fuentes, B.A. Ogunnaike, P.S. Dhurjati, An intelligent parallel control system structure for plants with multiple operating regimes, J. Proc. Control 9 (1999) 453–460. [10] M.A. Henson, D.E. Seborg, Nonlinear control strategies for Continuous Fermenter, Proc. of 1990 American Control Conference, San Diego, 1990. [11] A. Baraldi, P.A. Blonda, Survey of fuzzy clustering algorithms for pattern recognition-part I, IEEE Trans. on Systems, Man and Cybernetics- Part B: Cybernetics. 29 (6) (1999) 778–785. [12] A. Baraldi, P.A. Blonda, Survey of fuzzy clustering algorithms for pattern recognition-part II, IEEE Trans. on Systems, Man and Cybernetics- Part B: Cybernetics. 29 (6) (1999) 786–801. [13] D.E. Gustafson, W.C. Kessel, Fuzzy clustering with fuzzy covariance matrix, Proc. IEEE Intl. Conference on Fuzzy Systems, San-Diego. (1979) 761–766. [14] J.C. Bezdek, Pattern recognition with fuzzy objective algorithms, Plenum Press, New York, 1981. [15] A. Kandel, Fuzzy Techniques in Pattern Recognition, WileyInterscience Publishers, New York, 1982. [16] I. Gath, A. Geva, Unsupervised optimal fuzzy clustering, IEEE Trans. Pattern Analysis and Machine Intelligence 7 (1989) 773–781. [17] R. Krishnapuram, C.P. Freg, Fitting an unknown number of lines and planes to image data through compatible cluster merging, Pattern Recognition 25 (4) (1992) 385–400. [18] T. Soderstrom, P. Stoica, System Identification, Prentice-Hall International, UK, 1989. [19] B. Huang, Process identification based on last principal component analysis, Journal of Process Control 11 (2001) 19–33. [20] W.A. Docter, C. Georgokis, Nonlinear reduced order models for separation processes via augmentation of linear subspace models, Proceedings of the American Control Conference (1998) 2108– 2112. [21] S. Chen, S.A. Billings, W. Luo, Orthogonal least squares methods and their application to non-linear system identification, Int. J. Control 50 (5) (1989) 1873–1896. [22] P.Vijaysai and R.D.Gudi, Issues in key variable selection in dynamic model identification. Canadian Journal of Chemical Engineering (to be submitted). [23] A.F. Gomez-Skarmeta, M. Delgado, M.A. Vila, About the use of fuzzy clustering techniques for fuzzy model identification, Fuzzy Sets Systems 10 (6) (1999) 179–188. [24] C.E. Garcia, D.M. Prett, M. Morari, Model predictive control: theory and practice—a survey, Automatica 25 (3) (1989) 335–348. [25] J.J. Song, S. Park, Neural model predictive control for nonlinear chemical processes, J. Chem. Engg. Jpn. 26 (4) (1993) 347–354. [26] J. Saint-Donat, N. Bhat, T.J. McAvoy, Neural net based model predictive control, Int. J. Control. 54 (6) (1991) 1453–1468.
© Copyright 2025 Paperzz