5763-5.pdf

Identification of complex nonlinear processes based on fuzzy
decomposition of the steady state space
Aswin N. Venkat, P. Vijaysai, Ravindra D. Gudi*
Department of Chemical Engineering, Indian Institute of Technology-Bombay, Powai, Mumbai-400076, India
Abstract
A methodology for identification and control of complex nonlinear plants using multi-model approach is presented in this paper.
The proposed methodology is based on fuzzy decomposition of the steady state map. It is shown that such a decomposition strategy
facilitates the design of input perturbation signals and helps in identifying linear or simple nonlinear models for each local region. A
composition strategy to aggregate the local model predictions is proposed and shown to give excellent cross validation as well as to
facilitate smooth switching between the local models. A novel control scheme that is based on the multi model strategy is proposed.
The practicality of the identification and control scheme presented here is demonstrated by application to the continuous fermenter
of Henson and Seborg (M.A. Henson, D.E. Seborg, Nonlinear control strategies for continuous fermenter, in: Proceedings of 1990
American Control Conference, San Diego, 1990), which exhibits severe nonlinearities and gain directionality changes.
Keywords: Multi-model Identification; Steady state decomposition; Fuzzy segregation
1. Introduction
Nonlinear processes are invariably difficult to model
and hence have been traditionally represented employing a linearized model around some operational steady
state. A large number of chemical and biological processes are characterized by a wide and varied set of
input dependent dynamics that cannot be handled
within the linear model identification framework.
Changes in the directionality of the gain and in the nature of the dynamics itself, for different ranges of inputs,
pose a real challenge to identification and control of
such nonlinear systems.
Different methodologies have been employed by
researchers to mimic nonlinear behavior. An approach
that has been extensively used is the development of an
overall nonlinear model that performs satisfactorily
through the entire operating range. Neural Networks
(NN) and other nonlinear structures have been extensively used for the purpose. One of the disadvantages of
such an approach is the difficulty in arriving at a
suitable model/network structure and the invariably
complex structure thereof. Further, a majority of these
approaches are essentially black box models, which to a
good extent do not permit transparency between the
plant dynamics and the modeling procedures. An alternate methodology is the multiple-model based framework that allows for partial transparency of process
knowledge to the modeling procedure.
In the multiple modeling frameworks, the main challenges in the development of dynamic models are:
(i) Division of the operating space into operating
regimes.
Johansen and Foss [1] have proposed a strategy for
decomposition of the nonlinear space into multiple linear structures utilizing a priori process information. An
operating regime decomposition strategy based on
heuristics and local model validity functions has been
proposed by Johansen and Foss [2]. Ogunnaike and
Ray [3] have considered a steady state map and classified operating regimes based on a priori knowledge of
the input and output ranges on the steady state map.
They postulate appropriate controllers based on an
analysis of the RGA within each operating regime.
While analysis of the steady state map, as has been done
by Ogunnaike and Ray [3] does give an indication of the
474
steady state gain variation, inclusion of dynamics in the
decomposition step may characterize the variation in
dynamics more accurately. Towards this, Venkat and
Gudi [4] have proposed a strategy for decomposition of
the operating space based on an analysis of the dynamic
input–output data obtained by plant perturbation.
However, for complex plants having varied regions of
operation (including gain sign changes and variations in
time constants), perturbation design is not an easy task
and needs further analysis.
(ii) Construction of local dynamic models.
Segregation of the operating regions as in (i), divides
the input-output space into a collection of local regions.
The next step would be the development of dynamic
models within each region. Kuipers and Astrom [5] have
proposed a multiple-model strategy wherein interpolation between local model parameters is employed. In
Venkat and Gudi [6], the problem of local model structure
selection was solved as a combinatorial exercise with the
goal being that of minimum local cross validation error.
(iii) Aggregation of local models through a suitable
switching strategy.
For the deployment of these local models in closed
loop operation, the local model predictions need to be
aggregated to yield a representative prediction in the
region of operation. A number of switching and aggregation strategies have been proposed. Hard switching
strategies work by invoking only the most representative model/controller at any given time. Such a crisp
strategy leads to deterioration in performance during
transitions between local regions. Several model/controller scheduling techniques have been proposed in
recent times for use within the multiple-model based
framework. Schott and Bequette [7] have used an estimator based scheduling algorithm to weigh local models
towards control of chemical reactors. Local model performance indices to select local controllers have been
outlined by Narendra et al. [8]. A supervisor based
hierarchical technique for local controller selection
based on virtual control loop feedback error has been
proposed by Kordon et al. [9].
The objective of this paper is to analyze the above
issues related to multiple model control for nonlinear
systems by using fuzzy decomposition and aggregation
techniques. Unlike in our earlier approach [4], the focus
here is on decomposition of the steady state space. In a
typical operating plant having varying regions of operation (such as polymerization for the production of different polymer grades), steady state information
corresponding to each region is typically available from
the plant. Thus, based on the proposition that the
steady state map characterizes the variation in the
operating space at least in terms of the gain changes
(Ogunnaike and Ray [3]), we propose a decomposition
of the steady state map using fuzzy segregation.
Towards the latter decomposition, we propose a modi-
fication of the existing segregation methodologies and
develop a new iterative segregation scheme that includes
conditions on the linearity and parsimony of the resulting local models. For the development of dynamic
models in each local region that results from the above
segregation, we analyze issues related to perturbation
within each operating region for the development of
simpler linear/nonlinear models. Towards obtaining
accurate predictions over the entire operating range, we
propose a method of generating composite predictions
based on fuzzy aggregation, which ensures smoother
transitions between operating regions. For control, we
propose a multiple model control scheme that employs
the local models through fuzzy switching. The superiority of fuzzy switching over crisp controller switching
in terms of control performance is also highlighted.
Thus, the merits of the proposed approach over those
proposed in literature are (i) fuzzy classification of the
steady state information to decompose the relationships
with a view to better input signal design, (ii) suitable
division of the relationships based on a linearity index
to generate local linear models and (iii) fuzzy composition based control law in an model predictive control
framework. We demonstrate the validity of the identification and control methodologies proposed here by
application to the nonlinear fermenter problem of Henson and Seborg [10] which is characterized by severe
non-linearities and gain direction changes.
The organization of the paper is as follows. Section 2,
provides an overview of the fuzzy clustering techniques
employed in this work. The proposed cluster validity
measure based on minimum linear model reconstruction
error is also detailed. In Section 3, an analysis of a key
variable selection technique that has been used for
model structure selection is presented. Two indices to
ascertain linearity within local regions are also discussed. A detailed discussion on the proposed methodology is provided in Section 4. Section 5 describes the
relevant issues in the implementation of the multiple
model based MPC framework towards achieving good
control of complex nonlinear systems. The proposed
identification and control strategies are tested on the
nonlinear fermenter of Henson and Seborg [10]. The
case study and the results obtained employing the proposed schemes are detailed in Section 6. A discussion on
the proposed technique and scope for future work is
outlined in Section 7.
2. Fuzzy clustering
Clustering enables division of the complex nonlinear
regions into simpler subspaces. Incorporation of fuzziness into the clustering methodology allows for overlap
of subspaces and thereby smoother transitions between
operating regions.
475
Fig. 1(a) shows the steady state map for a certain
process. From an examination of the figure it is evident
that there are three characteristic and distinct divisions
that exist based on steady state gains. A point in such a
map belongs to one and only one uniform-gain operating region. Fig. 1(b) shows an alternate steady state map
with two operating regions based on uniform gain. For
this case it would be incorrect to classify intermediate
points crisply to either region. A transitional point in
such a map would belong to both regions albeit to
varying extents. By allowing for a point in the steady
state map to belong to more than one region, we introduce the concept of fuzziness within the clustering
scheme. This enables smoother transitions between
operating regions.
Consider the set X (matrix) of feature points (in our
case, the input–output steady state data points) for a
process, where X 2 Rn . A typical structure for X can be
written as,
"
#
U~ X¼
ð1Þ
y U~
nN
where U~ represents the set of steady state inputs
and
*
yðU~ Þ is the steady state output corresponding to U. Let c
be the number of clusters and m be the fuzzy exponent
(defined later), where m > 1. Partitioning of the data, in
the n-dimensional space, into each of these c clusters
would result in feature points that have similar gains to
lie closer to each other in a cluster. Often times this
transition may not be crisp. Let ij therefore denote the
membership of the jth feature point in X to the ith
cluster and let Vk denote the prototype point for the kth
cluster.
A fuzzy partition is characterized by the following
equations:
ij 2 ½0; 1
ð2Þ
c
X
ij ¼ 1 8j ¼ 1; 2 N
ð3Þ
i¼1
0<
N
X
ij < N 8i ¼ 1; 2 c
ð4Þ
j¼1
The partitioning involves the minimization of an
objective function J in the space of Mi and ij
J¼
c X
N X
2
m ij Xj Vi
ð5Þ
i¼1 j¼1
where
Xj Vi 2 ¼ Xj Vi Mi Xj Vi T
Xj represents the jth feature vector and Mi, the norm
matrix that characterizes the orientation of the clusters.
The minimization of J yields a partition of the data
with Vi , i ¼ 1 c; as the cluster centers. It also produces a matrix of memberships of each feature vector to
each cluster.
A number of fuzzy clustering algorithms are available
in literature [11,12]. Two parameters critically affect the
nature of clustering. The first, is the norm employed for
partitioning. We have employed the Gustafson–Kessel
algorithm (G–K) [13] for adaptive norm based segregation. The alternate fixed norm based clustering [14] forces the clusters into characteristic norm based
topographies, even if such a structure is not present in
the data. For example, a Euclidean norm based clustering induces spherical cluster topographies irrespective of
the actual distribution of the data. Employing an adaptive norm based clustering methodology enables the
cluster topology to vary depending on the data spread.
The expression in such a case for Mi can be analytically
derived [14,15] as
Mi ¼ ði detðFiÞÞ1=p Fi
1
ð7Þ
N
P
where Fi ¼
Fig. 1. Division of the steady state map into local regions: (a) crisp
division; (b) fuzzy division.
ij
m
ðZj ViÞðZj ViÞT
j¼1
N
m
P
ij
j¼1
ð8Þ
476
and i is the volume constraint in the ith cluster and Z is
the data vector. The other parameter, the fuzziness
exponent ‘m’ also significantly affects the nature of the
partition. As m approaches 1 from above, the partition
becomes crisp. As m ! 1 the partition becomes
totally fuzzy. In this limit, the magnitudes of the membership are all equal with a value of 1=c, i.e. all points
belong to all the clusters with an equal membership. An
overview of the G–K algorithm is provided in
Table 1(a).
2.1. Minimum linear model reconstruction error
Determination of the optimal number of divisions of
the operating space marks an important step in the segregation methodology. Dividing the operating space
into a large number of local regions may represent the
behavior more accurately but is a non-parsimonious
representation; however, a very small number of local
regions may perhaps be inadequate to represent the
overall process. To aid the determination of the number
of divisions necessary, two cluster validity measure
namely, Fuzzy Hypervolume [16] and Average within
cluster distance [17] have been proposed.
The primary thrust of the modeling philosophy discussed here is to construct locally linear models whenever possible. The clustering techniques as discussed
here, are data driven techniques and as such do not
guarantee linearity within each subspace. In this section,
we propose a new cluster validity measure that provides
an indication of the number of clusters into which the
operating space has to be divided into, such that the
error due to linear modeling is minimum. Thus at the
end of the clustering/decomposition step, it is expected
that the relationships between points within each cluster
could be modeled by linear or simpler nonlinear structures
Let n be the dimension of the input(s)–output space.
A typical feature vector (say the jth set, j=1, 2 . . .N) is
represented as
2 j 3
2 1 3
Zj
U1
6:
7
6:
7
6
7
6
7
*
7 ¼ 6:
7
Zj ¼ 6
ð9Þ
:
6 j 7
6 n
1 7
4U 5
4 Zj 5
n
1
Zjn
Yj
nx1
nx1
where is the parameter vector for the relationship.
Consider the objective function
J¼
i¼1 j¼1
þ
Zjn ¼ Zj1:ðn
1Þ ð10Þ
c X
N
2
X
i Zjn Zj1
ðn
1Þ i
ð11Þ
i¼1 j¼1
where i is the parameter vector for the ith cluster. The
first term in the objective function is identical to the
distance minimization term in the adaptive clustering
methodology discussed earlier in Section 2. The second
term represents a weighted sum of the errors resulting
due to the linear model assumption in each local region.
For linear modeling at this step, overlap between local
regions is not allowed and only those points that have
the highest membership to a particular cluster are considered for construction of the local linear models. The
expression for i can thus be written as
i ¼ 1; if ij > kj 8k ¼ 1; 2; ::::; i; :::c
ð12Þ
¼ 0; otherwise
Further, noting that the linear model has to satisfy the
conditions at the center of each cluster, the following
equation becomes mandatory
T 1
i ¼ Vi1:ðn
1Þ Vi1:ðn
1Þ
Vi1:ðn
1Þ Vin
N
m
P
ij Zj
where Vi ¼
j¼1
N
m
P
ij
; ¼ 1; 2:::N
ð13Þ
j¼1
The constraint set for the objective function in (11)
comprises of the following equations [13]
c X
ij ¼ 18j ¼ 1; 2; ::::::::::N
ð14Þ
i¼1
detðMi Þ ¼ 8i ¼ 1; 2; ::::::::::c
0 < ij < 18i ¼ 1; 2; :::::c
ð15Þ
j ¼ 1; 2; :::::N
ð16Þ
0<
Let Z~nj denote the output belonging to the jth set
namely Y j and Zj1:ðn
1Þ denotes the set of inputs resultj T
ing in the output Zjn i.e. the vector U1j U2j Un
1
. In
other words, the vector Zj is the representation of the
steady state output Y with respect each of the inputs U
and the steady state relationship between the inputs and
the output is given by
c X
N X
m T ij Zj Vi Mi Zj Vi
N X
ij
m
< N8i ¼ 1; 2; :::::::c
ð17Þ
j¼1
Numerical solution of the optimization problem in
the space of ij , Vi and Mi is complex as the number of
constraints depend directly on the number of data
points. For a data set of size N and number of clusters c,
there are c þ N equality constraints and 2c þ 2cN
inequality constraints. Below, we propose a scheme for
the minimization of the objective function (11). The
step-by-step procedure is provided in Table 1(b)
477
Table 1
(a) The Gustafson Kessel algorithm
Given the data set Z, choose the number of clusters c; the weighting exponent m (m >1) and the termination tolerance, tol
Initialize the fuzzy partition U(0) and the cluster centers V(0) and define ij to be the (i, j)th element of U
Repeat for ctr=1, 2, . . .. . ..
Update the cluster prototypes V(ctr) as
PN ViðctrÞ
ðijctr
1Þm Zj
m
¼ P ðctr
1Þ
N
j¼1 ij
j¼1
Re-compute the cluster covariance matrices Fi
PN j¼1
F ðctrÞ i ¼
m T
ðctr
1Þ
Zj ViðctrÞ Zj ViðctrÞ
ij
PN
ðctr
1Þ Þm
j¼1 ðij
Update the partitions
1
ðijctrÞ ¼ Pc
k¼1 ðDij=DkjÞ
2=ðm
1Þ
D2ij ¼ ðZj ViðctrÞÞT Mi Zj ViðctrÞ
Mi ¼ ði detðFi ÞÞ1=p Fi
1
where
Until || U(ctr)
U(ctr
1) || < tol
(b) Clustering for minimum linear modeling errors
Given the data set Z, choose the number of clusters c; the weighting exponent m (m >1) and the termination tolerance, tol
Initialize the fuzzy partition U(0) and the cluster centers V(0) and define ij to be the (i, j)th element of U
Repeat for ctr=1, 2, . . .. . ..
Update the cluster prototypes V(ctr) as
PN ðctrÞ
V
m
ðijctr
1Þ Zj
m
i¼ P ðctr
1Þ
N
j¼1 ij
j¼1
Compute i as
1
ctr
i ¼ 1; if ctr
> kj
ij
¼ 0; otherwise
1
8k ¼ 1; 2; ::::; i; :::c
Calculate i for each cluster from the equation
T 1
Vi1:ðn
1Þ Vin using values of V at ðctrÞ
i ¼ Vi1:ðn
1Þ Vi1:ðn
1Þ
Update the partition matrix U through the solution of the objective function
J¼
Pc PN i¼1
j¼1
ij
m Zj Vi
T
2
P P
1:ðn
1Þ
n
Mi Zj Vi þ ci¼1 N
i
j¼1 i Zj Zj
in the space of Mi and ij , subject to the constraint set
Pc m
¼ 18j ¼ 1; 2; ::::::::::N
i¼1 ij
detðMi Þ ¼ 8i ¼ 1; 2; ::::::::::c
0 < ij < 18i ¼ 1; 2; :::::c
j ¼ 1; 2; :::::N
0<
PN j¼1
ij
m
< N8i ¼ 1; 2; :::::::c
Until || U(ctr)
U(ctr
1) || <tol
478
While the above optimization can be carried out using
constrained minimization techniques as in Table 1(b),
an alternative technique could be to view the two terms
appearing in the objective function J separately. As
mentioned in the earlier section, the minimization of the
first term along with the constraints can be carried out
analytically. The second term could be used independently as a validity measure that gives the requisite
number of clusters resulting in minimum modeling error.
Towards this, we define the linear model based
reconstruction error (LMRE) as
LMRE ¼
c X
N
2
X
i Zjn Zj1
ðn
1Þ i
ð18Þ
i¼1 j¼1
The average LMRE is defined by the equations,
Av:LMRE ¼
LMRE
c
ð19Þ
The behavior of LMRE with the number of clusters is
monotonic and the knee of the graph corresponds to the
optimal selection of the number of clusters. An alternative measure could be the average reconstruction
error per cluster defined as
2N 2 3
P n
1
ðn
1Þ
Z
Z
i 7
j
i
j
c 6
1X
6j¼1
7
Av:LMRE=cluster ¼
6
7
5
c i¼1 4
N
ð20Þ
Measures similar to the final prediction error index
employed in linear identification schemes may also be
used
1 þ c=N
FPEðLMREÞ ¼
Av:LMRE
ð21Þ
1 c=N
An examination of each of the above cluster validity
indices provides an insight into the number of clusters
that may be required. In this work we have used this
latter approach towards determining the number of
divisions required.
3.1. Linearity ascertaining indices
Two techniques were employed as initial tests to
assess the extent of linearity of the obtained local
regions.
3.1.1. Mean of the output
Each cluster center corresponds to the mean of the
local operating region. Perturbation is carried out
within each such local operating region utilizing a predetermined input sequence. As long as the process is
linear and the input sequence is mean centered, the
output is also expected to have a mean around its local
steady state value. This assumption holds good only for
the linear processes since positive and negative inputs of
equal amplitudes perturb the process uniformly in either
direction. Significant deviation of the mean of the locaI
process output from the mean of the local region could
be assumed as an indication of nonlinearity.
3.1.2. Condition number
If the dynamics in each cluster are linear and can be
represented by ARX structure, the following equation
holds good for the assumed model order na and nb
which are greater than the true model order.
For a general SISO dynamic model under open-loop
conditions
yk ¼ a1 yk
1 þ a2 yk
2 þ þ ana yk
na þ b1 uk
1
þ b2 uk
2 þ bnb uk
nb þ ek
ð22Þ
Then
Y ¼ XK þ E
ð23Þ
where Y ¼ ½yk ; ykþ1 ykþm T is a vector of output
recorded from k to k þ m instants, such that
m > na þ nb þ 1. The matrix X can be written as
2
yk
1
yk
2
yk
na
uk
1
uk
2
6y
yk
1
ykþ1
na uk
uk
1
6 k
X ¼6
..
..
..
..
..
..
6 ..
4.
.
.
.
.
.
.
ykþm
1
3. Local dynamic model structure selection
3
The segregation of the data into different local regions
is based on the steady state information. The dynamic
model identification is done within each of the local
regions by using the suitable input signals, so that the
composite dynamics can be reproduced by a number of
local dynamic models of fairly constant gains. However,
as the segregation step itself does not guarantee (dynamic)
linearity of the sub-regions, it is also necessary to ascertain
whether the local regions obtained are adequately linear.
7
7
7
5
ykþm
2
ykþm
na
ukþm
1
ukþm
2
ð24Þ
The constant K ¼ ½a1 a2 ana b1 b2 bnb T and the
error E ¼ ½ek ekþ1 ekþm T .
Let the new matrix obtained by augmenting vector Y
with the matrix X be represented by
uk
nb
ukþ1
..
.
ukþm
479
h i
¼ YX
ð25Þ
An examination of the regressor matrix would
indicate that the matrix is ill conditioned if the process is
linear if na and nb are sufficiently larger than the true
model order. It must be noted that the presence of
higher signal to noise ratios or colored noise could
somewhat affect this interpretation regarding the linearity of the process in the local region. If the local
dynamics are non-linear, NARX models could be used
to effectively capture the nonlinearity or each local
region may further be dynamically clustered in to an
array of simpler structures as shown in [4] .
3.2. Parsimonious model estimation
Ordinary least square techniques and prediction error
methods (PEM) have been extensively quoted in the literature for parameter estimation [18]. But these technique have been proven to perform well only when the
model order is known. Huang [19] proposed a generalized last principal component (GLPC) analysis for estimating the model orders for linear systems. A two step
method of estimating non linear low order models by
augmenting an average linear low order model with
selected nonlinear terms has been proposed by Docter
and Georgakis [20] . The orthogonal least square techniques based on Classical Gram Schmidt (CGS), Modified
Gram Schmidt (MGS) and householder transformation
have been shown to be effective by Chen et al. [21] in
parameter estimation and structure selection of NARX
models. So based on the linearity test, a suitable identification methodology to determine the model order for
linear systems or identify the structure (if the system is
nonlinear) and estimate the parameters has to be used.
In another ongoing work, yet unpublished [22], an
alternative method of estimating appropriate model
order/structure for ARX /NARX type models has been
investigated and has been found to be promising. This
proposed method is a two step technique based on
principal component regression. In the outer loop, the
correct model orders are identified by evaluating the key
variables that describe the dynamic variation in the
input output data blocks. The parameter estimation, for
the assumed set of key variables (model orders) is carried out in the inner loop. This strategy has been used in
this paper to estimate the model orders/ structures and
estimate parameters for the local regions.
4. Fuzzy modeling methodology
The proposed fuzzy modeling procedure utilizing the
concepts discussed in Sections 2 and 3 can be described
as follows.
4.1. Division of the operating space
The steady state map is segregated into a number of
clusters using the adaptive norm based segregation
technique described in Section 2. The number of divisions necessary is determined through the application of
the cluster validity measures proposed. From the nature
of the membership functions obtained, the input space
is divided into a number of local grids. Suitable signal
design ensuring adequate excitation is carried out within
each local grid for the generation of dynamic data.
4.2. Construction of local models
The dynamic data generated through excitation of the
plant within each local grid are utilized for local
dynamic model construction. To ascertain the most
influential lagged variables, the proposed PCR based
key variable identification technique described in Section 3 is employed. To ascertain suitable basic model
structure within each local region i.e. linear or nonlinear, the linearity of each sub-region is ascertained
using the linearity determination indices discussed in
Section 3.1. Based on the results of this test, a
suitable representative structure linear (ARX or FIR) or
nonlinear (NARX) is assumed for key variable selection.
The proposed key variable selection technique then
enables identification of the most significant lagged variables. The coefficients for the ARX and FIR models (if
linear) and NARX model (if nonlinear) are identified
through PLS based regression. The suitability of the
identified model is ascertained through cross-validation
and residual correlation tests.
4.2.1. Aggregation of local models through a smooth
switching strategy
Given a new data point Xj, the strategy for aggregation involves the computation of cluster-wise memberships for the data point and weighing each local model
output by this cluster wise membership. A projection
based technique for fuzzy composition has been
employed by Gomez-Skarmeta et al. [23] and Venkat
and Gudi.[6]. An alternate ‘Method of Quasi Invariance’ for application to chemical systems has been discussed by Venkat and Gudi [6]. In this work, we have
employed the projection technique for fuzzy aggregation.
Direct application of this technique for prediction is not
possible as at any instant ‘k’, out of the n-dimensional
Table 2
Nominal model parameters for the continuous fermenter
Yx/s=0.4
=2.2
=0.2 h 1
m=0.48 h 1
Pm=50 g/l
Km=1.2 g/l
Ki=22 g/l
480
MISO mapping only a maximum of ‘n
1’ dimensions
are known. This scheme thus involves the projection of
the ‘n’ dimensional cluster onto the known ‘p’ dimensional space.
The membership of the point Xj to cluster ‘i’ is
obtained using a probabilistic measure as
ij ¼
1
"
#2
Xj ViP m
1
c
P
P
k¼1 Xj Vk
Fig. 2. Cluster validity measures based on minimum linear model error.
Fig. 3. Segregation of the input–output steady state map: (a) location of regime centers; (b) regime wise membership functions.
ð26Þ
481
Table 3
Segregation results for steady state data
Input (U2) range
Output (Y1) range
Regime center
0 < u < 12:8
0 < y < 4:6
u ¼ 6:6
y ¼ 2:3
12:8 < u < 20:2
4:6 < y < 6:9
u ¼ 16:5
y ¼ 5:8
where the superscript P on the norm denotes a projection onto the lower P dimensional space
Remark 1. In an earlier work [4], the decomposition to
build local parsimonious models was based on an analysis
in the dynamic operating space after suitably perturbing
the plant with appropriate inputs. For a complex nonlinear plant with varying dynamics and gain sign changes
(as seen in the case study presented here), the design of
inputs is by itself a difficult task. In the approach proposed here, the decomposition based on steady state
information helps simplify the task of input design and
local model building.
20:2 < u < 25:2
6:9 < y < 7:15
u ¼ 22:7
y ¼ 7:2
25:2 < u < 31
5:2 < y < 7:15
u ¼ 28:1
y ¼ 6:3
31 < u < 40
0 < y < 5:2
u ¼ 35:7
y ¼ 3:0
[25,26]. In this work we employ a multiple model based
MPC scheme wherein the composite model predictions are
obtained through fuzzy composition of individual local
model outputs. The advantage of fuzzy aggregation over
crisp composition techniques is illustrated through
control performance. The output of the composite
model when employing fuzzy composition can be
expressed as
c
P
O¼
ði Þm Oi
i¼1
c
P
ð27Þ
ð i Þ m
i¼1
5. Control: multiple model MPC
The MPC technique has been extensively employed
for control of chemical processes. In Garcia et al. [24] an
introduction to DMC based control is provided. A
number of researchers have looked into the utilization
of nonlinear process models within the MPC framework
where Oi represents the local model output in cluster i,
i denotes the weight associated with the local model
and O the composite output. This scheme is compared
with two crisp aggregation strategies, one wherein the
output is computed through a crisp weighted composition and the other involving a non-weighted output
averaging of relevant local model outputs.
Fig. 4. Model coefficients for local region 1: (a) FIR coefficients derived from ARX; (b) direct FIR estimation.
482
Fig. 5. Model coefficients for local region 3: (a) FIR coefficients derived from ARX; (b) direct FIR estimation.
Fig. 6. Model coefficients for local region 5: (a) FIR coefficients derived from ARX; (b) direct FIR estimation.
483
Fig. 7. (a) Cross validation results for cluster 3; (b) comparison between direct FIR estimated model and ARX derived FIR model for cluster 4.
In a crisp composition strategy, only the subset of
models having the largest associated membership are
employed. Let p, q and r denote the local models associated with the largest membership values (p, q, r < =c).
The weighted crisp composition yields an output O
which is expressed as
P
O¼
ði Þm Oi
i¼p;q;r
P
ð28Þ
ði Þm
i¼p;q;r
where denotes the number of local models associated
with the largest membership value. Both the above crisp
schemes [Eqs. (28) and (29)] reduce to the same expression
when only one local model is fired. To obtain good performance for the MPC scheme, the local model parameters
are optimized for long-range prediction capability.
Remark 2. As can be seen in the preceding section, the
algorithm requires information about the regions of the
operation, which is typically known through a priori
knowledge or through the above analysis. Perturbation in
each individual region an online composition of the models
as proposed above is required to implement this scheme.
In a non-weighted crisp composition scheme the output O is obtained as
P
O¼
6. Illustrative examples and a case study
Oi
i¼p;q;r
ð29Þ
The system under consideration is the well-mixed
continuous fermenter of Henson and Seborg [10]. The
Table 4
Local model structures, coefficients and steady state gains
Local region
Settling time
Model structure
1
2
3
4
5
20
30
40
100
300
ykþ1
ykþ1
ykþ1
ykþ1
ykþ1
¼ 0:0045uk þ 0:0037uk
1 þ 1:6338yk 0:6554yk
1
¼ 0:3503uk þ 0:003uk
1 þ 1:7yk 0:7189yk
1
¼ 0:0003uk þ 2:2196yk 1:5927yk
1 þ 0:3678yk
2
¼ 0:0006uk 0:0003uk
1 þ 1:9086yk 0:9113yk
1
¼ 0:0003uk þ 1:9561yk 0:957yk
1
Steady state gain
0.3806
0.3188
0.0682
0.3352
0.4678
484
Fig. 8. Composite model predictions employing fuzzy aggregation: (a) disturbance in input U1 (10% noise); (b) variation in input U2; (c) comparison of plant and composite model predictions; (d) magnified view of the comparison between sampling instants 800 and 1600. RMSE of prediction
errors=0.0176.
485
dilution rate (U1) and the feed substrate concentration
(U2) are the manipulated variables while the cell mass
concentration (X1), substrate concentration (X2) and
product concentration (X3) are the state variables. The
model equation set for the continuous fermenter is as
follows
dX1
¼ U1 X1 þ X1
dt
ð30Þ
dX2
X1
¼ U1 ðU2 X2 Þ dt
Yx=s
ð31Þ
dX3
¼ U1 X3 þ ð þ ÞX1
dt
Y1 ¼ X1
Y2 ¼ X2
ð33Þ
ð34Þ
ð35Þ
is the specific growth, þ is the specific product
formation rate and Yx/s is the yield of the cell mass with
respect to the limiting substrate. The specific growth
rate is given through the substrate and product inhibition model as
X3
m 1 X2
Pm
¼
ð36Þ
X22
Km þ X2 þ
K1
The nominal model parameters are listed in Table 2.
This application exhibits problems of gain direction
changes as well as non-linearities in different regions of
operation as seen in Fig. 3(a). In this work, we restrict
the analysis to the SISO case. The aim is to obtain a
dynamic model that describes the effect of feed substrate
concentration (U2) on cell mass concentration (Y1). The
value of the dilution rate (U1) is maintained at 0.1636,
Fig. 9. (a) Proposed multiple model based MPC scheme; (b) internal structure of the multiple model MPC.
486
which corresponds to the optimal operational value for
operation of the fermenter [10].
The steady state map of the input(s)–output pair is
generated through a simulation model. The allowable
range of U2 is fixed between 0 and 40. An analysis of the
data based on the LMRE based validity measures
(Fig. 2) indicates the optimal number of clusters to be 5.
The segregation of the steady state map into 5 clusters is
carried out employing the adaptive norm based technique discussed in Section 2. The location of the center
and nature of the membership function obtained for
each cluster is shown in Fig. 3.
Based on the membership plots, the input space is
divided into 5 grids. Membership cuts for each of the
membership functions are selected such that the input
range for each cluster spans midway through the region
of overlap between adjacent membership functions. The
location and characteristics of the grids are depicted in
Table 3.
Within each local grid, perturbation data is generated
through appropriate signal design. The linearity indices
are evaluated to ascertain linearity with each local
region. In this case study, all five regions were found to
be approximately linear. The key lagged variables are
estimated through the PCR based technique discussed
earlier. Since the local regions are linear, ARX and FIR
forms for the models are selected. The tolerance limits
on the percentage residual in y is set based on minimum
deviation between the actual FIR and the ARX derived
FIR coefficients. The coefficients of the ARX (reduced
to FIR form) and direct FIR models constructed for the
local regions 1, 3 and 5 are depicted in Figs. 4–6.
Fig. 10. Comparative control performance of the crisp and fuzzy model switching strategies in the multiple model MPC framework (fuzzy aggregation is employed to get composite model predictions in either case).
487
Fig. 11. Disturbance rejection performance of the propsed MPC scheme subjected to a 10% change in U1.
Fig. 7(a) depicts the cross-validation performance of
model 3. This model corresponds to the peak of the
steady state map. This region is characterized by low
gains and is typically difficult to model. However the
model constructed through the proposed scheme for
local region 3 shows excellent cross validation as ascertained from Fig. 7(a). Fig. 7(b) shows a comparison of
the ARX derived FIR and directly estimated FIR coefficients for local region 4. The local model structures,
coefficients and steady state gains are tabulated in
Table 4. It is noted from Table 4 that the dynamics in
each of the local regions are significantly disparate, with
the local time constants varying from 5 s in region 1 to
75 s in region 5. It is further observed that trend of the
steady state gains of the local models constructed follows the behavior expected from the steady state map of
the process as depicted in Fig. 2(a). The adequate performance of each local model is verified through cross
validation within each local region.
Each local model is constructed based on a sampling
rate that is characteristic of the local time constant.
Fuzzy aggregation is used for the composition of the
local models and the performance of the aggregated
model through the range of operation is shown in Fig. 8.
It is observed that the composite model cross validates
excellently through the entire range of operation.
and a comparative performance of either MPC scheme
is shown in Fig. 10. The dashed lines show the control
response when a crisp weighted average switching strategy is used and the dotted lines show the control
response when fuzzy switching is employed. From the
control results obtained, it is clearly observed that the
fuzzy switching strategy performs better than the crisp
switching technique. The MPC scheme is observed to
fail when crisp non-weighted output averaging technique is employed. A disturbance in the form of variation in the value of U1 is introduced for a fixed time and
the ability of the fuzzy switching based MPC scheme to
reject the disturbance is studied. The disturbance rejection performance of the proposed scheme is observed to
6.1. Control
Control of the fermenter through the range of operation is attempted through the incorporation of the
proposed multiple model based MPC scheme (Fig. 9).
Both fuzzy and crisp switching strategies are employed
Fig. 12. Performance of the proposed MPC scheme in presence of
both process and measurement noise (up to 10% deviations
considered).
488
be excellent as is shown Fig. 11. The performance of the
MPC scheme in presence of process and measurement
noise is shown in Fig. 12.
7. Conclusion
In this work, we propose a new fuzzy modeling based
methodology for decomposition of the overall nonlinear
plant behavior to obtain simpler dynamic models. New
validity measures that aid in the determination of the
optimal number of divisions of the operating space
based on linear model reconstruction errors have been
proposed. This decomposition, which is based on a
steady state map of the input–output space, is shown to
be feasible for dynamic model development in local
regions. Linearity indices are suggested to ascertain linearity (or non-linearity) of the local regions. Towards
obtaining accurate predictions over the entire operating
range, we have proposed and evaluated a method for generating composite predictions based on fuzzy aggregation.
A MPC scheme is also proposed within the multiple model
framework. The validity of the identification and control
strategies proposed are demonstrated through application to the nonlinear fermenter problem of Henson and
Seborg [10]. The superiority of the fuzzy switching
strategy over crisp switching methodology is highlighted
through control performance. The multiple model based
MPC scheme shows good set point tracking and disturbance rejection through the entire range of operation.
It can be seen that, since the local models have stable
realizable inverses, the local controllers can also therefore be assumed stable. Also, since the fuzzy composition is done by weighing each controller output by
appropriate membership, which in themselves sum to
unity, it may be concluded that the composite control
action will also be stable in a bounded input bounded
output sense. However, a rigorous analysis of the stability aspects is further ndatory and will be a focus of
future research.
References
[1] T.A. Johansen, B.A. Foss, Constructing NARMAX models
using ARMAX models, Int. J. Control. 58 (1993) 1125–1153.
[2] T.A. Johansen, B.A. Foss, Identification of nonlinear system
structure and parameters using regime decomposition, Automatica 31 (2) (1995) 321–326.
[3] B.A Ogunnaike, W.H. Ray, Process Dynamics, Modeling and
Control, Oxford University Press, 1994.
[4] A.N. Venkat and R.D.Gudi, Fuzzy Segregation Based Identification of Nonlinear Dynamic System, presented at ACC 2001,
Washington DC.
[5] B. Kuipers, K. Astrom, The composition and validation of heterogenous control laws, Automatica 20 (1994) 233–249.
[6] A.N. Venkat, R.D. Gudi, Fuzzy Segregation based identification
and control of nonlinear dynamic systems. Journal of Industrial
and Engineering Chemistry Research (submitted).
[7] K. Schott, W. Bequette, Control of chemical reactors using multiple model adaptive control, IFAC DYCORD +95, HelsingerDenmark (1995) 345–350.
[8] K.S. Narendra, J. Balakrishnan, M.K. Ciliz, Adaptation and
learning using multiple models, switching and tuning, IEEE
Control Systems Magazine 15 (3) (1995) 37–51.
[9] A. Kordon, Y.O. Fuentes, B.A. Ogunnaike, P.S. Dhurjati,
An intelligent parallel control system structure for plants
with multiple operating regimes, J. Proc. Control 9 (1999)
453–460.
[10] M.A. Henson, D.E. Seborg, Nonlinear control strategies for
Continuous Fermenter, Proc. of 1990 American Control Conference, San Diego, 1990.
[11] A. Baraldi, P.A. Blonda, Survey of fuzzy clustering
algorithms for pattern recognition-part I, IEEE Trans. on
Systems, Man and Cybernetics- Part B: Cybernetics. 29 (6)
(1999) 778–785.
[12] A. Baraldi, P.A. Blonda, Survey of fuzzy clustering
algorithms for pattern recognition-part II, IEEE Trans. on
Systems, Man and Cybernetics- Part B: Cybernetics. 29 (6)
(1999) 786–801.
[13] D.E. Gustafson, W.C. Kessel, Fuzzy clustering with fuzzy covariance matrix, Proc. IEEE Intl. Conference on Fuzzy Systems,
San-Diego. (1979) 761–766.
[14] J.C. Bezdek, Pattern recognition with fuzzy objective algorithms,
Plenum Press, New York, 1981.
[15] A. Kandel, Fuzzy Techniques in Pattern Recognition, WileyInterscience Publishers, New York, 1982.
[16] I. Gath, A. Geva, Unsupervised optimal fuzzy clustering, IEEE
Trans. Pattern Analysis and Machine Intelligence 7 (1989) 773–781.
[17] R. Krishnapuram, C.P. Freg, Fitting an unknown number of
lines and planes to image data through compatible cluster merging, Pattern Recognition 25 (4) (1992) 385–400.
[18] T. Soderstrom, P. Stoica, System Identification, Prentice-Hall
International, UK, 1989.
[19] B. Huang, Process identification based on last principal component
analysis, Journal of Process Control 11 (2001) 19–33.
[20] W.A. Docter, C. Georgokis, Nonlinear reduced order models for
separation processes via augmentation of linear subspace models,
Proceedings of the American Control Conference (1998) 2108–
2112.
[21] S. Chen, S.A. Billings, W. Luo, Orthogonal least squares methods and their application to non-linear system identification, Int.
J. Control 50 (5) (1989) 1873–1896.
[22] P.Vijaysai and R.D.Gudi, Issues in key variable selection in
dynamic model identification. Canadian Journal of Chemical
Engineering (to be submitted).
[23] A.F. Gomez-Skarmeta, M. Delgado, M.A. Vila, About the use of
fuzzy clustering techniques for fuzzy model identification, Fuzzy
Sets Systems 10 (6) (1999) 179–188.
[24] C.E. Garcia, D.M. Prett, M. Morari, Model predictive control:
theory and practice—a survey, Automatica 25 (3) (1989) 335–348.
[25] J.J. Song, S. Park, Neural model predictive control for nonlinear chemical processes, J. Chem. Engg. Jpn. 26 (4) (1993)
347–354.
[26] J. Saint-Donat, N. Bhat, T.J. McAvoy, Neural net based model
predictive control, Int. J. Control. 54 (6) (1991) 1453–1468.