generalized representations of multivariable linear parameterized

International Conference
AUTOMATICS AND INFORMATICS’2013
3-7 October 2013, Sofia, Bulgaria
GENERALIZED REPRESENTATIONS OF MULTIVARIABLE LINEAR
PARAMETERIZED MODELS
A. Efremov
Technical University of Sofia, Sofia 1000, bvd Kliment Ohridski 8
[email protected]
Abstract: There are two possible generalized representations of multiple input multiple output (MIMO) regression models, which are
linear w.r.t. their parameters. In one of them, the model parameters are collected in a matrix, but the regressors are in a vector, and in the
other representation, the parameters are in a vector, but the regressors at a given time instant are gathered in a matrix. In the paper both
types of representations are considered. Their advantages and disadvantages are summarized and also their applicability within the whole
experimental modelling process is discussed.
Key words: MIMO model, linear parameterization, parameter matrix, parameter vector.
INTRODUCTION
Nowadays, mankind possesses a tremendous number of data,
regarding many real-life phenomena. The availability of such a
source of information is the main premise for the wide spread
usage of the experimental modelling. When incorporate more
(appropriate) factors in the model, this leads to a better ability
of the models to represent the specifics of investigated systems
behaviour. Naturally, in fields like economics, sociology, finance, medicine, etc. the models are MIMO. Also, the linear
models w.r.t. their parameters are preferable (and applicable in
many cases), because the theory used for their determination is
well known and is easy to apply. For instance, the iterative
methods for numerical optimization, which are frequently used
when the model is non-linear function of the parameters, are
avoided in the case of linear parameterized models. Furthermore, these models are proven as effective representations of
systems with even strong non-linearity. The key point is to
transform the model into a linear parameterized form, by finding appropriate (non-linear) functions of the initial factors
and/or the outputs [3].
The considerations in the paper are focused on the multivariable linear parameterized regression models.
GENERAL FORMS
When the linear parameterized model is SISO, it can be written in the following general form
y k   kT   ek ,
where y k is the system output, ek is the residual, containing
the output variation, not accounted by the model, all factors
(possibly transformed) are in the regression vector  k and all
model parameters are arranged in the vector  . This form is
convenient, because all parameters are gathered in a single
vector, which has to be estimated based on the available data.
Obviously, the dimensions of both vectors are equal and their
product is the predicted output (scalar) by the model, which is
denoted below with ŷ k .
In this paper are investigated the possible representations of
the dynamic MIMO linear parameterized models. Let the inputs and outputs values in the current ( k -th) time instant are

JOHN ATANASOFF SOCIETY
OF AUTOMATICS AND INFORMATICS
I - 233
gathered in vectors and let the model has m inputs and  outputs. Then the vectors are
u k  [u1,k
u 2,k
 u m,k ]T ,
yk  [ y1,k
y2,k
 y,k ]T .
The multiplication between the factors and the parameters returns the vector yˆ k  R  . With this notation, there are two
possible ways to represent in a general form the multivariable
linear parameterized model. One way is the parameters to be
gathered in the matrix   R z and the regressors – in the
vector  k  R z ( z is the number of factors). This form is [4,
5]
(2)
y k  T  k  ek .
The other way is the parameters to be placed in the vector
  R p (i.e. the model has p parameters), but the regressors
at a given time instant are arranged in the matrix Φk  R  p ,
i.e.
(3)
yk  Φk  ek .
The matrices and vectors are of appropriate structures. There
are many realizations of each form, depending on how the parameters and regressors are arranged. From first sight, both
general forms should be equivalent, because, regardless of the
right hand side, the result is the observed system output (the
residual is introduced in the expressions). But as it will be
shown, (2) and (3) have specifics, which lead to different advantages, disadvantages and to different applications, respectively.
MIMO ARX MODEL
When the input-output data is available (and let it represents
the dynamics of the investigated system), it is natural first to
use auto-regressive model with exogenous input (ARX), which
gives the relation between y k and the past inputs and outputs.
Then, if an appropriate ARX structure is not found, other type
of regression model could be chosen (like ARMAX – ARX
model with moving average (MA) noise filter). Alternatively,
other parameters estimator should be used, or more factors
have to be included in the modelling process. In the paper, as
an initial system representation is considered only ARX model, which has the form
1
1
A(q ) yk  B(q )u k  ek .
Keeping in mind that u k  R
m
(4)

and y k  R , the polynomial
m

matrices are A(q 1 )  Rna
and B(q 1 )  Rnb
. With na
and nb are denoted the maximum degrees of the polynomials
in A(q 1 ) and B(q 1 ) . When the model (4) is MIMO, it can
be considered as a system of  equations. Each equation is a
MISO sub-model, associated with a particular output.
PARAMETER MATRIX FORM
Both polynomial matrices in (4) can be represented as the following matrix polynomials
A(q 1 )  I   A1q 1  ...  Anaq na ,
(5)
B(q 1 )  0m  B1q 1  ...  Bnbq nb ,
(6)
where Ai  R  and B j  R m consist the parameters of all
polynomials in A(q 1 ) and B(q 1 ) , which are multiplied by
y k i and u k  j respectively. The intercept in (5) is chosen to
be identity matrix, as in each equation (a MISO sub-model),
this provides the presence of only one of the outputs at the current time instant (the i -th equation contains only yi ,k ). All
other regressors are from previous time instants. This simplifies the model determination and its usage. The intercept in (6)
is a zero matrix, because it is assumed that first y k is measured, then the input u k is determined as a function of y k and
finally u k is applied to the investigated system. This is a
standard case in prediction, control and other model applications.
The matrix polynomials (5) and (6) can be used to obtain a
specific realization of the general form (4), where all parameters are placed in a matrix.
Let (5) is rewritten as
~
A(q 1 )  I   A(q 1 ) .
Then (4) can be formulated as
~
y k  A(q 1 ) y k  B (q 1 )u k  ek
known (and are non-equal for all polynomials), where
aij (q 1 ) is the ij -th polynomial of A(q 1 ) . Then, in order to
construct  it is assumed that all polynomials degrees are
na  max( na11, na12 , ..., na ) .
The same restriction is imposed on the polynomials degrees in
B(q 1 ) . This is the main disadvantage of the parameter matrix
form, which sometimes may lead to over fitted models and/or
to numerical problems.
From the last discussion follows that this representation is not
appropriate for an accurate model structure determination. But
if no a priori information is available, about the polynomials
degrees, (2) can be used for a fast orientation in the model
structure. This topic will be discussed later.
Another application of (2) is when impose restrictions on the
model structure, because of economic reasons. An example
from the finance industry and more precisely from the credit
scoring activities is when part of the factors is provided (on a
corresponding price) by credit bureaus. These factors are significantly more discriminative and from this point of view
more desirable as factors in the model, compared with other
factors, like those, provided by the credit applicants. When a
bank estimates how risky an applicant for credit is, it may use
a regression model to predict its behaviour. Normally the model requires some bureau factors. From this point of view, as
each bureau factor costs money, then it is reasonable to reduce
the number of these factors in the model. In this application,
the model outputs can be the risk levels associated with different bank strategies, potential losses, etc. Therefore, if a bureau
factor is introduced in the model to predict a specific output,
then the same factor has to contribute to the prediction of the
other outputs. If the regression model is in the form (2), then
this restriction will be naturally accounted for in the process of
model structure determination.
Other example for the application of (2), concerning the economic aspects is when build a control system. Here, each process included in the model is connected with an investment
(especially in serial production). If a new input is incorporated
in the model, then a corresponding actuator has to be provided
and if an output is added, then a corresponding sensor is needed. In order to reduce the investment, again the parameter matrix representation can be used.
  A1 y k 1  ...  Ana y k na  B1u k 1  ...  Bnbu k nb  ek .
Let the vector  k , containing all regressors is
 k  [ y kT2   y kTna u kT1  u kTnb ]T
and the parameter matrix  is
(7)
  [ A1  Ana B1  Bnb ]T .
Here the number of factors, necessary to predict each model
output at a given time instant is z   na  m nb . Then the
ARX model, represented in a general parameter matrix form
becomes
yk  T  k  ek .
Different realizations of this form can be obtained, if rearrange
the regressors in  k (the structure of  should be changed as
well, such that yˆ k  T  k to be the same regardless of the
different structure of  k ).
From (7) it is seen that, when the parameters are gathered in a
matrix, all polynomials in a polynomial matrix have to be from
the same degree. For instance, let naij  deg(aij (q 1)) are
PARAMETER VECTOR FORM
In order to translate the ARX model into the general form (3),
it is convenient to start with the model, written as
~
(8)
yk  A(q 1 ) yk  B(q 1 )uk  ek .
Let’s introduce the block-diagonal matrices Yk  R 
Uk  R
y kT
m
2
and
, which have  and m (equal) blocks, which are
u kT
and
respectively. Let also a(q 1 ) and b(q 1 ) are the
following polynomial vectors
~
a(q 1 )  vec(AT (q 1 )) ,
b(q 1 )  vec(B T (q 1 )) .
With vec(.) is denoted the matrix vectorization. For a matrix
M  R mn , the resulting vector, after the vectorization is
vec(M )  [ M .T1
Here M .i
as
I - 234
M .T2
... M .Tn ]T
 [m11 m21 ... mmn ]T  R mn .
is the i -th column of M . Then (4) can be written
a~11(q 1 )  a~1 (q 1 ) 

~ 1 



A( q )  

 a~ (q 1 )  a~ (q 1 )

 1

yk  Yk a(q 1 )  U k b(q 1 )  ek .
The multiplication between Yk and a(q 1 ) leads to a vector
with  components. Actually, the i -th component is the sca~
lar multiplication between the i -th row of A(q 1 ) and y k . It
represents the overall influence of the previous values of the
outputs on the current value of the i -th output. By analogy,
the i -th element of U k b(q 1 ) corresponds to the overall effect of the previous inputs on the i -th output. Indeed, because
~
the polynomials in A(q 1 ) and B(q 1 ) (and hence in a(q 1 )
and b(q 1 ) ) don’t have intercepts, all regressors in (4) are before the k -th time instant.
In order to arrive at (3), it is necessary all parameters to be
gathered in a vector  . Also all regressors should be arrange
in a matrix Φk , with non-zero elements of it’s i -th row, equal
to the regressors of the i -th MISO model. Moreover, they are
placed in such an order that corresponds to the arrangement of
the parameters in  .
Again, like in the parameter matrix form, the elements of 
can be distributed in different ways. One case is first elements
~
of  to be the parameters of all polynomials in A(q 1 ) (arranged sequentially row wise), followed by the polynomials’
parameters in B(q 1 ) . According to this structure of  , the
regressors matrix Φk consists of two block diagonal submatrices. The first one is associated with the auto-regression,
and the second one contains the previous values of the inputs.
and
b11(q 1 )  bm (q 1 )





B(q 1 )  
.
b (q 1 )  b (q 1 ) 
m
 1

~
The j -th polynomial from the i -th row of matrix A(q 1 ) is
na
a~ (q 1 )  a~ q 1  ...  a~
q ij
ij
where the first p1  

m
nb
j 1 1 j
elements of 
~
correspond to the polynomials from the first rows of A(q 1 )
1
and B(q ) . These elements of  are all parameters of the
first MISO model. Next p2  

na
j 1 2 j

m
nb
j 1 2 j
bij (q 1 )  bij ,1q 1  ...  bij , nbij q
,
i -th row of the matrix equation (8) becomes
yi , k  a~i1 (q 1 ) y1, k  ...  a~i  (q 1 ) y , k  bi1 (q 1 )u1, k  ...
 bi m (q 1 )u m, k  ei, k .
The terms on the right hand side can be written in the following vector form
a~ (q 1 ) y
 yT a ,
ij
ij , k
bij (q
1
ij , k ij
)uij , k  uTij , k bij
The parameter vectors a ij  R
naij
.
and bij  R
nbij
, which are
introduced above are
aij  [a~ j ,1 a~ j ,2 ... a~ j , na j ]T
bij  [b j ,1 b j ,2 ... b j , nb j ]T ,
and the vectors y
ij , k
R
naij
and u ij , k  R
nbij
containing the
regressors are
y
ij , k
 [ yij , k 1
yij , k  2 ...
yij , k  naij ]T ,
u ij , k  [uij , k 1 uij , k  2 ... uij , k  nbij ]T .
With this notation the i -th row of (8) becomes
yi , k   y T a i1  ...  y T a i  u Ti1, k b i1  ...
block contains all pi regressors of the i -th MISO model. The
structure of Φk is presented on figure 1. This variant of the
general representation is considered in more
 nbij
(for i  1,  and j  1, m ). Multiplying with the regressors, the
ele-
ments of  are the parameters of the second MISO model, etc.
In this way, Φk has a block diagonal structure, where the i -th
ij , naij
(for i  1,  and j  1,  ) and the polynomial bij (q 1 ) is
Another example of the parameter vector form is the one,

na
j 1 1 j
ij ,1
i1, k
i, k
 u Tim, k b im  ei, k
or briefly
yi , k  φTi, k θi  ei, k .
Here all parameter vectors are gathered in the vector
θi  [aTi1 ... aTi
bTi1 ... bTim ]T  R pi ,
and the regressors are collected in the vector
φ i ,k  [  y T
i1,k
...  yT
i,k
u Ti1,k
... u Tim,k ]T  R pi .
Let introduce the vector  , which consists the parameters of
all MISO models. Hence it has p  i pi elements. It has the
structure
Figure 1. Structure of the regressors matrix Φk
  [θ1T
θT2
... θT ]T .
As the output vector is y k  R , then the regressors in the

details below.
~
The polynomial matrices A(q 1 ) and B(q 1 ) can be written
as
model have to be gathered in a matrix with  rows and p
columns. The non-zero elements of the i -th row should contain φi, k . The structure of the full MIMO ARX regressors matrix (shown on figure 1) at the k -th time instant is
I - 235
Φk  diag(φ1T, k , φT2, k , ..., φT, k ) .
It contains all regressors, necessary for the prediction of all
model outputs. Finally the regression model (8) (and (4) respectively) in the parameter vector form is
yk  Φk  ek .
When all model parameters are arranged in a vector, there is
no restriction imposed on the polynomials degrees. As it is
seen from the above considerations, the degrees ( naij and
nbij ) of all polynomials in the model can be selected independently. Therefore the representation in the parameter vector
form can be used for an accurate determination of the model
structure.
Some dynamic systems, e.g. hypermarkets have many potential inputs [1]. Even after an initial decomposition of the whole
system, the sub-systems may have hundreds or even thousands
of inputs. Also there are many possible cross relations within
the multivariable structure, which has to be investigated when
determine the market structure. In these cases, the number of
competitive models, constructed when optimize the structure
parameters, may become very big. This leads to a significant
time for the overall model development process, especially
when a parameter vector representation is used. For this reason, the general form (3) should be used carefully.
FACTORS SELECTION
Another point, which was not covered in the previous sections,
is that the structure of each MISO model can be adjusted independently from the structures of the other sub-models [2]. On
this base can be developed effective realizations of methods
for MIMO structure determination. Probably the most wide
spread method is the stepwise regression. Its idea is shown on
figure 2. Here is used the fact that always, the change of a particular structure parameter (time delay, polynomial degree, index of entered in the model input or output) is actually a
change in the set of factors, which are introduced in the model.
The stepwise method is iterative, where at each iteration two
steps are consequently applied. They are ‘forward selection’
and ‘backward elimination’. The result of the iterative process
is a regression model, which accounts for a subset of factors,
which (as a combination) are appropriate to explain the system
output. In order to construct the model, the algorithm employs
series of significance tests. The factors are sequentially added
or removed from the regression model. This process continues
until no significant factors could be selected from the not entered ones and no insignificant factors could be eliminated
from the model. The two thresholds in the stopping conditions,
shown on figure 2 are SLE – the significant level for enter and
SLS – the significant level to stay.
(associated with certain outputs) could be a result of forward
selection and at the same time other could be obtained by a
backward elimination.
CONCLUSION
The two possible generalized representations of MIMO linear
parameterized regression models are presented. They are not
equivalent and both can be used at certain steps during the
model development. In section four was shown that the representation with a parameter matrix leads to a constraint on the
degrees of the model polynomials (they should be equal within
a polynomial matrix). This representation is applicable, when
system dimension is very big and the model structure is unknown. In this case, an appropriate solution at the stage of
model structure determination is first to use the model form
with a parameter matrix for a fast isolation of a subset of significant inputs and for an initial orientation on the degrees of
the model polynomials. After that the model structure can be
further specified by using of the parameter vector representation (where the above mentioned restriction is avoided). In this
case the number of competitive models could be significantly
reduced and the whole factors selection process to become
more efficient.
Another application of the parameter matrix representation is
to take into account some requirements (from economic perspective) introduced in the modelling problem. The constraint
on the model structure, mentioned above can be used to reduce
the number of factors, used by the model and thus to decrease
the cost of model utilization.
Finally, keeping in mind the specific structures of (2) and (3),
as it was mentioned in the previous section, there are ways to
further increase the efficiency of the iterative algorithms for
factors selection.
REFERENCES
1. Efremov, A. Multivariate Time-Varying System Identification at Incomplete Information. Technical University of Sofia,
Faculty of Automatics, Ph.D. 2008
2. Efremov, A. System Identification Based on Stepwise Regression for Dynamic Market Representation. International
Conference on Data Mining and Knowledge Engineering,
Rome, Italy, 28–30 April, 2010, vol. 64, № 2, pp. 132-137.
3. Faraway, J. Practical regression and ANOVA using R,
http://cran.r-project.org/doc/contrib/Faraway-PRA.pdf, 2002.
4. Van den Hof, P. M. J. Model sets and parameterizations for
identification of multivariable equation error models. Automatica, 30(3), 1994, p. 433-446.
5. Vuchkov, I. Identification. IK Jurapel, Sofia, 1996.
Figure 2. Stepwise regression
From the independence of the MISO sub-models, when the
system has  outputs, then  changes in the current MIMO
model can be applied simultaneously. Moreover, at a given iteration of the stepwise method, some of the model changes
I - 236