Elec170803.qxd


The Journal of Engineering Research Vol. 2, No. 1 (2005) 12-24

1.   Introduction

1.1  Takagi-Sugeno Models
Developing mathematical models of real systems is a

central topic in many disciplines of engineering and sci-
ence. Models can be used for simulations, analysis of the
system's behavior, better understanding of the underlying
mechanisms in the system, design of new processes, or 
design of controllers.  Takagi-Sugeno (T-S) modeling
plays an essential role in deriving local linear models of
the nonlinear dynamic system under concern (Lo and
Chen, 1999;  Takagi and Sugeno, 1985).  Through the use
of the heuristic rules inherent in the fuzzy systems, T-S
fuzzy models, then make it possible to have a transparent
like system which is governed by the fuzzy inference sys-
tem and rules.   Fuzzy modeling concerns the methods of
describing the characteristics of a system   using   fuzzy
inference  rules.  Fuzzy  modeling methods have a distin-  
_______________________________________
*Corresponding author E-mail: ebrgallaf@eng.uob.bh

guishing feature in that they  can  express complex   non-
linear  systems linguistically.

In a similar way fuzzy clustering has been utilized as
well in classifying data-driven fuzzy modeling,  since it
draws a methodology for assigning label to similar data.
Such assignment does give quantitative directions for
shaping the fuzzy membership functions.   Model valida-
tion and verification is also an important task in the mod-
eling paradigm.  This is due to the choice of the right
model from a number of models that might present simi-
lar characteristics.  Statistically validated models in addi-
tion to probabilistic validation are used sometimes to
make the suitable choice of a system model.  In general,
fuzzy control systems can be classified as linguistic (Lo
and Chen, 1999;  Mamdani and Assilian, 1975).  The lin-
guistic type fuzzy control system is well recognized and
received by the control society.   The T-S type fuzzy sys-
tem, which will be used in this article mainly focuses on
the modeling aspect.   It has been reported that a T-S fuzzy
system can exactly model  any  nonlinear system Wang, et
al. (2000).   On the other hand there is a main drawback

Takagi-Sugeno Neuro-Fuzzy Modeling of a Multivariable
Nonlinear Antenna System

E. A. Al-Gallaf*

Deptartment  of  Electrical and Electronics Engineering, College of Engineering
University of Bahrain, P.O. Box   13184, Kingdom of Bahrain

Received 17 August 2003;  accepted 7 April 2004

Abstract:  This article investigates the use of a clustered based neuro-fuzzy system to nonlinear dynamic system model-
ing.  It is focused on the modeling via Takagi-Sugeno (T-S) modeling procedure and the employment of fuzzy clustering
to generate suitable initial membership functions.   The T-S fuzzy modeling has been applied to model a nonlinear anten-
na dynamic system with two coupled inputs and outputs.   Compared to other well-known approximation techniques such
as artificial neural networks,  the  employed neuro-fuzzy system has provided a more transparent representation of the
nonlinear antenna system under study,  mainly due to the possible linguistic interpretation in the form of rules.   Created
initial memberships are then employed to construct  suitable T-S models.  Furthermore, the T-S fuzzy models have been
validated and checked through the use of some standard model validation techniques (like the correlation functions).
This intelligent modeling scheme is very useful once making complicated systems linguistically transparent in terms of
the fuzzy if-then rules.   

Keywords : Neuro-fuzzy  systems, Fuzzy clustering, Takagi-Sugeno modeling, Nonlinear 
systems

  ::¢¢üü∏∏îîàà°°ùŸŸGGÒ¨dG á«µ«eÉæj~dG äÉ«∏ª©∏d êPÉ‰ OG~YG πLG øe ∂dPh ºµëàdG º¶æd (á«fhÎµd’G á«Ñ°ü©dG äÉµÑ°ûdG πãe) á«còdG º«∏©àdG πFÉ°Sh ΩG~îà°SG ¤G åëÑdG Gòg ±~¡j

)ΩG~îà°SÉH á∏«ã“ OGôŸG ΩÉ¶ædG áHÉéà°SG ™jRƒJ ” ~≤a QÉW’G Gòg ‘h .ájOÉ«àY’G π«ãªàdG πFÉ°SƒH É¡∏«ã“ Ö©°üj »àdGh √~©dG äGÒ¨àŸG äGP á«£NTechnique

Clustering)ΩÉ¶f AÉæH h π«ãªàd äÉ≤ÑW á°ùªN  øe áfƒµŸG á«fhÎµd’G  á«Ñ°ü©dG äÉµÑ°ûdG ΩG~îà°SG ” ºK øe h (LogicFuzzyêPƒ‰ AÉæÑd Ω~îà°ùj ±ƒ°S …òdGh (

) ä’É°üJG ΩÉ¶æd ƒg h »µ«eÉæjO »£N ÒZ ΩÉ¶ædAntenna SystemMulti-variableQOÉb ìÎ≤ŸG êPƒªædG ¿G äÉÑK’ á«FÉ°üM’G ¥ô£dG ΩG~îà°SG ” GÒNGh .(

.ºµëàdG ΩÉ¶f AÉæH ºK øeh I~≤©ŸG äÉ«∏ª©dG π«ã“ ≈∏Y

áá««MMÉÉààØØŸŸGG  ääGGOOôôØØŸŸGG .á«£NÓdG áª¶fC’G  ,ƒæ«cƒ°S - »cÉcÉJ áª¶fCG - ÖÑ° ŸG …Oƒ≤æ©dG ™ªéàdG - áÑÑ° ŸG á«Ñ°ü©dG áª¶fC’G  :

äGÒ¨àŸG O~©àeh »£N ÒZ »FGƒg ΩÉ¶æd ÖÑ° ŸG ≥£æŸGh á«Ñ°ü©dG ÉjÓÿG ≈∏Y ~ªà©e ƒæ«cƒ°S - »LÉcÉJ êPƒ‰


of the linguistic model compared with the T-S model in
that there is a difficulty in dealing with a multidimension-
al system since a large number of fuzzy rules have to be
used.

Gorzalczany  et al. (2000),  has briefly presented and
compared four neuro-fuzzy systems used for rule-based
modeling of dynamic processes (chaotic Mackey-Glass
time series).  The following systems have been consid-
ered: NFMOD - the proposed system, the well-known
ANFIS and NFIDENT systems, and an alternative neuro-
fuzzy system already reported in literature.  The main cri-
terion of comparison of all systems is their performance
(modeling accuracy) versus interpretability (the trans-
parency and the ability to explain generated decisions; it
also includes an analysis and pruning of obtained fuzzy-
rule bases).   On the other hand,  Zhang and Knoll (1995).
have proposed an approach for solving multivariate mod-
eling problems with neuro-fuzzy systems.  Instead of
using selected input variables, statistical indices are
extracted to feed a fuzzy controller. The original input
space was transformed into an eigen-space.  If a sequence
of training data are sampled in a local context, a small
number of eigenvectors which possess larger eigen-values
provide a good summary of all the original variables.
Fuzzy controllers can be trained for mapping the input
projection in the eigen-space to the outputs.
Implementations with the prediction of time series was
used to validate the concept.  

The article of Ikonen and Kortela (2000) is concerned
with a process modeling using fuzzy neural networks.   In
Distributed Logic Processors (DLP) the rule base is para-
meterized. The DLP derivatives required by gradient-
based training methods are given, and the recursive pre-
diction error method is used to adjust the model parame-
ters.  The power of the approach is illustrated with a mod-
eling example where NOx-emission data from a full-scale
fluidized-bed combustion district heating plant are used.
The method presented in their paper was general, and can
be applied to other complex processes as well.   Bologna
(2001)  has presented a new neuro-fuzzy model denoted as
Fuzzy Discretized Interpretable Multi-Layer Perceptron
(FDIMLP).   Fuzzy rules were extracted in polynomial
time with respect to the size of the problem and the size of
the network.  He applied our model to three classification
problems of the public domain. It turned out that FDIMLP
networks compared favorably with respect to EFUNN and
ANFIS neuro-fuzzy systems.   

Ning  et  al. (2001) presented a fuzzy satisfactory clus-
tering algorithm in their paper.  It started with two cluster
centers and adds  new center if necessary.  A system data
set was quickly divided into several satisfactory fuzzy
clusters by the algorithm.  A T-S type fuzzy model was
then, identified.    Chen and Linkens (1998)  introduced  a
three-layered RBF (Redial Basis Function)  network to
implement a fuzzy model.   Differing from existing clus-
tering-based methods,  in their approach the structure
identification of the fuzzy model, including input selecting
and partition validating,  was implemented on the basis of

a class of sub-clusters created by a self-organizing net-
work instead of raw data.  The important input variables
which independently and significantly influence the sys-
tem output can be extracted by a  fuzzy neural network.
On the other hand,  the optimal number of fuzzy rules can
be determined separately via the fuzzy c-means clustering
algorithm with a modified fuzzy entropy measure as the
criterion of cluster validation.    

Akkizidis and Roberts (2001)  proposed an algorithmic
methodology for identifying and modeling non-linear con-
trol strategies.  The methodology presented was based on
choices of different fuzzy clustering algorithms, projection
of clusters and merging techniques.  The best features of
well-known clustering methods such as the Gustafson-
Kessel and mountain method were combined.  The latter
was used to determine and define the number and the
approximate positions of the cluster prototypes; whereas
the former was used to define the shapes of the clusters
according to the data distribution.   The projection of the
prototypes and variables of clusters was a recognized
approach to extracting the information included in the
data clusters into fuzzy sets.  Merging these fuzzy sets,
based on proposed guidelines, can  minimize the number
of rules and make the identifying control strategy more
transparent.   Bossley (1997)  has looked into the problem
of antenna modeling via neuro-fuzzy systems,   however,
getting an optimized five layers neural network was not
easily achieved due to the large number of generated fuzzy
rules.

1.2  Article  Contribution
The system under study is  typical of the type used for

oceanary satellite communication systems and has a high
nonlinear coupling among its two outputs.   Hence, it is
required to have transparent sub-models.   This class of
multivariable system  has been modeled via a classical
Neuro-fuzzy system as in Bossley (1997).   However,   it
did result in a large number of rules, and large number of
training patterns were required.   In this respect,   this
research frame work is investigating the use of clustered
fuzzy rules,  that makes it easy for the training mechanism
to be achieved in less time with fewer number of rules.
Fuzzy sets in the antecedent of the rules are obtained from
the partition matrix by projection onto certain antecedent
variables.  The obtained point-wise fuzzy sets are then
approximated by some suitable parametric functions.   The
transparency of the antenna model obtained using the
above approach may be hindered by the redundancy pres-
ent in the form of many overlapping (compatible) mem-
bership functions.  Certain similarity measures were used
in order to assess the compatibility (pair-wise similarity)
of fuzzy sets in the rule base, in order to detect sets that
can be merged.  Fuzzy sets estimated from antenna train-
ing data can also be similar to the universal set, thus
adding no information to the model.   Sets of such nature
were removed from the antecedent of the rules,  thus
reducing the number of the fuzzy rules. 

13

The Journal of Engineering Research Vol. 2, No. 1 (2005) 12-24


2.   Intelligent  Dynamic  Systems  Modeling

2.1  Intelligent Modeling 
Fuzzy modeling and control are typical examples of

techniques that make use of human knowledge and deduc-
tive processes.   Various alternative approaches have been
proposed, Fuzzy Logic and Set Theory being one of them.
Artificial neural networks and fuzzy models belong to the
most popular model structures used.  From the input-out-
put view,  fuzzy systems are flexible mathematical func-
tions, which can approximate other functions or just data
measurements with a desired accuracy.  Compared to well-
known approximation techniques such as Neural
Networks, fuzzy systems provide a more transparent rep-
resentation of the system under study, which is mainly due
to the possible linguistic interpretation in the form of rules.
The logical structure of the rules facilitates the under-
standing and analysis of the model in a semi-qualitative
manner,  close to the way human reason about the real
world.    

Given the state of a system with a given input,  the next
state x(k + 1) can be determined.  In the sense of dis-
crete-time setting,  it can be written as:

x(k + 1)= f (x(k), u(k)) (1)

where x(k) and u(k) are the state and the input at time  k,
respectively,  and   f is a static function.   Fuzzy models of
different types can be used to approximate the state-transi-
tion function.  As the state of a system is often not meas-
ured, input-output modeling is usually applied.  The most
common is the NARX (Nonlinear Auto-Regressive with
Exogenous input) model,  as  defined by

y(k+1) = f ( y(k), y(k-1),…, y(k - ny +1),  
u(k), u(k-1),…, u(k - nu +1) )     (2)

where  y(k) ,…y(k - ny + 1) ,  and  u(k) ,…, u(k - ny + 1)
denote the past model outputs and inputs respectively and
ny and nu are integers related to the model order (usual-
ly selected by the designer).  For instance in  Eq. (3), a lin-
guistic fuzzy model of a dynamic system may consist of
rules of the following form :

Ri : if  y(k)  is  Ai1 and  y(k-1)  is  Ai2  and,…, y(k- n+1) 
is  Ain and  u(k) is Bi1 and  u(k-1)  is  Bi2 

and,…, u(k-m+1)  is  Bim then  y(k+1)  is  Ci (3)

In Eq. (3),  the input dynamic filter is a simple genera-
tor of the lagged inputs and outputs, and no output filter is
used.  Since the fuzzy models can approximate any
smooth function to any degree of accuracy,  models of the
type in Eq. (3) can approximate any observable and con-
trollable modes of a large class of discrete-time nonlinear
systems.  To facilitate data-driven optimization of fuzzy
models (learning), differentiable operators (product, sum)

are often preferred to the standard min and max operators.
Once the structure is fixed, the performance of a fuzzy
model can be fine-tuned by adjusting its parameters.
Tunable parameters of linguistic models are the parame-
ters of antecedent and consequent membership functions
(determine their shape and position) and the rules (deter-
mine the mapping between the antecedent and consequent
fuzzy regions). 

x = (x1…………xN)T (4)

y = (y1…………..yN)T (5)

3.   Neuro-Fuzzy Modeling

Figure 1 shows  typical five layers of a neuro-fuzzy sys-
tem that can be employed to accomplish a rule network.
Typically, such rules are

if  x1 is  A11 and  x2 is  A21 then  y=b1 (6)

if  x1 is  A12 and  x2 is  A22 then  y=b2 (7)

Nodes in the first layer compute the membership degree
of the inputs in the antecedent fuzzy sets.  The product
nodes Π in the second layer represent the antecedent con-
junction operator.  The normalization node  Ν and the
summation node Σ realize the fuzzy-mean operator.
Using smooth antecedent membership functions,  such as
a  Gaussian function,  as given below

(8)

in which  cij and  τij parameters are adjusted by gradient-
descent learning algorithms, such as back-propagation.
This allows  a fine-tuning of the fuzzy model to the avail-
able data in order to optimize its prediction accuracy.
There may be a lot of structure/parameter combinations
which make the fuzzy model behave in a satisfactory way.
The problem can be formulated as that of finding the struc-
ture complexity which will give the best performance in
generalization.  In our approach we  choose the number of
rules as the measure of complexity to be properly tuned on
the basis of available data.  We adopt an incremental
approach where different architectures having different
complexity  (i.e. number of rules) are first assessed in
cross-validation and then compared in order to select the
best one.

14

The Journal of Engineering Research Vol. 2, No. 1 (2005) 12-24

     It is assumed that a set of N input-output data pairs 
( ){ }Niy ii ,...,2,1, =x  is available.  Recalling that 

inpF
i ℜ∈x are input vectors and yi are output scalars.   

Denote inpFN ×ℜ∈X a matrix having the vectors Tkx  in its 
rows, and Nℜ∈y a vector containing the outputs y k: 

( )
⎟⎟
⎟

⎠

⎞

⎜⎜
⎜

⎝

⎛

⎟
⎟
⎠

⎞
⎜
⎜
⎝

⎛ −
−=

2

2
exp,,

ij

ijj
ijijjAij

cx
cx

τ
τµ


15

The Journal of Engineering Research Vol. 2, No. 1 (2005) 12-24

 
Layer  No.1  

Layer  No.2 

Layer  No.3  

Layer  No.4  

Layer  No.5  

Figure 1.  A five layer neurofuzzy network architecture

 
Local linear model  

Curves of equidistance  

Clusters center  
v1 

v2 

Data  

x 

x 

y 

µ (x) 

Projected clusters  

data 

If x is A1 then y is B1 
 
If x is A2 then y is B2 
 A1 A2 

B1 

B2 

µ (y) 

Figure 2.  Hyper ellipsoidal fuzzy clusters


The initialization of the architecture is provided by a
hyper-ellipsoid fuzzy clustering procedure inspired by
Babuska and Verburggen (1995).   This procedure is clus-
tering the data in the input-output domain obtaining a set
of hyper-ellipsoids which are a preliminary rough repre-
sentation of the input/output mapping.  Methods for ini-
tializing the parameters of a fuzzy inference system form
the outcome of the fuzzy clustering procedure.  Here we
use the axes of the ellipsoids (eigenvectors of the scatter
matrix) to initialize the parameters of the consequent func-
tions.  We  project the cluster on the input domain to ini-
tialize the centers of the antecedents and we adopt the scat-
ter matrix to compute the width of the membership func-
tions.  Once the initialization is done, the learning proce-
dure begins.  In the case of linear T-S models this mini-
mization procedure can be decomposed in a least-squares
problem to estimate the linear parameter of the consequent
models and a nonlinear minimization to find the parame-
ters of the membership functions.   The structural identifi-
cation loop (the outer one) searches for the best structure,
in terms of optimal number of rules, by increasing gradu-
ally the number of local models. 

4.   Fuzzy  Pattern Clustering 

4.1  Fuzzy Clustering
Identification methods based on fuzzy clustering orig-

inate from data analysis and pattern recognition, where the
concept of graded membership is employed to represent
the degree to which a given object, represented as a vector
of features, is similar to some prototypical object. Based
on that similarity, feature vectors can be clustered such
that vectors within a cluster are as similar as possible, and
vectors from different clusters are as dissimilar as possi-
ble.  This thought of fuzzy clustering is depicted in Fig.  2.
Data is clustered into two groups with prototypes   v1 and
v2, using the Euclidean distance measure.  The partitioning
of the data is expressed in the fuzzy partition matrix whose
elements  µij are degrees of  membership of  data  points
(xi, yi) in a fuzzy cluster with prototypes vj.  The concept
of similarity of data to a given prototype leaves enough
space for the choice of an appropriate distance measure
and of the character of the prototype itself.    Prototypes
can be defined as linear subspaces, or the clusters can be
ellipsoids with adaptively determined shape Akkizidis and
Roberts, (2001).    From these clusters, the antecedent
membership functions and the consequent parameters of
the T-S model can be extracted as follows, (Bossley,
1997).

if     x is A1 then      y = a1x + b1
if     x is A2 then      y = a2x + b2 (9)

Each obtained cluster is represented by one rule in the T-S
model.  Membership functions for fuzzy sets A1 and A2 are
generated by pointwise projection of the partition matrix

onto the antecedent variables.  Such pointwise defined
fuzzy sets are then approximated by a suitable parametric
function.  

4.2   Fuzzy Clustering Algorithm

U  =  [uij]i i = 1,...,c,   j = 1,...,n (10)

(11)

Second, every constructed cluster is  nonempty and differ-
ent from the entire set, that is,

(12)

The general form of the objective function used in fuzzy
clustering is 

(13)

where w(xi) is a prior weight for each  xi and d(xj, vk) is
the degree of dissimilarity between the data xi and the
supplemental element vk, which can be considered as the
central vector of the  kth cluster.  Degree of dissimilarity
is defined as a measure that satisfies two assumptions
given by

(14)

(15) 

Based on the above background, fuzzy clustering can be
precisely formulated as an optimization problem: 

Minimize

16

The Journal of Engineering Research Vol. 2,  No. 1 (2005) 12-24

     Consider a finite set of elements { }nxxx ,...,, 21=X  as 
being elements of the Finp dimensional Euclidean space 

inpFℜ , that is, .,...,2,1, njx inpFj =ℜ∈   The issue is to 
perform a partition of such collection of elements into c 
fuzzy sets wi th respect to a given criterion,   where c is a 
given number of clusters.  The criterion is usually to 
optimize an objective function that acts as a 
performance index of clustering.   The end result of 
fuzzy clustering can be expressed by a parti tion matrix 
U  such that :  

     In Eq . (10), iju  is a numerical value in [0,1] and 
expresses the degree to which an element jx  belongs to 
the ith cluster.  However, there are two additional 
constraints on the value of  uij.  First, a total mem bership 
of the element Xx j ∈ in all classes is equal to unity; 
that is :  

∑ =
=

c

i
iju

1
1       for all      j=1,2,…,n          

∑ <<
=

n

j
ij nu

1
0     for all     i=1,2,…, c 

( ) ( )[ ] ( )∑
=

∑
=

∑
=

=
c

i kj
n

j

c

k ijikij
,vxduxwgvuJ

1 1 1
,,

( ) ,0≥kj ,vxd

( ) ( )jkkj xvd,vxd =


(16)

subject to 

One of the widely employed clustering methods based
on Eq. (16) is the Fuzzy C-Means (FCM) algorithm.   The
objective function of the FCM algorithm is expressed in
the form of

(17)

where m is called exponential weight that influences the
degree of fuzziness of the membership (partition) matrix.
To solve this minimization problem,  the objective func-
tion J(uij,vk) in Eq. (17) is differentiated  with respect to
vk ( for fixed uij,  i=1,…,c,   j=1,…,n ) and to uij ( for fixed
vk , i=1,…,c ) and the conditions of Eq. (11), are applied
obtaining 

(18)

(19)

The system described by the Eqs. (18) and (19) cannot
be solved analytically.  However,  the FCM algorithm pro-
vides an iterative approach to approximating the minimum
of the objective function starting from a given position. 

5.  Linear State Space Models  Extraction

5.1  T-S Fuzzy Space Model
At each sample time  k,  given an operating point con-

dition (for example u(k - 1) and  y (k - 1),  a local linear
fuzzy state-space model can be constructed via calculating
the degree of fulfillment  µi (x(k)) of the antecedents, using
product as the fuzzy logic AND operator.   The inference
of the entire structure (hierarchy)  due to rule i results in
a sub-model (1) which can be expressed as:

(20)

(21)

Defining ζ1 , η1 and θ1 as follows :

(22)

(23)

(24)

where  x(k), u(k) and  y(k) for the state-space description
are defined as:

(25)

(26)

(27)

In order to employ Quadratic Programming for systems
which depend on current as well as on the previous inputs,
it is necessary to construct a state-space representation,
such that the state vector  x(k) to accommodate not only
the state variables, appearing in y(k),  but also the previous
inputs and the offset as last element.  This results in a sys-
tem with only current inputs, but leads to a more complex
A-matrix.  

17

The Journal of Engineering Research Vol. 2,  No. 1 (2005) 12-24

∑ =
=

c

i
iju

1
1            for al l      j =1,2,…, n    and    

∑ <<
=

n

j
ij nu

1
0     for all         i=1,2,…, c    

( ) ( )[ ] ( )( )[ ]kx
kykx

ky
l

r
i li

lil
r
i li

l
∑ =

∑ = +=+
1

1 1.1
µ

µ( ) ( )[ ] ( )
njcki

,vxd,uxwg,vuJ
c

i kj
n

j

c

k ijikij

,...,2,1;,...,2,1,
,

1 1 1
==

= ∑
=

∑
=

∑
=

( ) 1
1

2

1
>−= ∑

=
∑
=

mvxu,vuJ
c

i ij
n

j

m
ijkij

( ) ( )∑ =∑ =
== nj j

m
ij

n
j

m
ij

i c,ixuu
v 1

1

,...,2,1
1

( )
( )

njci
vx

vx
u

c
k

m
kj

m
ij

ij

,...,2,1;,...,2,1

,
1

1

1
1

1
2

1
1

2

==
−

−
=

∑ =
−

−

( ) ( ) ( )[ ]lililili kukyky θηζ ++=+ 1

( )[ ]
( )[ ]kx

kx

l
r
i li

lil
r
i li

l
∑ =

∑ ==
1

1 .
µ

ζµ
ζ

( )[ ]
( )[ ]kx

kx

l
r
i li

lil
r
i li

l
∑ =

∑ ==
1

1 .
µ

ηµ
η

( )( )
( )( )kx

kx

l
r
i li

lil
r
i li

l
∑ =

∑ ==
1

1 .
µ

θµ
θ

( )

( ) ( ) ( )
( ) ( )

( )

T

udin

oynon

ony

nnku

kunkx

kxnkxkx

k

⎥
⎥
⎥
⎥
⎥

⎦

⎤

⎢
⎢
⎢
⎢
⎢

⎣

⎡

−−

−−

−

= ...1...

......

1

111

x

( ) ( ) ( ) ( )[ ]Tni kukukuk ...21=u

( ) ( ) ( ) ( )[ ]Tni kxkxkxk ...21=y

     The latter contains also η s, corresponding to the 
previous inputs.   If the maximal delay in the input i, 
i=1,…,n i is ui,dmax,  then the number of the additional 
columns is ( )∑ −=ini diu1 max, 0,1max .  In the last column of 
A are st ored the offsets θ s.   The columns with η s 
correspond  to the previous inputs,  stored in the state 
vector; these columns are not included in B.   The ones 
in A correspond to the delayed values of a certain 
variable.   The local linear system matrices are derived 
as follows:   A is αα × square matrix,  


where

(28)

(29)

(30)

The ones in C are positioned such that y1(k) = x1(k).
At any time index k , initially the control signal  u(k - 1) is
used.   However, after the optimization,  u(k) is available
and could be used in next iterations. 

6.  Takagi-Sugeno Fuzzy  Model Validation

6.1   Correlation Tests
Traditionally more rigorous statistical validation tests

are employed in which model residuals are examined, and
if found to be sufficiently correlated with a function of the
data then the model is inadequate.   This is achieved by
defining a matrix  Z,  where  Z(xt) is

(31)

in which xt is the observational vector of inputs, outputs
and errors seen up to time step  t, and  m(t) represents the
degree of dependency  of the  two training signals  y(t)
and  u(t), i.e.

(32)

and  m(t - 1) is a monomial of the vector xt given by

(33)

The following two hypotheses are defined : 

where the purpose of validation is to use the data to decide
if H0 holds.   Two different test statistics have been pro-
posed in the literature,  the most common being the stan-
dard sample correlation measure, ρ(k),  is defined as,

(34)

(35)

where  Ho hold is  asymptotically a  X2(s) distribution
where s is the number of delays, td.  For a given acceptance
level (typically 95%) a critical point is found,  and if ele-
ments of  d are outside this acceptance region,  Ho is
rejected.

7.   Modeling of a Nonlinear System :  A Case 
Study

7.1   Antenna   System   (Input-Output  Training 
Pattern)

To test these proposed neuro-fuzzy methodologies fur-
ther, they are applied to model a realistic nonlinear dynam-
ical system. The system considered is a nonlinear (MIMO)
dynamics of an antenna system with two coupled inputs

18

The Journal of Engineering Research Vol. 2, No. 1 (2005) 12-24

 
( )

( ),1max
,10,1max

11
1 max,1

∑ −=
+∑ −+=

=

=
oin

j yj

in
i di

n
u

α
αα

   
   B  is  an in×α   and    C is a α×on  matrix: 

⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎥

⎦

⎤

⎢
⎢
⎢
⎢
⎢
⎢
⎢
⎢
⎢

⎣

⎡

=

MOOM

K
MOOM

K
KK

MM
KK

K

inononon

in

in

,2,1,

,22,21,2

,12,11,1

00

00

ηηη

ηηη

ηηη

B

⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎥

⎦

⎤

⎢
⎢
⎢
⎢
⎢
⎢
⎢
⎢
⎢
⎢
⎢
⎢

⎣

⎡

=

10000000
00001000

00000100

0

0
000010
0000001

1,,,1,2,1,

1,2,,21,22,21,2

1,1,1,11,13,12,11,1

KKK
KKK

MMOMMMOMMMM
KKK

KKKKK

MMOMMOOM

KKKKK

MMOMMOOK
KOM
KK

KKKK

αα

α

α

θηηζζζ

θηηζζζ

θηηζζζζ

onjonionononon

jonio

ji

A

.
0010

001

⎥
⎥
⎦

⎤

⎢
⎢
⎣

⎡
=

KKK
MOOM

KKKK
C

( ) ( ) ( ) ( )[ ]Tdt ttmtmtmxZ −−= ,...,1,

( )Ttttt eyux 111 ,, −−−=

( ) ( ) ( )21 2 −−= tutytm

oH  : ( )te  is uncorrelated with ( ) ( ) ()( ) 0., =ZteExZ t  and    
H 1  :  ( )te  is  correlated with ( ) ( ) ()( ) 0., ≠ZteExZ t  

( ) ( ) ( )[ ]
( ) ( ) ][ ( ) ( )[ ]∑ +−=∑ +−=

∑ +−=

+−+−
+−

=
1

1
1

1

1
1

11
11

KN
t

KN
t

KN
t

tetektmktm
tektmN

kρ

in which k =1,…,td and ( ) [ ].1,1 +−∈kρ    If H o  holds, 
this statistic asymptotically approaches a normal 
distribution, and with 95% confidence limits , H o  is 

accepted if ( ) ]/96.1,/96.1[ NNk −∈ρ . An 
alternative stati stic is given by:  

( )[ ][ ] ( ) ( )[ ]
( ) ( )[ ][ ] ( ) ( )[ ]( )[ ]∑ −=−∑ −=

∑ −=
−

∑ −= ⎥⎦
⎤

⎢⎣
⎡=

dtNtd
tN

t

dtNtdtNt

tetZtZtZ

tetZteNd

1
1

1

1
1

1
2


and outputs.   A data set containing 500 samples of train-
ing patterns were produced by applying random torques to
the different channels,  with suitable sampling rate and an
amplitude drawn from uniform random distribution in the
range (-1.5 , + 1.5) N/m.

Antenna system
A coupled two degree of freedom satellite dish, typical

of the type used for oceanary satellite communication sys-
tems, is presented.

The behavior of the antenna is described by the follow-
ing nonlinear idealized time invariant state space equa-
tions:

(36)

(37)

and  

(38)

where ϕ is   the   azimuth   angle, ψ is the elevation angle,
bϕ  and  bψ are the associated friction coefficients,  and Tϕ
and  Tψ are the torques applied to the axes.  To produce a
more realistic simulation, the outputs are corrupted by
additive Gaussian noise, [eϕ(t)  eψ(t)]T, representing a
crude approximation to measurement noise.  The azimuth
is permitted to turn through a complete revolution, while
end stops restrict the elevation to the interval [0,π].  In this
antenna there are essentially two sources of nonlinearity:
that produced from the end stops on the elevation and the
other as a results of the non-isotropic moment of inertia
tensor.  Indeed when isotropy is present the state-space
equations (above) are linear.  The strength of this non-lin-
earity depends on the degree of anti-isotropy and the angu-
lar velocities of the antenna.  These torques are chosen to
emulate typical operating conditions.  Such block diagram
used to produce the identification data was simulated
through SIMULINK/MATLAB,  using a set of nonlinear
differential equations that describe the antenna system.
Half of the training pattern was used in the modeling of the
dynamic system whereas the other half was used to vali-
date the fuzzy models resulting from the modeling.  For a
typical sequence of  training data, such responses of the
antenna inputs-outputs is shown in Fig. 3.

7.2   Training Pattern  and Clustering

7.3   Neuro-Fuzzy Modeling 
Neuro-fuzzy modeling is applied to the problem of

identifying a discrete model of the antenna.  A fuzzy
model can be constructed from data by using the output of
the clustering algorithm and by constructing regressors to
form inputs to the neuro-fuzzy network.  Hence a conven-
tional linear difference model with regressors is construct-
ed containing previous inputs and outputs, i.e.

(39)

(40)

Fuzzy IF-THEN rules can be extracted by projecting
the clusters onto the axes and the membership functions of
the fuzzy sets generated by pointwise projection of the
partition matrix onto the antecedent variables.  Then con-
sequent parameters for each rule are obtained as least
squares estimates.  When an initial structure is obtained
through clustering, the membership functions and the con-
sequent parameters are tuned to satisfy certain cost func-
tion through the learning procedure of the neuro-fuzzy.

7.4  Membership Functions and Associated Fuzzy 
Rules

As a result, the membership functions of all the inputs
(regressors) and outputs are shown in Fig. 5  for azimuth
angle.  The antenna system has six inputs (in terms of fuzzy
model) and two outputs, hence two groups of seven sets of
MFs are shown.  Each universe of discourse (set) has three

19

The Journal of Engineering Research Vol. 2, No. 1 (2005) 12-24

[ ]Tψψϕϕ &&=z

( ) ( )
( ) ( )

( ) ( )
⎥
⎥
⎥
⎥
⎥
⎥
⎥

⎦

⎤

⎢
⎢
⎢
⎢
⎢
⎢
⎢

⎣

⎡

−′−−

+′
−′−−

=

I
IIbT

II

IIbT

ψψψ
ψ

ψψ

ψψϕϕ
ϕ

ψψ

ϕϕ

2sin

cossin

2sin

2
2
1

22

&&

&

&&&
&

&z

( )
( )⎥⎦

⎤
⎢
⎣

⎡
+⎥⎦

⎤
⎢⎣
⎡= te

te
ψ

ϕzy 0100
0001

     As discussed in section (3),  fuzzy modeling of any 
dynamical system could be achieved through clustering 
the training data.  In this respect,   Fig. 3  shows the 
employed Input -Output data training pattern.  For  this 
simulation example, clustering ha s been applied to th e 
antenna training pattern.  In Fig. 4  it is shown the 
training pattern following appl ying the clustering 
algorithm, where it illustrates clearly the clusters and 
their three associated centers  is shown in Fig 4 .   For 
instant,   the fi gure shows the training pattern which has 
been clustered into three .   To reduce the fuzzy rules 
while preserving the model accuracy,  the number of 
clusters were chosen to be three clusters .  The fuzziness 
parameter m was kept at 2.2 with a termination cr iterion 
∈ =0.01.  The result of the clustering algorithm is the 
fuzzy partition matrix and the cluster centers matrix, 
which will be used to construct the fuzzy model for the 
antenna system.   

( )
( ) ( ) ( )

( ) ( )
( ) ( ) ( )

T

kkk

kk

kTkTkT

k

⎥
⎥
⎥
⎥

⎦

⎤

⎢
⎢
⎢
⎢

⎣

⎡

−∆−−

−−

−−−

=

1,2,1

,2,1

,1,2,1

ϕψψ

ϕϕϕ

ψϕϕ

( )
( ) ( ) ( )

( ) ( )
( ) ( ) ( )

T

kkk

kk

kTkTkT

k

⎥
⎥
⎥
⎥

⎦

⎤

⎢
⎢
⎢
⎢

⎣

⎡

−∆−−

−−

−−−

=

1,2,1

,2,1

,2,1,1

ψψψ

ϕϕψ

ψψϕ


20

The Journal of Engineering Research Vol. 2, No. 1 (2005) 12-24

 
0 1 0 0 2 0 0 3 0 0 4 0 0 5 0 0 6 0 0
-2

0

2

0 1 0 0 2 0 0 3 0 0 4 0 0 5 0 0 6 0 0
-2

0

2

0 1 0 0 2 0 0 3 0 0 4 0 0 5 0 0 6 0 0
-5

0

5

0 1 0 0 2 0 0 3 0 0 4 0 0 5 0 0 6 0 0
-5

0

5

Tim e  (s )

Input  
azimuth 
 torque 
(Nm) 

Input  
elevation  
 torque 
(Nm) 

Output 
elevation  
 torque 
(rad) 

Output  
azimuth 
 torque 
(rad) 

Figure 3.  Input-Output data training pattern

 
-1.5 -1 -0.5 0 0.5 1 1.5 -3 

-2 

-1 

0 

1 

2 

3 

4 

center 1  

center 2 

center 3  

Torque 1 

rad 

Figure 4.  Extracted clusters functions associated 
azimuth angle

Figure 5.  Extracted membership functions of all 
inputs


MFs representing the assigned three clusters.  Such mem-
berships are representing the range of the inputs. 

7.5  Fuzzy Sub-models:
The T-S fuzzy model presented has been used to iden-

tify the nonlinear antenna system.  As was mentioned
before,  the number of rules in the T-S fuzzy model equals
the number of clusters in the product space.  The conse-
quent of each rule is a local model that approximates the
output of the real function for the range of x for which the
rule is applicable.  As a result of the modeling develop-
ment, the following rules are obtained for azimuth  and
elevation angles :

Here the C and D matrices are common for all of the
three fuzzy sub-models, and the D matrix is equal to zero.
Furthermore, the elevation angle dynamics is of the same
above structure.  The antenna simulation system incorpo-
rating the three models are shown in Fig. 6.  Consequently,
Fig. 6 shows the actual antenna output superimposed over
the evaluated fuzzy model output.   From the figure, it is
apparent how the fuzzy model output resembles the actual
system output.   

7.6  Fuzzy  Sub-models Validation
Figure 7 displays the cross-correlation function of the

error signal with the first input signal of the antenna.  The
correlation in the figure is within the confidence interval,
which indicates that the two signals are not correlated.  To
further investigate the constructed local linear sub-models
of the antenna,   Figure 8 shows the attained linearized
sub-models over the antenna time response.  In terms of
antenna nonlinear behavior,  it is obvious that the entire
operating region has been sub-divided into a number of
local models which could be employed for further control
synthesis.   From the shown antenna response,  fuzzy mod-
els are useful for describing the antenna dynamics where

the underlying physical mechanisms are not completely
known and  the antenna behavior is understood in qualita-
tive terms.  Consequently,  an important property of fuzzy
models is their capability to represent nonlinear dynamic
systems.  Therefore, the obtained fuzzy sub-models can
also be applied to systems that are well understood but due
to the nonlinearities untraceable with standard linear
methods.  Rule-based structure of fuzzy models allows  for
integrating heuristic knowledge with information obtained

21

The Journal of Engineering Research Vol. 2, No. 1 (2005) 12-24

Rule 1: IF ( )1−kTϕ  is 11F  and ( )2−kTϕ  is 12F  and ( )1−kTψ  is 13F  and ( )1−kϕ  is 14F  and ( )2−kϕ  is 15F
( )1−kψ  is 16F  and ( )2−kψ  is 17F  then ( ) ( ) ( )ttt uBxAx 11 +=&    and   ( ) ( )tt xCy 1= . 

 
Rule 2: IF ( )1−kTϕ  is 21F  and ( )2−kTϕ  is 22F  and ( )1−kTψ  is 23F  and ( )1−kϕ  is 24F  and ( )2−kϕ  is 25F  

( )1−kψ  is 26F  and ( )2−kψ  is 27F      then ( ) ( ) ( )ttt uBxAx 22 +=&  and  ( ) ( )tt xCy 2= . 
 
Rule 3: IF ( )1−kTϕ  is 31F  and ( )2−kTϕ  is 32F  and ( )1−kTψ  is 33F  and ( )1−kϕ  is 34F  and ( )2−kϕ  is 35F

( )1−kψ  is 36F  and ( )2−kψ  is 37F  then    ( ) ( ) ( )ttt uBxAx 33 +=&  and    ( ) ( )tt xCy 3=  

 
⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎥

⎦

⎤

⎢
⎢
⎢
⎢
⎢
⎢
⎢
⎢

⎣

⎡

−−

−−

=

000000
000000
000100

0013.003315.08022.00273.00348.0
000001
00116.02998.00300.04265.009510.0

1A

⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎥

⎦

⎤

⎢
⎢
⎢
⎢
⎢
⎢
⎢
⎢

⎣

⎡

−−−

−−

=

000000
000000
000100
0002.003715.00844.00384.00373.0
000001
00119.00475.00525.04667.09354.0

2A
 

⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎥

⎦

⎤

⎢
⎢
⎢
⎢
⎢
⎢
⎢
⎢

⎣

⎡

−−

−−

=

000000
000000
000100

0002.002987.07887.01098.01294.0
000001
00108.00157.00142.04099.09020.0

3A

⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎥

⎦

⎤

⎢
⎢
⎢
⎢
⎢
⎢
⎢
⎢

⎣

⎡

=

10
01
00

0240.00
00
00128.0

1B
,

⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎥

⎦

⎤

⎢
⎢
⎢
⎢
⎢
⎢
⎢
⎢

⎣

⎡

=

10
01
00

0137.00
00
00128.0

2B
,       

⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎥

⎦

⎤

⎢
⎢
⎢
⎢
⎢
⎢
⎢
⎢

⎣

⎡

=

10
01
00

0028.00
00
00118.0

3B
    ⎥

⎦

⎤
⎢
⎣

⎡
===

000100
000001

321 CCC         ⎥
⎦

⎤
⎢
⎣

⎡
===

00
00

321 DDD  

where,


22

The Journal of Engineering Research Vol. 2, No. 1 (2005) 12-24

 
0 100 200 300 400 500 600
-4

-2

0

2

4

6

Time (s)

0 100 200 300 400 500 600-4

-2

0

2

4

Time (s)

azimuth 
angle 
(rad) 

elevation  
angle 
(rda) 

Figure 6.   Fuzzy model responses (azimuth and elevation angels)  compared to the antenna outputs

 
-500 -400 -300 -200 -100 0 100 200 300 400 500
-3

-2

-1

0

1

2

lag

cross correlation of torque 1 with error 

-500 -400 -300 -200 -100 0 100 200 300 400 500-1

0

1

2

3

4

lag

cross correlation of torque 2 with its associated error

cross correlation of torque 1 with error 

Figure 7.  Cross correlation of system first input with its associate error of the antenna system


from antenna measurements. The global operation of the
antenna  nonlinear process is divided into several local
operating regions.  Within each region Ri, a reduced order
linear model in ARMAX form, is used to represent the
local antenna behavior.  That was not restrictive,  and any
appropriate model forms can also be used. 

8.  Conclusions

This article has concentrated on the modeling of non-
linear dynamic systems via the utilization of the well
known fuzzy  modeling  paradigm, the  Takagi-Sugeno
(T-S) technique.  T-S models depend heavily on some ini-
tial membership centers of the universe of discourse of
used fuzzy variables,  such centers have been obtained by
employing clustering algorithm.  Once such centers are
computed, a fuzzy system can establish initial member-
ship centers through which they are updated via a neural
network learning mechanism.  One of the advantages  of
T-S modeling  is that systems can be modeled by few
rules, and consequently fewer linear sub-models.  This
advantage has overcome the problem of the large number
of rules in the fuzzy modeling.  Fuzzy models have also
been verified and validated through some standard valida-
tion techniques,  where they have shown clearly the suc-
cessful ability of   T-S techniques to model nonlinear sys-
tems with a good degree of accuracy.

References

Akkizidis, S.  and  Roberts, N., 2001,   "Fuzzy Clustering
Methods for Identifying and Modeling of Non-Linear
Control Strategies": Proceedings of the Institution of

Mechanical Engineers. Part I: Journal of Systems and
Control Engineering,  Vol. l(215),  pp. 437-452.

Babuska, R.  and  Verburggen, H., 1995,  "Identification of
Composite Linear Models via Fuzzy Clustering,"
Proceedings of the European  Control  Conference,
Italy,  pp. 1207-1212.

Bologna, G., 2001,  "A New Neuro-Fuzzy Model,"
Proceedings of the International Joint Conference on
Neural Networks (IJCNN'01),  Vol. 2,  Washington-DC
(USA),  pp. 1328-1333.

Bossley,  K., 1997, "Neuro-Fuzzy Modeling Approaches
in System Identification," Ph.D. Thesis, University of
Southampton, U.K.

Gorzalczany, E.,  Marian, B. and Gluszek, A., 2000,
"Neuro-Fuzzy Networks in Time Series Modeling,"
Proceedings of the International Conference on
Knowledge-Based  Intelligent  Electronic  Systems, Vol.
1,  Brighton, (UK),  pp. 450-453.

Ikonen, E.,  Najim, K. and  Kortela, U., 2000, "Neuro-
Fuzzy  Modelling of Power Plant Flue-Gas Emissions,"
J. of Engineering Applications of Artificial Intelligence,
Vol. 3(6),  pp. 705-717.

Lo, J. and Chen, Y., 1999, "Stability Issues on Takagi-
Sugeno Fuzzy Model, Parametric Approach",   IEEE
Transaction of Fuzzy Systems,  Vol. 7(5), pp. 597-607.

Mamdani, E. and Assilian, S., 1975,  "An   Experiment in
Linguistic Synthesis  with  a   Fuzzy  Logic Controller,"
International J. of Man-Machine Studies,  Vol. 3(5),
pp. 1-13.

Min-You, Chen.  and Linkens, A., 1998,   "Fast Fuzzy
Modeling Approach Using Clustering Neural
Networks,"   IEEE   International   Conference on
Fuzzy Systems,  Vol. l(2), IEEE World Congress on
Computational Intelligence,  pp. 1088-1093.

23

The Journal of Engineering Research Vol. 2, No. 1 (2005) 12-24

Figure 8.  Obtained linearized antenna sub-models


Ning, L.,  Shaoyuan, L.  and Yugeng, X., 2001,
"Modeling pH Neutralization Processes Using Fuzzy
Satisfactory Clustering," IEEE International
Conference on Fuzzy Systems,  Vol. l(1),   pp. 308-311.

Takagi, T. and Sugeno, M., 1985,  "Fuzzy Identification of
Systems and its Application to  Modeling and Control,"
IEEE Transaction Systems, Man and Cybernetics,  Vol.
15(1),  pp. 116-132.

Wang, H.,  Li, J.,  Niemann, D. and Tanaka, K., 2000,  "T-

S Fuzzy Model with Linear Rule  Consequence and
PDC Controller: A Universal  Framework for Nonlinear
Control Systems," Proceedings of the 9th IEEE
International Conference on Fuzzy Systems.

Zhang, J. and  Knoll, A., 1999,  "Modelling Multivariate
Data by Neuro-Fuzzy Systems",  Proceedings of the
1999 IEEE/IAE Conference on Computational
Intelligence for Financial Engineering (CIFEr), New
York, (USA),   pp. 267-270.

24

The Journal of Engineering Research Vol. 2, No. 1 (2005) 12-24


The Journal of Engineering Research 11 (2005) 00-00


The Journal of Engineering Research 11 (2005) 00-00


The Journal of Engineering Research 11 (2005) 00-00