RATIO MATHEMATICA
ISSUE N. 30 (2016) pp. 45-58

ISSN (print): 1592-7415
ISSN (online): 2282-8214

Dealing with randomness and vagueness in
business and management sciences: the

fuzzy-probabilistic approach as a tool for
the study of statistical relationships between

imprecise variables

Fabrizio Maturo
Department of Management and Business Administration

University G. d’Annunzio, Chieti - Pescara

f.maturo@unich.it

Abstract

In practical applications relating to business and management sciences,
there are many variables that, for their own nature, are better described by
a pair of ordered values (i.e. financial data). By summarizing this mea-
surement with a single value, there is a loss of information; thus, in these
situations, data are better described by interval values rather than by single
values. Interval arithmetic studies and analyzes this type of imprecision;
however, if the intervals has no sharp boundaries, fuzzy set theory is the
most suitable instrument. Moreover, fuzzy regression models are able to
overcome some typical limitation of classical regression because they do
not need the same strong assumptions. In this paper, we present a review
of the main methods introduced in the literature on this topic and introduce
some recent developments regarding the concept of randomness in fuzzy re-
gression.

Keywords: fuzzy data; fuzzy regression; fuzzy random variable; tools
for business and management sciences

2010 AMS subject classifications: 62J05; 62J86; 03B52; 62A86; 97M10

doi: 10.23755/rm.v30i1.8

45


Fabrizio Maturo

1 Introduction
Regression analysis offers a possible solution to study the dependence between

two sets of variables. Standard classical statistical linear regressions take the form
[27]:

yi = b0 + b1xi1 + b2xi2 + ... + bjxij + .... + bPxiP + ui (1)

where:

• i=1,.....,N is the i-th observed unit;

• j=1,...,P is the j-th observed variable;

• yi is the dependent variable, observed on N units;

• xij are the P independent variables observed on N units;

• b0 is the crisp intercept and bj are the P crisp coefficients of the P variables;

• ui are the random error terms that indicate the deviation of Y from the
model;

• yi, xij, bj, ui are all crisp values.

In classical regression model it is assumed that:

• E(ui) = 0

• σ2ui = σ
2

• σui,uj = 0 ∀ i,j with i 6= j

In matrix form, the classical regression model is expressed as:

y = Xβ + u (2)

where y = (y1, y2, ..., yN)′, b = (b0, b1, b2, ..., bP)′, u = (u1, u2, ..., uN)′

are vectors and X is a matrix:

X =




1 x11 . . . x1P
1 x21 . . x2P
1 . . . . .
1 . . . . .
1 xN1 . . . xNP




46


Dealing with randomness and vagueness in business and management sciences

The aim of statistical regression is to find the set of unknown parameters so
that the model gives is a good prediction of the dependent variable Y. The most
widely used regression model is the Multiple Linear Regression Model (MLRM),
as well as the Ordinary Least Squares (OLS) [12] is the most widespread estima-
tion procedure. Under the OLS assumptions the estimates are BLUE (Best Linear
Unbiased Estimator), as stated by the famous Gauss-Markov theorem.

OLS is based on the minimization of the sum of squared deviations:

min (y − Xb)′(y + Xb) (3)

The optimal solution of the minimization problem is the following vector:

b̂ = (X′X)−1X′y (4)

The OLS model is comfortable but its assumptions are every restrictive. Sev-
eral phenomena violate these assumptions causing biased and inefficient estima-
tors [9]. In particular the assumptions E(u|X) ≈ N(0,σ2I) is very strong and
rarely it is respected in real phenomena. Moreover in case of ”quasi” multi-
collinearity (many highly correlated explanatory variables), although this does not
violate OLS assumption there is a bad impact on the variance of B. In these cir-
cumstance the OLS estimators are efficient and unbiased but have large variance,
making estimation useless from a practical point of view.

The effects of the quasi multi-collinearity are more evident when the sample
size is small [1]. The generally proposed solution consists in removing correlated
exploratory variables. This solution is unsatisfying in many applications fields
where the user would keep all variables in the model.

In general, we can observe that classical statistical regression has many useful
applications but presents troubles in the following situations [26]:

• Number of observations is inadequate (small data set);

• Difficulties verifying distribution assumptions;

• Vagueness in the relationship between input and output variables;

• Ambiguity of events or degree to which they occur;

• Inaccuracy and distortion introduced by linearization;

Furthermore, there are many variables that, for their own nature, are better
described by a pair of ordered values, like daily temperatures or financial data. By
summarizing this measurement with a single value, there is a loss of information.
In these situations data are better described by interval values rather than by single

47


Fabrizio Maturo

values. Interval arithmetic studies and analyzes this type of imprecision; but if the
intervals has no sharp boundaries, fuzzy set theory is the better tool. In particular
fuzzy regression model are able to overcome some typical limitation of classical
regression because they don’t need the same strong assumptions. Furthermore,
some nuanced concepts that exist in economic and social sciences, need to be
necessarily treated with linguistic variables, which for their nature, are imprecise
concepts.

2 Fuzzy Linear Regression Models (FLR)
There are two general ways, not mutually exclusive, to develop a fuzzy regres-

sion model:

• Models where the relationship of the variables is fuzzy;

• Models where the variables themselves are fuzzy;

Therefore fuzzy linear regression (FLR) can be classified in:

• Partially fuzzy linear regression (PFLR), that can be further divided into:

– PFLR with fuzzy parameters and crisp data;

– PFLR with fuzzy data and crisp parameters;

• Totally fuzzy linear regression (TFLR) where data and parameters are both
fuzzy.

Fuzzy Least Squares Regression is more close to the traditional statistical ap-
proach. In fact, following the Least Squares line of thought [13], the aim is to
minimize the distance between the observed and the estimated fuzzy data. This
approach is referred as Fuzzy Least Squares Regression (FLSR).

In case of one independent variable, the model take the form:

ỹi = b0 + b1x̃i + ũi i=1,2,...,N (5)

where:

• i=1,.....,N is the i-th observed unit;

• yi is the dependent fuzzy variable, observed on N units;

• xi is the independent fuzzy variable, observed on N units;

48


Dealing with randomness and vagueness in business and management sciences

Figure 1: Relation between output and input variables

• b0 and b1 are the crisp intercept and the crisp regression coefficient;

• ui are the fuzzy random error terms;

From a graphical point of view [26] the relation between output and input
variables can be represented as shown in Fig.1

In case of several independent variables, the model take the form:

ỹi = b0 + b1x̃i1 + b2x̃i2 + ... + bjx̃ij + .... + bP x̃iP + ũi (6)

where:

• i=1,.....,N is the i-th observed unit;

• j=1,...,P is the j-th observed variable;

• yi is the dependent fuzzy variable, observed on N units;

• xij are the P independent fuzzy variables, observed on N units;

• b0 is the crisp intercept and bj are the P crisp regression coefficients mea-
sured for the P fuzzy variables;

• ui are the fuzzy random error terms;

Limiting the reasoning to the first model, the error term can be expressed as
follows:

ũi = ỹi − b0 − b1x̃i i=1,2,...,N (7)

49


Fabrizio Maturo

Therefore, from a least square perspective, the problem becomes as follows:

min
N∑
i=1

[ỹi − b0 − b1x̃i]2 i=1,2,...,N (8)

Many criteria for measuring this distance have been proposed over the years;
however, the most common are two methods:

• The Diamond’s approach;

• The compatibility measures approach.

2.1 FLSR using distance measures
The Diamond’s approach is also known as fuzzy least squares regression using

distance measures. This is the most close approach to the traditional statistical
one. Following the Least Squares line of thought, the aim is to minimize the
distance between the observed and the estimated fuzzy data, by minimizing the
output quadratic error of the model. Since the model contains fuzzy numbers the
minimization problem considers distances between fuzzy numbers [5, 17, 20, 15,
19, 18].

Diamond defined an L2-metric between two triangular fuzzy numbers; it mea-
sures the distance between two fuzzy numbers based on their modes, left spread
and right spread as follows

d[(c1, l1,r1), (c2, l2,r2)]
2 =

= (c1 − c2)2 + [(c1 − l1) − (c2 − l2)]2 + [(c1 + r1) − (c2 + r2)]2
(9)

The methods of Diamond are rigorously justified by a projection-type theorem
for cones on a Banach space containing the cone of triangular fuzzy numbers,
where a Banach space is a normed vector space that is complete as a metric space
under the metric d(x,y) = ||x−y|| induced by the norm [25].

In the case of crisp coefficients and fuzzy variables, the problem is the follow-
ing:

min
N∑
i=1

d[ỹi
∗ − ỹi]2 i=1,2,...,N (10)

where,

ỹi
∗

= b0 + b1x̃i (11)

50


Dealing with randomness and vagueness in business and management sciences

Figure 2: Compatibility measure

therefore the optimization problem can be written as follows:

min
N∑
i=1

d[b0 + b1x̃i − ỹi]2 i=1,2,...,N (12)

Using Diamond’s difference in this minimization problem, we can obtain the
parameters. If the solutions exist, it is necessary to solve a system of six equa-
tions in the same number of unknowns; of course, these equations arise from the
derivatives being set equal to zero.

2.2 FLSR using compatibility measures

The second type of fuzzy least squares regression model is based on Celmins’s
compatibility measures [3]. A compatibility measure can defined by

γ(Ã,B̃) = maxmin(µA(x),µB(x)) (13)

This index is included in the interval [0,1] as shown in Fig. 2. A value of ”0”
means that the membership functions of the fuzzy numbers A and B are mutually
exclusive as shown in Fig. 3. A value of ”1” means that the membership functions
A coincides with that one of B as shown in Fig.4.

The basic idea is to maximize the overall compatibility between data and
model. Thus, the objective may be reformulated in a minimization problem with
the following objective function:

51


Fabrizio Maturo

Figure 3: Zero compatibility

Figure 4: Max compatibility

52


Dealing with randomness and vagueness in business and management sciences

min
N∑
i=1

[1 −γi]2 i=1,2,...,N (14)

3 Fuzzy regression models with fuzzy random vari-
ables

Recent studies have reintroduced the concept of Fuzzy Random Variables
(FRVs) [24] firstly introduced by Puri and Ralescu [23]. The need for FRVs arises
when the data are not only affected by imprecision but also by randomness [11].
Several papers deal with this topic that it is called fuzzy-probabilistic approach. It
consists in explicitly taking into account randomness for estimating the regression
parameters and assessing their statistical properties [22, 7, 8].

The membership function of a fuzzy number can be expressed, in term of
spreads as:

µ
Ã

(x) =



LAm−x

Al
for x ≤ Am, Al > 0

1 for x ≤ Am, Al = 0
Rx−Am

Ar
for x > Am, Ar > 0

0 for x > Am, Ar = 0

(15)

where the functions L, R : <− > [0, 1] are convex upper semi-continuous
functions so that L(0) = R(0) = 1 and L(z) = R(z) = 0, for all z ∈ </[0, 1] [6]
and Am is the center, Al and Ar are the left and the right spread. Of course these
functions must be chosen by the researcher in advance and must be the same for
all the data.

In particular, for a triangular fuzzy number we obtain:

µ
Ã

(x) =




0 for x ≤ Am −Al
1 − Am−x

Al
for Am −Al ≤ x ≤ Am

1 − x−Am
Ar

for Am ≤ x ≤ Am + Ar
0 for x ≥ Am + Ar

(16)

A distance for these functions [21] could be:

D2(Ã,B̃) = (Am−Bm)2+[(Am−λAl)−(Bm−λBl)]2+[(Am+ρAr)−(Bm+ρBr)]2
(17)

where,

53


Fabrizio Maturo

λ =

∫ 1
0

L−1(α)dα

ρ =

∫ 1
0

R−1(α)dα

These functions consider the shape of the membership functions; for example,
for triangular fuzzy numbers λ and ρ = 1/2.

To avoid the problem of the non-negativity of the spreads of Ỹ , it is possible
to solve a non negative regression problem [14], or to transform the spreads of Ỹ
by means of the centers and the spreads of the P regressors X. In this context, we
use the latter method introducing two invertible functions [7]:

g : (0, +∞) −→<

h : (0, +∞) −→<

Thus the linear regression model take the form



Ym = xb

′
m + am + um

g(Yl) = xb
′
l + al + ul

h(Yr) = xb
′
r + ar + ur

(18)

where ul,um,ur are the real valued random variables with E(ul|(x)) = 0,
E(um|(x)) = 0, E(ur|(x)) = 0.

The row vector of length 3p of all the components of the regressors is:

x = (xm1, xl1, xr1, ....., xmP, xlP, xrP)

The row vectors of length 3p of the parameters related to x are:

bm = (bmm1, bml1, bmr1, ..., bmmP, bmlP, bmrP)

bl = (blm1, bll1, blr1, ..., blmP), bllP, blrP)

br = (brm1, brl1, brr1, ..., brmP, brlP, brrP)

The generic element bijt is the regression coefficient between the component
i�[m,l,r] of Ỹ , where m,l,r refer to center and the transformed spread of Ỹ , and
the component j�[m,l,r] of the regressor x̃t with t=1,....,P, where m,l,r refer to

54


Dealing with randomness and vagueness in business and management sciences

the corresponding center, left spread and right spread. For example bmr2 is the
relationship between the right spread of x̃2 and the center of Y. Of course, am, al,
and ar are the intercepts.

The covariance matrix of x is denoted by:

Σ(x) = E[(x − E(x))′(x − E(x))] (19)

The covariance matrix of um, ul, ur is indicated with Σ and contains the vari-
ances σ2um , σ

2
ul

and σ2ur .
The regression parameters can be expressed as:

bm
′ = [Σ(x)]

−1E[(x − E(x))′(Ym − E(Ym)]
bl
′ = [Σ(x)]

−1E[(x − E(x))′(g(Yl) − E(g(Yl))]
br
′ = [Σ(x)]

−1E[(x − E(x))′(h(Yr) − E(h(Yr))]
am = E(Ym|x) − [Σ(x)]−1E[(x − E(x))′(Ym − E(Ym)]
al = E(g(Yl)|x) − [Σ(x)]−1E[(x − E(x))′(g(Yl) − E(g(Yl))]
ar = E(h(Yr)|x) − [Σ(x)]−1E[(x − E(x))′(h(Yr) − E(h(Yr))]

(20)

Since the total variation of the response can be written in terms of variances
and covariances of real random variables, it can be decomposed in the variation
not depending on the model and that explained by the model. Thus, we can obtain
a determination coefficient for the fuzzy model based on the decomposition of the
total variance given by:

E[D2(Yt,E(Yt)] = E[D
2(Yt,E(Yt|x)]+

+ E[D2(E(Yt|x, E(Yt))]
(21)

Therefore, the linear determination coefficient R2 can be defined as:

R2 =
E[D2(E((Yt|x), E(Yt))]

E[D2(Yt,E(Yt)]
=

= 1 −
E[D2(Yt,E(Yt|x)]
E[D2(Yt,E(Yt)]

(22)

The meaning of this index is the same of the classical regression model. The
estimation problem of the regression parameters is faced by means of the LS cri-
terion. As shown in [6], applying the appropriate substitutions and using the con-
cept of distance between two fuzzy numbers, like in the Diamond’s approach, it
is possible to find the equation of the estimators of all parameters.

55


Fabrizio Maturo

4 Conclusions
Fuzzy regression models are able to overcome some limitations of classical

regression because they do not need the same strong assumptions. In this paper,
we have presented a review of the main methods introduced in the literature on
this topic and some recent developments regarding the concept of randomness in
fuzzy regression. In practical applications relating to business and management
sciences, fuzzy regression models with fuzzy random variables are more suitable
for the characteristics of the data. However, some of the main issues of Zadeh’s
operations with these models are the following: the addition and the multiplication
between fuzzy numbers lead to a considerable increase of the spreads; the mul-
tiplication of two symmetric fuzzy numbers does not provide a symmetric fuzzy
number or at least a fuzzy number with equal spreads; spreads of Zadeh’s product
depend heavily on the modes of the numbers; some important algebraic proper-
ties, such as the distributive property, are valid only in particular circumstances;
the product of two triangular fuzzy numbers does not provide a triangular fuzzy
number. Therefore, alternative operations in order to overcome some problems
connected to the addition and the product between fuzzy numbers in fuzzy linear
regression models are strongly necessary. Moreover, our research prospects in-
clude considering finite geometric spaces [16, 2], multivalued functions [4] and
algebraic hyperoperations [10] in fuzzy regression models.

References
[1] Achen, C., 1982. Interpreting and using regression. Sage Publications, Cal-

ifornia.

[2] Ameri, R., Nozari, T., 2012. Fuzzy hyperalgebras and direct product, Ratio
Mathematica. 24.

[3] Celmins, A., 1987. Least squares model fitting to fuzzy vector data. Fuzzy
Sets and Systems 22(3), 245–269.

[4] Corsini, P., Mahjoob, R., 2010. Multivalued functions, Fuzzy subsets and
Join spaces. Ratio Mathematica 20, 1–41.

[5] Diamond, P., 1988. Fuzzy least squares. Information Sciences 46 (3), 141–
157.

[6] Ferrraro, M., Colubi, A., Gonzales-Rodriguez, A., Coppi, R., 2010. A linear
regression model for imprecise response. Int J Approx Reason 21, 759–770.

56


Dealing with randomness and vagueness in business and management sciences

[7] Ferrraro, M., Colubi, A., Gonzales-Rodriguez, A., Coppi, R., 2011. A deter-
mination coefficient for a linear regression model with imprecise response.
Environmetrics 22, 516–529.

[8] Gonzales-Rodriguez, A., Blanco, A., Colubi, A., Lubiano, M., 2009. Es-
timation of a simple linear regression model for fuzzy random variables.
Fuzzy Sets Syst 160, 357–370.

[9] Gujarati, D., 2003. Basic Econometrics. McGraw-Hill, New York.

[10] Hošková-Mayerová, Š., Maturo, A., 2016. Fuzzy sets and al-
gebraic hyperoperations to model interpersonal relations, in:
Recent Trends in Social Systems: Quantitative Theories and
Quantitative Models. Springer Nature, pp. 211–221. URL:
http://dx.doi.org/10.1007/978-3-319-40585-8 19,
doi:10.1007/978-3-319-40585-8 19.

[11] Klir, G., 2006. Uncertainty and information: foundations of generalized
information theory. Wiley, New York.

[12] Kratschmer, V., 2006a. Strong consistency of least-squares estimation in
linear regression models with vague concepts. J Multivar Anal 97, 633–654.

[13] Kratschmer, V., 2006b. Strong consistency of least-squares estimation in
linear regression models with vague concepts. J Multivar Anal 97, 1044–
1069.

[14] Lawson, C., R.J. Hanson, 1995. Solving least squares problems. Classics in
applied mathematics 15.

[15] Maturo, A., Maturo, F., 2013. Research in social sciences: Fuzzy regression
and causal complexity, in: Multicriteria and Multiagent Decision Making
with Applications to Economics and Social Sciences. Springer, pp. 237–249.
URL: http://dx.doi.org/10.1007/978-3-642-35635-3 18,
doi:10.1007/978-3-642-35635-3 18.

[16] Maturo, A., Maturo, F., 2014. Finite geometric spaces,
steiner systems and cooperative games. Analele Univer-
sitatii ”Ovidius” Constanta - Seria Matematica 22. URL:
http://dx.doi.org/10.2478/auom-2014-0015,
doi:10.2478/auom-2014-0015.

57


Fabrizio Maturo

[17] Maturo, A., Maturo, F., 2016. Fuzzy events, fuzzy probabil-
ity and applications in economic and social sciences, in: Re-
cent Trends in Social Systems: Quantitative Theories and
Quantitative Models. Springer Nature, pp. 223–233. URL:
http://dx.doi.org/10.1007/978-3-319-40585-8 20,
doi:10.1007/978-3-319-40585-8 20.

[18] Maturo, F., 2016. FUZZINESS: TEORIE E APPLICAZIONI. Aracne Ed-
itrice, Roma, Italy. chapter LA REGRESSIONE FUZZY. pp. 99–110.

[19] Maturo, F., Fortuna, F., 2016. Bell-shaped fuzzy numbers
associated with the normal curve, in: Topics on Method-
ological and Applied Statistical Inference. Springer. URL:
http://dx.doi.org/10.1007/978-3-319-44093-4 13,
doi:10.1007/978-3-319-44093-4 13.

[20] Maturo, F., Hošková-Mayerová, Š., 2016. Fuzzy regression mod-
els and alternative operations for economic and social sciences,
in: Recent Trends in Social Systems: Quantitative Theories
and Quantitative Models. Springer Nature, pp. 235–247. URL:
http://dx.doi.org/10.1007/978-3-319-40585-8 21,
doi:10.1007/978-3-319-40585-8 21.

[21] M.S. Yang, C.K., 1996. On a class of fuzzy c-numbers clustering procedures
for fuzzy data. Fuzzy Sets and Syst 84, 49–60.

[22] Nather, W., 2006. Regression with fuzzy random data. Comput Stat Data
Anal 51, 235–252.

[23] Puri, M., Ralescu, D., 1986. Fuzzy random variables. J Math Anal Appl
114, 409–422.

[24] Ramos-Guajardo, A., Colubi, A., Gonzales-Rodriguez, A., 2010. One-
sample tests for a generalized frÃ¨chet variance of a fuzzy random variable.
Metrika 71, 185–202.

[25] Shapiro, A., 2004. Fuzzy regression and the term structure of interest rates
revisited. Proceedings of the 14th International AFIR Colloquium Vol.1,
29–45.

[26] Shapiro, A., 2005. Fuzzy regression models. ARC .

[27] Stock, J., Watson, M., 2009. Introduction to econometrics. Pearson Addison
Wesley.

58