INTERNATIONAL JOURNAL OF COMPUTERS COMMUNICATIONS & CONTROL
ISSN 1841-9836, 13(3), 383-390, June 2018.

Reduction of Conditional Factors in Causal Analysis

H. Liu, I. Dzitac, S. Guo

Haitao Liu*, Sicong Guo
1. Institute of Intelligence Engineering and Mathematics
Liaoning Technical University, Fuxin 123000, China
2. College of Science
Liaoning Technical University, Fuxin 123000, China
*Corresponding author: liuhaitao@lntu.edu.cn

Ioan Dzitac
1. Aurel Vlaicu University of Arad
310330 Arad, Elena Dragoi, 2, Romania
ioan.dzitac@uav.ro
2. Agora University of Oradea
410526 Oradea, P-ta Tineretului 8, Romania,
idzitac@univagora.ro

Abstract: Faced with a great number of conditional factors in big data causal
analysis, the reduction algorithm put forward in this paper can reasonably reduce
the number of conditional factors. Compared with the previous reduction methods,
we take into consideration the influence of conditional factors on resulted factors, as
well as the relationship among conditional factors themselves. The basic idea of the
algorithm proposed in this paper is to establish the matrix of mutual deterministic
degrees in between conditional factors. If a conditional factor f has a greater deter-
ministic degree with respect to another conditional factor h, we will delete the factor
h unless factor h has a greater deterministic degree with respect to f, then delete
factor f in this case. With this reduction, we can ensure that the conditional factors
participating in causal analysis are as irrelevant as possible. This is a reasonable
requirement for causal analysis.
Keywords: factors space, causal analysis, reduction of factors, fuzzy logic.

1 Introduction

Causal analysis in factors space [18] is proposed in the paper [22], which extracts causal rules
from the background distribution in between a group of factors. This is the original methodology
provided by the factor space for the machine learning, classification and decision-making and so
on. The paper [13] applies those causal rules to causal reasoning, and the paper [17] improves the
inductive algorithm introduced in paper [22]. The paper [1] puts forward that the slip-differential
algorithm, improving the precision of causal reasoning. The paper [15] gives the rule extraction
with respect to multi-result factors, which connects multi-label learning theory [5]. The paper [2]
presents a reasonable statement on logistic regression based on fuzzy sets and the factor space
theory. The paper [14] introduces the historic background of factors space and its relationship
with formal concept analysis [6]. A lot of theoretical papers about factors space can be found
in the reference [3, 4, 7–12, 16, 19–21, 23, 24]. All this lays a complete foundation for the unified
depiction of causal induction and reasoning in artificial intelligence. However, in the face of the
impact of big data, the number of factors to be processed by causal analysis is surprisingly large.
We will discuss how to simplify and merge the large number of conditional factors in this paper.

The idea of the article [22] is that the factor which has the strongest influence on the resulted
factor will be used first. By using it, we can have a causal rule and delete some data. Repeating

Copyright ©2018 CC BY-NC


384 H. Liu, I. Dzitac, S. Guo

the process, when all the data are deleted all unused conditional factors were reduced. This
reduction method is determined by the deterministic degrees of conditional factors with respect
to the resulted factor. This paper makes a supplement to the idea of reduction. Not only do
we consider the influence of the conditional factors on the result factor, but also the relationship
between the condition factors is taken into consideration. The deterministic degrees of a condition
factor with respect to other factors should be considered. The conditional factors are reduced
or merged according to the degree of mutual determination, and the last is a set of conditional
factors that are as not related to each other as possible, which is the best condition for causal
analysis.

The structure of this paper is that section 2 introduces the mutual relationship in between
conditional factors, and section 3 introduces the reduction algorithm of conditional factors. Sec-
tion 4 is a short conclusion. This paper is a mathematical method without specific examples.

2 Mutual relationship in between conditional factors

The factor is the quality root, each factor in command of a string of attributes. For example,
the color is a factor, which commands the red, orange, yellow, green, blue, blue, purple and so
on. It mathematically is defined as a mapping f : U → X(f). f is color, for example, U are a
group of cars, X(f)={red, orange, yellow, green, green, blue, purple}, which draws our attention
from the group of cars to their colors. X(f) is called the state space of the factor f , where the
states are described by natural language words, called the qualitative states; of course, factors
also can have quantitative state space, then back to the variable. The factor is the promotion of
variables. Factor f is regular if there are at least two objects u and v such that f(u) 6= f(v).

Considering a set of basic factors F∗ = {f1, · · · ,fn}, we can define a synthetic factor by any
subset f = {f(1), · · · ,f(k)} of F∗ with the state space X = X(f(1)) ×···× X(f(k)) ( × stands
for Cartesian product). Denote the synthetic factor as f = {f(1) ∪·· ·∪f(k)}. It is easy to prove
that P(F∗) = (P(F∗),∪,∩,c) forms a factorial Boolean algebra, where the operations ∪ and ∩
are called the synthesis and separation of factors respectively.

Denote XF∗ = {X(f)}(f∈P(F∗)), and φ = (U,XF∗) is called the factor space defined on U.
A factor f defines an equivalence relation ∼ in the domain U: For any u,v ∈ U, u ∼ v if and

only if f(u) = f(v). Denote the subclass of U containing u as [u]f = {v ∈ U|f(v) = f(u)}. We
call that H(f,U) = {[u]f|u ∈ U} the division of U by f.

We call that f is more specific than h, denoted as H(f,U)〉H(h,U), if for any u, there is an
v in U such that [v]f ⊆ [u]h, and for any u, there is an v in U such that [u]f ⊆ [v]h. It is obvious
that H(f,U)〉H(h,U) if and only if H(f ∪ h,U) = H(f,U), in this case, for any a ∈ X(h),
there are a1, · · · ,at ∈ X(f), such that [a]h = [a1]f + · · · + [at]f. We call that f and h are
equivalent if H(f,U)〉H(h,U) and H(h,U)〉H(f,U). Suppose that the numbers of subclasses in
the divisions of f and h are s and t respectively, and suppose that s,t > 1. If H(f ∪ h,U) is
the roughest common more specific division of H(f,U) and H(h,U), then we call that f and h
are independent in division. In this case, for any b ∈ X(h), there are a1, · · · ,as ∈ X(f) such
that [b]h = [a1]f + · · · + [as]f ; and for any a ∈ X(f), there are b1, · · · ,bt ∈ X(h) such that
[a]f = [b1]h + + [bt]h.

Given a factors space φ = (U,XF∗), selecting f1, · · · ,fk and g from XF∗, called a set of
conditional factors and a result factor respectively, and extracting m objects from U to form a
sample domain U ′, we obtain the combined states data of these objects with respect to the k + 1
factors. Causal analysis aims to extract causal rules from conditions to the result based on the
sample distribution of U ′. One of the key concepts is the deterministic degree of factor fi with
respect to g.


Reduction of Conditional Factors in Causal Analysis 385

Definition 1. (Wang2015) If there is an object u ∈ U ′ such that [u]fi ⊆ [u]g, then we say that
[u]fi is a deterministic class of fi with respect to g. The ratio d of the number of objects in
all deterministic classes of fi with respect to g and the number of objects in U ′ is called the
determination degree of fi with respect to g.

When fj(u) = fj(v),we have that [u]fi = [v]fi. To avoid repetition, denote [u]fi = [v]fi =
[a]fi, then we have

d(fj,g) =
∑
{|[a]fi||[a]fi is a deterministic class of fi on g}/m (1)

where |A| stands for the number of elements in A.
In this Section, we will consider the deterministic degree d(f,h) of a conditional factor f with

another conditional factor h. The whole theory is applied on a sampling U ′ ⊆ U.

Theorem 2. Let f, h be two conditional factors on sample U ′. Factor f is more specific than h
on U ′ if and only if d(f,h) = 1.

Proof: Suppose that f is more specific than h on U ′. For any u ∈ U ′, there is v ∈ U ′ such that
[v]f ⊆ [u]h, it means that [v]f is a deterministic subclass of f with respect to h. Therefore, all
the elements of U ′ is covered by deterministic subclasses of f with respect to h, then, we have
that d(f,h) = 1.

Inversely, suppose that d(f,h) = 1. For any a ∈ X(h), let [a] be the subclass that has the
state a under h, there must be an element u ∈ U ′ such that h(u) = a, and then [a] = [u]h. Since
d(f,h) = 1, we have that [u]f ⊆ [u]h, i.e., [u]f ⊆ [a]; For any a ∈ X(f), let [a] be the subclass
has the state a under f, there must be an element u ∈ U ′ such that f(u) = a. Since d(f,h) = 1,
we have that [u]f ⊆ [u]h, i.e., [u] ⊆ [u]f. Therefore, the factor f is more specific than h. 2

Theorem 3. If f is more specific than h on U ′, and the two factors h and f are not equivalent,
then d(h,f) < 1.

Proof: Suppose that d(h,f) = 1. According to proposition 2, h is more specific than f, and
then f and h are equivalent U ′. This is a contradiction. 2

Theorem 4. If f is more specific than h on U ′, then d(f,g) ≥ d(h,g).

Proof: Suppose that [a] is a deterministic subclass of h (a ∈ X(h)). There is u ∈ [a]h such that
h(u) = a. Since that f is more specific than h on U ′, we have that H(f ∪h,U) = H(f,U), and
then [a]h = [a1]f + · · · + [at]f, where [a1]f, · · · , [at]f are deterministic degrees of f on U ′ with
respect to g both. It is obvious that d(f,g) ≥ d(h,g). 2

Theorem 5. If f and h are two regular factors mutual independent in division on U ′, then
d(f,h) = d(h,f) = 0.

Proof: Since f and h are two regular factors mutual independent in division on U ′ for any
b ∈ X(h), there are a1, · · · ,as ∈ X(f) such that [b]h = [a1]f + · · ·+ [as]f; and for any a ∈ X(f),
there are b1, · · · ,bt ∈ X(h) such that [a]f = [b1]h + · · ·+ [at]h, and it ensures that [a]f \ [b]h 6= Φ
and [b]h \ [a]f 6= Φ hold for any a ∈ X(f) and any b ∈ X(h). then d(f,h) = d(h,f) = 0. 2

There are three kinds of relationship between conditional factors: 1. d(f,h) is rather larger
and d(h,f) is rather smaller; 2. d(f,h) and d(h,f) are rather larger both; 3. d(f,h) and d(h,f)
are rather smaller both. According to the statements mentioned above, in case 1, if d(f,g) is
larger, then we need not the factor h when f is taken part in with respect to the result g; in
case 2, factors f, h are related to each other closely, and they are not suitable to be conditional
simultaneously, and need to do reduction; in the case 3, factors f, h are rather independent, so
they don’t need to be deleted provided they have important influence to the resulted factor.


386 H. Liu, I. Dzitac, S. Guo

Table 1: Conditional Factors

factor name state space

f1 Age X(f1)={Old, Middle, Young}
f2 Income X(f2)={High, Average, Low}
f3 Student X(f3)={Y, N}
f4 Credit X(f4)={Very-good, Good, Un-recorded}
f5 Education X(f5)={Primary, Junior, University, Graduated}
f6 Civil X(f6)={Civil, Private}
f7 Housing X(f7)={Rent, Narrow, Mansion}
f8 Car X(f8)={Car, Bike}
f9 Health X(f9)={Healthy, Sickness}
f10 Residence X(f10)={Town, Rural}

Table 2: Causal Data

u1 u2 u3 u4 u5 u6 u7 u8 u9 u10 u11 u12 u13 u14
f1 O O M O O M M M M Y Y Y Y Y
f2 L A H A H L A H L A H A L L
f3 N N N N N N N N N Y N Y Y Y
f4 U U V G V U G V G V V V G U
f5 P J G U U J U J P G U U U G
f6 C C C C C P C P P P C P P P
f7 N N M M M N N M N N M R R R
f8 B B C C B B C C B C C B B B
f9 H S H H S H S S H H H S H S
f10 R T T T T R T R R T T T T T
g 0 1 2 1 2 0 1 2 0 2 2 2 1 0

3 Reduction of conditional factors

Causal analysis aims to extract the rules form conditional to resulted factors; the more
independent the better the conditional factors. The reduction of conditions factors obeys such a
principle: For a pair of the factors with higher deterministic degrees with respect to the resulted
factor g both, delete one of them except their mutual deterministic degrees are smaller both (i.e.,
in the case 3). This principle aims to take conditional factors into causal analysis as independent
as possible.
Algorithm
Step 1. Rank the conditional factors according to their deterministic degrees with respect to
the resulted factor g from high to low;
Step 2. Write the matrix of deterministic degree between conditional factors;
Step 3. For any i and j, if (d(fi,fj) > 0.5 or d(fj,fi) > 0.5) and d(fi,fj) > d(fj,fi), then
delete fj.

The remaining factor sequence is the conditional sequence that is required by the reduction.
If causal analysis is performed according to this sequence, the sequence will be terminated when
the causal tree is formed, and all unused conditional factors are deleted at all.
Example. In customer analysis, the goal is to open the market. The utility factor is the
purchasing power of the customer, and the form factor is the information of the customers. Take
the form factors as the conditionals; the benefit factor should be the result to do the causal
analysis. The conditional factors considered are listed in Table 1.

Selecting 14 customers to form a sampling universe
U ′ = {u1 u2 u3 u4 u5 u6 u7 u8 u9 u10 u11 u12 u13 u14},
Table 2 presents causal data type.


Reduction of Conditional Factors in Causal Analysis 387

Table 3: Frequencies of factors

f1 f2 f3 f4 f5 f6 f7 f8 f9 f10
0 4 0 6 2 0 0 0 0 0

Table 4: The matrix of mutual deterministic degrees between conditional factors

f4 f2 f5 f1 f3 f6 f7 f8 f9 f10
f4 0 0 0 0 0 0 4 0 0
f2 4 0 0 5 0 4 9 0 5
f5 2 2 0 2 0 2 4 4 8
f1 0 0 0 9 4 0 0 0 5
f3 0 2 0 2 4 0 0 0 4
f6 0 0 0 0 0 0 0 0 0
f7 0 0 0 3 3 3 3 0 0
f8 0 0 0 0 0 0 0 0 0
f9 0 0 0 0 0 0 0 0 0
f10 0 0 0 0 4 0 0 0 0

The steps of reduction of conditional factors are shown as follows:
Step 1. Reordering conditional factors according to their deterministic degrees with respect to
g. Remember that m=14, to be more simple, we write all frequencies by 14 times. The results
are given in Tables 3.

The new order is shown as: f4, f2, f5, f1, f3, f6, f7, f8, f9, f10.
Step 2. The matrix of mutual deterministic degrees between conditional factors is listed in
Table 4.
Step 3.

When i = 2, j = 8, d(fi,fj) = 9/14 > 0.5 and d(fi,fj) > d(fj,fi) = 0, delete f8;
When i = 5, j = 10, d(fi,fj) = 8/14 > 0.5 and d(fi,fj) > d(fj,fi) = 0, delete f10;
When i = 1, j = 3, d(fi,fj) = 9/14 > 0.5 and d(fi,fj) > d(fj,fi) = 0, delte f3.
After deleting the three conditional factors, the new causal analysis data style is presented

in Table 5
According to the causal analysis [22], do rule extraction by f4 to get that

Rule 1: If Credit is very good, then the purchasing power is #2

Taking out those customers having very good credit from U ′, Table 6 presents newer causal
analysis data style.

Do rule extraction by f4 and f2 to get that
Rule 2: If Credit is unrecorded and Income is low, then the purchasing power is #0;
Rule 3: If Credit is unrecorded and Income is average, then the purchasing power is #1;
Rule 4: If Credit is good and Income is average, then the purchasing power is #1;

Table 5: New Causal Data

u1 u2 u3 u4 u5 u6 u7 u8 u9 u10 u11 u12 u13 u14
f4 U U V G V U G V G V V V G U
f2 L A H A H L A H L A H A L L
f5 P J G U U J U J P G U U U G
f1 O O M O O M M M M Y Y Y Y Y
f6 C C C C C P C P P P C P P P
f7 N N M M M N N M N N M R R R
f9 H S H H S H S S H H H S H S
g 0 1 2 1 2 0 1 2 0 2 2 2 1 0


388 H. Liu, I. Dzitac, S. Guo

Table 6: New Causal Data (with 8 factors)

u1 u2 u4 u6 u7 u9 u13 u14
f4 U U G U G G G U
f2 L A A L A L L L
f5 P J U J U P U G
f1 O O O M M M Y Y
f6 C C C P C P P P
f7 N N M N N N R R
f9 H S H H S H H S
g 0 1 1 0 1 0 1 0

Table 7: New Causal Data (with 2 factors)

U′ u9 u13
f4 G G
f2 L L
f5 P U
f1 M Y
f6 P P
f7 N R
f9 H H
g 0 1

Taking out those customers having been contributed to rule extraction from U ′, the newer
causal analysis data style is given in Table 7.

Do rule extraction by f4, f2 and f5 to get that
Rule 5: If Credit is Good, Income is low, and Education is Univ., then the purchasing is #1;
Rule 6: If Credit is Good, Income is low, and Education is Prim., then the purchasing is #0;
Now, the universe U ′ has been empty, the rule extraction has finished. We just select three

factors in use, all of others have been deleted at all. What are the relationship in between the
three factors?

d(f4,f2) = 0 < 0.5, d(f2,f4) = 4/14 < 0.5,
d(f4,f5) = 0 < 0.5, d(f5,f4) = 2/14 < 0.5,
d(f2,f5) = 0 < 0.5, d(f5,f2) = 2/14 < 0.5.
All of mutual deterministic degrees between them are small, which satisfy the requirement

of causal analysis.

4 Conclusion

In the face of the challenge of big data, the number of conditional factors in causal analysis is
very large, so the reduction of conditional factors is an important task. The proposed reduction
algorithm can reasonably reduce the number of conditional factors. Compared with the previous
reduction methods, we take into consideration the influence of conditional factors on resulted
factors, as well as the relationship among conditional factors themselves. In this paper we
consider the mutual deterministic degrees in between conditional factors. Such reduction ensures
the conditional factors are selected as independent as possible, Causal analysis requires such
selection, and this improvement is of great importance in practice.


Reduction of Conditional Factors in Causal Analysis 389

Acknowledgement

The authors specially thank Professor P. Z. Wang for his guidance and modification. This
study was partially supported by the grants (Grant Nos. 61350003, 11401284, 70621001, 70531040)
from the Natural Science Foundation of China, and the grant (Grant Nos. L2014133) from the
department of education of Liaoning Province.

Bibliography

[1] Bao, Y. K.; Ru, H.Y.; Jin, S.J.(2014); A new algorithm of knowledge mining in factor space,
Journal of Liaoning Technical University (Natural Science), 33(8), 1141–1144, 2014.

[2] Cheng, Q.F.; Wang, T.T.; Guo S.C.; Zhang, D. Y.; Jing K.; Feng, L.; Wang P.Z. (2017); The
Logistic Regression from the Viewpoint of the Factor Space Theory, International Journal
of Computers Communications & Control, 12(4), 492–502, 2017.

[3] Dzitac I. (2015), The Fuzzification of Classical Structures: A General View, International
Journal of Computers Communications & Control, 10(6), 772-788, 2015.

[4] Dzitac, I.; Filip, F.G.; Manolescu, M.J. (2017), Fuzzy Logic Is Not Fuzzy: World-renowned
Computer Scientist Lotfi A. Zadeh, International Journal of Computers Communications &
Control, 12(6), 748-789, 2017.

[5] Furnkranz, J.; Hullermeier, E.; Mencia, E.L.; Brinker, K.(2008); Multilabel classification
via calibrated label ranking, Machine Learning, 73(2), 133-153, 2008.

[6] Ganter, B.; Wille, R. (1996); Formal concept analysis, Wissenschaftliche Zeitschrift-
Technischen Universitat Dresden, 45, 8–13, 1999.

[7] Kandel, A.; Peng, X.T.; Cao, Z.Q.; Wang P.Z. (1990); Representation of concepts by factor
spaces, Cybernetics and Systems: An International Journal, 21(1), 43–57, 1990.

[8] Li, H.X.; Wang, P.Z.; Yen, V.C. (1998); Factor spaces theory and its applications to fuzzy
information processing.(I). The basics of factor spaces, Fuzzy Sets and Systems, 95(2), 147–
160, 1998.

[9] Li, H.X., Yen, V.C.; Lee, E.S. (2000); Factor space theory in fuzzy information processing-
Composition of states of factors and multifactorial decision making, Computers & Mathe-
matics with Applications, 39(1), 245–265, 2000.

[10] Li, H.X.; Yen, V.C.; Lee, E.S. (2000); Models of neurons based on factor space, Computers
& Mathematics with Applications, 39(12), 91–100, 2000.

[11] Li, H. X.; Chen, C.P.; Yen, V.C., Lee, E.S. (2000); Factor spaces theory and its applications
to fuzzy information processing: Two kinds of factor space canes, Computers & Mathematics
with Applications, 40(6-7), 835–843, 2000.

[12] Li, H.X.; Chen, C.P., Lee, E.S. (2000); Factor space theory and fuzzy information processing-
Fuzzy decision making based on the concepts of feedback extension, Computers & Mathe-
matics with Applications, 40(6-7), 845–864, 2000.

[13] Liu, H.T.; Guo, S.C. (2015); Inference model of causality analysis, Journal of Liaoning
Technical University(Natural Science), 34(1), 124–128, 2015.


390 H. Liu, I. Dzitac, S. Guo

[14] Liu, H.T.; Dzitac, I.; Guo, S.C. (2018); Reduction of conditional factors in causal analysis,
International Journal of Computers Communications & Control, 13(1), 83–98, 2018.

[15] Qu, W.H.; Liu, H.T.; Guo, S.Z.(2017); Multi-target causality analysis in factor space, Fuzzy
Systems & Mathematics, 31(6), 74–81, 2017.

[16] Vesselenyi, T.; Dzitac, I.; Dzitac, S.; Vaida, V. (2008); Surface roughness image analysis
using quasi-fractal characteristics and fuzzy clustering methods, International Journal of
Computers Communications & Control, 3(3), 304–316, 2008.

[17] Wang, H.D.; Wang, P.Z.; Shi, Y.; Liu, H.T. (2014); Improved factorial analysis algorithm
in factor spaces, International Conference on Informatics, 201–204, 2014.

[18] Wang, P.Z.; Sugeno, M. (1982); The factor fields and background structure for fuzzy subsets,
Fuzzy Mathematics, 2(2), 45–54, 1982.

[19] Wang, P.Z. (1990); A factor spaces approach to knowledge representation, Fuzzy Sets and
Systems, 36(1), 113–124, 1990.

[20] Wang, P.Z.; Zhang, X.H.; Lui, H.C.; Zhang, H.M.; Xu, W. (1995); Mathematical theory of
truth-valued flow inference, Fuzzy Sets and Systems, 72(2), 221–238, 1995.

[21] Wang, P.Z.; Jiang, A. (2002); Rules detecting and rules-data mutual enhancement based on
factors space theory, International Journal of Information Technology & Decision Making,
1(01), 73–90, 2002.

[22] Wang, P.Z.; Guo, S.C.; Bao, Y.K.; Liu, H.T. (2014); Causality analysis in factor spaces,
Journal of Liaoning Technical University (Natural Science), 33(7), 1–6, 2015.

[23] Yuan, X.H.; Wang, P.Z.; Lee, E.S. (1992); Factor space and its algebraic representation
theory, J Math Anal Appl., 171(1), 256–276, 1992.

[24] Yuan, X.H.; Wang, P.Z.; Lee, E.S. (1994); Factor Rattans, Category FR (Y), and Factor
Space, Journal of Mathematical Analysis and Applications, 186(1), 254–264, 1994.