INTERNATIONAL JOURNAL OF COMPUTERS COMMUNICATIONS & CONTROL
ISSN 1841-9836, 12(4), 577-591, August 2017.

High-speed Train Control System Big Data Analysis Based on
Fuzzy RDF Model and Uncertain Reasoning

D. Zhang

Dalin Zhang
National Research Center of Railway Safety Assessment
Beijing Jiaotong University
Beijing, 100044, China
dalin@bjtu.edu.cn

Abstract: China high-speed train control system is a combination of computer,
communication and control. Its events are diverse, including sensor data stream,
GPS signal, GSM-R transmission data, real-time video monitoring data, train control
software data, etc. These data have the typical characteristics of big data. If these
data are well applied, this will be of great help to operations, maintenance, safety,
passenger services, etc. This paper presents an efficient analysis method based on
the fuzzy RDF model and uncertain reasoning for high-speed train control system
big data. We have used the method proposed in this paper to analyze the data of
the high-speed train control system. The experiment results show that the method
proposed in this paper has good efficiency and scalability for the analysis of big data
with different structures, types and context sensitive from high-speed train control
system.
Keywords: high-speed train control system, fuzzy RDF, D-S theory, uncertain rea-
soning.

1 Introduction

Railway transportation system is a complex system with many factors. It has been unable to
meet the high transport capacity, high efficiency, high safety, high quality of service requirements
relying on traditional theory and technical methods. The intelligent rail transport system has
become a trend, such as the European rail traffic management system, the Japanese Cyber Rail,
the US IRS (Intelligent Railway System), etc. In recent years, the IBM Smarter Railroad [3],
CISCO Smart + Connected Railway [11], SIMENS Intelligent Train [20], they have promoted
the development of the intelligent railway.

High-speed train control system makes use of the information provided by the ground, the
distance between the target and the route, and generated the control curve automatically on
the train control equipment. China has developed the standard of Chinese train control sys-
tem(CTCS) in 2008, which have 4 level totally [17]. The CTCS-3 uses the GSM-R wireless
communication for bidirectional transmission of information between the train and the ground.
It uses the wireless blocking center(RBC) to generate traffic permits, uses the track circuit to
achieve the train occupancy check, and uses a transponder to achieve train positioning. At the
same time, it has the function of CTCS-2.

High-speed train control system is a combination of computer, communication and control.
Its events are diverse, including sensor generated data stream, GPS signal, GSM-R transmission
data, real-time video monitoring data, train control software data, etc. These data have the typ-
ical characteristics of big data, including massive data, distributed, complex, context-sensitive,
etc. If these data are well applied, this will be of great help to high-speed rail operations,
maintenance, safety, passenger services, etc.

The dynamic monitoring system(DMS) is a comprehensive system for dynamic monitoring of
the use status of the relevant equipment of the train control system. It can analyze the data, guide

Copyright © 2006-2017 by CCC Publications


578 D. Zhang

the maintenance, and provide the basis for the decision. DMS consists of vehicle information
collection, ground data center and query terminal. A variety of complex data analysis beyond
our current ability range to use DMS. How to extract the useful knowledge from DMS’s massive
data effectively is an urgent need for the current situation. Because these excavated knowledges
can help the maintenance department to achieve rapid and accurate fault diagnosis, and provide
reasonable preventive maintenance decision support.

For big data, the current distributed database technology NoSQL and Hadoop can handle it
well. If the semantic technology is used to analyze, the main challenge lies in the distributed query
and reasoning. Such as streaming media, sensors, the current distributed computing technology
can also meet this demand. The challenge of semantic technology is the flow reasoning, that is
in the process of continuous arrival of the data and the incomplete information should be used
for reasoning. Such as distributed reasoning can use Hadoop / Storm and the flow reasoning
can use continuous(dynamic) queries [2]. There are many types of data in the DMS system,
and the structure is very complex and the speed of change is very fast. It not only needs high
performance but also needs management and real-time analysis. If it should be needed to meet
the above three, there is no good solution.

In view of the above problems, this paper proposes a distributed real-time context-aware data
processing method based on the fuzzy linked data and uncertain reasoning. This method extends
the RDF model to a fuzzy RDF model, and propose real-time distributed context reasoning based
D-S theory.

2 CTCS-3 data context modeling

2.1 CTCS-3 data

The system structure of CTCS-3 is shown in the following Figure 1. The real-time data of
CTCS-3 can be summarized as follows:

(1) RBC. Generating the traffic permit according to the track circuit, interlocking and other
information; Accepting vehicle equipment data through GSM-R.

(2) GSM-R network. Two-way communication between vehicle equipment and ground equip-
ment.

(3) Transponder. Sending positioning, level conversion information, line parameters and
temporary speed-limited to the vehicle equipment.

(4) Secure computer. Generating the dynamic speed curve and monitoring the safe operation
of the train.

(5) Track circuit. Checking the train occupancy and sending traffic permit information.
(6) CTC system. Realization of centralized control for the station signal equipment.

2.2 Linked data

At present, more and more big data applications began to introduce semantic technology,
which makes the description of the data more standardized and rich in machine comprehensible
semantics. Rich semantic links make the system more open and inter-operable, and make big
data analysis deep into the knowledge level, which requires big data technology can provide a
wealth of related functions and simple reasoning ability. Big data technology can effectively
solve the distributed environment of the web scale of unstructured information management and
utilization issues. And the associated data brings the rich formalized semantics, which is a tool
for cross-domain integration and intelligent analysis.


High-speed Train Control System Big Data Analysis Based on Fuzzy RDF Model and
Uncertain Reasoning 579

Figure 1: CTCS-3 system structure

Resource description framework(RDF). RDF [13] is a triplet 〈subject,predicate,object〉,
which describes the generic model of the resource description, where the subject is a resource
with a uniform identifier(URI), or a blank node with no name space, such as DOI, ISBN etc.;
object could be a resource with a URI, a blank node, or a string value; and a predicate that
represents the relationship between the subject and the object.

More and more big data applications are described and encoded using the RDF model. The
meta-data and ontology make the data semantically recognizable by the machine, enrich the
semantics of big data, and make the data have better interoperability. Since the linked data [7]
is semantically described, it is no longer a piece of information. When we combine linked data
with big data mining, which provides a powerful tool for big data analysis, which implements
semantic-based integration in the data. When using the linked data technology in big data
applications, we can call is as a linked big data application.

Linked data. Linked data is a data application form, which uses the URI as a data identifier,
and RDF triplet structure as a data model, and based on HTTP, which is a simplified realization
of the semantic web, and the intention is to build data web. The linked data is based on
the existing web technology (HTTP, URL and HTML), and the web specification is further
standardized and defined with basic principles.

Depending on the principle of the linked data, the elements of the triad of the linked data
should be encoded as RDF as much as possible. Especially the subject, must be able to access
in an open HTTP way, and the RDF description should contain other more data link. The data
in the linked data is not an independent, context-free abstract data, but a clear knowledge unit
with a URI identifier and an RDF description (including cross-domain links), which is the basic
semantic unit managed in the web. Any one thing, people, institutions, places, events, concepts
can be described as a linked data, so the linked data to big data development is inevitable.

2.3 Fuzzy RDF model

The purpose of context modeling is to describe the data and its environment, which plays an
extremely important role in building a context-aware system. At present, the context modeling
methods are key-value model, tag configuration model, graph model, object-oriented model,
based on the logical model, ontology model. Ontology model has the advantages of knowledge
sharing, logical reasoning, easy knowledge reuse, and so on. In this paper, the use of extended
RDF modeling method is also a kind of ontology.

This study used Drupal [12] to convert CTCS-3 data into linked data. The RDF data model


580 D. Zhang

is the basis of linked data, but in specific applications, will inevitably use some domain ontology,
such as FOAF, Dublin Core, SKOS, OWL, SIOC and so on. Drupal also supports the import
of external ontologies, which can define the mapping between these external and local content
models. Drupal provides a way for mapping content types, fields, and nodes to classes, attributes,
and objects in the RDF triplet model, namely subject, predicate, object. In the field of CTCS-3
data analysis, some concepts are not deterministic, such as speed, security, etc., so the RDF
model needs to be extended. The context representation of this paper is based on the fuzzy RDF
model, and it is optimized for CTCS-3 data processing.

Fuzzy RDF model(FRDF). The FRDF model FR is expressed as FR = 〈fs,fp,fo〉,
where the fs is the fuzzy concept, and the fp is the set of fuzzy attributes for the subject, the
fo is a set of fuzzy concept instances. The attributes fp is defined as fp = {p1,p2, · · · ,pn},
the pi is an attribute that represents the relationship between the subject and the object; fo =
{o1,o2, · · · ,on}, the oi represents a fuzzy instance. The function of the fuzzy degree in the FRDF
model is µ : fo → [0, 1]. If µoi = 1, we call oi concrete instance.

For example, the "the No G115 train" is a fuzzy concept, meanwhile, it is an instance of
another concept "train". "the speed of No G115 train is very fast" is an example of CTCS-3
data, which can be described by the FRDF model. The "No G115", "train", "speed", "very",
"fast" can total be described by the FRDF model as an instance. The "train" is a fuzzy concept,
and it has the Fuzzy attribute "driving speed". The value of speed can be expressed as a group
of instances {S(slow), M(middle), F(fast)}.

Fuzzy contexts. According to the FRDF model, Fuzzy context FC is defined as FC =
〈FS,FP,FO〉. The FS is a set of fuzzy concepts, FP is a fuzzy attribute set that represents
the relationship between the subject set FS and the object set FO, and FO is a set of fuzzy
objects.

2.4 FRDF model based on D-S theory

The D-S theory [10] is a generalization of the Bayesian reasoning method, which mainly
takes advantage of the Bayesian conditional probability in probability theory, and requires a
priori probability. The D-S theory does not require a priori probability, can describe uncertainty,
is widely used to deal with uncertain data. The D-S theory classifies the information that people
are interested in some identification frameworks.

Basic probability assignment(BPA). For any of the recognition frames Θ and its subset
A, if the following conditions are met: (1) m(φ) = 0; (2)

∑
A⊆Θ = 1; Then, m : 2

Θ yields−→ [0, 1]
is called the BPA function. m(A) = 0 is the basic probability assignment of A, which indicates
the degree of belief in A. The BPA is also called the mass function.

Belief function. To express the degree of trust in the proposition, D-S theory intro-
duces the concept of belief function Bel, and its relationship with BPA satisfies Bel : 2Θ

yields
−→

[0, 1] ,Bel(A) =
∑

B⊂A m(B) = 1(∀A ⊂ Θ).
D-S rule. For n limited mass functions m1, m2, · · · , mn on the one recognition frame Θ,

∀A ⊆ Θ, A = A1 ∩A2 ∩·· ·∩An, then the synthesis of n belief functions is as follows:

m(A) = (m1 ⊕m2 ⊕···⊕mn)(A) =
1

K

∑
A

m1(A1) ·m2(A2) · · ·mn(An) (1)

K =
∑
A 6=∅

m1(A1) ·m2(A2) · · ·mn(An) (2)

The K is called the normalization factor, which reflects the degree of conflict among the evidence.
When K = 0, it is called incomplete conflict; When 0 < K < 1, it is called incomplete conflict;
When K = 1, it is called full conflict.


High-speed Train Control System Big Data Analysis Based on Fuzzy RDF Model and
Uncertain Reasoning 581

D-S theory does not well reflect the structure of contextual information and the relationship
between them, so its application is limited. To play its superiority, it must use a valid context
information representation. The D-S theory uses the evidence recognition framework E and the
conclusion recognition framework Θ to divide the information. The FRDF model divides the
context into two layers: the context of the low-level acquisition and the context of the high-level
reasoning, which can naturally be combined to establish the FRDF model based on the D-S
theory. The FRDF model has a very good scalability. The FRDF model can be expanded based
on the D-S theory easily.

In the FRDF model based on the D-S theory(FRDFDS), the fs is still the fuzzy concept. The
fp is the set of attributes for the subject which been expanded and contains some special evidence
attributes and conclusion attributes, such as hasConclusions, hasEvidences, hasBPAvalue and
so on. The fo is a set of instances of concept which also been expanded and contains some special
evidence and conclusion instances.

At present, there are five attributes which are added to the basic FRDF model: the attribute
hasConclusions, the attribute hasEvidences, the BPA numeric attribute hasBPAV alue, the
Bal value attribute hasBal, and the time attribute hasTimetag. In the context of the FRDFDS
model, all data is represented as a unified FRDFDS model, which is the advantage of the
linked data, and each data can be considered as a node of the FRDFDS model instance. The
hasTimetag attribute is used to get the context information for the time. The hasConclusions
attribute may map a single instance or multiple instance instances. Likewise, the attribute
hasEvidences corresponds to all evidence instance collections. In the pervasive computing en-
vironment, a conclusion context information may be another conclusion of the evidence context
information. In order to adapt to the context of the dynamic environment, diversity, uncertainty,
we add data attribute hasBel and hasBPAV alue for the conclusion and evidence, which are
used to describe the probability of the values. The above FRDFDS model is universal, just as the
basic RDF model, the relevant attributes and instances can be added to express the uncertainty
information of the context.

Belief structure. Given a recognition frame Θ, the evidence space E, the mapping F :
E → Θ and the basic belief distribution value m, where Θ and E are finite sets, the quaternion
D = 〈Θ,E,F,m〉 is called a belief structure. Based on the basic FRDFDS model given in this
paper, we can construct the belief structure D , and then create the instance of the uncertainty
context information, and finally, use the evidence combination rule step by step reasoning process.
The m is the distribution of the basic belief, which is a quantitative evaluation based on evidence.
We can make a synthesis based on different sources of evidence through visiting their value of m
correspondingly.

3 Multi-layer real-time uncertain context reasoning method

Context reasoning is the key to solve the problem of uncertainty reasoning, and which often
depends on the context model. According to the FRDFDS model proposed in this paper, three
kinds of reasoning mechanisms can be used in context reasoning [1], namely, ontology reasoning,
rule reasoning and evidence theory reasoning. The rule-based reasoning is to generate new
knowledge by matching existing facts with predefined rules. For example, when the sensor
information is collected: "John is currently in the bedroom, the indoor curtains closed, the light
intensity is dark", you can infer that “John is sleeping". The reasoning process of the ontology-
based is like the rule-based reasoning process, except that the rules used are defined by the OWL
language itself to obtain knowledge that is implicit in explicit definitions and declarations, such
as the symmetrical property SymmetricProperty, the transitive property TransitiveProperty,
etc. For the reasoning based on D-S theory, the evidence set and the conclusion set are expressed


582 D. Zhang

by multiple discriminant frameworks respectively. The synthetic method provides the hierarchical
reasoning model of evidence. Uncertainty is transferred layer by layer based on the synthesis
rules of D-S Theory. The hierarchical model can add the evidence based on the original synthesis,
greatly improving the efficiency of evidence synthesis.

CTCS-3 is the complex application of the Internet of things. The logic-based reasoning
method will become more complex. To solve the problem of large data, this paper proposes
a distributed multi-layer real-time uncertain context reasoning framework based on FRDFDS
model. In this framework, the first layer generates the context state according to the FRDFDS
model instance, in which the FRDFDS instance is generated by the complex event processing
engine. Since the event processing agent may be multiple, the FRDFDS instance is distributed.
CTCS-3 data is collected in real-time, so the context state is continually updated with updated
data in real-time. The context state of the (i + 1)th layer can be inferred from the context
of the ith layer. The reasoning process combines the traditional rule-based reasoning and the
uncertainty reasoning based on the D-S theory.

The overall algorithm framework is proposed by this paper in algorithm 1. The algorithm 1
combines traditional rule-based reasoning and the uncertainty reasoning based on the D-S theory.
From the algorithm 1, we can see that all context can be expressed by a FRDFDS model instance,
and this framework is multi-layer and real-time and has a good adaptability. In algorithm 1, the
fuzzy degree is described by the function µ. When all reasoning is finished, the framework will
trigger an update_context() procedure. The procedure ds_reasoning() and update_context()
will be present in algorithm 2 and algorithm 3.

Algorithm 1 Multi-layer real-time uncertain context reasoning framework
Require: Input Context[layers][instances] = Context[leveal0][instances0];//layers represent the

level of context; instance is a FRDFDS model instance set;
Ensure: Output Context[layers][instances]; // the context at time t;
1: initialize Context[layers][instances];
2: initialize context update = false;
3: for (all layers ∈ N and layers > 0) do
4: for (all FRDFDS model instance in Context[layers][instances]) do
5: if instance is an uncertain then
6: Call D-S theory reasoning procedure ds_reasoning();
7: else
8: Call rule-based reasoning Interface;
9: end if

10: end for
11: end for
12: if update= false then
13: Call update_context(Context[layers][instances]);
14: update= true;
15: end if

The algorithm 2 gives a complete uncertainty reasoning process based on D-S theory. The
whole process is based on the belief structure which can map the FRDFDS model perfectly. The
classical evidence combination formula requires that all evidence of participation in the synthesis
have the same degree of importance. However, in the pervasive computing environment, the
evidence obtained from different sources of evidence may differ in importance and belief, so that
the basic credibility of the evidence needs to be corrected before the evidence is synthesized to
reflect the different importance of the evidence and belief. The formula for correcting the belief


High-speed Train Control System Big Data Analysis Based on Fuzzy RDF Model and
Uncertain Reasoning 583

is as follows:

m′i(Ai) =

{
Bel(E)mi(Ai) when Ai 6= Θ
1 −

∑
m′i(Ai) when Ai = Θ

(3)

For different context applications, the Bel function is different, but the choice of Bel generally
follows the following rules: (1) the more important evidence, the higher belief, the corresponding
correction factor is great, that is, we assigned a big weight for the important evidence distribution.
(2) the higher evidence conflict, the lower belief, the smaller corresponding correction factor, that
is, there will be a larger amendment in the future.

Algorithm 2 Uncertainty reasoning based on the D-S theory
Require: Input a context instance in instance set; //an instance in one layer context
Ensure: Output a context instance in instance set; //the context instance after running a

reasoning process;
1: Procedure ds_reasoning(instance) {
2: initialize conflict factor threshold value = k;
3: for (all conclusions in instance) do
4: // conclusion is also a FRDFDS model instance.
5: get its evidence set and
6: calculate the BPA numeric and the Bal value;
7: calculate the conflict factor;
8: if conflict_factor 6 k then
9: calculate the Bal value for this conclusion;

10: else
11: calculate the Modified Bal value for evidence set;
12: calculate the Bal value for this conclusion;
13: end if
14: end for
15: }

The weighted combination rule is proposed to fuse the evidence, which solves the shortcomings
of the evidence theory in the case of high degree of conflict between the evidence. However, the
reasoning model of the method is not adaptive. Therefore, this paper improves the combination
rule with the following modified belief formula:

m′(A) = (m′1 ⊕m
′
2 ⊕···⊕m

′
n)(A) =

∑
A ω1m

′
1(A1) ·ω2m

′
2(A2) · · ·ωnm

′
n(An)∑

A 6=∅ ω1m
′
1(A1) ·ω2m

′
2(A2) · · ·ωnm′n(An)

(4)

ωi =

∑n
j=1,j 6=i(1 −dBPA(mi,mj))∑q

i=1

∑n
j=1,j 6=i(1 −dBPA(mi,mj))

(5)

The dBPA(mi,mj) is the distance of mi, and mj. The formula (5) reflects the different
importance and belief of the evidence which resolves the one-vote veto problem and the robust-
ness problem in the conflict evidence combination. It makes full use of the conflict evidence
information and avoids the valid information loss.

In order to improve the speed of analysis, we do not look back at historical data, but only
analyze and update the current node data based on rules, D-S theory. This paper uses the tran-
sitive property TransitiveProperty to update the state attribute information in Drupal module
instead of using based a finite state machine method. As shown in Algorithm 3, this paper uses
fuzzy similarity to update the fuzzy data on the same data node. In the distributed, multi-layer,


584 D. Zhang

real-time uncertain context updating based on similarity, the former context in the different node
is fused by the D-S evidence theory. Based on the similarity of fuzzy sets, the new context is
fused to the existing context in the context update processing. The fuzzy similarity calculation
rules is as follows.

Fuzzy set similarity. The recognition framework is Θ = {A1,A2, · · · ,An} . The fuzzy set
M and N are two random fuzzy subsets, and the similarity between them is

ψ(M,N) = 1 −
1

|Θ|

n∑
i

|µM (Ai) −µN (Ai)| (6)

Algorithm 3 Real-time context update based on similarity
Require: Input Context[layers][instances] at time t and t-1;// the context at time t; layers

represent the level of context; instance is a FRDFDS model instance set;
Ensure: Output the updated Context[layers][instances];
1: Procedure update_context(Context[layers][instances]) {
2: for (all all layers ∈ N) do
3: for (all FRDFDS model instance in Context[layers][instances] ) do
4: initialize similarity threshold value = s;
5: get the previous t-1 instance p_instance;
6: calculate the index of similarity between p_instance and instance;
7: if index 6 s then
8: make an additional with previous
9: end if

10: end for
11: end for
12: }

4 Experiment and evaluation

4.1 DMS

As shown in Figure 2, to monitor real-timely the operation of CTCS-3 equipment, China high-
speed train equipped with DMS which consists of on-board information detecting equipment,
ground data center and inquiry terminal. The on-board information detecting equipment collect
data from ATP, transponder, track circuit and RBC, and then transmits them to the ground
center through GPRS/GSM-R/WLAN network. Through this remote monitoring method, the
ground center can monitor and deal with the working states and faults of on-board signal equip-
ment. The business logic of DMS includes data acquisition, storage and analysis, and command.
The main function of data acquisition part is data sampling and aggregation. The storage and
analysis part mines and analyses the running status data of all related systems comprehensively.
It can realize the early warning analysis and fault diagnosis. Based on the storage and analysis,
the command part provides data sharing service for daily production and emergency dispatching
of the electrical department, and it can improve the efficiency of the electrical department daily
work.

4.2 Data analysis system framework

At present, the capability of analysis needs to be further improved in DMS. The work of this
paper can be seen as an extension of DMS which is integrated with DMS in the way as shown


High-speed Train Control System Big Data Analysis Based on Fuzzy RDF Model and
Uncertain Reasoning 585

Figure 2: DMS network structure

Figure 3: Data analysis integration framework

in Figure 3. In the Figure 3, the middle part of the figure is the work of this paper and which is
marked with a dotted line; The top part is the DMS UI, which is a set of functions that can be
used in all aspects of railway operations; The base part is the DMS data pool.

It is easy to see from the left of the middle of the Figure 3 that our analysis method is
based on Drupal module [12]. The RDF is the standard framework for semantic web and also
is a recommended framework for linking data. The linked data technology enables data to be
interoperability, reusable, and easier to use. Drupal is the first mainstream content management
system to support semantic web technology in its core, which can publish linked data by exposing
content with RDF. Our main work in this paper is layout on the right of the middle of the figure.
We extend the RDF to a FRDFDS model based on the D-S theory firstly. All data in the DMS
data pool will be expressed as a FRDFDS data. We also customize the data reasoning rule on
the nodes by our reasoning rule module. Accordingly, the query plug-in also was expanded and
programmatically filters and displays content. We implemented the algorithms presented in this
paper as two functional plug-ins deployed to Drupal modules.

4.3 Experiment analysis

As shown in Figure 2, there are two types of DMS, one is deployed in the railway bureau
and the other in the Ministry of Railways. Our experiment was deployed DMS data center in
Beijing communication and signaling section of Beijing railway bureau. We selected Beijing-
Tianjin inter-city train control system data as the research object. Specifically, we selected two


586 D. Zhang

work area’s real-time data from 6 am to 6 pm in one day as experimental data. We collect the
performance of the algorithms every half hour and analyze it. Our experiment is detailed, we
try to find the various problems of our algorithm. The experiments were divided into 3 groups
simultaneously on 3 IBM x3650 M4 servers respectively. The first group analyzes the data of a
work area. Its purpose is to analyze the influence of the number of layers on the performance of
the reasoning algorithms. The second group analyzes the data of two work areas. Its purpose is
to analyze the effect of the increase of the number of nodes on the performance of the algorithm.
The third group also analyzes one work area data, but its analysis algorithms were replaced by
proposed in paper [10]. We have implemented the method in paper [10] and compared it with
our reasoning method.

These three IBM x3650 M4 servers are connected with DMS data center through a high-
speed network, each with 2 Intel Xeon E5-2600 processors and 4GB memory, and the operating
system of them are ubuntu 14.04. The Drupal 8.0 with our algorithm was deployed in every
server respectively which can automatically generate a variety of FRDFDS nodes.

The first group experiment tested the distributed multi-level real-time uncertain context
reasoning method. The results are shown in Figure 4(Bar graph). According to Figure 4, we
recorded the time of the four-tier query respectively, where the first layer is Drupal based data
query time, it reflects the performance of Drupal; The second layer, the third layer, the fourth
layer were a higher level of knowledge reasoning inquire. We summed up the 24 knowledge which
needs to reason through the algorithm in the preliminaries. This knowledge includes certainty
and uncertainty, which are distributed in these three layers. As can be seen from the figure, the
results showed that the algorithm performance is not significantly decreased when the amount
of data is increased to a certain value from the level0 to level2; The level3 layer calculation has
increased significantly. This is because we deployed several uncertain reasoning, for example,
safety, flexibility and so on.

The second set of experiments analyzes the data of two work areas, which doubled the
amount of experimental data and focus on the efficiency of the algorithm in a distributed data
environment. As shown in the Figure 4(Line graph), we compare the results of the first set of
experiments with the experimental results of the second group (dense data and sparse data),
when the number of data increases, the performance of the methods is decreased, because the
increase of the number leads to more complexity of the state rule and uncertain calculation.
The column chart in the figure represents the one work area sparse data, and the line chart
represents the two areas dense data. But our algorithm does not increase much with the increase
of experimental data, this is because the method is based on the current node, and we will
update the current node based on the fuzzy similarity at the end of each calculation, as shown
in Algorithm 3. This shows that our algorithm has good scalability.

To more intuitive statistics of our algorithm efficiency, we counted the average process time
of each layer by our algorithms with the increase in the number of layers in different types of
data. As shown in the following Figure 5, when the algorithm handles the data(data1) of one
work area, the average time of level 0 is 34.0 ms; the average time of level1 is 46.9 ms; the average
time of level2 is 57.1 ms; the average time of level3 is 98.3 ms. When the algorithm handles the
data of two work areas(data2), the average time of level 0 is 47.5 ms; the average time of level1
is 65.0 ms; the average time of level2 is 86.2 ms; the average time of level3 is 110.3 ms. With the
increase in data, level3 processing time increased by 12 ms. This also shows that it has a better
performance, because our method uses a similarity-based method, avoiding the calculation of
many context nodes of the top level.

We have implemented Jousselme’s method [10] and compared it with the reasoning method
in this paper. The purpose of experiment 3 is to evaluate the correctness of the proposed method
and Jousselme’s in 24 point-in-time, and the results are shown in Figure 6. In the initial period,


High-speed Train Control System Big Data Analysis Based on Fuzzy RDF Model and
Uncertain Reasoning 587

Figure 4: Performance comparison on different data sets

Figure 5: Performance comparison on different data density

the correctness of this algorithm is slightly lower than the paper 12 method; at time point 9, the
correctness of this method begins to be higher than the paper 12 method; then, the correctness
of the method is improved, and at time point 15, the correctness of this method reaches a steady
state.

Due to the existence of conflicting evidence, the use of classical evidence theory for reasoning
may result in incorrect results, and improved combinatorial rules can improve this situation.
The paper [10] method proposed a weighted combination of rules for evidence fusion, to solve
the shortcomings of high conflict between the evidence, but the lack of self-adaptability of the
method. On the basis of this idea, this paper first gives the belief distribution table by the
expert and then adjusts it by human intervention in the concrete calculation. If there is no
human intervention, the original credit distribution table is calculated. In the first eight time
points, we have a corresponding fine-tuning of the credit rating table and achieved good results.

In general, as can be seen from the above experiments, the research method in this paper is
effective in dealing with the distributed context sensitive complex event of the internet of things.
What’s more, when in large-scale networking applications, it has better performance than the


588 D. Zhang

Figure 6: Evaluate the correctness of the proposed method and method in paper [10]

general method and scalability. In addition, this method has high correctness in combination with
expert dynamic intervention, and the experimental results show that it has attractive usability
in the field of dynamic control.

5 Related research

5.1 Railway intelligent transportation system

The railway management department has been using the "failure - repair" work mode. How-
ever, efficient railway operations require maintenance and repair equipment timely, so as to avoid
the occurrence of a failure, by analyzing the evolution of equipment status [25]. The distributed
inspection and monitoring system continues to collect infrastructure status data and train op-
erating status data, which generate relevant information through conversion. The evolutionary
trend of this information can be analyzed to gain knowledge of the evolution of the infrastructure
state, which provides decision support for preventive maintenance of the infrastructure.

In recent years, with the development of railway information construction and intelligent
transportation management, China’s railway transport system [18] has gradually built a number
of application-oriented management information systems. At present, these information man-
agement systems deal with different aspects of vehicle, infrastructure operation and maintenance
management, which have independent organizational structure and produce different forms of
data. If we want to use the data of each information system to provide support for the upper
application, you must first solve the problem of data sharing. Based on the above, we can re-
alize the intelligent processing of data, using data fusion analysis, expert system, data mining,
knowledge reasoning and other technologies, in the field of warning, traffic control, integrated
scheduling, resource management, operation management, fault analysis. This is one railway
intelligent transportation system(RITS), which is proposed according to the actual situation
of China’s railway and makes each information system work together. In order to realize the
RITS, many scholars have carried out many researches, such as meta-data sharing based on
meta-data [5], general data mode based on XML [15], IoT [24], cloud computing technology [8],
knowledge reasoning [4] and so on.

In summary, the scholars have begun to study the use of cloud computing and distributed
systems to collect and analysis related information for rail transport, infrastructure management,
maintenance departments [21] . However, the development of these work is mainly focused on
transport planning and operation and management. In the communication, signal system, there
is no relevant research and applications are reported.

The train control system and the GSM-R network used in the high-speed train are large and
complex systems. The application of new technologies makes the workload and the difficulty of


High-speed Train Control System Big Data Analysis Based on Fuzzy RDF Model and
Uncertain Reasoning 589

work greatly increased in the infrastructure maintenance department. A variety of complex data
analysis process greatly exceeded our current maintenance capacity. In addition, our maintenance
personnel do not have enough time and energy to analyze the data, which leads to a lot of data
idle, and has not been fully utilized. How to effectively analyze the massive data produced by
various detection and monitoring devices, and get useful rules and knowledge that is an urgent
task for high-speed rail. It helps the maintenance department to achieve rapid and accurate fault
diagnosis and provide reasonable support for preventive maintenance and maintenance decisions.

5.2 Context sensitive event processing

The context model plays an important role in the development and application of the data
analysis system in heterogeneous environments. There are various context representation models
were presented [19], including key-value model, object oriented model and ontology-based model.
The ontology is the best model of event context representation, but the traditional ontology
cannot handle uncertain knowledge which limits its application in uncertain event processing, so
in recent years there have been some fuzzy ontology model and reasoning research: The logic
based fuzzy model [6] attempts to integrate fuzzy logic into ontology design structure [16]; The
distributed fuzzy reasoning Petri net and fuzzy ontology [22] were used for distributed fuzzy
reasoning extensively [14].

The challenge of the context sensitive system is mainly to make the right decision for the
user’s context in real-time. The processing context data in an intelligent way is called context
reasoning. In recent years, there has been some research work on the processing of context
sensitive events. Zhou et al. proposed a similarity based on context reasoning, which defines
the similarity between the context models [26]. This method does not require the initial context
information, thus reduces the complexity of the reasoning process. Helmer et al. described
the framework of context event processing, and summarized the current context support the
commonly used event processing systems [9]. Ashish et al. proposed a context sensitive and
complex event processing method based on ontology [23]. However, the most of current articles
discuss the idea and framework of context sensitive complex event processing, which lack the
details of processing algorithms.

6 Conclusion

China High-speed train control system is a combination of computer, communication and
control. Its data is diverse, including sensor generated data stream, GPS signal, GSM-R trans-
mission data, real-time video monitoring data, train control software data, etc. This paper
presents an efficient analysis method based on the fuzzy linked data and uncertain reasoning for
high-speed train control system big data. We have used the method proposed in this paper to
analyze the data of the high-speed train control system. The experiment results show that the
method proposed in this paper has good efficiency and scalability for the analysis of large data
with different structures, types and context sensitive from train control system. The work of this
paper is based on the real practical application. In the future, there is still a lot of work to be
done, such as the adaptive distribution of belief value, RDF representation of expert knowledge,
the architecture of reasoning system based on Drupal.

Acknowledgment

This work is sponsored by the National Natural Science Foundation of China under Grant
No. 61502029.


590 D. Zhang

Bibliography

[1] Anicic D. et al. (2011), Etalis: easoning in Event-Based Distributed Systems, Volume 347
of the series Studies in Computational Intelligence, Springer, 99-124, 2011.

[2] Aniello L., Baldoni R., Querzoni L. (2013), Adaptive online scheduling in storm, Proceedings
of the 7th ACM international conference on Distributed event-based systems, ACM, 207-218,
2013.

[3] Dierkx K. (2009), The Smarter Railroad: An Opportunity for the Railroad Industry, IBM
Institute for Business Value, 2009.

[4] Feljan A.V. et al. (2017), Framework for Knowledge Management and Automated Reasoning
Applied on Intelligent Transport Systems, arXiv preprint arXiv:1701.03000, 2017.

[5] Gregor D. et al. (2016), A methodology for structured ontology construction applied to
intelligent transportation systems, Computer Standards & Interfaces, 47, 108-119, 2016.

[6] Gu L. et al.(2014), Trust Model in Cloud Computing Environment Based on Fuzzy Theory,
International Journal of Computers Communications & Control, 9(5), 570-583, 2014.

[7] Hartig O., Bizer C., Freytag J.C. (2009), Executing SPARQL queries over the web of linked
data, The Semantic Web-ISWC, 293-309, 2009.

[8] He S., Song R., Chaudhry S.S. (2014), Service-oriented intelligent group decision support
system: application in transportation management, Information systems frontiers, 16(5),
939-951, 2014.

[9] Helmer S., Poulovassilis A., Xhafa F. (2011), Introduction to Reasoning in Event-Based
Distributed Systems, Reasoning in Event-Based Distributed Systems, Springer Berlin Hei-
delberg, Vol 347, 1-10, 2011.

[10] Jousselme A.L., Grenier D., Bosse E. (2001)O, A new distance between two bodies of evi-
dence, Information fusion, 2(2): 91-101, 2001.

[11] Kondepudi S., Baekelmans J. (2012), Service Delivery Platform: The Foundation of Smart+
Connected Communities, Cisco Smart+ Connected Communities Institute, 2012.

[12] Kumar N. et al. (2016), Drupal 8 Development: Beginner’s Guide, Packt Publishing Ltd,
2016.

[13] Lassila O., Swick R R. (1999), Resource description framework (RDF) model and syntax
specification, W3C Recommendation, 22 February 1999.

[14] Liu H.C. et al. (2017), Fuzzy Petri nets for knowledge representation and reasoning: A
literature review, Engineering Applications of Artificial Intelligence, 60, 45-56, 2017.

[15] Medjoudj M., Yim P. (2007), Extraction of critical scenarios in a railway level crossing
control system, International Journal of Computers Communications & Control,2(3): 252-
268, 2007.

[16] Nadaban S. (2015), Fuzzy continuous mappings in fuzzy normed linear spaces, International
Journal of Computers Communications & Control, 10(6), 74-82, 2015.


High-speed Train Control System Big Data Analysis Based on Fuzzy RDF Model and
Uncertain Reasoning 591

[17] Ning B., Tang T., Qiu K., Gao C., Wang Q. (2004), CTCS-Chinese Train Control System,
Computers in Railways, WIT Press, 393-399, 2004.

[18] Ning B. et al. (2006), Intelligent railway systems in China, IEEE Intelligent Systems, 21(5),
80-83, 2016.

[19] Perera C. et al. (2014), Context aware computing for the internet of things: A survey, IEEE
Communications Surveys & Tutorials, 16(1): 414-454, 2014.

[20] Roop S.S., Ruback L.G. (2001), Intelligent rail crossing control system and train tracking
system, U.S. Patent, 6, 179-252, 2001.

[21] Tan X., Ai B. (2011), The issues of cloud computing security in high-speed railway, Elec-
tronic and Mechanical Engineering and Information Technology (EMEIT), 2011 Interna-
tional Conference on. IEEE, 8, 4358-4363, 2011.

[22] Rehman Z., Kifor C.V. (2016), An Ontology to Support Semantic Management of FMEA
Knowledge, International Journal of Computers Communications & Control, 11(4), 507-521,
2016.

[23] Taylor K., Leidinger L.(2011), Ontology-driven complex event processing in heterogeneous
sensor networks, ESWC’11 Proceedings of the 8th extended semantic web conference on The
semanic web: research and applications , Part II, 285-299, 2011.

[24] Zhang N. et al. Optimization scheme of forming linear WSN for safety monitoring in railway
transportation, International Journal of Computers Communications & Control, 9(6), 800-
810, 2014.

[25] Zheng W., Hu N. (2015), Automated test sequence optimization based on the maze algo-
rithm and ant colony algorithm, International Journal of Computers Communications &
Control, 10(4): 593-606, 2015.

[26] Zhou H., Wang Y., Cao K. (2013), Fuzzy DS theory based fuzzy ontology context modeling
and similarity based reasoning, Computational Intelligence and Security (CIS), 2013 9th
International Conference on, IEEE, 707-711, 2013.