Microsoft Word - 1.docx


 CHEMICAL ENGINEERING TRANSACTIONS  
 

VOL. 77, 2019 

A publication of 

 
The Italian Association 

of Chemical Engineering 
Online at www.cetjournal.it 

Guest Editors: Genserik Reniers, Bruno Fabiano 
Copyright © 2019, AIDIC Servizi S.r.l. 
ISBN 978-88-95608-74-7; ISSN 2283-9216 

How Big Data & Analytics Can Improve Process and Plant 
Safety and Become an Indispensable Tool for Risk 

Management 
Pankaj Goela,b, Hans Pasmana*, Aniruddha Dattab 
a Mary Kay O’Connor Process Safety Center, Artie McFerrin Department of Chemical Engineering, 
b Department of Electrical and Computer Engineering Texas A&M University, College Station, Texas 77843-3122 
hjpasman@gmail.com 

With the advances in digitization, Information Technology (IT), and connected devices, data are becoming 
plentiful. And with the past 30 years of developments of Artificial Intelligence tools leading to great 
enhancements in dealing with various levels and types of uncertainty, much has become tangible, where in 
the past it used to remain vague and fuzzy. Tools like neural networks can distil information from datasets, 
while probabilistic methods can characterize randomness. Bayesian causation networks enable finding critical 
pathways and help to design and monitor effective safeguards, while Petri nets enable analysis of time-critical 
events. Interval analysis, Dempster-Shafer theory, and fuzzy logic can assist in delimiting uncertainty in 
measurement results and expert judgment. System dynamics modeling and Functional resonance analysis 
may unravel interactively degrading processes. All this can improve understanding about communication lines 
and mechanisms of interactions within a plant socio-technical system, and the influences on achievement and 
performance. This will result in reformed work processes, manufacturing conditions and help in identifying 
abnormal trends. Therefore, while planning and prediction are based on observational evidence and trends, 
the new technologies will be a strong support for management, in recognizing and evaluating risks, including 
safety risks. Although applications of big data and analytics are still young, nevertheless in process control 
and reliability prediction of equipment a few achievements have already been demonstrated. However, much 
more is possible. For example, in the case of process safety performance indicators, lagging indicators are 
usually available but the techniques may stimulate the recording of the more important leading indicators for 
the prediction of safety and culture trend in a company in relation to its economic health. The paper will 
present more details on the methods and an example of dynamic risk mapping. 

1. Introduction 

With the advances in technology over past four decades, process plants use different control systems such as 
Programmable Logic Controllers (PLC), Distributed Control System (DCS), and Supervisory Control and Data 
Acquisition systems (SCADA) for monitoring and controlling the plant operations (Goel et al., 2017b). At the 
same time with development in IT, communication methods and connected devices, process plants are 
producing incredible amounts of data in different forms stored in ‘data lakes’ (data warehouses). This requires 
new and innovative approaches and methods to create Business Intelligence and actionable insights. The 
industry can get significant benefits with the use of intelligent systems and big data analytics methods. Several 
attributes such as volume, variety, velocity, value, veracity, variability, and valence characterize Big Data 
(Goel et al, 2017a). Figure 1a shows different data collected during process plant operations. Static data 
means data or reports generated over a period and remains fixed for a considerable amount of time while 
dynamic data means data, which changes with time and are continuous. Structured data refers to data in a 
table or specific report formats, while unstructured refers to data primarily expressed as text. The collected 
data is usually the raw data and requires pre-processing, cleaning and analysis to derive the expected 
information for decision making. Figure 1b highlights the various data analysis types such as descriptive, 

                                
DOI: 10.3303/CET1977127 

 
Paper Received: 7 December 2018; Revised: 18 May 2019; Accepted: 23  June  2019 

Please cite this article as: Goel P., Pasman H., Datta A., 2019, How Big Data & Analytics can improve process and plant safety and become an 
indispensable tool for risk management, Chemical Engineering Transactions, 77, 757-762  DOI:10.3303/CET1977127  

757


diagnostic, predictive and prescriptive and relation between analysis, decision making and human input. 
Human input is highest at descriptive analysis level, and lowest during the prescriptive analysis.  

2. Data processing methods (analytics) 

There are many methods available by which based on information in the form of data containing uncertainty a 
prediction can be made for a situation. Apart from conventional statistics, in a sequence of increasing 
uncertainty handling and decreasing quantitative character, we shall briefly consider here Bayesian causation 
networks (BN), Interval analysis, Dempster-Shafer theory, fuzzy logic, and functional resonance analysis.  
In case a cause-effect structure can be derived and cause probabilities, e.g., on failures in discrete or 
continuous form are given, a Bayesian network (BN) of nodes representing the variables and directed edges 
the causal links is the most obvious choice. For practitioners, Fenton and Neil (2012) give a clear and 
applications-oriented description of BN; Kjaerulff and Madsen (2008) delve deeper. Both prediction of an effect 
probability and inference of a most probable cause based on observations is possible. Drawback is that 
feedback loops are not allowed. The power of the Bayes approach is that prior information can be updated 
with new evidence to a posterior distribution. 

          
Figure 1a: Process operations data (Goel et al., 2017a)      Figure 1b: Data analysis types and actions (PFDs: 
Process Flow Diagram; P&IDs: Piping & Instrumentation Diagram; SOPs: Standard Operating Procedure; 
CMMS: Centralized Maintenance Management System; LIMS: Laboratory Information Management System) 

Interval analysis is applied when the value bounds are certain but information within that range is vague. 
Alefeld and Mayer (2000) treat the interval arithmetic. In risk assessment it is used to express and handle 
imprecise probability, typical for epistemic uncertainty (lack of knowledge). Combination with random 
(aleatory) information can be shown in a p-box plot, e.g., as a fuzzy cumulative normal distribution with mean 
and/or standard deviation within bounds, see also Choudhary et al. (2016). Expert probability estimates form a 
category of imprecise probability. The Dempster-Shafer (DS) belief approach (Shafer 1976, Dempster 2008) is 
applicable. The subjective expert answers for the probability of event occurrence and non-occurrence need 
not to sum to 1 as probability theory requires. Therefore, one distinguishes in the interval [0, 1] three sections 
separated by belief and plausibility. The interval to belief is often taken about what the expert is at least sure 
of, and that to plausibility as highest estimate bound; the third part represents ignorance. The expert is 
assigned a reliability value. To combine the answers of more than one expert with different reliability the DS 
rule has been developed. Sentz and Ferson (2002) tied the DS approach and that of p-box plots together. 
Certa et al. (2017) applied the DS rule to combine expert risk estimates as part of failure mode effects and 
criticality analysis (FMECA).  
Zadeh (1965) designed fuzzy sets and logic to deal with vague information. Among many others, Wierman 
(2010) described the approach. There are many applications. If a variable value or object description cannot 
be characterized sharply, one can indicate a center and left and right extremes (fuzzification), though. From 
the extremes to the center-point membership increases, hence from 0 at the extremes to 1 at the center; linear 
or curved membership functions can be defined. If variables have to be combined the logic in the form of IF-
THEN-ELSE rules applies. Executing the logic is called inference. Results can be defuzzified either to a 
centroid of a fuzzy set (Mamdani model) or a constant or function (Sugeno model). The approach is heavily 
applied in control systems because in complex systems fuzzy approach outweighs precision. More recently 
type-2 fuzzy set has evolved and in particular the interval version of it. If experts first independently of each 
other grade a value linguistically (e.g., high, medium, low) or as index, then independently indicate on a 

758


continuous scale an interval for the grade, mathematical treatment as developed by Liu and Mendel (2008) 
merges the information to an objective result, facilitating decision-making.  
Last but not least is the Functional Resonance Analysis Method (FRAM) developed by Hollnagel (2017). It 
serves to analyze variability in a socio-technical system (STS) and to determine causal structures for 
scenarios. An STS is a hierarchical organizational structure of layers connected by communication lines, 
controlling a technical process. Lack of transparency in such complex system blurs causation. FRAM nodes 
describe each a system function and are modeled as hexagons with at the vertices contacts for Input (I); 
Output (O); Resources (R) consumed; Constraints/controls (C); Preconditions (P); and Time (T). By not 
detailing the process inside a node but connecting appropriate vertices of different functions FRAM supports 
causation thinking.  

3. Applications 

One approach that would benefit from developments sketched in the Introduction is the Process Resilience 
Analysis Framework (PRAF) developed by Jain et al. (2017). This method of ultimate resort includes an 
integrated method using process plant data, simulation and optimization approach to find the operating region 
bounds in which a plant can operate efficiently and safely (Jain et al., 2018a). This approach relies on 
integration of technical and social factors in the process plant under study (Jain et al., 2018b and c), and it 
assumes reliable dynamic risk assessment for decision making. However now, with the availability of data 
streams and analytical methods dynamic and operational risk management can be made much more effective. 
In the following sections we shall give an example. 

3.1 Dynamic Risk mapping 

Facilities have various subsystems and/or components that have complex interactions, which result in 
changing operations environment. This affects the risk profile of the facilities and hence it is important to study 
the emergent behavior of these interactions within the complex systems. So far, the body of literature that is 
concerned with dynamic risk profiles due to emergent behavior of complex process systems using big data 
analytics is small. In this section, a systematic methodology is described and developed. For this purpose, the 
process unit system is reproduced as a system of layers as illustrated in Figure 2. Based on this system of 
layers, a dynamic risk profile is obtained by the incorporation of the wealth of data generated in the facility 
from various sources such as historic information, Centralized Maintenance Management System (CMMS), 
operational data, and Process Safety Management (PSM) system in the form of indicators (Jain et al., 2018b). 
With the real plant data, the risk could be assessed also applying contributions from safety culture survey 
data, audit reports and more.  

 
Figure 2: Dynamic risk mapping layers; the blue boxes will receive the data streams for the parameters 
determining the risk (PM: Predictive Maintenance; LFIs: Learning From Incidents), from Goel et al. (2017a) 

The dynamic risk evaluation involves different steps similar to a Layer of Protection Analysis (LOPA) study. As 
illustrated in Figure 4, the following is a step-wise methodology that involves layer-wise analysis from plant 
layer to safeguards layer to calculate the final risk as low, medium or high as indicated in the matrix of Figure 
2. 

759


Step 1 Scenario identification: To define a scenario in details applying basic fault and event tree. Fault tree 
analysis helps to identify the initiating and basic events leading to the top event. Event tree analysis supports 
the identification of safety barriers in place to prevent and mitigate the consequence.  (see Figure 4, layer 2) 
is evaluated from the scenario analysis in the form of initiating scenario probability leading to a risk of major 
consequence. Depending on the scenario, this follows different combinations of AND/OR gate calculations. 
Step 2 Plant operations assessment: This step deals with identification of the dynamic factors based on the 
operational hazard layer. These could be from issued work permits, ongoing SIMOPS (simultaneous 
operations), transient operations, previous events, and hazardous area classification. Outcome of this step is 
Operations Hazards Factor  acting as an additional factor leading to increased event probability: in the 
conventional case it is not considered, in the dynamic case 2 ≤ 1. Contributions to  by various operational 
activities are time-averaged, composed as AND gates, while the smaller the value the larger the effect. 
Step 3 Barrier health assessment: This step is a combination of identifying the existing control and recovery 
barriers available for the scenario and assessment of their health, based on the conditions of items from the 
safety barriers layer. Here,  is evaluated after dividing the probability of failure on demand (PFD) of each 
protection layer, assumed independent of the others (IPLs), by a corresponding penalty factor. The penalty 
factors are determined based on indicators of maintainability, availability, replacement and audit (see the right 
side of Figure 3).  is derived from the product of penalty factors adapted PFDs (LOPA approach). 
Step 4 Calculation: The final step is to calculate the risk of a major consequence occurring from the collected 
operations data. The proposed method follows LOPA approach, incorporating additional factors based on the 
data from dynamic operations. Equation (1) is used to calculate risk of a major consequence as shown below: ∗ /    (1) 
3.2 Example case study 

An accident scenario is considered to analyze and map the dynamic risk profile. This type of dynamic risk 
profile analysis would support more informed operational decisions, improved maintenance plans, work 
execution strategies, and overall safer and more reliable operations. The way data mining is performed is as 
follows: At any moment in time discrete parameter values (true [1] or false [0]) will be read by the risk 
calculation module at a suitable time frame sequence. Beside the parameter values inputs to the risk 
calculation module are user defined weights for the fourth layer parameters expressing the degree of 
effectiveness of the relevant parameter.  

 
Figure 3: Left: K.O. drum with piping (Talebberrouane et al., 2016); Right: Relevant penalty scores 

The example scenario (Figure 3) concerns a knock out drum (K.O.D.) which includes a level switch and a level 
transmitter indicator. During the normal operation the process stream is captured in the K.O.D. The liquid from 
the process stream is discharged with the help of pumps as soon as the level reaches a set point measured 
by the level switch. High level occurs 2 to 3 times per day. If at high level increase continues, a hazard 
situation of a major risk event is due to liquid discharge to the flare stack causing liquid-carryover and 
spreading of fire or even explosion. For the purpose of the study, we assume that level indication may 
malfunction, that of the two pumps in the process stream one is under maintenance and the other may fail to 
start, and that the upstream process may be under upset condition (isolation valve fails). In this case, the 
following three barriers are available: Barrier 1: High level switch and Basic Process Control System (BPCS) 
cutting off flow to K.O.D. with an operator response; Barrier 2: Operator checking that BPCS is working; and 
Barrier 3: Pressure Relief Valve connected to the vent line. In conventional risk assessment analysts do not 
explicitly consider various variables related to human and organizational factors, nor do they consider changes 
in the conditions and in input data. The latter, such as component failure data may have been determined over 

760


the years in the plant but are often estimates based on information from elsewhere. The effect of correlation 
and dependencies are usually ignored. Even for this simple scenario variants are imaginable, which may 
worsen the situation, such as the sticking of the pressure relief valve. Anyhow, for this simplified example, in 
the static QRA maintenance influences, which can appear as issuing work permits, and nearby maintenance 
SIMOPS, which can be a threat to the plant, or other events are not considered (  is 1 in layer 3 of the left 
table of Figure 4). Hence, due to ignoring operational hazards and health or robustness of barriers the 
calculated risk seems Low (rounded value 2.10-5/yr). 

 
Figure 4: Left: Conventional risk analysis approach result; Right: Dynamic risk mapping result (NA is not 
active).  

However, the dynamic risk mapping approach developed in this study is using data from the plant informing us 
on various parameters for the operations (layer 3), such as whether hot work occurs. Also, results of health of 
barriers by maintenance inspection and testing results (layer 4) can be monitored. If needed this can be 
followed by repair, or e.g. replacement with a similar instrument, hence confirming availability or not. In case 
activities are on, hazard values for different operations are assigned based on experience, for example, a 
value of 0.4 for hot work. For these values expert estimates can be used applying methods described in 
Section 2. In the course of time updates may be established. This way we get a different value of risk 
depending upon the actual daily operations in the plant. In this example scenario the value of risk at a certain 
time and given conditions is calculated to be Medium (rounded value 6.10-4/yr). Hence, we can see that with 
the help of dynamic risk mapping by considering more realistic scenarios and failure values we have a very 
different and more realistic value of risk. This risk value is not constant and may change depending on various 
key scenarios during the plant operations. In reality even more factors can be taken into account. Data on the 
reliability of safety critical components as the pressure relief valve or the level switch can be collected and by 
making use of Bayesian update the values over time made more realistic. The limited volume of this paper 
does not enable a more detailed description of how a risk module for this purpose is built, but it is obvious that 
for each HAZID based scenario a cause-effect structure, basically a bowtie, will be built in the form of a 
Bayesian or Petri network. For the future, additional indicators can be included, such as pipe vibration, sudden 
gas concentration, or power usage. As Albalawi et al. (2017) contend, it is even thinkable to pick-up from a 

761


safety-Lyapunov-based economic mode predictive controller a signal of a large process disturbance. And if an 
event occurs such risk module system may help to detect quicker where the cause of a disturbance may be 
found. The system should be able to handle risk during transient exposed people concentrations and turn-
around operations as the latter are prone to accident. With Big Data & Analytics a real-time risk dashboard 
comes under reach. 

4. Conclusions 

In this study, the authors established a systemic methodology to determine the dynamic risk profile of process 
plants. An easy, friendly, and excel-based prototype user interface has been developed for this methodology. 
This approach utilizes data that is already recorded in the process plant system and can provide real time risk 
profiles. The developed method was demonstrated using an example case study of high liquid level scenario 
in a K.O.D and comparing it to the conventional method of risk analysis. 

Acknowledgments 

We thank late Dr M. Sam Mannan for his endeavour and stimuli to advance process safety and risk 
assessment research for a safer world and to encourage this work. 

References  

Albalawi, F., H. Durand, P.G. Christofides. 2017, Distributed Economic Model Predictive Control for Operational 
Safety of Nonlinear Processes, AIChE Journal, 63 (8), 3404-3418. 

Alefeld, G. and Mayer G., 2000, Interval analysis: theory and applications, Journal of computational and applied 
mathematics, 121 (1-2), 421-464. 

Certa, A., Hopps F., Inghilleri R. and La Fata C.M., 2017, A Dempster-Shafer Theory-based approach to the Failure 
Mode, Effects and Criticality Analysis (FMECA) under epistemic uncertainty: application to the propulsion system 
of a fishing vessel, Reliability Engineering and System Safety, 159, 69-79. 

Choudhary, A., Voyles I.T., Roy C.J., Oberkampf W.L. and Patil M., 2016, Probability Bounds Analysis Applied to the 
Sandia Verification and Validation Challenge Problem, ASME Journal of Verification, Validation and Uncertainty 
Quantification, 1 (1), 011003 1-13. 

Dempster, A.P., 2008, Upper and lower probabilities induced by a multivalued mapping. Classic Works of the 
Dempster-Shafer Theory of Belief Functions, Springer, 57-72. 

Fenton, N. and Neil M., 2012, Risk assessment and decision analysis with Bayesian networks, CRC Press. 
Goel, P., Datta A. and Mannan M.S., 2017a, Application of big data analytics in process safety and risk 

management. Big Data (Big Data), 2017 IEEE International Conference on, IEEE. 
Goel, P., Datta A. and Mannan M.S., 2017b, Industrial alarm systems: Challenges and opportunities, Journal of Loss 

Prevention in the Process Industries, 50, 23-36. 
Hollnagel, E., 2017, FRAM: the functional resonance analysis method: modelling complex socio-technical systems, 

CRC Press. 
Jain, P., Pasman H.J., Waldram S.P., Rogers W.J. and Mannan M.S., 2017, Did we learn about risk control since 

Seveso? Yes, we surely did, but is it enough? An historical brief and problem analysis, Journal of Loss 
Prevention in the Process Industries, 49, 5-17. 

Jain, P., Pasman H.J., Waldram S., Pistikopoulos E. and. Mannan M.S, 2018a, Process Resilience Analysis 
Framework (PRAF): A systems approach for improved risk and safety management, Journal of Loss Prevention 
in the Process Industries, 5, 61-73. 

Jain, P., Mentzer R. and Mannan M.S., 2018b, Resilience metrics for improved process-risk decision making: survey, 
analysis and application, Safety science, 108, 13-28. 

Jain, P., Rogers W.J., Pasman H.J. and Mannan M.S., 2018c, A resilience-based integrated process systems 
hazard analysis (RIPSHA) approach: Part II management system layer, Process Safety and Environmental 
Protection, 118, 115-124. 

Kjaerulff, U.B. and Madsen A.L., 2008, Bayesian networks and influence diagrams, Springer Science+ Business 
Media, 200, 114. 

Liu, F. and Mendel J.M., 2008, Encoding words into interval type-2 fuzzy sets using an interval approach, IEEE 
transactions on fuzzy systems, 16 (6), 1503-1521. 

Sentz, K. and Ferson S., 2002, Combination of evidence in Dempster-Shafer theory, Citeseer. 
Shafer, G.,1976, A mathematical theory of evidence, Princeton University press. 
Talebberrouane, M., Khan F. and Lounis Z., 2016, Availability analysis of safety critical systems using advanced fault 

tree and stochastic Petri net formalisms, J. Loss Prevention Process Industries, 44, 193-203. 
Wierman, M.J., 2010, An introduction to the mathematics of uncertainty, Creighton University, 149-150. 
Zadeh, L.A., 1965, Information and control, Fuzzy sets, 8 (3), 338-353. 

762